Important: The GCConnex decommission will not affect GCCollab or GCWiki. Thank you and happy collaborating!
Difference between revisions of "Data Science in the Government of Canada"
| Line 17: | Line 17: | ||
| − | =''' | + | ='''Data Science in the Government of Canada''' = | 
| ==What is Data Science?== | ==What is Data Science?== | ||
Revision as of 10:46, 5 July 2023
| Process Information and Latest Updates | Data Science in the Government of Canada | FAQ | 
Contents
Data Science in the Government of Canada
What is Data Science?
Data Science is an interdisciplinary field that uses scientific methods and algorithms to extract information and insights from diverse data types. It combines domain expertise, programming skills and knowledge of mathematics and statistics to solve analytically complex problems.
Get a short overview of the daily work of a data scientist in the Government of Canada by watching this video from a previous recruitment campaign of Employment and Social Development Canada (ESDC). It is possible to view the embedded video from the source page.
The COVID-19 Pandemic: A Stark Reminder of the Crucial Role of Data Scientists
Data science allows government agencies and department to respond quickly to changing economic and social situations. For example, Statistics Canada is using the power of data science to support the COVID-19 response in Canada.
The agency collaborated with Health Canada to visualize the supply and demand information for Personal Protective Equipment (PPE). Before the data visualization could begin, the data needed to be extracted and ingested. The data were coming daily from many different sources (different provincial/territorial governments, other federal departments and private sector companies that had been hired to help source the PPE) and in many different formats (e.g. Word documents, Excel files, PDFs) and required a significant amount of manual work to create standardized reports.
To improve this process, data scientists at Statistics Canada created an algorithm that parses the data into different data entries. Machine learning was used to identify numbers and dates within the text. The structured data were then presented in a PowerBI dashboard that was shared with other government departments to meet their information needs and better understand the supply and demand for PPE in Canada.
Source: Data Science Centre, Statistics Canada, 2022-09-29
Data Scientist Job Profile in the Government of Canada
The full description of the following job profile can be found on the GCwiki page of the Data Science Network for the Federal Public Service (DSNFPS).
Data Scientist
Data scientists use data to identify and solve complex business problems. They have an interdisciplinary focus, using techniques and knowledge from a range of scientific and computer science disciplines (for example, economics, statistics, mathematics, predictive analytics, and machine learning) and are generally part of multidisciplinary project teams involving data science engineers, business owners, social scientists, business analysts, project managers, software engineers/designers, and others. The roles and responsibilities of a data scientist may include:
● eliciting problems from business owners, understanding where data science can add value in supporting strategic and operational decision making, and designing data science solutions and metrics to these problems;
● Clean, process, and explore structured and unstructured data to extract actionable insights for making business decisions;
● building and validating statistical models from data, often using advanced statistical techniques such as econometrics, machine learning, predictive analytics, regression, segmentation, and other relevant techniques;
● supporting computer scientists and data engineers conducting the deployment and maintenance of the models;
● using best coding practices to generate reproducible, verifiable work;
● exploring and visualising data to present the ‘story’ of data, based on a thorough understanding of business processes and incentivized behaviour, in a meaningful way to a range of technical and non-technical audiences;
● using an evolving range of data analysis tools and techniques, including open source, some of which must be learnt quickly, as and when required;
● adhering to standards, guidelines and norms around digital solutions and responsible development and implementation of artificial intelligence and machine learning.
