Changes

Jump to navigation Jump to search
no edit summary
Line 18: Line 18:  
               <th>[[Tendances_Technologiques|Tendances Technologiques]]</th>
 
               <th>[[Tendances_Technologiques|Tendances Technologiques]]</th>
 
               <th> / </th>
 
               <th> / </th>
               <th>[[Tendances_Technologiques/Lacs_de_donnees|Lac de Données]]</th>
+
               <th>[[Tendances_Technologiques|Lac de Donnés]]</th>
 
             </tr>
 
             </tr>
 
           </table>
 
           </table>
Line 34: Line 34:  
       <tr>
 
       <tr>
 
         <th>Latest version</th>
 
         <th>Latest version</th>
         <td>January 28, 2020</td>
+
         <td>February 17, 2020</td>
 
       </tr>
 
       </tr>
 
       <tr>
 
       <tr>
Line 114: Line 114:  
     <li><p><b>Distributed processing </b>capabilities associated with a logical data warehouse.</p></li>
 
     <li><p><b>Distributed processing </b>capabilities associated with a logical data warehouse.</p></li>
 
   </ul>
 
   </ul>
   <p class="expand mw-collapsible-content"><b>[https://www.datanami.com/2017/10/03/td-bank-made-data-lake-usable How TD Bank Made Its Data Lake More Usable]</b></p>
+
   <p class="expand mw-collapsible-content"><b>[https://www.datanami.com/2017/10/03/td-bank-made-data-lake-usable How TD Bank Made Its Data Lake More Usa]</b></p>
 
   <p class="expand mw-collapsible-content">Toronto-Dominion Bank (TD Bank) is one of the largest banks in North America, with 85,000 employees, more than 2,400 locations between Canada and the United States, and assets nearing $1 trillion. In 2014, the company decided to standardize how it warehouses data for various business intelligence and regulatory reporting functions. The company purchased a Hadoop distribution and set off to build a large cluster that could function as a centralized lake to store data originating from a variety of departments.</p>
 
   <p class="expand mw-collapsible-content">Toronto-Dominion Bank (TD Bank) is one of the largest banks in North America, with 85,000 employees, more than 2,400 locations between Canada and the United States, and assets nearing $1 trillion. In 2014, the company decided to standardize how it warehouses data for various business intelligence and regulatory reporting functions. The company purchased a Hadoop distribution and set off to build a large cluster that could function as a centralized lake to store data originating from a variety of departments.</p>
    
   <h2>Canadian Government Use</h2>
 
   <h2>Canadian Government Use</h2>
<p>In 2019, the Treasury Board of Canada Secretariat (TBS), partnered with Shared Services Canada and other departments, to identify a business lead to develop a Data Lake (a repository of raw data) service strategy so that the GC can take advantage of big data and market innovation to foster better analytics and promote horizontal data-sharing<ref>Treasury Board of Canada Secretariat. (March 29th, 2019). Digital Operations Strategic Plan: 2018-2022. Government of Canada. Treasury Board of Canada Secretariat. Retrieved 26-May-2019 from: <i>[https://www.canada.ca/en/government/system/digital-government/digital-operations-strategic-plan-2018-2022.html] </i></ref>. </p>
+
<p>In 2019, the Treasury Board of Canada Secretariat (TBS), partnered with Shared Services Canada and other departments, to identify a business lead to develop a Data Lake (a repository of raw data) service strategy so that the GC can take advantage of big data and market innovation to foster better analytics and promote horizontal data-sharing.<ref>Treasury Board of Canada Secretariat. (March 29th, 2019). Digital Operations Strategic Plan: 2018-2022. Government of Canada. Treasury Board of Canada Secretariat. Retrieved 26-May-2019 from: <i>[https://www.canada.ca/en/government/system/digital-government/digital-operations-strategic-plan-2018-2022.html] </i></ref> </p>
<p class="expand mw-collapsible-content">Big data is the technology that stores and processes data and information in datasets that are so large or complex that traditional data processing applications can’t analyze them. Big data can make available almost limitless amounts of information, improving data-driven decision-making and expanding open data initiatives. Business intelligence involves creating, aggregating, analyzing and visualizing data to inform and facilitate business management and strategy. TBS, working with departments, will lead the development of requirements for an enterprise analytics platform<ref>Ibid.<i></i></ref>.</p>
+
<p class="expand mw-collapsible-content">Big data is the technology that stores and processes data and information in datasets that are so large or complex that traditional data processing applications can’t analyze them. Big data can make available almost limitless amounts of information, improving data-driven decision-making and expanding open data initiatives. Business intelligence involves creating, aggregating, analyzing and visualizing data to inform and facilitate business management and strategy. TBS, working with departments, will lead the development of requirements for an enterprise analytics platform.<ref>Ibid.<i></i></ref></p>
 
<p>Data Lake development in the GC is a more recent initiative. This is mainly due to the GC focussing resources on the implementation of cloud initiatives. However, there are some GC departments engaged in developing Data Lake environments in tandem to cloud initiatives.</p>
 
<p>Data Lake development in the GC is a more recent initiative. This is mainly due to the GC focussing resources on the implementation of cloud initiatives. However, there are some GC departments engaged in developing Data Lake environments in tandem to cloud initiatives.</p>
<p class="expand mw-collapsible-content">Notably, the Employment and Social Development Canada (ESDC) is preparing the installment of multiple Data Lakes in order to enable a Data Lake Ecosystem and Data Analytics and Machine Learning toolset. This will enable ESDC to share information horizontally both effectively and safely, while enabling a wide variety of data analytics capabilities. ESDC aims to maintain current data and analytics capabilities up-to-date while exploring new ones to mitigate gaps and continuously evolve our services to meet client’s needs<ref>Brisson, Yannick, and Craig, Sheila. (November, 2018). ESDC Data Lake – Implementation Strategy and Roadmap Update. Government of Canada. Employment and Social Development Canada – Data and Analytics Services. Presentation. Last Modified on 2019-04-26 15:45. Retrieved 07-May-2019 from GCDocs<i>[https://gcdocs.gc.ca/ssc-spc/llisapi.dll?func=ll&objaction=overview&objid=36624914 ]</i></ref>. </p>
+
<p class="expand mw-collapsible-content">Notably, the Employment and Social Development Canada (ESDC) is preparing the installment of multiple Data Lakes in order to enable a Data Lake Ecosystem and Data Analytics and Machine Learning toolset. This will enable ESDC to share information horizontally both effectively and safely, while enabling a wide variety of data analytics capabilities. ESDC aims to maintain current data and analytics capabilities up-to-date while exploring new ones to mitigate gaps and continuously evolve our services to meet client’s needs.<ref>Brisson, Yannick, and Craig, Sheila. (November, 2018). ESDC Data Lake – Implementation Strategy and Roadmap Update. Government of Canada. Employment and Social Development Canada – Data and Analytics Services. Presentation. Last Modified on 2019-04-26 15:45. Retrieved 07-May-2019 from GCDocs<i>[https://gcdocs.gc.ca/ssc-spc/llisapi.dll?func=ll&objaction=overview&objid=36624914 ]</i></ref> </p>
 
   <h2>Implications for Government Agencies</h2>
 
   <h2>Implications for Government Agencies</h2>
 
   <h3>Shared Services Canada (SSC)</h3>
 
   <h3>Shared Services Canada (SSC)</h3>
 
   <h4>Value Proposition</h4>
 
   <h4>Value Proposition</h4>
 
   <p class="expand mw-collapsible-content">There are three common value propositions for pursuing Data Lakes. 1) It can provide an easy and accessible way to obtain data faster; 2) It can create a singular inflow point of data to help connect and merge information silos in an organization; and 3) It can provide an experimental environment for experienced data scientists to enable new analytical insights.</p>
 
   <p class="expand mw-collapsible-content">There are three common value propositions for pursuing Data Lakes. 1) It can provide an easy and accessible way to obtain data faster; 2) It can create a singular inflow point of data to help connect and merge information silos in an organization; and 3) It can provide an experimental environment for experienced data scientists to enable new analytical insights.</p>
   <p class="inline-spacer">   </p>
+
   <p class="inline-spacer"> </p>
 
   <p class="inline">Data Lakes can provide data to consumers more quickly by offering data in a more raw and easily accessible form. Data is stored in its native form with little to no processing, it is optimized to store vast amounts of data in their native formats. By allowing the data to remain in its native format, a much timelier stream of data is available for unlimited queries and analysis. A Data Lake can help data consumers bypass strict data retrieval and data structured applications such as a data warehouse and/or data mart. This has the effect of improving a business’ data flexibility.</p><p class="expand inline mw-collapsible-content">Some companies have in fact used Data Lakes to replace existing warehousing environments where implementing a new data warehouse is more cost prohibitive. A Data Lake can contain unrefined data, this is helpful when either a business data structure is unknown, or when a data consumer requires access to the data quickly. </p>
 
   <p class="inline">Data Lakes can provide data to consumers more quickly by offering data in a more raw and easily accessible form. Data is stored in its native form with little to no processing, it is optimized to store vast amounts of data in their native formats. By allowing the data to remain in its native format, a much timelier stream of data is available for unlimited queries and analysis. A Data Lake can help data consumers bypass strict data retrieval and data structured applications such as a data warehouse and/or data mart. This has the effect of improving a business’ data flexibility.</p><p class="expand inline mw-collapsible-content">Some companies have in fact used Data Lakes to replace existing warehousing environments where implementing a new data warehouse is more cost prohibitive. A Data Lake can contain unrefined data, this is helpful when either a business data structure is unknown, or when a data consumer requires access to the data quickly. </p>
<p class="inline-spacer">   </p>
+
<p class="inline-spacer"> </p>
 
<p class="inline">A Data Lake is not a single source of truth. A Data Lake is a central location in which data converges from all data sources and is stored, regardless of the data formatting. </p><p class="expand inline mw-collapsible-content">As a singular point for the inflow of data, sections of a business can pool their information together in the Data Lake and increase the sharing of information with other parts of the organization. In this way everyone in the organization has access to the data. A Data Lake can increase the horizontal data sharing within an organization by creating this singular data inflow point. Using a variety of storage and processing tools analysts can extract data value quickly in order to inform key business decisions.</p>
 
<p class="inline">A Data Lake is not a single source of truth. A Data Lake is a central location in which data converges from all data sources and is stored, regardless of the data formatting. </p><p class="expand inline mw-collapsible-content">As a singular point for the inflow of data, sections of a business can pool their information together in the Data Lake and increase the sharing of information with other parts of the organization. In this way everyone in the organization has access to the data. A Data Lake can increase the horizontal data sharing within an organization by creating this singular data inflow point. Using a variety of storage and processing tools analysts can extract data value quickly in order to inform key business decisions.</p>
 
   <p class="inline-spacer">  </p>
 
   <p class="inline-spacer">  </p>
262

edits

Navigation menu

GCwiki