Line 55: |
Line 55: |
| <h2>Business Brief</h2> | | <h2>Business Brief</h2> |
| <p>In an ever-increasing hyperconnected world, corporations and businesses are struggling to deal with the responsibilities of storage, management and quick availability of raw data. To break these data challenges down further:</p> | | <p>In an ever-increasing hyperconnected world, corporations and businesses are struggling to deal with the responsibilities of storage, management and quick availability of raw data. To break these data challenges down further:</p> |
− | <ul class="expand mw-collapsible-content"> | + | <ul> |
| <li>Data comes in many different structures. | | <li>Data comes in many different structures. |
| <ul> | | <ul> |
Line 116: |
Line 116: |
| <li><p><b>Distributed processing </b>capabilities associated with a logical data warehouse.</p></li> | | <li><p><b>Distributed processing </b>capabilities associated with a logical data warehouse.</p></li> |
| </ul> | | </ul> |
− | <b class="expand mw-collapsible-content">How TD Bank Made Its Data Lake More Usa</b> | + | <p class="expand mw-collapsible-content"><b>How TD Bank Made Its Data Lake More Usa</b></p> |
− | <p class="expand mw-collapsible-content">[[https://www.datanami.com/2017/10/03/td-bank-made-data-lake-usable/]]<br>Toronto-Dominion Bank (TD Bank) is one of the largest banks in North America, with 85,000 employees, more than 2,400 locations between Canada and the United States, and assets nearing $1 trillion. In 2014, the company decided to standardize how it warehouses data for various business intelligence and regulatory reporting functions. The company purchased a Hadoop distribution and set off to build a large cluster that could function as a centralized lake to store data originating from a variety of departments.</p> | + | <p class="expand mw-collapsible-content">[[https://www.datanami.com/2017/10/03/td-bank-made-data-lake-usable]]<br>Toronto-Dominion Bank (TD Bank) is one of the largest banks in North America, with 85,000 employees, more than 2,400 locations between Canada and the United States, and assets nearing $1 trillion. In 2014, the company decided to standardize how it warehouses data for various business intelligence and regulatory reporting functions. The company purchased a Hadoop distribution and set off to build a large cluster that could function as a centralized lake to store data originating from a variety of departments.</p> |
| | | |
| <h2>Canadian Government Use</h2> | | <h2>Canadian Government Use</h2> |
Line 128: |
Line 128: |
| <h4>Value Proposition</h4> | | <h4>Value Proposition</h4> |
| <p class="expand mw-collapsible-content">There are three common value propositions for pursuing Data Lakes. 1) It can provide an easy and accessible way to obtain data faster; 2) It can create a singular inflow point of data to help connect and merge information silos in an organization; and 3) It can provide an experimental environment for experienced data scientists to enable new analytical insights.</p> | | <p class="expand mw-collapsible-content">There are three common value propositions for pursuing Data Lakes. 1) It can provide an easy and accessible way to obtain data faster; 2) It can create a singular inflow point of data to help connect and merge information silos in an organization; and 3) It can provide an experimental environment for experienced data scientists to enable new analytical insights.</p> |
| + | <br> |
| <p class="inline">Data Lakes can provide data to consumers more quickly by offering data in a more raw and easily accessible form. Data is stored in its native form with little to no processing, it is optimized to store vast amounts of data in their native formats. By allowing the data to remain in its native format, a much timelier stream of data is available for unlimited queries and analysis. A Data Lake can help data consumers bypass strict data retrieval and data structured applications such as a data warehouse and/or data mart. This has the effect of improving a business’ data flexibility.</p><p class="expand inline mw-collapsible-content">Some companies have in fact used Data Lakes to replace existing warehousing environments where implementing a new data warehouse is more cost prohibitive. A Data Lake can contain unrefined data, this is helpful when either a business data structure is unknown, or when a data consumer requires access to the data quickly. </p> | | <p class="inline">Data Lakes can provide data to consumers more quickly by offering data in a more raw and easily accessible form. Data is stored in its native form with little to no processing, it is optimized to store vast amounts of data in their native formats. By allowing the data to remain in its native format, a much timelier stream of data is available for unlimited queries and analysis. A Data Lake can help data consumers bypass strict data retrieval and data structured applications such as a data warehouse and/or data mart. This has the effect of improving a business’ data flexibility.</p><p class="expand inline mw-collapsible-content">Some companies have in fact used Data Lakes to replace existing warehousing environments where implementing a new data warehouse is more cost prohibitive. A Data Lake can contain unrefined data, this is helpful when either a business data structure is unknown, or when a data consumer requires access to the data quickly. </p> |
− | <p></p> | + | <br> |
| <p class="inline">A Data Lake is not a single source of truth. A Data Lake is a central location in which data converges from all data sources and is stored, regardless of the data formatting. </p><p class="expand inline mw-collapsible-content">As a singular point for the inflow of data, sections of a business can pool their information together in the Data Lake and increase the sharing of information with other parts of the organization. In this way everyone in the organization has access to the data. A Data Lake can increase the horizontal data sharing within an organization by creating this singular data inflow point. Using a variety of storage and processing tools analysts can extract data value quickly in order to inform key business decisions.</p> | | <p class="inline">A Data Lake is not a single source of truth. A Data Lake is a central location in which data converges from all data sources and is stored, regardless of the data formatting. </p><p class="expand inline mw-collapsible-content">As a singular point for the inflow of data, sections of a business can pool their information together in the Data Lake and increase the sharing of information with other parts of the organization. In this way everyone in the organization has access to the data. A Data Lake can increase the horizontal data sharing within an organization by creating this singular data inflow point. Using a variety of storage and processing tools analysts can extract data value quickly in order to inform key business decisions.</p> |
− | <p></p> | + | <br> |
| <p class="expand inline mw-collapsible-content">A Data Lake is optimized for exploration and provides an experimental environment for experienced data scientists to uncover new insights from data. Analysts can overlay context on the data to extract value. All organizations want to increase analytics and operational agility.</p><p class="inline">The Data Lake architectural approach can store large volumes of data, this can be a way in which cross-cutting teams can pool their data in a central location and by complementing their systems of record with systems of insight. </p><p class="expand inline mw-collapsible-content">Data Lakes present the most potential benefits for experienced and competant data scientists. </p><p class="inline">Having structured, unstructured and semistructured data, usually in the same data set, can contain business, predictive, and prescriptive insights previously not possible from a structured platform as observed in data warehouses and data marts.</p> | | <p class="expand inline mw-collapsible-content">A Data Lake is optimized for exploration and provides an experimental environment for experienced data scientists to uncover new insights from data. Analysts can overlay context on the data to extract value. All organizations want to increase analytics and operational agility.</p><p class="inline">The Data Lake architectural approach can store large volumes of data, this can be a way in which cross-cutting teams can pool their data in a central location and by complementing their systems of record with systems of insight. </p><p class="expand inline mw-collapsible-content">Data Lakes present the most potential benefits for experienced and competant data scientists. </p><p class="inline">Having structured, unstructured and semistructured data, usually in the same data set, can contain business, predictive, and prescriptive insights previously not possible from a structured platform as observed in data warehouses and data marts.</p> |
| | | |