Difference between revisions of "GC Enterprise Architecture/Framework/DataGuide"

From wiki
Jump to navigation Jump to search
m
m
Line 5: Line 5:
 
== Information architecture ==
 
== Information architecture ==
 
Information architecture is defined as the management and organization of data for a business. The best practices and principles aim to support the needs of a business service and business capability orientation. To facilitate effective sharing of data and information across government, information architectures should be designed to reflect a consistent approach to both structured and unstructured data, such as the adoption of federal and international standards. Information architecture should also reflect responsible data management, information management and governance practices, including the source, quality, interoperability, and associated legal and policy obligations related to the data assets. Information architectures should also distinguish between personal and non‑personal data. How personal information is treated such as its collection, use, sharing (disclosure), and management must respect the requirements of the ''[https://laws-lois.justice.gc.ca/eng/ACTS/P-21/index.html Privacy Act]'' and its related policies. Outlined in the points below, are objectives to be fulfilled in order to maintain information architecture standards.
 
Information architecture is defined as the management and organization of data for a business. The best practices and principles aim to support the needs of a business service and business capability orientation. To facilitate effective sharing of data and information across government, information architectures should be designed to reflect a consistent approach to both structured and unstructured data, such as the adoption of federal and international standards. Information architecture should also reflect responsible data management, information management and governance practices, including the source, quality, interoperability, and associated legal and policy obligations related to the data assets. Information architectures should also distinguish between personal and non‑personal data. How personal information is treated such as its collection, use, sharing (disclosure), and management must respect the requirements of the ''[https://laws-lois.justice.gc.ca/eng/ACTS/P-21/index.html Privacy Act]'' and its related policies. Outlined in the points below, are objectives to be fulfilled in order to maintain information architecture standards.
 +
 +
[[File:Info Arch System.png|800px|thumb|center]]
  
 
=== Collect data to address the needs of the users and other stakeholders ===
 
=== Collect data to address the needs of the users and other stakeholders ===

Revision as of 08:55, 6 February 2023

Information architecture[edit | edit source]

Information architecture is defined as the management and organization of data for a business. The best practices and principles aim to support the needs of a business service and business capability orientation. To facilitate effective sharing of data and information across government, information architectures should be designed to reflect a consistent approach to both structured and unstructured data, such as the adoption of federal and international standards. Information architecture should also reflect responsible data management, information management and governance practices, including the source, quality, interoperability, and associated legal and policy obligations related to the data assets. Information architectures should also distinguish between personal and non‑personal data. How personal information is treated such as its collection, use, sharing (disclosure), and management must respect the requirements of the Privacy Act and its related policies. Outlined in the points below, are objectives to be fulfilled in order to maintain information architecture standards.

Info Arch System.png

Collect data to address the needs of the users and other stakeholders[edit | edit source]

Collecting data is an important activity to define a scope that meets all stakeholder requirements. Stakeholder requirements refer to what is expected out of an activity such as business capabilities being addressed, resource requirements, and timeline. Assessing these requirements will make it easier to draw boundaries around the activity and ensure that a high-quality result is being delivered. To track and demonstrate that requirements are being fulfilled throughout the activity, data must be collected first. Data collection is a systematic process of gathering observations for the purposes of addressing programs or policies in a business or the government.

The initial step to collecting data is to identify what information needs to be gathered and the source of data. For example, in creating a centralized system for a patient database at a hospital, once the user requirements are identified, the process of how data is collected should be considered. There are several ways of collecting data, such as interviews, surveys/ questionnaires, workshops, and observing consumers. The goal is to gather these requirements and prioritize them. Questions for surveys or interviews must be consistent to keep the data organized and easier to prioritize. The data itself should be reusable and easily transportable to another system to save time and money.

  • assess program objectives based on data requirements, as well as users, business and stakeholder needs
  How to achieve:
    * Summarize how the architecture meets the data needs of the users and other key stakeholders including:
        * How does the data asset contribute to outcomes/needs of the user and other stakeholders
        * Gaps in the existing data assets to meet the needs of the users and other stakeholders and how the architecture addresses these gaps
        * Gaps in data collection and analysis and how the architecture is addressing it so that department can ensure that we are serving the members of our society 
        * Alignment to the data foundation of the departmental information/data architecture practice
        * Alignment to the theoretical foundation of the departmental information/data architecture practice
   Tools:
    * For Data Foundation – Implement:
         * Data Catalogue
         * Benefits Knowledge hub
         * Data Lake (growth)
         * Data Science and Machine Platform
    * Stakeholder Requirements
    * Solution Requirements
  • collect only the minimum set of data needed to support a policy, program, or service
   How to achieve:
    * Summarize how the architecture aligns to “collect with a purpose”  including:
         * What is necessary (as opposed to what is sufficient) to meet the stakeholder need
         * Supporting Performance Information Profiles (PIPs) used to assess a progress towards target and broader objectives
  Tools:
    * Value Stream (Value Item and Value Proposition – Context on what we measure)
    * KPI  (Linked to benefits, outcomes and objectives)
  • reuse existing data assets where permissible and only acquire new data if required
   How to achieve:
    * Summarize reusability of the architecture’s data assets given:
        * Context of data assets and user and stakeholder needs
        * Data quality and fit for purpose
        * Privacy and Security Regulatory Framework
   Tools: 
    * Legislative / Regulations
  • ensure data collected, including from third-party sources, are of high quality
   How to achieve:
    * Summarize how the architecture meets the data quality requirements of third-party sources:
        * Data quality meets fit for purpose
        * Data quality dimensions including:
             * Relevance,
             * Timeliness
             * Consistency,
             * Reliability,
             * Interpretability,
             * Usability
        * Data quality mechanism 
   Tools: 
    * Data Foundation – Implement (Leverage the standard definition)
       * Data Catalogue
       * Benefits Knowledge hub
       * Data Lake (growth)
       * Data Science and Machine Platform

Manage and reuse data strategically and responsibly[edit | edit source]

Data architecture is defined as the management of data by translating the business requirements into technical requirements for an organization. The management of data refers to the collection, storage, and usage of data in an information system. Furthermore, data management and the direction of its flow are guided by various framework of models, policies, rules, and standards used by the organization. They both provide a foundation to work efficiently with data as well as to govern data access by establishing roles, responsibilities and accountabilities. For example, an organization may have a system that they have conceptualized to store information. To ensure that the system succeeds in doing so, it must satisfy adequate data storage capabilities as well as role-based access functionalities. To assess said capabilities, an organization should review the system’s functions and compare them to the user/stakeholder requirements to ensure adequate support to organizational policies. The system must also ensure data lineage is maintained to be able to trace back data to its origin.

Data architectures define and set data standards and principles. To accomplish the process of translating business requirements into technical requirements, some duties may entail creating blueprints for data flow and data management as well as assessing potential data sources. Plans may be devised to make these sources accessible to all employees and keep them protected according to existing security and privacy policies. Data architecture identifies [data] consumers within an organization, then align with their varying requirements and allow them access at any moment with a synchronous process to deliver usable data. For example, within a hospital, nurses and doctors utilize and work with patient data. Depending on who it is, some may be required to update data such as illness or prescribed medicine, and some should view and direct based on the data, to coordinate rooms and available medical machines. It is a necessity to have a centralized system, or systems that are interoperable, with varying features and permissions to be able to access all this information at any given time. Otherwise, if non-interoperable multiple systems for different data sets were used, it would be difficult to maintain the flow of data throughout the hospital, which would cause loss opportunities of time-sensitive action that can harm patients in critical conditions.

  • define and establish clear roles, responsibilities, and accountabilities for data management
  How to achieve:
    * Summarize how the architecture assists in defining key data management roles and their responsibilities to ensure data is correct, consistent, and complete including:
       * Identifies the data steward responsibilities;
       * Identifies the data consumer responsibilities, and;
       * Identifies the data custodian responsibilities.
   Tools: 
    * Stakeholders
    * Business Process Model
    * Functional Requirements
    * Business Glossary
  • identify and document the lineage of data assets
  How to achieve:
   * Summarize how the architecture’s data assets demonstrate alignment with department's data governance and strategy including:
       * Alignment to the data foundation of the ESDC information/data architecture practice
       * Alignment to the theoretical foundation of the ESDC information/data architecture practice
  Tools:
   * Target state (solution data elements)
   * Data Foundation – Implement (Leverage the standard definition)
      * Data Catalogue
      * Benefits Knowledge hub
      * Data Lake (growth)
      * Data Science and Machine Platform
   * Theoretical Foundation
      * EDRM (Conceptual and Logical)
      * Business Glossary
      * Departmental Data Strategy
  • define retention and disposition schedules in accordance with business value as well as applicable privacy and security policy and legislation
  How to achieve:
   * Summarize  for each key data assets:
      * Retention and disposition schedules
      * Disposition process 
  Tools:
   * Target state (solution data elements)
   * Non Functional Requirements
   * IM Best Practices and Standards
  • ensure data are managed to enable interoperability, reuse and sharing to the greatest extent possible within and across departments in government to avoid duplication and maximize utility, while respecting security and privacy requirements
  How to achieve:
   * Summarize how the architecture enables interoperability, reuse and sharing to the greatest extent possible within and across departments
   * Summarize how the architecture avoids data duplication 
  Tools:
   * Target State
   * Data Foundation – Implement (Leverage the standard definition)
       * Data Catalogue
       * Benefits Knowledge hub
       * Data Lake (growth)
  • contribute to and align with enterprise and international data taxonomy and classification structures to manage, store, search and retrieve data
  How to achieve:
   * Summarize the alignment to departmental/GC:
      * Data taxonomy structure
      * Data classification structure
  Tools:
   * Data Foundation – Implement (Leverage the standard definition)
       * Data Catalogue
   * Theoretical Foundation
       * EDRM (Conceptual and Logical)
       * Business Glossary

Use and share data openly in an ethical and secure manner[edit | edit source]

Organizations should be able to adhere to ethical guidelines on data sharing to address and meet emerging standards and legislative requirements. It is an organization’s responsibility to apply transparency and to respect how data is used within the organization. Using and sharing data in an ethical manner can build trust between the public and the organization. Failing to prioritize privacy, security, consent, and ownership of data can negatively harm an organization’s reputation and credibility (and create the risk of extinction). To share data ethically and legally, an organization must request participants’ consent. How personal data will be used and shared must be communicated transparently to avoid misleading anyone. Furthermore, to keep data private and more generic for future sharing purposes, it can be anonymized by removing participant’s tombstone information such as name, address, and occupation. If data anonymization is considered, it is ideal to plan it during the collection phase. It is necessary to inform third party readers when data has been anonymized. This may be done by using markings in the text for contents that have been previously removed. Additionally, an original data repository copy should always be kept separately and secured to keep a record of all data that has been anonymized in the final product. Third party readers should have valid reasons and the right qualifications to access the original data to ensure data is treated in a careful manner. Data must not be shared when: there is a conflict of interest with the need to protect personal identities; when an organization does not have ownership of the data; and when releasing the data presents a security risk.

  • share data openly by default as per the Directive on Open Government and Digital Standards, while respecting security and privacy requirements; data shared should adhere to existing enterprise and international standards, including on data quality and ethics
  How to achieve:
   * Summarize how the architecture supports sharing data openly by default as per Directive on Open Government and Digital Standards given:
       * Existing departmental and GC data standards and policies
       * International data standards;  and the Privacy Act, 
       * Fitness for purpose
       * Ethics 
  Tools:
   * Data Foundation – Implement (Leverage the standard definition)
       * Data Catalogue
       * Benefits Knowledge Hub
       * Data Lake (growth)
       * Data Science and Machine Platform
   * Theoretical Foundation
       * EDRM (Conceptual and Logical)
       * Business Glossary
       * Departmental Data Strategy
  • ensure data formatting aligns to existing enterprise and international standards on interoperability; where none exist, develop data standards in the open with key subject matter experts
  How to achieve:
   * Summarize how the architecture utilises existing enterprise and international data standards
   * Summarize how the architecture has developed any data standards through open collaboration with key subject matter experts and the Enterprise Data Community of Practice.
  Tools:
   * Data Standards
       * NIEM
       * OpenData 
       * National Address Register
       * Reference Data Repository
  • ensure that combined data does not risk identification or re‑identification of sensitive or personal information
  How to achieve:
   * Summarize how the architecture ensures the aggregation and combing of data does not pose a risk to information sensitivity or personal information 


Design with privacy in mind for the collection, use and management of personal Information[edit | edit source]

Keeping data protected in a company is important to keep sensitive information such as industry knowledge and personal information private and solely on a company server. It is a legal requirement for companies as outlined in the Privacy Act. Furthermore, when companies work with clients who are investing their money and time, it is expected for them to ensure that data will be held safely at the minimum. Data privacy can protect the reputation and credibility of a company as unprotected data can often fall victim to hackers and outside sources. For example, within a hospital, patient personal information should be kept secure and away from the public. This is expected of hospitals and failing to comply can result in legal and ethical issues. Additionally, it would be dangerous for strangers to get access to patient information such as their address and health implications.

  • ensure alignment with guidance from appropriate institutional ATIP Office with respect to interpretation and application of the Privacy Act and related policy instruments
  How to achieve:
   * Describe how the architecture aligns to guidance of the ATIP Office around  personal information  regulatory  framework; policy framework; and consent directives
  • assess initiatives to determine if personal information will be collected, used, disclosed, retained, shared, and disposed
  How to achieve:
   * Has the initiative assessed  if personal information will be collected, used, disclosed, retained, shared, and disposed
  • only collect personal information if it directly relates to the operation of the programs or activities
  How to achieve:
   * Summarize how the architecture ensures  the  personal information collected is directly required to the operational of the programs or activities
  • notify individuals of the purpose for collection at the point of collection by including a privacy notice
  How to achieve:
   * Does the solution’s privacy notice provide the purpose for collecting this personal information
   * Does the solution provide a privacy notice at the point of personal information collection 
  • personal information should be, wherever possible, collected directly from individuals but can be from other sources where permitted by the Privacy Act
  How to achieve:
   * Does the architecture collect personal information directly from the individual
   * If no, what personal information is collect form other sources  and does it comply with the Privacy Act and the consent directive of the source 
  Tools:
   * Target State Architecture
   * Interim State Architecture
  • personal information must be available to facilitate Canadians’ right of access to and correction of government records
  How to achieve:
   * Summarize how the architecture facilitates Canadian's right to access their personal information records
   * Summarize how the architecture facilitates Canadian's right to correct their personal information records
  Tools:
   * Target State Architecture
   * Interim State Architecture
  • design access controls into all processes and across all architectural layers from the earliest stages of design to limit the use and disclosure of personal information
  How to achieve:
   * Summarize how the architecture limits the use and disclosure of personal information in accordance to the privacy legislative; policy frameworks and consent directives
  • design processes so personal information remains accurate, up‑to‑date and as complete as possible, and can be corrected if required
  How to achieve:
   * Summarize how the  architecture ensures personal information remains accurate
   * Summarize how the architecture ensures personal information remains up-to-date
   * Summarize how the architecture ensures personal information remains complete as possible
   * Summarize how the architecture ensures personal information can be corrected if required
  Tools:
   * Non Functional Requirements
   * FUnctional Requirements
  • de‑identification techniques should be considered prior to sharing personal information
  How to achieve:
   * Outline the de-identification techniques used by the architecture in sharing personal information
  • in collaboration with appropriate institutional ATIP Office, determine if a Privacy Impact Assessment (PIA) is required to identify and mitigate privacy risks for new or substantially modified programs that impact the privacy of individuals
  How to achieve:
   * Describe how the architecture addresses the recommendations of the PIA
   * If not all recommendations of the PIA are being addressed,  outline how the business will address any residual risks of the PIA


  • establish procedures to identify and address privacy breaches so they can be reported quickly and responded to efficiently to appropriate institutional ATIP Office
  How to achieve:
   * Are procedures established to identify and address privacy breaches
   * Summarize how the architecture enables/supports these procedures
  Tools:
   * Business Process Model