Difference between revisions of "Annex A: Data Loss Prevention"

From wiki
Jump to navigation Jump to search
(Created page with "<div class="center"><div style="float: right; z-index: 10; position: absolute; right: 0; top: 1;">File:JoinusonGCconnex.png|link=http://gcconnex.gc.ca/groups/profile/2785549...")
 
 
Line 27: Line 27:
  
 
</div></div>
 
</div></div>
{{TOCright}}
+
{{Delete|reason=Expired Content}}
 
 
 
 
== Overview==
 
This annex to the GC Enterprise Security ConOps document explores the operational aspects from the users' and operators' perspective for a GC-wide data loss prevention (DLP) capability to detect and prevent the unauthorized exfiltration of protected GC information, including information creation, security event generation, policy detection and enforcement, loss remediation, etc. With responsibility for maintaining large amounts of sensitive data, both classified and unclassified, the GC needs to minimize risk of unauthorized disclosure of this data. In Particular, the GC must ensure that sensitive data cannot be sent outside of the GC without authorization. Unauthorized disclosure of sensitive information could not only result in risks to national security, but also put at risk the well-being of Canadian citizens and other individuals and organizations that do business with the GC.
 
 
 
 
 
For more information, please read the [[Media:GC Enterprise Security ConOps - ANNEX A DLP.pdf|GC ESA ConOps Annex A: Data Loss Prevention]] document.
 
 
 
<br>
 
 
 
== Why Data Loss Prevention (DLP)? ==
 
Computer Network Defence (CND) technologies that include firewalls, intrusion detection and prevention systems, application proxies, etc., are well understood and widely implemented. Whereas CND technologies are focused on preventing intruders from getting into the IT/IS, DLP technologies are focused on preventing sensitive data from getting out of the IT/IS. There is an overlap in that some CND tools can perform some basic DLP functions as by-products of their primary functions (and may even have built-in DLP functions), but a dedicated DLP system is required by any organization concerned about leakage of sensitive information.
 
 
 
=== ''DLP Defined'' ===
 
Gartner's definition of content-aware DLP tools is as follows:<blockquote>"Content-aware data loss prevention (DLP) tools enable the dynamic application of policy based on the content and context at the time of an operation. These tools are used to address the risk of inadvertent or accidental leaks, or exposure of sensitive enterprise information outside authorized channels, using monitoring, filtering, blocking and remediation features."</blockquote>
 
[[File:DLP Tools Overview.PNG|thumb|441x441px|DLP Tools Overview]]
 
An important part of this definition, and a distinguishing characteristic of any enterprise DLP solution, is that of content-awareness. Content-awareness refers to the ability for a DLP tool to examine with actual content of a message and not just information about the message (such as recipient name for an email). The latter is referred to by DLP vendors as contest. In reality, it is not appropriate to always block sensitive data. Instead, it is necessary to consider the destination of the sensitive data, the intended business purpose, and possibly other factors, such as time-of-day. Not doing so is likely to have negative effects on departments performing legitimate government business. This requires DLP solutions to analyze both content and context.
 
 
 
DLP tools can be deployed in different locations to discover and analyze data in different states (see image on the right):
 
* '''Network DLP''' sensors are deployed in network perimeters to analyze data flowing throughout an enterprise network. The type of data processed by a network DLP sensor is known synonymously as '''''Data-in-Motion (DIM)''''' or '''''Data-in-Transit (DIT)'''''.
 
* '''Storage DLP''' sensors are deployed on dedicated enterprise data storage devices, such as Network Attached Storage (NAS) devices, Storage Area Networks (SAN), and Database Management Systems (DBMS). Unlike endpoints that are capable of running a variety of application software, storage devices typically only include software or firmware for performing backup, recovery, and other specialized data management functions. The type of data processed by a storage DLP sensor is known as '''''Data-at-Rest (DAR)'''''.
 
* '''Endpoint DLP''' sensors are deployed as agents on general-purpose computers. Endpoint DLP sensors monitor data leaving the endpoint over wired and wireless interfaces, such as USB, WiFi, Bluetooth, and Near Field Communications (NFC). General purpose computers include end-user devices (e.g. desktops, laptops, tablets) and application servers. Data actively being processed, including moving to and from external interfaces (including portable storage devices), is known as '''''Data-in-Use (DIU)'''''. Data resident in non-removable storage on an endpoint (e.g. hard drive) is a form of '''''Data-at-Rest (DAR)'''''.
 
* Application Servers and End User Devices should support both types of Endpoint DLP sensor. Data storage services include Network-Attached Storage (NAS) and Database Management Systems (DBMS) likely only support data-at-rest DLP sensors.
 
As shown in the image below, an automated DLP solution should not be considered a "magic bullet" that prevents all loss of sensitive data. Instead, DLP must be used in combination with other technologies, and appropriate policies and operational procedures must be put in place.
 
 
 
<br>
 
 
 
[[File:Different States of Data and Associated Concerns.PNG|centre|thumb|588x588px|Different States of Data and Associated Concerns]]
 
 
 
=== ''DLP Standardization Efforts'' ===
 
Currently, broad scope DLP standardization efforts are non-existent. DLP technology has grown organically based on proprietary vendor technologies and interfaces. The industry is consolidating with major security solution providers acquiring the technology of smaller companies and integrating DLP with a broader security suite.
 
 
 
A major challenge for enterprise DLP customers like the GC is the lack of standardization. This leads to issues with vendor lock-in and variability in the effectiveness of different parts of a DLP system.
 
 
 
 
 
For more information about data loss protection and what it is, please read the [[Media:GC Enterprise Security ConOps - ANNEX A DLP.pdf|GC ESA ConOps Annex A: Data Loss Prevention]] document.
 
 
 
<br>
 
 
 
== Current Situation ==
 
 
 
=== ''Background, Objectives, Scope'' ===
 
The GC operates a network-centric IT/IS infrastructure. Endpoint and perimeter protections are primarily focused on preventing attacks from external sources (e.g. the Internet, untrusted partners). These protections include traditional firewalls, IDS/IPS, proxy gateways, and host-based protection suites. While some may have some capabilities for detecting and blocking outgoing traffic based on deep content analysis, this is not their primary purpose, so the capabilities may be rudimentary and may not even be configured.
 
 
 
=== ''Description of the Current Situation for Data Loss Prevention'' ===
 
Data loss prevention within the GC is currently heavily human-dependent. Information owners are responsible for tagging documents with their classification or sensitivity level and recipients of those documents are responsible for prudent handling on those documents. Document labels are generally only included within the content of the document (e.g. text annotations in headers and footers of the document) and not in metadata labels, making automated identification of tagged documents unreliable. Similarly, USB thumb drives and CD/DVD may have the sensitivity level of the information they contain inscribed using permanent marker/felt-tip pens. Prudent handling of sensitive documents may be audited using physical security measures, such as random bag searches at building exits and random clean desk checks.
 
 
 
Automated CND capabilities may be able to block information transfers using conventional access control techniques based on metadata associated with the metadata, but are mostly unable to make decisions based on deep analysis of information content. Any gateways that do perform deep content analysis are primarily looking for malware and are intended to prevent infiltration of malicious files rather than exfiltration of sensitive data. Other content may be discovered and detected after the fact.
 
 
 
When DLP tools are introduced, they will make the overall DLP process more effective, but will never replace the human element. User awareness and training will continue to be an important aspect of DLP. The introduction of DLP tools will also not eliminate the need for CND tools as it will remain critical to keep intruders out of GC networks and systems. From a DLP perspective, intruders may be able to view unauthorized information by masquerading as an authorized external user, and they may be able to disable or bypass DLP tools.
 
 
 
 
 
For more information about the current situation for data loss protection, please read the [[Media:GC Enterprise Security ConOps - ANNEX A DLP.pdf|GC ESA ConOps Annex A: Data Loss Prevention]] document.
 
 
 
<br>
 
 
 
== Operational States ==
 
This section presents phases of the current DLP lifecycle and then overlays the lifecycle phases on the operational states of the current GC enterprise IT/IS infrastructure.
 
[[File:Data Loss Detection Current Situation.PNG|left|thumb|511x511px|Data Loss Detection Current Situation]]
 
 
 
=== ''DLP Lifecycle Phases'' ===
 
The image to the left illustrates the current DLP lifecycle phases. As shown, four phases of DLP have been defined.
 
 
 
'''Define'''
 
 
 
Information owners develop policies that identify protected classes of data and with whom those classes of data may be shared. These policies may also proscribe procedures for dealing with loss of different classes of data and sanctions for those responsible.
 
 
 
'''Discover'''
 
 
 
This phase consists of discovering channels that may be used to exfiltrated data. Understanding and monitoring these channels is a prerequisite for preventing data loss. Physical channels are generally well understood, but technical channels are less so. The latter is not helped by the large number of zones and perimeters (including Internet access points), often under different administrative control. Even if the technical channels have been identified, data loss cannot be prevented without DLP technical capabilities in place.
 
 
 
'''Detect'''
 
 
 
Physical security measures (e.g. security guards performing bag searches at buildings exits) are able to detect and prevent some data loss, but without additional technical capabilities, most data is only detected after it has been lost (i.e. the loss could not be prevented). Data that has been lost may only be reliably detected if that data is subsequently published on the web or shared with the press (e.g. by a whistleblower). There may be other indications that imply, but do not confirm, that data loss has occurred.
 
 
 
'''Respond'''
 
 
 
If data loss is detected, the goal is to determine how the data was lost, who was responsible for the loss, whether the loss was deliberate or accidental, and how to mitigate the impact of the loss. If attempted data loss was detected, the response may include initiation of disciplinary proceedings. In all cases, policies and procedures should be reviewed to reduce future risk.
 
 
 
In the existing manual system, most effort is spent responding to data loss that has already occurred as depicted by the larger size of the Respond box. This includes damage control and attempting to identify the source of the loss. As DLP becomes more automated, the effort is expected to shift from Respond to Define.
 
 
 
=== ''System Operational States'' ===
 
[[File:Current GC Enterprise IT-IS Infrastructure Operational States Relating to DLP.PNG|thumb|557x557px|Current GC Enterprise IT/IS Infrastructure Operational States Relating to DLP]]
 
The image on the right shows how data loss activities overlay the operational states of the GC enterprise at present (i.e. before the deployment of an automated DLP capability).
 
 
 
'''Pre-Deployment State'''
 
 
 
No activities occur in the Pre-Deployment state.
 
 
 
'''Deployed'''
 
 
 
No activities occur in the Deployed state.
 
 
 
'''Operational'''
 
 
 
The activities associated with the Detect and Respond phases of the data loss lifecycle occur in the Operational state:
 
# '''Detect:''' Data loss activities begin when it is determined that an authorized transfer of sensitive data has occurred. The determination can occur automatically, for example through a technical security control, such as Computer Network Defense (CND), or the determination can occur manually, for example, through external notification that GC sensitive data has been discovered outside the GC enterprise.
 
# '''Respond:''' After it has been determined that an unauthorized transfer of sensitive data occurred, the response is manual and performed on an ad-hoc incident-by-incident basis. Response activities that involve the GC enterprise IT/IS infrastructure may include examining log files to trace the source of the data loss, configuring CND capabilitites to reduce the risk of a future loss, cleaning up malwar, suspending/disabling accounts, etc.
 
'''Maintenance'''
 
 
 
No DLP activities occur in the Maintenance state.
 
 
 
'''Failed'''
 
 
 
No DLP activities occur in the Failed state.
 
 
 
'''Decommissioned'''
 
 
 
DLP components that store incident data must be purged using GC-approved procedures.
 
 
 
 
 
For more information about DLP operational states and the proposed architecture for the GC DLP solution, please read the [[Media:GC Enterprise Security ConOps - ANNEX A DLP.pdf|GC ESA ConOps Annex A: Data Loss Prevention]] document.
 
 
 
<br>
 
 
 
== User Classes and Other Involved Personnel ==
 
The image below shows the user classes (actors) involved in the definition, discovery, detection, and response to data loss. Each user class is fully described in the [[media:GC Enterprise Security ConOps.pdf|GC ESA ConOps Main Body]] document, together with their organizational relationships. The description and DLP responsibilities of each user class are described in the [[Media:GC Enterprise Security ConOps - ANNEX A DLP.pdf|GC ESA ConOps Annex A: Data Loss Prevention]] document.
 
 
 
<br>
 
 
 
[[File:User Classes Relevant to the Current DLP Situation.PNG|centre|thumb|671x671px|User Classes Relevant to the Current DLP Situation]]
 
 
 
<br>
 
 
 
== Justification for and Nature of Changes ==
 
 
 
=== ''Justification for Changes'' ===
 
The GC is responsible for protecting a significant amount of classified and sensitive information. Loss of such information could result in a threat to national security and/or loss of privacy for Canadian citizens. In June 2013, the Privacy Commissioner of Canada indicated that there were more than 3,000 data breaches over a 10-year period that affected about 725,000 Canadians. After an attack in early 2011 on the Treasury Board, Defence Research and Development Canada, and Department of Finance that resulted in the exfiltration of classified information, it took almost eight months for full Internet access to be restored. The latter incident resulted in a report issues by the Auditor General ("[http://www.oag-bvg.gc.ca/internet/English/osh_20130423_e_38313.html Protecting Canadian Critical Infrastructure against Cyber Threats]").
 
 
 
=== ''Description of Desired Changes'' ===
 
While enhanced user awareness and additional physical security can lessen the amount of data loss, the GC needs automated capabilities that can detect, block, and report on the attempted exfiltration of unauthorized sensitive information from within the GC to the outside, and from one GC organization to another GC organization.
 
 
 
The technical features of an automated DLP capability are itemized and prioritized in the next section. As a prerequisite to installing DLP features, existing CND capabilities should be reviewed and enhanced as necessary. CND is different to DLP and cannot replace DLP. However, some CND capabilities, such as firewalls and application proxies can implicitly prevent loss of data, and may implemented some "channel DLP" features. A robust CND architecture is also necessary to protect DLP tools after they are installed.
 
 
 
=== ''Priorities among Changes'' ===
 
The table below identifies features that are required to complement existing manual processes and CND capabilities to prevent data loss. They are prioritized according to whether they are essential (high priority), desirable (medium priority), or optional (low priority). Features are not further prioritized within each priority. The priorities listed should be adjusted, if necessary, based on an analysis of past breaches within the GC. For more details about the identified features, please read the [[Media:GC Enterprise Security ConOps - ANNEX A DLP.pdf|GC ESA ConOps Annex A: Data Loss Prevention]] document.
 
 
 
<br>
 
 
 
{| class="wikitable"
 
| style="background: #000000; color: #ffffff | '''Feature'''
 
| style="background: #000000; color: #ffffff | '''Priority'''
 
| style="background: #000000; color: #ffffff | '''Rationale for Priority'''
 
|-
 
| style="background: #e5e5e5; color: #000000 | The ability to automatically detect, block (if configured to do so by policy rules), and report on the attempted exfiltration of unauthorized data-in-transit over a network interface (zone perimeter). This includes internal and external perimeters.
 
| style="background: #727272; color: #ffffff | '''Essential'''
 
|Deploying network sensors does not affect endpoints (or the users operating them) and prevents deliberate exfiltration of information by outsiders (i.e. intruders who hack into the system) and accidental exfiltration by insiders. As network consolidation occurs under Shared Services Canada, network sensors are expected to offer the most benefit for the least cost.
 
|-
 
| style="background: #e5e5e5; color: #000000 | The ability to analyze both data content and context as part of an information flow decision.
 
| style="background: #727272; color: #ffffff | '''Essential'''
 
|Tools that only perform simple pattern matching on content, or only look at specific metadata attribute types and values, may meet short-term GC requirements to improve on the current situation, but are unlikely to be part of a long-term, comprehensive, solution. Simple content or context analysis may be configured initially, and more sophisticated analysis techniques configured as experience is gained. Products should also be configured in permissive mode initially (non-blocking) so policy rules can be fine-tuned without users being inconvenienced by false positives.
 
|-
 
| style="background: #e5e5e5; color: #000000 | The ability to automatically detect block, and report on the attempted exfiltration of unauthorized unencrypted data-in-use to portable storage devices.
 
| style="background: #727272; color: #ffffff | '''Essential'''
 
|Portable storage devices, particularly USB thumb drives, present a significant risk for data exfiltration.
 
|-
 
| style="background: #e5e5e5; color: #000000 | A means to define and configure policy rules that are understandable to non-technical information owners.
 
| style="background: #727272; color: #ffffff | '''Essential'''
 
|The criteria for identifying and recognizing sensitive data should be defined by information owners.
 
|-
 
| style="background: #e5e5e5; color: #000000 | Reporting capabilities that are understandable to non-technical information owners, support (HR, legal, etc.), and oversight personnel.
 
| style="background: #727272; color: #ffffff | '''Essential'''
 
|Information owners understand the value of their data and the consequences of its loss. Therefore, it must be easy for them to understand what has been lost when a violation occurs.
 
|-
 
| style="background: #e5e5e5; color: #000000 | Capabilities and processes that allow the effectiveness of deployed technical capabilities to be assessed.
 
| style="background: #727272; color: #ffffff | '''Essential'''
 
|It must be possible to assess the effectiveness of deployed capabilities to determine return on investment, identify improvements in current capabilities, and to obtain approval for deployment of future capabilities.
 
|-
 
| style="background: #e5e5e5; color: #000000 | The ability to automatically detect, quarantine (if configured to do so by policy rules), data-at-rest stored in inappropriate locations.
 
| style="background: #898989; color: #ffffff | '''Desirable'''
 
|Encrypting data-at-rest, particularly on portable devices (laptops, tablets, etc.) that are vulnerable to theft can significantly reduce the risk of exfiltration. Attempts to exfiltrated data-at-rest in secure physical locations (e.g. in a server room) generally require it to be converted to data-in-use and/or data-in-transit, which means it can be caught by endpoint and network sensors that have already been deployed.
 
|-
 
| style="background: #e5e5e5; color: #000000 | The ability to automatically detect exfiltration of data-in-use via other means than portable storage devices, for example, via Bluetooth and near-field communications (NFC) interfaces.
 
| style="background: #898989; color: #ffffff | '''Desirable'''
 
|Portable storage devices present the greatest risk, but all interfaces should be monitored.
 
|-
 
| style="background: #e5e5e5; color: #000000 | A hierarchical management capability.
 
| style="background: #898989; color: #ffffff | '''Desirable'''
 
|A hierarchical capability allows reporting information to be consolidated and analyzed to provide an overall view of the effectiveness of DLP tools within the GC.
 
|-
 
| style="background: #e5e5e5; color: #000000 | A means for users to explicitly tag (apply a label to) a data object to indicate its nature, sensitivity, community of interest, and other type of metadata attributes that can be used to make information flow decisions.
 
| style="background: #898989; color: #ffffff | '''Desirable'''
 
|The distinguishing characteristic of DLP tools is content-awareness. Metadata labels can be useful for initial identification of unauthorized data objects but may not be sufficiently trusted or complete to substitute for content analysis. Explicit labeling is likely to be a burden for users and result in missing or inappropriate labels.
 
|}
 
 
 
<br>
 
 
 
== Assumptions and Constraints ==
 
There are some assumptions and constraints to implementing GC DLP solution.
 
 
 
Assumptions:
 
* The number of network zones and associated perimeters will decrease as network consolidation is carried out by SSC.
 
Constraints:
 
* The ability to examine encrypted data-in-transit may be limited unless a key escrow scheme is developed.
 
 
 
 
 
For more information, please read the [[Media:GC Enterprise Security ConOps - ANNEX A DLP.pdf|GC ESA ConOps Annex A: Data Loss Prevention]] document.
 
 
 
<br>
 
 
 
== References ==
 
* [[Media:GC Enterprise Security ConOps - ANNEX A DLP.pdf|GC ESA ConOps Annex A: Data Loss Prevention]]
 
* [[media:GC Enterprise Security ConOps.pdf|GC ESA ConOps Main Body]]
 
 
 
[[Category:Government of Canada Enterprise Security Architecture (ESA) Program]]
 
[[Category:Enterprise Security Architecture]]
 
[[Category:DLP]]
 

Latest revision as of 13:27, 20 April 2021