Important: The GCConnex decommission will not affect GCCollab or GCWiki. Thank you and happy collaborating!
GC HTTPS Mixed Content
Mixed Content Overview
When an HTTPS website references insecure (HTTP) resources, this is called mixed content. Browsers prevent an HTTPS website from loading most insecure resources, like fonts, scripts, etc. Migrating an existing website from HTTP to HTTPS means identifying and fixing or replacing mixed content. Mixed content comes in two varieties:
- Active mixed content includes resources that can greatly change the behaviour of a website, such as JavaScript, CSS, fonts, and iframes. Browsers refuse to load active mixed content, which often results in affected pages being completely unstyled or broken. Browsers treat these very aggressively because of the consequences if they were compromised. For example, a single compromised Javascript file compromises the entire website, regardless of how other resources are loaded.
- Passive mixed content includes resources whose impact on the page’s overall behaviour is more minimal, such as images, audio, and video. Browsers will load passive mixed content, but will typically change the HTTPS indicator.
Mixed Content: Migration strategy
Every website’s mixed content situation will be different, but the general approach is:
- Enable https:// for your website, but don’t force a redirect. Continue to present the http:// version as the canonical URL to search engines.
- Identify the most obvious and widespread pieces of mixed content by loading your website in a browser over https:// and observing breakages. Chrome, Opera, and Firefox will log any mixed content warnings to the console, which should point out necessary site-wide changes. Use these to secure your resource links (//name anchor to below).
- After fixing them, tackle the long tail by scanning your code (//name anchor to below) and crawling your website (//name anchor to below).
- Finally, force the redirect to HTTPS, turn on HSTS (//link to GC content), and tell search engines that your new URL starts with https://.
Note: the below instructions use tools 'optimized for an OS X or Linux environment'. Documentation for Windows-based tools would be a welcome contribution to this guide.
Mixed Content: Linking to resources securely
Most commonly used third party services, such as Google Analytics or AddThis, will automatically adapt when migrating to HTTPS.
Other services may require manual updates, but have an https:// version ready: <link href="https://fonts.googleapis.com/css?family=Open+Sans" rel="stylesheet">
Generally speaking, for content on your own domain, stick to site-relative URLs wherever possible: <img src="/media/my-picture.png" />
When migrating a site with a lot of user- or staff-submitted content (e.g. a blog), you may find media hot-linked from a third-party domain which doesn’t support HTTPS.
This is a great opportunity to improve your website’s privacy and lessen your dependency on third parties, by copying those media files to your own server instead and hosting them yourself.
Mixed Content: Scanning your code
After identifying and fixing the obvious issues, you can scan your website’s files for leads. On a 'Mac or Linux-based system', grep
is very handy:
- Images and scripts:
grep -r "src=\"http:" *
- Stylesheets and fonts:
grep -r "href=\"http:" * | grep "<link"
- CSS imports and references:
grep -r "url(\"http:" *
- Finding links in JavaScript is more challenging, but you can look for all http: references and try to exclude hyperlinks in HTML or Markdown:
grep -r "http:" | grep -v "href=\"http:"
orgrep -r "http:" | grep -v "](http:"
Mixed Content: Crawling your website
There are free apps and services that you can use to scan your website, however are limited to a certain number of pages. This limitation requires you to take action on issues discovered, and rescan for a subsequent set of issues to manage, or confirm all corrective actions have been taken. As a start, these services are a good option, and provide a quick leg up on manual review.
Mixed-content-scan is a very handy command line tool that can crawl an http:// or https:// website to see if it contains any references to insecure resources. This is especially helpful if your content is primarily managed in a content management system.
Mixed-content-scan requires PHP, and Composer. Once Composer is installed, you can run mixed-content-scan on your domain with the following command: mixed-content-scan https://sub.domain.gc.ca
.
You should see something like this:
[2018-02-15 16:56:48] MCS.NOTICE: Scanning https://sub.domain.gc.ca/ [] []
[2018-02-15 16:56:49] MCS.INFO: 00000 - https://sub.domain.gc.ca/ [] []
[2018-02-15 16:56:49] MCS.INFO: 00001 - https://sub.domain.gc.ca/faq/ [] []
[2018-02-15 16:56:49] MCS.INFO: 00002 - https://sub.domain.gc.ca/hsts/ [] []
[2018-02-15 16:56:49] MCS.INFO: 00003 - https://sub.domain.gc.ca/resources/ [] []
[2018-02-15 16:56:49] MCS.NOTICE: Scanned 4 pages for Mixed Content [] []
Any discovered mixed content will be listed as a WARNING
.
You can also get the results as newline-separated JSON objects: mixed-content-scan https://sub.domain.gc.ca --format=json
Mixed Content: Automatically detecting mixed content
Site owners also have the option of using a Content Security Policy header that will instruct browsers to ping a given URL with information about any observed mixed content warnings.
Mixed Content: Using a Content Security Policy (CSP)
Content Security Policy (CSP) is an added layer of security that helps to detect and mitigate certain types of attacks, including Cross Site Scripting (XSS) and data injection attacks. These attacks are used for everything from data theft to site defacement or distribution of malware.
block-all-mixed-content
: Will prevent the loading of insecure content from sources.upgrade-insecure-content
: Forces the upgrade of insecure connections to a secure source/connection.
A CSP policy that only specified pinging for mixed content warnings might look like: Content-Security-Policy-Report-Only: default-src https:; report-uri https://example.com/reporting/endpoint
CSP reporting, especially for larger services, is an advanced approach that will likely require planning and tuning by a dedicated development team.
Some resources:
- An engineer describes Twitter’s approach as of 2014.
- Report-uri.io, which offers a hosted reporting URI for CSP violations (or public key pinning violations) and accompanying dashboard.
- A newly developed CSP extension, Upgrade Insecure Requests, will instruct browsers to automatically upgrade referenced HTTP URLs to HTTPS URLs without triggering mixed content detection. This extension is not finalized, and as of June 2015 is only available in Chrome.
CSP Reporting
Browsers will post errors to a URL specified in the CSP header; should be set and configured over time, by fine tuning slowly. CSP reporting is essential to proper use of CSP, and can be deployed in report-only mode initially until the CSP rules are matured.
CSP guidance / resources
Mixed Content: Why do browsers block mixed content?
If mixed content were not blocked, an attacker could control the main website by conducting a Man in The Middle (MiTM) attack against any of its active resources.
Even with passive content like images, attackers can manipulate what the page looks like, and so the yellow-lock icon is intended to communicate that security has been weakened and user confidence should be reduced. In addition, an attacker will be able to read any cookies for that domain that does not have the __Secure-
flag, and set cookies.
When a website is accessible over http://, loading other insecure resources does not generate any sort of warning, and so websites operating over plain HTTP often accumulate many of these sub-resources.
Mixed Content: Security considerations for third party content
Incorporating or loading content from third party domains creates an additional attack vector. Third party resources should be vetted and managed through appropriate application of web security controls such as Sub-Resource Integrity (SRI) and a Content Security Policy (CSP) to mitigate the risk of compromise due to attacks on linked resources.
Even if a page has all page elements loaded over HTTPS, variations in HTTPS configurations could result in security vulnerabilities. For example, if ‘foo.gc.ca’ loads a page element over HTTPS from ‘bar.com’ but ‘bar.com’ is not as exacting with its HTTPS/TLS configuration, the page element from ‘bar.com’ may allow injection of malicious software into the page.
If ‘bar.com’ uses a TLS configuration that is known to be weak, a malicious network adversary may be able to modify or replace the page element to inject software that could read the page contents or, potentially exploit browser vulnerabilities and accomplish more global access to the client device. Accordingly, just as it’s important to regularly evaluate the HTTPS/TLS configuration of US government websites, it will be important to also evaluate the configurations of the domains that serve third-party page elements.
Note that this is still a strict improvement over incorporating content from third party domains over unencrypted HTTP. Attacks on the privacy, integrity, and security of connections to third party domains over unencrypted HTTP are trivial.