Why you should be verifying third-party scripts

It has become common for developers to load third-party scripts from external servers. However, doing so introduces risk. What happens if the script changes from what is expected?

Last week, a supply chain attack vulnerability was flagged for Polyfill.io. Polyfill is/was a widely used JavaScript library that enhances functionality in older Web browsers. It has been used on many legitimate sites. However, the domain was sold; its new owners then started delivering malicious JavaScript code to websites implementing embedded scripts from its CDN.

When compromises like this get identified, the industry generally acts quickly to minimise the vulnerability. For example, Google Ads will reject sites using the compromised scripts; and cloud-based content delivery networks like Cloudflare will setup safe mirrors. These all happened for the Polyfill.io vulnerability.

However, not all compromises get detected, or at least not straightaway. It's really important that when we use third-party scripts we verify the source to protect our users and our brand.

Verifying third-party scripts in the browser

When loading third-party resources through HTML we can use Subresource Integrity (SRI) checks to ensure the scripts haven't been tampered with. This has been a feature of Web browsers for almost a decade now.

How it works

When including a <script> or <link> tag to an external resource, we include an integrity attribute. The value of this attribute is a base64-encoded cryptographic hash of the resource (file) we are linking to. The hash is essentially a unique fingerprint associated with the resource.

When someone visits a page, the browser will check for any <script> or <link> elements with an integrity attribute. For these, it will run a check of the hash before executing a script or applying any styling. The integrity hash must match the hash of the fetched resource; if they differ, then the resource has changed from what was expected and the browser will refuse to execute the script or apply the stylesheet.

If someone was to maliciously manipulate a resource, the resource will have changed and the hash will differ from the expected value. Our users will be protected from any harmful behaviour the resource may have exposed them to.

When using the integrity attribute for an external resource browsers will also check the resource using Cross-Origin Resource Sharing (CORS). This is to check the server delivering the resource allows it to be shared with our site. Therefore, we also need to include crossorigin="anonymous" alongside our integrity attributes.

Generating SRI hashes

The simplest way to generate an SRI hash is using an online tool like the SRI Hash Generator.

An example

The following is an example of requiring AlpineJS (a popular JavaScript library) using a <script> element:

<script 
	src="https://cdn.jsdelivr.net/npm/alpinejs@3.14.0/dist/cdn.min.js" 
	integrity="sha384-O8NPfezTLQ/sgLfQYBJEnezJLlum9L6KOqHsfIWauzaFfD1TQSuvA4iUpgWGHeuZ" 
	crossorigin="anonymous"
></script>

The SRI hash is set as the value of the integrity attribute. It is made up of two parts:

O8NPfezTLQ/sgLfQYBJEnezJLlum9L6KOqHsfIWauzaFfD1TQSuvA4iUpgWGHeuZ - the base64-encoded 'hash' part
sha384 - the prefix identifying the type of hash algorithm used

If we were to modify the hash, or the contents of https://cdn.jsdelivr.net/npm/alpinejs@3.14.0/dist/cdn.min.js changed, then the browser would not run the AlpineJS library. In the browser's developer tools we would see a network error indicating that it failed to fetch the resource.

Mutable resources

There is one limitation of using subresource integrity. If our resources are likely to change, then we won't be able to set the integrity attribute, otherwise our site will stop properly functioning once the resource is updated. This is particularly problematic if we use non versioned resources. This will be the case if we use many of the Google services like Google Analytics or Google Tag Manager.

Check if the provider offers versioned resources and use those along with subresource integrity where possible. In our example above, the CDN does provide mutable versions of AlpineJS, but it is better to use a tried and tested version that we can lockdown in our code.

If the provider doesn't provide immutable resources, there's not much we can do about it. We have to determine if the risk of using the resource is justifiable.

It's important we use trustworthy sources regardless of whether or not we can use subresource integrity. If we can't then it's extra important. Big providers like Google are likely to have many safeguards in place to prevent malicious code making its way into their services.

npm packages

SRI provides a great way of verifying resources requested directly from our HTML. This does not help us if we are using npm packages that directly load a third-party source through JavaScript. For example, several npm packages embed code from the cdn.polyfill.io service. We need to be vigilant of the packages we use and check them for vulnerabilities as compromises are announced.

Similar to mutable resources, it's important we use trustworthy sources for our packages.

One additional precaution we can take is to ensure we are making use of a lock file. When we run npm install a package-lock.json file is generated automatically. Similarly, if we make changes to our installed packages the lock file updates. This lock file wants committing and used to install packages whenever someone is setting up a copy of code, testing or making a deployment. We can do this by using npm ci (instead of npm install); this will check that the packages being installed haven't been modified. The lock file contains cryptographic hashes (like used with SRI) that ensures files haven't been tampered with.

Summary

Using third-party resources introduces a real risk to our websites. However, subresource integrity provides a widely supported means of securing against compromises. If a resource is manipulated to deliver malware, these checks will ensure the only impact on our users is a reduction in functionality. This would be a far greater outcome than our users being put at risk.

Sadly, use of the integrity and crossorigin attributes is lacking. Example code often neglects to include these. This leads to poor implementation. As developers, we need to be aware of this and ensure we are writing secure and robust code wherever possible. Just because an example is missing these attributes, it doesn't mean they shouldn't be used.

As for polyfill.io, if your sites are still using this, update your code now. Remove references to this compromised domain. Cloud-based content delivery networks have acted to minimise the vulnerability, but the risk remains. Andrew Betts, the creator of polyfill, wrote back in February that 'no website today requires any of the polyfills in the http://polyfill.io library.' Just get rid!