DNT and respecting user privacy By drenwick on 2019-06-13 11:57:57

Some time in 2017, most major browsers added a new privacy feature to their settings menu: "Do Not Track". This was an opt-in setting that would, in theory, stop sites from tracking those that did not wish to be tracked.

In practice however, all it does is add a DNT: 1 header to all HTTP requests. This header is ignored by most sites.

I still use the DNT setting, and I endorse the use of it for those of you who are privacy conscious. While most sites may ignore the header, there are still sites out there that respect it, and even if enabling it only stops 1 site you frequent from tracking you, that's still less tracking simply by flicking a switch.

In fact, this very site you're on right now (assuming you're reading this blog post on my blog) supports the DNT header. I use Google Analytics for tracking user activity. I like to know where my readers come from, and where they go. Is this hypocritical? Maybe, but I don't think so.

Because my site supports DNT.

Not everybody is as privacy conscious as I am, and many people are more privacy conscious than I. (After all, I use Windows 10, Android, and Gmail). The DNT header can be accessed in 2 ways; Server-side and client-side.

The client-side method involves using the JavaScript navigator.doNotTrack field, which will return "1" if DNT is enabled, and "0" if it is disabled (although in testing I found that chrome simply returned null when DNT was disabled)

The server-side method is the approach I chose to use on this site, and it simply involves reading the DNT header from the request.
In PHP, I simply created this little utility function:

public static function doNotTrack()
{
    return (isset($_SERVER['HTTP_DNT']) && (int)$_SERVER['HTTP_DNT'] === 1);
}

For those of you that don't speak PHP, this checks that the HTTP_DNT header is set, and checks that it's value, when cast to an integer, is 1.
I'm not a fan of type coercion due to the unexpected behavior it can cause, and the header uses "1" as its value when set, so casting to an int was, in my opinion, the cleaner way to do this.

Now, in my header template, I can simply surround my Google Analytics code like so:

<?php if (!Request::doNotTrack()) { ?>
    // Put Google Analytics here
<?php } ?>

So now my site totally respects privacy, right?

... Right?

Eh.. maybe not.
See, my site uses 4 3rd party scripts (excluding Google Analytics):

  • Animate.css
  • Raleway font from GoogleFonts
  • highlight.js
  • FontAwesome

Animate.css is used for the fancy animations you see when you first load a page. It's just a css file, and as such can't really do any tracking.
Raleway is the font you see used across this entire site, it's also just a css file, which in turn references a woff2 file, so can't really do any tracking either. highlight.js is a JS file, which could do tracking in theory, but the whole script is open source, and the source shows no tracking. FontAwesome is also a JS file, which pulls svg files as needed, again it's open source, and has no tracking code.

Cool, so no tracking there right? Wrong again!
Animate.css is served from jsdelivr's CDN, which could use tracking.
Raleway is served from GoogleFonts' CDN, which absolutely has tracking.
FontAwesome's CDN also uses tracking, as per their privacy policy.

So how do we combat this?
Well Animate.css can be simply downloaded and hosted locally, this isn't a problem, unless we want auto-updates (we don't, that might break things).
Raleway can also sort of be hosted locally, it's a little tricky, as the fonts are generated per request, so you need to make sure you download the right files.
FontAwesome can also be served locally.

So if all of these problems have been addressed, what's my point?
Simple: Respecting user privacy on your website extends beyond how you develop your website, it extends into how your 3rd party scripts and libraries behave. Just because I was lucky in that all of my 3rd party scripts could be hosted locally and were open source, doesn't mean that's always the case.

How many of you web devs out there use 3rd party libraries on your website? Maybe it's FontAwesome, maybe it's jQuery, maybe it's a plethora of obtuse NPM modules that you needed just to left-pad a string.

The point is, many of them are likely served from a CDN. That CDN can track your users and disrespect their privacy. Have you ever read or even looked for a privacy policy on most of the 3rd party libraries and scripts you use? You've also likely never read the source code of most of the JS scripts you're using on your site, they could have tracking code in that disrespects your users' privacy.

Of course, I'm not saying you should analyze the source code of every library you use in fine detail, on the off chance it has some privacy-violating tracking code in there, I'm just asking you to think before you include that monolithic JS library.

So next time you go looking for a library that makes your life marginally easier, think about what the impact of that is, not only on the performance of your page, but on the privacy of your page.

Update

After 6 days, I've received a response from FontAwesome's privacy team regarding whether they respect the DNT header:

Disappointing news. Good thing they support serving the library locally!