Recently while conducting some research, I found myself down the path of Google Analytics ID’s as well as other analytics services. I was investigating ways to not only identify varying analytics code in sites, but to correlate them with other sites that may be linked to the same owner. Please note before further reading: I make some guesses about what I find, though that’s contrary to the concept of analysis, and I am not presuming to know definitively why I am seeing what I am seeing in this specific case study. It’s all just very curious to me. Dive in and take a look for yourself!


WHAT ARE ANALYTICS ID’S?

Various analytics services essentially give you a small chunk of code to inject into your site that reports information back to them when somebody views your website. This is generally then provided on a fancy dashboard so you can better understand the traffic coming to your site whether it’s visitor counts, visitor locations, referrers, etc. Depending on the service, you may just get one unique ID to use across all of your sites or perhaps, like in the image above, you’ll get an ID with appended information to distinguish sites you’re monitoring. If you look at the structure of the image above, you’ll see that I myself as a user have the ID 2319990. Then, for each site or “property” I want to track, they append more information to the end of it. In the case of this site, they’ve added -11 because I suppose I’ve previously monitored 10 other things that are now dead or gone.

So why is this important? Because, especially if you look at the case of Google Analytics, you’ll notice that because the base ID stays the same, you can use services or search engines to identify other assets linked to this ID and in turn, find other sites I own or at least have some kind of permission to monitor or edit code on. If sites -1 through -10 still existed, you’d be able to see them all and know that they were in some way possibly related to me.

There are two problems with this type of searching / enumeration. One of them being, if at any point an analytics ID is reassigned, you are going to get false positives. Another is that certain 3rd parties can distribute Analytics ID’s. So you may find two sites that appear seemingly related based on the Analytics ID but find out later or perhaps never that the only real connection is that they both received ID’s from a 3rd party and have very little material connection to each other. All that said, finding related sites through analytics ID’s is not a guarantee but is definitely worth checking for the inquisitive investigator or researcher.


A CASE STUDY OF THE FBI

I didn’t actually think this would be a thing. I tried finding analytics ID’s for both the CIA and NSA first but nothing turned up. Personally, I think it’s probably a good idea that these federal agencies aren’t utilizing this if only in part from reasons I’ve already mentioned. Then I put the FBI in and… well let’s see how this analytics research can both work successfully and provide some unnecessary visibility to the people who own the account.

One site that will help you identify these links is called “BuiltWith”. If you enter a domain in there and then click the “Relationships” tab, you’ll be greeted with a list of every ID this site has observed them use, and a fancy historical record with graphs for you graph people (shown above). You’ll notice that the same analytics ID linked to the FBI has appeared on a lot of other seemingly unrelated sites over the years. Of note, the unrelated sites seem to share the same ID during the same timeframe which is curious. One site in particular from the list seems to have been up with that ID for over 6 years. Maybe its something they forgot about, let’s take a look.

This site is straight out of the internet boom and has clearly never been updated since. This obviously isn’t something that should be showing up in relation to the FBI, so how did it get in our list? Well if we take a brief look deeper, theres more going on here than we realize. I use Brave Web Browser because it has a lot of built in privacy functions that you’d have to add extensions in other browsers to replicate. One of them being protection against cross-site trackers. On Google.com, Brave blocks 6 trackers, all belonging to them. On this husky site from the 90s?

Yikes. That’s a whole lot of FBI hosted javascript. What does it do? I have no idea, but you probably shouldn’t go to this site. To the person who read that and now absolutely has to check the site out, godspeed.

So what’s going on here? If I had to guess, I suspect that the FBI may be using fake (or real, seized) pages in their investigations and they’re utilizing their own FBI-linked code and analytics IDs to monitor them. It’s actually rather comical considering they’re betting on poor tech practices for the people visiting the pages but are making what I’d consider a pretty major oversight in how they’re operating their own practices. I counted 57 total pages linked to them through analytics. Barring any of the very obviously gov / FBI related pages (17 I believe), it appears there are still 40 in this list that were possibly at some point used to collect analytics data for the FBI but made to not look like they belonged to, or were being monitored by the FBI.

To their credit, many of the sites either no longer exist or no longer have the FBI’s code embedded in them. In fact, though I haven’t checked them all, this page has been the only one where I’ve seen present, current evidence of code related to the FBI. Some other interesting domains that have had their analytics ID on it include an official edu domain for Chattahoochee Technical College, a torrent site, and a proxy site (with clear FBI labeling on the page itself).


GOING WAYBACK

Just because a site doesn’t exist anymore doesn’t necessarily mean we can’t see what it looked like. Using the dates provided from BuiltWith, we can use the “Wayback Machine” on Archive.org to see if any historical snapshots of the pages exist around the time the analytics ID showed up there.

I found some similarities in a random sampling of pages I checked. First, they were all available in the Wayback Machine. This isn’t always guaranteed because a person generally has to manually submit archival to archive.org, it’s not just indexing every single page. Second, of the sampling I took, all the pages had a snapshot taken within a month preceding the time when the analytics ID reported being added to the site. Was this the FBI? I have no idea but it seems odd.

Sometimes the Wayback Machine has trouble indexing pages for whatever reason. One of these reasons, at least in my experience, is when a site is redirecting somewhere else. If it does have trouble following the redirect however, it displays a log of where it was going.


WHAT DOES THE JAVASCRIPT DO?

The sources of the code on the page refer to paths that are now broken or do not exist. There are 3 scripts included that link to the FBI’s core site: urchin.js, /fsrscripts/triggerParams.js, and /fsrscripts/stdLauncher.js. Since all of the code is gone, I can’t say what it was all officially doing at the time it was being actively used on the page. However, I did find that all of them were archived in the WayBack machine. I have linked the wayback paths above by clicking on each respective script name so some of you smart Javascript coders can take a peek for yourself.

It appears that both triggerParams and stdLauncher were related to some kind of survey. I have no idea what the survey was or why it used to be on this page but nonetheless, it’s gone now. I don’t know what urchin does (well I don’t know what any of them do) but the code seems more obfuscated. Preliminary searches indicate it may be related specifically to Google Analytics.


THE SITES

I went through all the sites from BuiltWith that had the same Analytics ID as the FBI at some point in time and eliminated all of the .gov sites or anything that seemed directly related to the government or FBI. Here’s the list below:

Site Dates
ppcguy.com Jan - Mar 2011
meetinsing.com Jan - Mar 2011
bfslash.org Mar 2011
othx.com Mar 2011 - Feb 2013
siberianhuskypuppiesofravenwood.com Apr 2011 - Oct 2016
bsotpond.com Feb 2012
evilunclesteve.com Feb 2012
gilmerfreepress.net Apr - Nov 2012
gilmerfreepress.com Apr - Nov 2012
mybmb.net Apr - June 2012
ansasolutions.co.uk May 2012 - Apr 2015
tapiaadvertising.com Apr - Dec 2013
pacificanovelties.com Aug 2013
50pluslifestyles.net Feb 2014 - Apr 2015
jsplumbing.net Sep 2014 - Oct 2015
stevereicherttraining.com Nov 2014 - Oct 2015
friendbombforfreedom.com Apr 2015
mitdating.dk Sep 2015
picrate.me Sep - Oct 2015
bzr.am Sep - Oct 2015
wtpcnews.com Jan 2016
chattahoocheetech.edu Jun 2016
blog.chandlerreports.com Jun 2016 - Nov 2017
onebitsnoop.com Apr - May 2016
torrentssearch.net Apr - May 2016
bitsnoop.in Apr - May 2016
applerejects.com Jul 2016 - Feb 2017
hashcaddy.com Jul 2016 - Apr 2017
quickprox.com Nov 2016 - Feb 2020
lhorlogestore.com Dec 2016
sho4baby.dk May 2017
ingage.net May 2017
lhorlogestore.com May 2017
zsoor.org Aug - Sep 2017
doggieyou.be Feb 2018
doggieyou.com Feb 2018
doggieyou.nl Feb 2018
poolstuff.ca Jun 2018
bestpoolbuys.com Jun 2018
adasigndepot.com Nov 2019

WHAT ELSE?

I don’t know. This is about as far as I took it. There’s probably a lot more informatin to be dug into here but this post has already gone on long enough and I’m sure some daring adventurers out there who see this will come out with research that completely shadows this. That’s great and I can’t wait to read it! I for one have to push pause. This was a fun find for me and I hope for you too. If anything, I hope it highlights the importance of checking analytics ID’s for associations when conducting an investigation. If you find anything else interesting or have any feedback, please feel free to reach out! You can find me on Twitter or by emailing me at my name (at the top of the post) at this domain.