This is post 1 of 2 in a series on data collection. You can view post two here.
Recently, I decided to download my Google and Facebook data archives containing all the information they had on me. I have had a Google account since 2006 and I’ve always been an early adopter of products like Gmail and other such Google services. I have had a Facebook account since roughly 2007 probably. On both sites, I had always been very flippant with my data. I maximized the amount of access I gave the sites under the perspective of simply wanting a better experience. My motto of the past had always been “if I’m going to be served ads, they might as well be ads for things I want.” Well, the download of my Google data was ready today (at a whopping 32gb) and I was uncomfortable with, though not surprised, at the things I found in there.
The content of this blog post will be quite transparent with regard to me sharing actual content from my archive and I am hoping it will help to clarify to others what kind of data they’re exposing to these companies and the vendors they share it with and that I will be met with graciousness for my transparency.
I don’t know what the retention rate for data at Google is but I suspect there is no set expiration for when they stop retaining data they’ve collected on people. That being said, my archive contained obscenely specific records of everything I’ve done or searched with Google services since 2006. For example it was interesting to learn that my first Google search was on November 16th, 2006 at 11:38pm for “dictionary”. No clue why I needed a dictionary back then at that time of night but it’s a moot point now and really senseless information to be retaining. It was also fascinating to find that just 3 days after creating my Google account, I wanted to find “the end of the internet” at 1:41pm. After skimming through portions of my search history, I was shocked with… myself. There were search queries tied to my name that were immature, potentially illegal sometimes (like searching how to make homemade explosives) I was a high school student, and frankly just ridiculous. All in all, the log of these search results wasn’t very helpful for me, of anything it’d probably be quite hindering to me if somebody else were to use them as a measure of myself.
Those were simply just the web searches. Google had beyond a decade of search queries across any platform or service they’ve ever offered. I found video searches (separate from YouTube) dating back into my high school years looking for “insane fight”, or one of my personal favorites and potentially embarrassing searches of the “Shopping” tab for things like “tv laptop” or “google internet”. Queries I’d laugh at today and would expect to see from some 70+ year old grandparent learning how to use the internet for the first time. Beyond that, the other searches they maintain records on are:
- Google Images
- Google News
- Google Maps (this included thousands of locations I’ve looked up dating back to 2007)
- Google Support (Help)
- Google Goggles
- Google Play Store
- Google Music
- Google ‘Shopping’
- and Google Developer (includes subjects I viewed as well).
Google began as a search engine, and they do searching well. Far better than any of their competitors in my opinion. However, they have since extended their grasp into gathering any other data they can get their hands on. Like I said earlier, the aforementioned data I described was unsettling but not surprising, however, the data in this portion was equally unsettling and was indeed surprising to me. Up until 2 years ago, I had been an adamant and staunch Android fan. Ever since the G1, I have owned almost every single ‘Nexus’ or Google original branded device. I switched to iPhone for work and I’ve been very pleased with the switch. After looking at this archive, I have to say I will have a hard time switching back. In my archive was a folder simply labeled “Android”. Upon opening it, I was faced with a time and date log of each time I used every app on my Android phone. I don’t know what this data is for or why they keep a log of it years after but I was surprised to see it was in my records. For this specific archive, I should note they only date back to December 2014. I am not sure if that’s when they started logging or if I somehow dodged this function all the way up until that date but I would be surprised if other’s logs dated past this date.
The data in the “Ads” file was also a bit surprising to me for a few reasons. Not only did this file contain a time/date list of every ad I’ve ever clicked through Google, it also contained certain apps that it listed as “Google Ads”. The kicker on this file is, the apps shown in this list are recent, and they’re from apps that are used by my wife on her iPhone. The only thing linking my wife and I’s Google accounts is the recovery system. Other than that, we are not in any kind of managed family or any other system that I know of that should make her apps show up in my ads list. I was surprised that 1) this ads list displayed apps and 2) they were from my wife’s phone. Apps that were recorded here for us were: Pandora CamScanner Lite The Bump Pregnancy Tracker and Tinybeans.
Google’s location tracking and Maps app are incredibly useful. Google Maps is still my GPS app of choice and I also use it to look up reviews for places and to see their hours, website, or phone number. However, I have now turned their logging off after reviewing the content they’ve had stored for me. Since 2007 (on my archive), they have a log of every location I have searched for, called, and asked for directions to. In the log was also a time/date list of every time I even opened the app as well as anytime I simply viewed any generic area such as “Viewed area around Northeast Seattle” along with a hyperlink to open Google Maps to the area I was viewing. Beyond the log, there is a location history portion file that contains further location/maps information on you. My data had over 590 confirmed locations and 480 suspected but unconfirmed locations (that it wanted me to verify) all over the world. This map, with a pin on each location, included areas all over the U.S. and even the locations in Kuwait where I was stationed when I deployed.
Beyond that, Google Fit contained logs from years of any time I did any activity as little as walking. Not only did it contain micro logs of any activity like that, it also contained location GPS coordinate data and the duration it took from me to get from point A to point B.
I had adopted Google Hangouts (and by proxy Google Voice) as my primary communication method for several years. I had a Google Voice phone number that I sent SMS/MMS messages through as my primary number for a time. My archive contained just under 1,000 images that were sent across any message I had on Google Hangouts over the time I used it. There was also a ‘hangouts.json’ file that contained the actual content of all my Hangouts messages as well as other technical information like who initiated individual conversations, message ID’s, and what type of notification was produced like “ring”. F or Google Voice, there was not only a list of every call (with placed/received label), date/time, and duration of the call, there were also .mp3 files of any voicemail I received. There was also a file containing all of my billing information for any calls I made such as international calls, further mp3 files of any voicemail greetings I personally recorded, and a file containing the phones I’ve used Google Voice on. My archive also contained a full export of my Gmail inbox including every email (that I hadn’t deleted) including Spam and attachments.
Beyond the things above, there was also a copy of everything in my Google Drive storage, all photos that were auto backed up using Google Photos (a service I love and plan to continue using), and all of my old Google Keep notes were stored in there. My Google contacts, along with my Google Calendar events and anything I had ever listened to on Google Music were also available.
Another fun find was a file under “Google Pay Send” which was a transaction log of any Android Pay / Google Wallet transaction I’d ever made, who the transaction was made with, and details like the last 4 numbers of my debit/credit card or my bank account.
This is just a snapshot of all of the data contained in the 32GB of information I downloaded off of Google. I have since deleted most of these logs from my account management page and disabled any further logging for the things I can control. Hopefully going forward, even searches won’t be stored in a log and I can dig myself out of the data nightmare I’ve dug myself into. Thank you all so much for your time in reading this, I truly appreciate it. Be on the lookout for a followup blog when I get my Facebook archive link to sift through that.
#### ADDITIONAL RESOURCES