After all the fuss recently about the iPhone tracking your location, I thought I’d have a closer look at the data to see what Apple (or any other prying eyes) could possibly know about me from the data being stored on my phone. What I found was surprising – but not in the way you might think.
According to the articles, the iPhone was storing up to a year’s worth of location data. So where have I been in the past year?
Sydney, Canberra, Melbourne, and a little around the snow fields. Well – I never claimed I’d been anywhere interesting.
So, the iPhone knows I’ve spent time in some major (1) eastern cities, and it knows I went to the snow. While that is kind of weird — really, it’s weird to think that my phone “remembers” anything about where I’ve been — I’m not exactly worried about my privacy. It’s hard to distinguish between me and several million other Australians (and/or tourists) at this point.
Where can we go from here? Can we divine anything specific from this data? Anything really worrying — where I live, where I work, where I spend my off-hours? First let’s take a broad look, and then see if we can narrow it down to specifics.
The Big Picture
Let’s start with the big picture. Where have I spent the most time? It’s pretty hard to tell looking at the points — Sydney, Melbourne and Canberra are all pretty crowded. The easiest way to get a quick answer is a density map. This won’t tell me anything specific in terms of absolute numbers — but it will give me an idea of where the most points occurred.
Well, that settles that then — it’s very obvious from this view that the most points have been recorded in and around Melbourne. That’s getting a little closer to home — now let’s zoom in there and see if we can really invade my privacy.
It looks like I’ve been spending a lot of time in the city area, with some time in Geelong and Frankston/Mornington, and a conspicuous hotspot in Berwick (to the southeast). Although somewhat alarming, it’s still very general — at this stage, I still don’t consider that an invasion of privacy. But can we narrow that down a little?
A Closer Look
Let’s select these points around Melbourne, and run the point density tool again using a smaller search radius – 500 meters. That should be small enough to define some specific areas of interest.
Now there’s a clear hotspot in the CBD (the dark blue region in the centre) extending to a fair amount of activity in the general Brunswick/Northcote area (north of the city). There are a couple of conspicuous hotspots, one around St Kilda (southeast of the city) and another around the Flemington racecourse (northwest of the city). This is getting a little more uncomfortable, but it’s still hardly attributable; you can’t exactly pick out a particular location as my house or workplace from this view. What happens if we try looking a bit closer? Let’s try making one more density map — this time with a search radius af 100m. At 100m resolution, you’re getting towards scary territory — less than a household block. You should be able to pick up my house or my workplace at this scale.
Hmm. At this size, it’s not looking that different from the points shapefile to begin with – it’s just a bunch of dots. We’ve probably gone too small to be useful with the search radius – the points are no longer blending together. However, there are some areas that stick out – there’s one at the MCG (east of the city), and one at Docklands (west of the city). So, according to my iPhone, it’s safe to assume that I live at Docklands, and work at the MCG, or vice versa. Right?
I’ve only been to the MCG once in my life — and that was just last week! When I started this analysis, I had never been to the MCG. Furthermore, the other places hot spots show up — St. Kilda and northwest of the city — I have barely spent any time in either. To be fair, the Esri Australia Melbourne office is at 100 Franklin St. Melbourne – in the CBD, yes, but on the edges of this map’s “hot zone”, and I don’t spend a lot of time in the city besides sitting in the office. It’s true that my house is in the Brunswick/Northcote area (the area north of the city, shaded pink in the 500m density map), but from that view it’s impossible to even narrow it down to a particular suburb, let alone a house.
To top it all off, the data as a whole implies that I have spent most of the time in the last year in and around the Melbourne CBD – when I only moved to Melbourne 4 months ago! This analysis of my iPhone’s “tracking data” has established absolutely nothing specific about me, and suggests some things which are actively incorrect and incomplete.
To sum up:
- A general analysis of the data can at best identify an incomplete record of general regions I have visited in the last year.
- An attempt at a specific analysis of the data would result in misleading and inaccurate information about where I have been and which locations are significant.
- I have not been to some of the locations reported by the data, and I have been to others which are not represented in the data.
- The actual lat/long coordinates certainly do not indicate specific locations where I have been.
- Taken as a whole, there is no way this data could be used to pinpoint me as an individual, or conclusively prove that I was or was not at a given place at a given time.
The most Apple (or anyone else) could say from this data is “The owner of this iPhone probably lives and/or works in inner Melbourne, and has travelled to Canberra, Sydney, and the ski slopes”. That’s hardly a unique identifier — and it’s not really reflective of the truth anyway. Hopefully this quells some of the worry about Apple keeping tabs on your locations, and also demonstrates some of the analytic capability of ArcGIS.
Are you curious what your smartphone knows about where you’ve been? Why not have a look and let us know?
Note 1: Whether this label includes Canberra is left as an exercise for the reader.
Note 2: I’ve chosen a 30 meter cellsize for the output raster so it will be of a similar resolution to my elevation data. I’ve chosen a 5k search radius to indicate “all points within 5k of each other can be considered to be in the same place” – this is pretty big because my data here is on a national scale. I’ve decided to symbolise the result using the Geometrical Interval method, excluding the 0 values. The “Distance” colour ramp gives a clear distinction between the categories while also implying some kind of increase – which is exactly what we want in this case. In the Display tab, ask it to draw using Cubic Convoltion resampling and at 30% transparency, and off we go!
Note 3: The analysis covered in this post is pretty surface-level, but I’ve also investigated it from a spatial statistics perspective, and considered the time the events were logged. Neither reveals anything specific at all – only the general “Melbourne CBD” conclusion, along with a bunch of nonspecific or outright untrue secondary conclusions.