Wednesday 27 April 2011

iPhone Tracking–Apple’s Response

This statement from Apple is just about the only viable explanation for the data that I see on my own iPhone:

This data is not the iPhone's location data-it is a subset (cache) of the crowd-sourced Wi-Fi hotspot and cell tower database which is downloaded from Apple into the iPhone to assist the iPhone in rapidly and accurately calculating location. The reason the iPhone stores so much data is a bug we uncovered and plan to fix shortly (see Software Update section below). We don't think the iPhone needs to store more than seven days of this data.

The far more detailed “local” tables are probably the source for the “crowd-sourced” data referred to above but I’ll be digging a bit more to see if there’s anything else lurking in the system databases.

It’s still technically possible for Apple to be harvesting location specific data that could be used to track users but I really don’t think so in the light of their previous statements, the actual structure of the data* that caused this latest batch of seriously uninformed speculation, and the obvious legal risks (in California, where Apple is based) associated with tracking anything that could then be mapped to real people.

Hopefully some sanity will prevail about this issue from here on.

That’s not to say that there aren’t huge risks associated with using crowd sourced location data, as Apple admit to doing a number of times here, but at least at this stage Apple do appear to be aware of the risks and concerns and have been acting accordingly.

* So far almost nobody appears to have actually examined the data with a view to understanding [a] whether it could be used for tracking or [b] why there are so many entries with identical timestamps. I know it’s silly to expect signs of intelligence on the internet but most of the echo-chamber chatter about this was truly uninformed and pathetic.

Tuesday 26 April 2011

iPhone Tracking.

The initial brouhaha has calmed down although the level of traditional media coverage was fairly awesome. I had a rather odd encounter with a strange woman on the corner of Green St and Barrack St in Cork on Saturday who felt led to [a] ask me was that an “I4” that I was using and [b] did I know that it was tracking my every move. Apparently she’d read about it in that well known tech journal, The Star, earlier and she was extremely concerned. She went on to explain to me that she used to work with US Intelligence, and to be fair her accent made that slightly plausible but I hope that US Intelligence agencies would hire agents that were less arbitrarily voluble and a lot more discerning in their sources than my over eager friend on Barrack St. Gotta love Cork though, it wouldn’t happen anywhere else.

Apart from that hilarious encounter I’ve had a chance to dig a bit further into the whole “consolidated.db” tracking issue. I have to say that the general commentary online is useless. Almost nobody has bothered to actually look at the data which is astonishing given how easy it is to find with the instructions provided.

Some of the best commentary I’ve seen so far has come from Frank Rieger here on his Knowledge Brings Fear blog. I think there is probably some truth to his assertion that this is the sort of thing that happens as a result of companies or individuals using a bug or bug like feature for plausible deniability but given the fact (as I’ll explain later) that the data is more or less useless for real tracking this is increasingly looking like a non-issue. Indeed if IT Forensics folks are using this to assert that they can covertly track iPhone users then I don’t think that sort of claim should be allowed stand up in court – the data is far to vague, and inconsistent for that.

As I said before there are four sets of tables. Within each group there are xxxLocation tables and xxxLocationLocal tables. All of the discussions I’ve seen talk about the Location tables and these are definitely the locations of cellular towers or WiFi Access points, not the location of the handset at all. It might be possible to deduce the handset location from these if the timestamps on them made any sense but they do not. The timestamps are grouped into batches with tens or hundreds of entries sharing the same timestamp which almost certainly corresponds to the time that Apple sent this location helper data down to the phone. I have 60k or so cell tower locations in my celllocation table but only a couple of hundred unique timestamps. This data might be useful in indicating that I had been within a few km of somewhere at some time but its so infrequent that I doubt that there is any “tracking” data that could ever be derived from it, even if the locations were reliable. And the locations aren’t “reliable” – more on that later.

You can broadly see where someone was over a timeframe of weeks, and by broadly I mean within a few tens of km or so, so I can see that I wandered over and back to Amsterdam, visited Eindhoven, Moscow and Bracknell over the past few months but there’s no way to tell where and when I was in the broader Dublin area on March 20th for example.

Anyway back to the main tables. One is CDMACellLocation and is used for CDMA Cell towers (indexed on MCC+SID+NID+BSSID , and dupes are not allowed by the index) Those are Mobile Country Code, System ID, Network ID and Base Station ID and they uniquely identify CDMA cellular towers globally. For the CellLocation table used for GSM Cellular towers the index uses MCC+MNC+LAC+CI and again duplicates aren’t allowed. The GSM keys correspond to Mobile Country Code, Mobile Network Code, Location Area Code and Cell Identity , again these uniquely identify a GSM cell tower globally. The WiFiLocation table is a lot simpler and it uses MAC addresses as a unique index.

Note that the indexes do not allow multiple entries – so these cannot be used for tracking for any practical purpose. They can be used to tell the last time that a particular cell was possibly nearby but the time resolution is a couple of days and the location is accurate to within a couple of km at best.

I’ve checked the data, not that I don’t trust the SQLite indexes but just to be able to say for certain, and all the tables contain only a single entry for each cell location.

Let me be 100% clear – the data in the CellLocation and Wifilocation tables on my iPhone are totally unusable for tracking, and any other advanced analysis of location patterns. As cached data to help speed up Celltower\WiFi AP based location triangulation they are quite useful though.

The spatial quality of these tables is quite dubious too. There are Celltowers in my table that are more than 100km from any location that I’ve been near in the last year and even one entry in Northern Italy that I know for certain I haven’t been near so even at that level this data is unusable in terms of telling where someone has been. For some reason my database also seems to have missed four days that I spent in Spain and I know I had my phone powered on there, and I was using it all the time although that trip to Spain might just be the one location where I never triggered any Location Aware apps.

In fact, given all of the above, I think that the issues with the temporal and spatial accuracy of the main tables, as far as tracking the user is concerned, make me think that someone decided to structure it this way so the data could not be used to track people.

Actual Tracking data.

Of far more interest to me is the user specific location info that seems to be logged in CellLocationLocal and CDMACellLocationLocal. There are far fewer of these 130 or so vs 60k in the celllocation table in my case, but they have accurate timestamps, locations that appear to be accurate to within a few hundred metres and speed\bearing data. There doesn’t appear to be any pattern to the logged data though so I need to do some more work there but at an average of 1 entry every three days isn’t a whole heap of tracking info. Even with location services enabled for 36 hours it only added an extra 20 or so entries.

Out of curiosity I enabled location tracking, and automatic updating, via Google’s native Latitude app for iPhone on Saturday. Google has now gathered a really impressive track of my movements since then that absolutely can be used to see where I’ve been at very accurate temporal and spatial resolutions – here it is catching a number of spots on my 18 minute walk from Hartlands Rd over to the The Evergreen just after 9:00PM on Sunday night. You can also see that I had a walk around the Lough earlier on, including a diversion that I took to get a cup of Coffee in a shop on Togher St.

image

This is an example of pretty useful tracking data but it generated a huge number of entries (about 1000 over a day and a half). If someone really wanted to secretly maintain a useful tracking database then that’s the sort of level of detail they’d need. The main thing I noticed as a result was that my iPhone ran out of juice after about 3 hours which makes it a bit useless.

Quinn, a name that I hate more and more every day.

I was horrified to hear on the news today that the Government (with a straight face) has indicated that it would consider instituting a 2% levy _on consumers_ to pay for the €600m or so shortfall that Anglo Irish Bank (!) and Liberty Mutual say they will not take on board as part of their acquisition. The levels of crazy in this story are beyond belief.

For starters Quinn was one of Senior Demons on the the inner circle of the hell that brought us to where we are today. Sean Quinn’s insane CFD gambling on Anglo Irish shares was a significant trigger in the collapse. Of course there were many others in that same circle but the Quinn group was one of those sick beyond all belief organisations that frankly we should let die to serve as a lesson to all the others. I’ll admit that the jobs are important, but the blatant disregard for basic financial responsibility that the Quinn group practised for years is the sort of thing that needs to be eradicated with extreme prejudice.

The fact that Anglo Irish Bank are involved in this at all beggars belief. That they are doing so in this way is something I just can’t get my head around at all. Seriously WTF are Anglo doing buying an interest in any significant financial business? I thought they were on the road to structured euthanasia.

Who is owed this debt that is so important that I, someone who has never had a Quinn policy in my life, has to pay part of it? Why can’t they take they hit?. What happened the normal rules of capitalism? Did I miss some new government policy while I was flying all over the western hemisphere these past few months.

Seriously – my answer to this is no. You can’t have it from me, I’m not going to pay and if you want to you can put me in prison before I will allow any of my money to be spent to bail out yet another capitalist’s gambling debts. Sean Quinn may be a nice man but his addiction to high stakes gambling should not be my problem, or yours.

Restructuring (Again)

Richard Porte has a very balanced commentary up on VOX Eu that clearly outlines why a restructuring of our debt burden is inevitable, and shouldn’t be viewed as either a disaster or a failure of the system. In a very good overview of the situation he also makes the very valid point that some private burden sharing would go a long way towards countering the vast moral hazard situation this has been brought about as a result of the 100% guarantees that investors in Irish banking have been given by the government.

It’s well worth a read although he makes an odd point at the end about the current account deficit that I don’t really understand. We now have the odd situation that we have a current account surplus (Portes’ data is a little out of date) because it reflects trade balances and our export sector is doing particularly well. This helps avoid some liquidity problems depending on what the various parties do with it but that is not _our_ money, or at least not the Government’s money and it will have no real impact on the Government Deficit which is the thing that is giving me sleepless nights.

As I understand it we are currently running a Government Deficit of around €15-18bn a year. I’m not sure precisely because the numbers are hard to find but total annual expenditure is somewhere between €48bn and €51bn and total annual revenue is around €33bn. We have committed to reducing that difference it around €750m-1000m (2%) by 2014 (or thereabouts). That is on top of an existing fiscal adjustment that adds up to around €4.5bn between tax hikes and spending cuts.

Our progress so far has been pretty good – few other countries have ever managed to contract this fast while remaining stable – but the remaining gap is, as Portes points out, an heroically ambitious target. It might be possible but _even_ if we manage to do that we will still end up with a debt that is 120% of GDP and interest rates that are likely to be punitive. Future growth and our ability to maintain a stable society are certainly at significant risk if we do as we are asked, and then still find ourselves paying more to service debts than on anything else. 

The difference between what we spend now, and what we need to be spending is more than we spend on Health, Education or Social welfare. In effect we have committed to cuts that could be achieved by totally eliminating one of those. Obviously that wont happen but it is useful to clarify the scale of the heroic effort that we are going to have to put in in order to get out of this. Debt restructuring wont make that particular problem go away, but it will make it slightly less difficult to achieve, and it would at least make it possible that we could look forward to a future where a recovery was possible.

Friday 22 April 2011

More detail on Apple’s iPhone Tracking Data

It seems highly likely to me that the tracking data retained by iPhones and (currently) backed up in the collections.db database is fairly innocuous.

I pulled together some Perl scripts to analyse the data and have come to the following conclusions:

  • The CellLocation Table (8000 entries) is a subset of Apple’s global cell tower location database that is cached locally. The timestamps in this only makes sense if they are part of a batch update from some other source. In my case there about 100 distinct groups of location and they don’t match where I’ve been at any level smaller than about 50km and on a timescale of days. There are some really odd entries too – one in northern Italy that I haven’t even flown within 500 km of in the past year.
  • The WiFi Location table (60000 entries) has similar characteristics. It’s got groups of AP locations that also make no sense. My table correctly has a bunch of locations in Bracknell during March but there are also more than a hundred entries much further North at the same time in a curious east <-> west line that only makes sense if there is some caching process that pre-fetches a bunch of these to improve non-GPS positioning performance. All the MAC addresses listed appear to be infrastructure (AP’s) rather than user systems so the general threat from it is pretty limited.
  • The CellLocationLocal Table looks much more like tracking data but it only has 106 entries. All of these have unique timestamps, include altitude, speed and heading information and most importantly all seem to correspond very accurately (to within a few metres) with times and places that make sense even though they claim the accuracy is only 1500m. Thinking carefully about it all of these seem to correspond with times when I had the GPS function enabled. It looks to me as if this is data that Apple might actually be likely to harvest and could be sending home in order to improve their Cell Phone location maps.

Having discussed this with some folks and after reading the online commentary I think it’s pretty safe to say that this looks like a storm in a teacup. There is certainly some tracking data in the file, but it’s privacy risks are a lot less than they seemed. Apple don’t appear to be gathering location data that includes the DeviceID or any other details that could uniquely identify the user or phone. They could be sending that back along with such an identifier but I doubt that they would – having these tables structured in this way only makes sense if the primary purpose is to enable a local cache for quick location lookups.

Ultimately I’m a bit disappointed – I’d wanted to have a reliable record of where I’d been over the last year – but on the plus side it really doesn’t look as if Apple are harvesting data that could be used to spy on me. We’ll have to se what comes out on this over the next few days but I’m not losing any sleep over this.

Wednesday 20 April 2011

iPhone Location Tracking

The subject of iPhone’s tracking their owners locations has hit the Interwebs again because Pete Warden and Alasdair Allan just pulished a rather nifty application that extracts and maps out the location logs that iOS keeps.

Their application uses the location info that is acquired using cell tower triangulation which is of dubious accuracy (about 500m for me for the most part) and seems to have something weird going on with it’s timestamps in my case, but that might just be something strange about either my iPhone or O2’s network. They get all of this from the CellLocation table within the iOS database that Apple use to manage most of the OS system configs and functions (Consolidated.db). Finding the Database is a minor challenge given the instructions provided. Extracting the data is the work of a few clicks after that.

Interestingly they don’t mention the fact that there is dramatically more and far better GPS derived data in the Location table which has far fewer entries and only seems to log data when you are actively using a GPS app. 

And there is a WiFiLocation database for those times that you have WiFi enabled that has logged about 8 times as much data as I have for Cell towers. That particular table intrigues me because it contains MAC addresses, and there appear to be lots of them (120k entries in that table). Interestingly it only records MAC addresses (along with location data and timestamps) not SSID’s which confirms one of my older assertions that SSID’s are useless for location tracking. I’m going to take a look at that table in much more detail to see whether I’ve been harvesting thousands of MAC address\Location combos silently over the past year.

It would be very interesting to know whether Apple extracts any of this data, and if so what it does with it. Kim Cameron had a lot to say about the risks of this last year when he made some fairly insightful remarks about the massive privacy holes in Apple’s Policy. At that point we were only talking about Apple gathering the user’s own device ID but I am at a loss to explain why Apple would have the phones log all of this location data if they did not intend to harvest it.

Funnily enough all three location tables also have a corresponding “Harvest” table but they are currently empty on my iPhone at least, perhaps they have plans for some future capabilities.

I’m thinking of putting something together that will allow us poor Windows users to get the data poured into a nice Map interface, Pete and Alasdair’s version is OSX only at the moment so I’m making do with Perl and Google Apps to see what data SkyNet has been collecting on me. It’s not very accurate as I noted above but it’s logged all 20 of my international trips over the past couple of months at some level.