Sunday 6 June 2010

Kim Cameron takes on Google’s StreetView

I’ve been following Kim Cameron’s increasingly critical analysis of Google’s StreetView WiFi mapping data privacy debacle with some interest of late.

Some background might be in order for those interested in reading where he’s been coming from – start here and work forward. He’s been quite vocal and directed in his criticism and I have been surprised that his focus has been almost entirely on Google rather than on the underlying technical root cause. My initial view on the issue was that it was a stupid over-reaction to something that everyone has been doing for years, and that at least Google were being open about having logged too much data. I’m still of the opinion that the targeting of Google specifically is off base here, although I think Kim is right that there is a fundamental problem here. 

Kim is probably the pre-eminent proponent and defender of strong authentication and privacy on the net at the moment. His Laws of Identity should be mandatory reading for anyone working with user data in any sort of context but especially for anyone working with online systems. He’s a hugely influential thought leader for doing the right thing and as a key technical leader within Microsoft he’s doing more than almost anyone else to lay the groundwork for a move away from our current reliance on insecure, privacy leaking methods of authentication. Let’s just say that I’m a fan.

For obvious reasons he has spotted the huge privacy problems associated with the practice of gathering WiFi SSID and MAC addresses and using them to create large scale geo-location databases. There are serious privacy issues here and despite my initial cynicism about this perhaps it’s a good thing that there has been a huge furore over what Google were doing.

Note that there were two issues in play here – the intentional data (the SSID’s, MAC addresses and geo-location info) and the unintentional data (actual user payloads). I’m only going to talk about the intentionally harvested data right now because that is the much trickier problem – few people would argue that having Google (or anyone) logging actual WiFi traffic from their homes is OK.

The problem that I see with Kim’s general position on this and the focus on Google’s activities alone is that he’s not seeing the wood for the trees. The problem of companies or individuals harvesting this data is minor compared to the problem that enables it. The technical standards that we all use to connect wirelessly with the endless array of devices that we all now have in our homes, use at work and carry on our person every day are promiscuous communicators of identifiers that can be easily and extensively misused. Even if Google are prevented by law from doing it, if the standards aren’t changed then someone else will.

First some history is in order. Google aren’t the first to do this not by a long shot. Google are the first to admit that they have harvested more data from these signals than just the base identifiers but you can be certain that all the other players did too, and many are probably still doing it. Skyhook were the first to exploit the idea commercially as far as I can tell and they have been partnering with Apple (and Yahoo amongst others) since at least 2008 to provide the fruits of this data to iPhone users. The geo-location capability of iPhones that was available prior to the release of the 3G, and that is still used when GPS data is poor, uses that data. Navizon provide Microsoft Live with their data – conveniently described as crowd sourced – which has a similar dubious provenance. All three systems use a combination of WiFi SSID\MAC and Cellular Phone Base Station IDs to provide geo-location in the +-100m range. Cisco provide techniques for companies to leverage their WiFi infrastructure to do something similar in reverse – their WiFi management consoles allow administrators to track the physical location of individual devices (ie people) within large sites with a high level of accuracy – IIRC Cisco claim to be able to give location data down to the sub 5m range. This is a common surveillance technique and an excellent covert tracking mechanism that is certainly in common use. Wardriving tools (like Kismet that Google modified for their StreetView scanners, AirScanner, NetStumber..) have been around since WiFi first became practical. That the technology enabled these sort of uses is not a sudden revelation. None of this is to claim that any of this is OK mind you, just that it is a blatantly obvious side effect of the technical standard and it will be used like this as a result.

Kim very rightly points out that just because Skyhook and others did it before them does not absolve Google of responsibility if what they did was an invasion of privacy. It would help if he pointed out that all the major OS vendors are using exactly the same techniques though and they are all equally guilty of the same crimes here. That Google have patent applications based on novel [ab]uses of broadcast WiFi signals is no real surprise – I know Intel have had similar things in the pipeline in the past and I’d be shocked if all the other major companies had missed out on that trick.

Anyway the reason all of these companies have used this data is because the 802.11 (and 3GPP\WCDMA\EVO cellular) standards make no attempt to secure these things. In fact the current software\hardware stacks go so far as to actively discourage users from disabling the broadcast features. Kim’s employers flagship end user OS, Windows 7, goes so far as to warn you that you are taking a security risk if you set up an access point that does not broadcast its SSID.

Wireless technologies work by broadcasting data. WiFi uses frequencies that have an effective range of tens of meters in congested dense buildings and tens of miles in open air. That in itself is no excuse for unauthorised third parties to intercept that data but if the standard is implemented so poorly that a blind chicken cannot fail but to have the data presented to them then there is no use telling anyone not to make use of it, if it is problematic then the technology should not enable it in the first place. While laws can prevent the likes of Google and Skyhook from harvesting this in countries that care to be strict about such things that is no solution to this problem. Kim makes a sincere point about this that totally misses the point in my view – under the strict reading of some fairly outdated legislation anyone logging their neighbour’s WiFi SSID could be guilty of a criminal offence in some jurisdictions. That may or may not be true, I’m not a lawyer, but in any event the fundamental problem is that it is not possible for me to prevent you logging that data even if I wanted to without denying myself the benefits of using the technology I’ve paid for. And a law that says you shouldn’t is no way to protect me from you.

Kim’s hypothetical Child molester can stalk a child using a WiFi adaptor’s MAC address because the people who wrote the operating systems and defined the WiFi standards allow the device to leak that data over the air to systems that are untrusted. Google’s [alleged] misuse of the data is a minor issue compared to the failure of those who invented and ratified the standards.

The reasons why this is the case are not trivial. It’s not simply that the people involved didn’t know how to make things secure, or that they didn’t care. The reality is that the WiFi standard we have now is a trade off where those security aspects were not a priority. It’s probably about time that we made them one, and I’d be very happy to see the current crusade move on from focussing on just Google and going after the IEEE and the 802.11 standards body.

No comments: