Tuesday 8 June 2010

So how much does a MAC address tell about you?

Kim Cameron responded to my previous post making some very good point – you shouldn’t just dismiss the parts that can’t be solved by fixing the technology.

I’d discounted the payload snooping issue as a distraction because I’d believed (and still do) that it was almost certainly an unfortunate error. I’d then made the point that a legal barrier to a technical problem was insufficient to prevent the bad guys doing bad things but I used that as an excuse to ignore the problem – small scale abuses of this sort of thing are not good but systematic large scale abuses “benefit” from network scaling effects. You might not be able to prevent small scale\illegal abuse through legal means but just because you can’t does not mean that you can’t control large scale abuses this way. The benefits and dangers inherent in this data become exponentially worse as the scale of the database that contains it increases. Large scale means companies and companies react to regulation by being much more careful about what they do. If a technology that is already out there has major privacy issues the regulatory approach is the only way to keep a lid on the problem while the technologists argue about how to fix the bits. Even if we assume that the law was OK about companies creating Geo-location databases using WiFi SSID\MAC mapping, effective regulation would have made the additional mistake made by Google (assuming it was a mistake) much less likely.

Now the obvious question is should scanning for identifiers that are broadcast openly by all WiFi radio signals be acceptable and legal?

802.11 WiFi signals are pretty complex things - Wikipedia has a brief overview here for those who want to see the alphabet soup of standards involved. Despite the range of encoding\modulation schemes and the number of frequency bands and channels almost all 802.11 devices revert to a couple of basic communication modes. This makes it easy for devices to connect to each other, and it’s what makes public WiFi hotspots practical. However it also makes configuring a device to monitor WiFi traffic trivially easy – the hardware does all the heavy lifting and the standards don’t really do anything to stop it happening. An important feature of WiFi is that, even though the payload encryption standards can now be pretty robust, the data link layer is not protected from snooping. This means that the content (my Google searches, the video clip I’m streaming down from Youtube etc) can be pretty well kept away from prying eyes but, at what the Ethernet folks call layer 2, the logical structures called frames that carry your encrypted data transmit some control data in the open.

So even with WPA2’s thorough key management and AES encryption your WiFi traffic still contains quite a bit of chatter that isn’t hidden away. The really critical thing for me is that the layer 2 addresses, the Media Access Control (MAC) addresses, of the sender and receiver (generally your PC\Phone’s WiFi adaptor and your Access Point) for each frame are always visible. And remember that MAC addresses are globally unique identifiers by design. Individual WiFi networks are defined by another identifier, the Service Set Identifier or SSID – when you set up your home WiFi AP and call the network “MyWLAN” you are choosing an SSID. SSID’s are very important, you can’t connect to a wireless LAN without knowing the relevant SSID, but they are not secure even though they can be sort of hidden they are never protected and can always be seen by someone just watching your wireless traffic. Interestingly SSID’s are not globally unique – there’s generally no real issue so long as my chosen SSID doesn’t match that of another network that’s relatively close by.

So SSID’s are possibly visible but MAC addresses are definitely visible, and MAC addresses are unique. While driving along a street or sitting in a coffee shop, hotel lobby or conference room your WiFi adaptor will see dozens if not hundreds of WiFi packets all of which will contain globally unique MAC addresses. It is possible to hack some WiFi hardware to change the MAC address but that practice is rare. Your PC has a couple (one for the wired Ethernet adaptor which isn’t important here, and usually one for WiFi these days), your Wii\PS3\XBox-360 has one, so does your Nintendo DS, iPhone, PSP … you get the picture. Another feature of MAC addresses is that it is very easy to differentiate between the MAC address of a Linksys Access Point, an iPhone and a Nintendo DS – Network protocol analyzers have been doing that trick for decades.

So the systematic scanners out there (Google, Navizon, Skyhook and the rest) can drive around or recruit volunteers and gather location data and build databases of unique identifiers, device types,  timestamps, signal strengths and possibly other data. The simplest (and most) benign use of that would be to pull out the ID’s of devices that are known to be fixed to one place (Access Points say) and use that for enabling Geo-location.

It’s not a big leap to also track the MAC addresses that are more mobile. Get enough data points over a couple of months or years and the database will certainly contain many repeat detections of mobile MAC addresses at many different locations, with a decent chance of being able to identify a home or work address to go with it. Kim Cameron describes the start of this cascade effect in his most recent post, mapping the attendees at a conference to home addresses even when they’ve never consented to any such tracking is not going to be hard if you’ve gone to the trouble of scanning every street in every city in the country. With a minor bit of further analysis the same techniques could be used to get a good idea of the travel or shopping habits of almost everyone sitting in an airport departure lounge or the home addresses of everyone participating in a Stop The War protest.

And remember that even though you can only effectively use WiFi to send and receive data over a range of a few 10’s to maybe a 100m you can detect and read WiFi signals easily from 100’s to 1000’s of metres away without any special equipment.

The plans to blanket London with “Free WiFi” start to sound quite disturbing when you think about those possibilities.

To answer my own title question – MAC addresses can tell far more about you than you think and keeping databases of where and when they’ve been seen can be extremely dangerous in terms of privacy.

What about Bluetooth?

Bluetooth is a slightly different animal. It’s also a short range radio standard for data communications but it was developed from the ground up to replace wires and the folks building the standard got a lot of stuff right. It doesn’t appear to be all that bad from a privacy leakage perspective – when implemented correctly nothing is sent in clear text (the entire frame is encoded, not just the payload) and the frequency hopping RF behaviour makes it much harder to casually snoop on specific conversations. Bluetooth devices have a Bluetooth Device ID that is very like a MAC address (48 bits), with a manufacturer ID that enables broad classification of devices if the ID can be discovered but most Bluetooth devices keep that hidden most of the time by defaulting to a “not visible” mode even when Bluetooth is enabled. When actively communicating (paired) all data is encrypted so the device ID’s are not visible to a third party. Almost all modern Bluetooth devices only allow themselves to remain openly visible in this way for a short period of time before they revert to a safer non broadcasting mode. The main weakness is that when devices are set to “visible” the unique identifiers and other data can be scanned remotely and used in just the same way as scanned WiFi MAC addresses. That’s not to say that Bluetooth doesn’t have its share of security problems but they made an attempt to get some of the fundamentals right. It does also show that there is a practical way to approach the wireless privacy challenge which is good to see.

2 comments:

Dave Berry said...

"I’d discounted the payload snooping issue as a distraction because I’d believed (and still do) that it was almost certainly an unfortunate error."

Do you think this programming decision might reflect internal culture at Google? They had the choice of collecting only the data they believed was needed for the task at hand, or of collecting as much data as possible and then extracting what they were interested in at that time. The latter seems to be Google's general approach to data collection. So perhaps when the programming team were faced with the decision, they (possibly unconsciously) chose the approach most in line with Google's practices elsewhere.

If this hypothesis is true (and we can't tell), then your point about the general effect of a regulatory environment would be even more relevant.

Silvester Norman said...

MAC Address is considered as Identity of a computer system it acts as a important parameter to secure internet connections.

Thanks
Silvester Norman

Change Mac Address