Google sees privacy threats

August 11, 2006

Google sees privacy threats

Privacy has taken center stage again with this whole AOL debacle where they accidentally released search data that could be traced back to individuals. While it’s probably true that some firms are better than others at preventing these sort of leaks, I think one of the more important factors is that this information is being collected at all.

It’s not really surprising that this sort of thing is going on. Most marketers would probably sacrifice their first born to get a peak at the browsing behaviors of their target markets. There’s the obvious direct marketing route whereby a user whose been searching for ‘diaper rashes’, ‘milk formula’, or ‘child seats’ could be bombarded by ads for baby products. But there’s also the ability to mine the data further to discern patterns and trends like age groups, locations, brands, environmental correctness, political affiliations, income group, etc. Collecting this sort of data reliably over time can really only happen using a unique id (i.e. cookies) or login.

For the most part, this data gets tied to an anonymous entity or, in the case of a login, whatever information you choose to give. However, things start to get a little closer when you start searching for personally identifiable items. Names of friends, family, co-workers, or the amazingly common vanity search can quickly narrow the identity of the anonymous entity. Ever type in your address? You probably have if you’ve ever gotten directions from a mapping service.

But does it matter? Ultimately it really just depends on how much you value your privacy; some people live life with the windows open and the lights on while others prefer a little more discretion. But because these are businesses we’re talking about, you can be sure that this information will be put to commercial use, whether it be for advertising on behalf of others or for gaining competitive advantages for the self. In the end, privacy will always be a potential problem with server based systems that can identify you with either an IP address, a cookie, or a login.

Fact200, being a thin client application, can help alleviate these concerns. Of course, when operating as a thin client, it must connect to a server and is thus susceptible to some tracking. However, Fact200 takes great pains to connect to servers without using persistent cookies. This prevents tracking through most means except IP addresses. If you use DSL or Cable, chances are good that you have an IP that is both shared and dynamic thus making identification difficult. You can further get around this via an onion routing network like Tor.

Conversely, with servers and integrated services, there’s almost always a path to trace: e-mail provides a good consistent ID to start with, search provides general traits, and mapping provides location.

Fact200 also maintains its own independent data library that resides on your local computer. Searches through this library do not go over the Internet and are thus untraceable. Even though the source of all data is from the Internet, correlating any sort of search behavior would be difficult. For example, one might do a broad search for ‘porn’ which, without cookies, would already be hard to trace. But then one might then do a library only search for ‘midget’ thus obscuring the original intended search from the Internet.
Data sources like news sites or social bookmarking sites can also be very broad making it difficult to ascertain any distinguishing individual characteristics. Ultimately, the more often you search within the library, the closer you will be to absolute privacy.

%d bloggers like this: