Today the New York Times Opinionator blog ran a piece by Robert Wright made the following assertion about HTML5:
In principle, HTML 5 will allow sites you visit to know your physical location and will make it easier for them to keep track of your browsing and shopping history.
That assertion is based on this news article from the Times, which says:
In the next few years, a powerful new suite of capabilities will become available to Web developers that could give marketers and advertisers access to many more details about computer users’ online activities. Nearly everyone who uses the Internet will face the privacy risks that come with those capabilities, which are an integral part of the Web language that will soon power the Internet: HTML 5.
All of this talk is about one piece of HTML5, client storage. For the details, check out Mark Pilgrim’s chapter on local storage in Dive Into HTML5.
There are two points to make. The first is that Web sites won’t have access to any information that they don’t have already already. In that sense, the talk about “access to many more details” is misleading. It’s not that Web sites will have access to new information, but rather that they’ll have a new place to store information that they already collect that may make it more convenient for them.
For example, if I don’t share my current location with FourSquare, they won’t suddenly be able to retrieve it if I use a browser that supports local storage. However, if I do give them access to my current location, they could store it in local storage on my own computer rather than using their own resources to store it on their server. In that sense, the information may suddenly be worth storing and easier to access, but it’s information they could already obtain and store on their own servers if they chose to do so. This aspect of local storage subjects users to no real risk beyond the risk already posed by cookies or other vectors for storing information about users.
What’s really gotten people wound up is evercookie (mentioned in the New York Times story), a proof of concept that demonstrates how the variety of ways Web sites can store information on the client can be exploited so that it’s nearly impossible to delete tracking cookies. Browser cookies are one way to store information on the client, as is local storage. Flash Local Shared Objects (also known as Flash cookies) can also store information on behalf of Web sites on your computer. evercookie uses a number of other methods for storing information as well. The nefarious thing about it is that when the information is deleted in one of these locations, evercookie replicates it again from another location where it is still stored. So if I delete my browser cookie, evercookie will copy that information from Flash and put it back in place. If I delete the Flash cookie, it will look in one of the other locations where it stashes information and copy it back again.
Using tricks like this to make it difficult for users to prevent Web sites from tracking them is unethical. Web sites who take this approach should be classified as spyware. But the existence of these techniques has nothing to do with HTML5.
What concerns me is that we’re on a path toward HTML5 being perceived negatively by regular users because the only thing they’ve heard about it is that it is likely to compromise their privacy. This perception could become a major stumbling block on the road to wider usage of browsers with HTML5 support. As developers, it’s important to educate users and perhaps more importantly, the media, so that people don’t conjure up risks where they don’t exist and damage the HTML5 brand in the process.