rc3.org Strong opinions weakly held

Posts Tagged ‘security’

Using HMAC to authenticate Web service requests

One weakness of many Web services that require authentication, including the ones I’ve built in the past, is that the username and password of the user making the request are simply included as request parameters. Alternatively, some use basic authentication, which transmits the username and password in an HTTP header encoded using Base64. Basic authentication obscures the password, but doesn’t encrypt it.

This week I learned that there’s a better way — using a Hash-based Message Authentication Code (or HMAC) to sign service requests with a private key. An HMAC is the product of a hash function applied to the body of a message along with a secret key. So rather than sending the username and password with a Web service request, you send some identifier for the private key and an HMAC. When the server receives the request, it looks up the user’s private key and uses it to create an HMAC for the incoming request. If the HMAC submitted with the request matches the one calculated by the server, then the request is authenticated.

There are two big advantages. The first is that the HMAC allows you to verify the password (or private key) without requiring the user to embed it in the request, and the second is that the HMAC also verifies the basic integrity of the request. If an attacker manipulated the request in any way in transit, the signatures would not match and the request would not be authenticated. This is a huge win, especially if the Web service requests are not being made over a secure HTTP connection.

There’s one catch that complicates things.

For the signatures to match, not only must the private keys used at both ends of the transaction match, but the message body must also match exactly. URL encoding is somewhat flexible. For example, you may choose to encode spaces in a query string as %20. I may prefer to use the + character. Furthermore, in most cases browsers and Web applications don’t care about the order of HTTP parameters.

foo=one&bar=two&baz=three

and

baz=three&bar=two&foo=one

are functionally the same, but the crypto signature of the two will not be.

Another open question is where to store the signature in the request. By the time the request is submitted to the server, the signature derived from the contents of the request will be mixed in with the data that is used to generate the signature. Let’s say I decide to include the HMAC as a request parameter. I start with this request body:

foo=one&bar=two&baz=three

I wind up with this one:

foo=one&bar=two&baz=three&hmac=de7c9b8 ...

In order to calculate the HMAC on the server, I have to remove the incoming HMAC parameter from the request body and calculate the HMAC using the remaining parameters. This is where the previous issue comes into play. If the HMAC were not in the request, I could simply calculate the signature based on the raw incoming request. Once I start manipulating the incoming request, the chances of reconstructing it imperfectly rise, possibly introducing cases where the signatures don’t match even though the request is valid.

This is an issue that everyone implementing HMAC-based authentication for a Web service has to deal with, so I started looking into how other projects handled it. OAuth uses HMAC, with the added wrinkle that the signature must be applied to POST parameters in the request body, query string parameters, and the OAuth HTTP headers included with the request. For OAuth, the signature can be included with the request as an HTTP header or as a request parameter.

This is a case where added flexibility in one respect puts an added burden on the implementor in others. To make sure that the signatures match, OAuth has very specific rules for encoding and ordering the request data. It’s up to the implementor to gather all of the parameters from the query string, request body, and headers, get rid of the oauth_signature parameter, and then organize them based on rules in the OAuth spec.

Amazon S3′s REST API also uses HMAC signatures for authentication. Amazon embeds the user’s public key and HMAC signature in an HTTP header, eliminating the need to extract it from the request body. In Amazon’s case, the signed message is assembled from the HTTP verb, metadata about the resource being manipulated, and the “Amz” headers in the request. All of this data must be canonicalized and added to the message data to be signed. Any bug in the translation of those canonicalization rules into your own codes means that none of your requests will be authenticated by Amazon.com.

Amazon uses the Authorization header to store the public key and HMAC. This is also the approach that Microsoft recommends. I think it’s superior to the parameter-based approach taken by OAuth. It should be noted that the Authorization header is part of the HTTP specification and if you’re going to use it, you should do so in a way that complies with the standard.

For my service, which is simpler than Amazon S3 or OAuth, I’ll be using the Authorization header and computing the HMAC based on the raw incoming request.

I realize that HMAC may not be new to many people, but it is to me. Now that I understand it, I can’t imagine using any of the older approaches to build an authenticated Web service.

Regardless of which side of the Web service transaction you’re implementing, calculating the actual HMAC is easy. Normally the SHA-1 or MD5 hashing algorithms are used, and it’s up to the implementor of the service to decide which of those they will support. Here’s how you create HMAC-SHA1 signatures using a few popular languages.

PHP has a built-in HMAC function:

hash_hmac('sha1', "Message", "Secret Key");

In Java, it’s not much more difficult:

Mac mac = Mac.getInstance("HmacSHA1");
SecretKeySpec secret = 
    new SecretKeySpec("Secret Key".getBytes(), "HmacSHA1");

mac.init(secret);
byte[] digest = mac.doFinal("Message".getBytes());

String hmac = Hex.encodeHexString(digest);

In that case, the Hex class is the Base64 encoder provided by Apache’s Commons Codec project.

In Ruby, you can use the HMAC method provided with the OpenSSL library:

DIGEST  = OpenSSL::Digest::Digest.new('sha1')

Base64.encode64(OpenSSL::HMAC.digest(DIGEST, 
  "Secret Key", "Message"))

There are also libraries like crypto-js that provide HMAC support for JavaScript.

The good and bad of the OS X sandbox

Lots of thoughtful posts are cropping up about the new restrictions Apple plans to implement for OS X applications that will be distributed through the App Store. The occasion is, I suppose, the news that Apple is pushing back the deadline for all applications distributed through the App Store to be Sandbox-compliant from the middle of this month to March 2012.

For a basic rundown of the new rules and what they mean, check out this post from Pauli Olavi Ojala.

For an argument that Apple could take a more realistic, less restrictive approach to securing applications, see Will Shipley. In it, he explains why entitlements and code auditing may be useful in theory, but certificates are a more straightforward solution:

But, in the real world, security exploits get discovered by users or researchers outside of Apple, and what’s important is having a fast response to security holes as they are discovered. Certificates give Apple this.

His proposed solution makes a lot of sense, I’d love to see Apple adopt it.

Ars Technica’s Infinite Loop blog has a useful post on the sandbox features in OS X Lion as well.

Screening systems and the base rate fallacy

Kellan Elliott-McCrea has a great post about the high cost of false positives when it comes to building software that detects fraud, spam, abuse, or whatever. The cost of false positives is explained by the base rate fallacy. The BBC explains the base rate fallacy very well. Here’s a snippet:

If 3,000 people are tested, and the test is 90% accurate, it is also 10% wrong. So it will probably identify 301 terrorists – about 300 by mistake and 1 correctly. You won’t know from the test which is the real terrorist. So the chance that our man in the mac is the real thing is 1 in 301.

Anybody who wants to talk about screening systems without an understanding of the base rate fallacy needs to do more homework.

Security is a cost

At work, we’re switching things to encrypt a lot of information in our databases for security reasons. The project has been time consuming and painful, and in the end, our database is far less usable from a developer’s standpoint than it was before. Soon the days when I can quickly diagnose issues on the production system with a few well-placed SELECT statements will be a thing of the past.

As far as the implementation goes, I’ll tell Hibernate users who want to implement an encryption system that there’s only one way to go — UserTypes. Don’t bother with anything else.

What this project really has me thinking about, though, is the high cost of security. It ties into something from the Bill James interview that I linked to the other day. Here was his response to the question of whether we overestimate or underestimate the importance of crime:

We underestimate it, because it’s our intent to underestimate it. We only deal with it indirectly. We all do so many things to avoid being the victims of crime that we no longer see those things, so we don’t see the cost of it. Just finding a safe place for us to have this conversation, for example — we needed a quiet place, but before that, we needed to find a safe place. A hotel lobby is what it is because of the level of security. I’ve checked out of this hotel, but I’m still sitting here in the third-floor lobby, because it’s safe. When you buy something, it’s wrapped in seven layers of packaging in order to make it harder to steal.

I think that people are generally excessively afraid of crime but underestimate the day to day costs that crime imposes. In software engineering, we spend a lot of time and effort on security. If everyone were honest, we wouldn’t need passwords, encryption, or any of the other stuff that occupies a lot of time on every project. We’d still need to take precautions against damage caused by user error, but most of the hours we spend on security could be spent on other things.

The other cost of security, beyond implementation time, is the ongoing cost related to the inconvenience of security. Whether it’s the time we take to unlock our screen or set up SSH tunnels or deal with the fact that we have to decrypt data in the database in order to see it, it all counts. Security is almost always a form of technical debt.

In many cases security precautions are necessary (or even mandated by law), but it’s important to be vigilant and not add more of it than is necessary, because it’s almost always painful in the moment and forever thereafter.

The FBI does not understand Web hosting

The New York Times has a report on an FBI raid that knocked some of my favorite sites offline yesterday. The FBI visited a colo facility and seized at least one full rack of servers leased by DigitalOne, taking down sites like Instapaper and Pinboard. Apparently they were going after a specific host but they had no idea how to seize only the hardware associated with that host, and in the age of virtualization, going after one VM could still cause many hosts to be taken down.

How Microsoft responded to Stuxnet

John Borland at the Wired Threat reports on a talk by Bruce Dang, the engineer at Microsoft whose job it was to break down the Stuxnet worm. It’s an interesting look at exactly which vulnerabilities Stuxnet exploits, and how Microsoft’s security team broke down the problem.

A video of the talk will eventually be posted at the Chaos Computer Congress Web site. I’m going to try to remember to go back and watch it.

Update: Video of the talk is available here.

Everything you needed to know about backscatter

Bruce Schneier has rounded up all the links on the backscatter X-ray scanners and related issues. Bullet points:

In this piece (not yet linked by Schneier), TSA screeners surveyed say that conducting the more invasive patdowns makes their job worse. My inclination in the face of this new scanning is to request the patdown for exactly that reason. Walking through the machine imposes a cost on the person being scanned, and no cost on the person doing the scanning. The patdown sucks for the person conducting the patdown and the person being patted down. Seems more fair to me.

As far as predictions go, my guess is that the money has been spent and we are not likely to see the government back off on the scanning. As irritated as people are now, they’ll eventually come to accept it, and it will become one more permanent contributor to the horrible experience that air travel has become.

When should you change your passwords?

One of my closely held beliefs is that expiring passwords reduce rather than increase security because the more often you have to change your passwords, the less likely you are to remember them. That is offset by the fact that people tend to use one password everywhere, so if you force people to change them, that pattern can be broken to some extent.

This week, Bruce Schneier has an essay on the subject. Here’s his bottom line, but read the whole thing:

So in general: you don’t need to regularly change the password to your computer or online financial accounts (including the accounts at retail sites); definitely not for low-security accounts. You should change your corporate login password occasionally, and you need to take a good hard look at your friends, relatives, and paparazzi before deciding how often to change your Facebook password. But if you break up with someone you’ve shared a computer with, change them all.

Blizzard continues to innovate on the security front

It would probably surprise people to learn that Blizzard, a game company, provides better security options for players of its games (World of Warcraft and now Starcraft) than nearly all banks and financial services companies do for their customers. The problem Blizzard faces is that people steal World of Warcraft accounts all the time, either to use the characters to farm gold, or to just strip all of the cash and things that can be sold from the account and pocket the cash.

A number of methods are used to steal passwords, including phishing, catching the passwords using key loggers, and just brute forcing them. Blizzard’s first big attempt to solve the problem was to give users the option of protecting their account using two factor authentication — their password and an authenticator that is tied to the account. The authenticator is a key fob (or an phone app) that generates a number every few seconds that must be entered in order to log in. Once an authenticator is tied to your account, getting your password stolen is no longer a problem.

Despite the fact that the authenticator app is free and the physical authenticator only costs $6, many players do not use them, and accounts still get stolen all the time. Indeed, account thieves almost always attach their own authenticator to compromised accounts as soon as they’ve been compromised, making it that much more difficult for players to get them back. (I shudder to think about how much money Blizzard spends dealing with account theft.)

To enable players who haven’t gotten an authenticator to secure their accounts, Blizzard has introduced a dial-in authenticator. With it, you can assign a phone number to your account. If there’s something unusual about an authentication attempt, you will be required to dial in to a toll free number from that phone and enter a PIN in order to log in successfully.

There’s bound to be an interesting article written about the economics of account security that explains why Blizzard finds it more worthwhile to implement robust authentication solutions when so many businesses that are susceptible to financial fraud do not. Are people that much more likely to steal your World of Warcraft characters than they are to steal your Amazon.com account and use the credit cards you’ve saved there? Or is it that people are more willing to go to extra trouble to secure their game accounts?

Update: There are lots of smart comments about this at Hacker News as well.

The growing misperception of HTML5

Today the New York Times Opinionator blog ran a piece by Robert Wright made the following assertion about HTML5:

In principle, HTML 5 will allow sites you visit to know your physical location and will make it easier for them to keep track of your browsing and shopping history.

That assertion is based on this news article from the Times, which says:

In the next few years, a powerful new suite of capabilities will become available to Web developers that could give marketers and advertisers access to many more details about computer users’ online activities. Nearly everyone who uses the Internet will face the privacy risks that come with those capabilities, which are an integral part of the Web language that will soon power the Internet: HTML 5.

All of this talk is about one piece of HTML5, client storage. For the details, check out Mark Pilgrim’s chapter on local storage in Dive Into HTML5.

There are two points to make. The first is that Web sites won’t have access to any information that they don’t have already already. In that sense, the talk about “access to many more details” is misleading. It’s not that Web sites will have access to new information, but rather that they’ll have a new place to store information that they already collect that may make it more convenient for them.

For example, if I don’t share my current location with FourSquare, they won’t suddenly be able to retrieve it if I use a browser that supports local storage. However, if I do give them access to my current location, they could store it in local storage on my own computer rather than using their own resources to store it on their server. In that sense, the information may suddenly be worth storing and easier to access, but it’s information they could already obtain and store on their own servers if they chose to do so. This aspect of local storage subjects users to no real risk beyond the risk already posed by cookies or other vectors for storing information about users.

What’s really gotten people wound up is evercookie (mentioned in the New York Times story), a proof of concept that demonstrates how the variety of ways Web sites can store information on the client can be exploited so that it’s nearly impossible to delete tracking cookies. Browser cookies are one way to store information on the client, as is local storage. Flash Local Shared Objects (also known as Flash cookies) can also store information on behalf of Web sites on your computer. evercookie uses a number of other methods for storing information as well. The nefarious thing about it is that when the information is deleted in one of these locations, evercookie replicates it again from another location where it is still stored. So if I delete my browser cookie, evercookie will copy that information from Flash and put it back in place. If I delete the Flash cookie, it will look in one of the other locations where it stashes information and copy it back again.

Using tricks like this to make it difficult for users to prevent Web sites from tracking them is unethical. Web sites who take this approach should be classified as spyware. But the existence of these techniques has nothing to do with HTML5.

What concerns me is that we’re on a path toward HTML5 being perceived negatively by regular users because the only thing they’ve heard about it is that it is likely to compromise their privacy. This perception could become a major stumbling block on the road to wider usage of browsers with HTML5 support. As developers, it’s important to educate users and perhaps more importantly, the media, so that people don’t conjure up risks where they don’t exist and damage the HTML5 brand in the process.

← Before