Designing a no-fly list

May 5, 2010 / Rafe / 8 Comments

One news item arising from the arrest of accused Times Square bomber Faisal Shahzad was that Emirates Airlines didn’t update their copy of the no-fly list soon enough after Shahzad’s name was added to prevent him from buying a ticket or boarding a flight out of the country. A non-programmer friend of mine was wondering why the airlines keep their own copy of the no-fly list rather than accessing some centralized resource that always has the most up-to-date list of names, and I thought I’d take a stab at explaining a few of the reasons why that may be the case.

The first question is, what’s a no-fly list? In short, it’s a list of names that airlines use some algorithm to match against. I have no idea how this part works, but it’s not really important. When someone tries to purchase a ticket or board a plane, the system should run their name against the list and return some kind of indication of what action should be taken if there’s a match. In matching against this kind of list, fuzzy matches will return more false positives, and stricter matches will do a poor job of accounting for things like alternate spellings and people adding or leaving out their middle names.

The question at hand, though, is how best to provide access to the no-fly list. These days, a developer creating a no-fly list from scratch would probably think about it as a Web service. Airlines would simply submit the names they wanted to check to the service, which would handle the matching and return a result indicating whether the person is on the list, or more specifically, which list they’re on. There are a number of advantages to this approach:

A centralized list is always up to date. New names are added immediately and scrubbed names are removed immediately.
The government can impose a standard approach to name matching on all of the list’s end users, avoiding problems with airlines creating their own buggy implementations.
This approach offers more privacy to the people on the list, some of whom shouldn’t be on there. If you’re on the list but you never try to fly, nobody will know that you’re on there except the government agency compiling the list.

Given the strengths of this approach, why would the government instead allow each airline to maintain its own copy of the list, distributing updates as the list changes? I can think of a few reasons.

If the access to the list is provided by a centralized Web service, every airline endpoint must have the appropriate connectivity to communicate with that service. For reasons of security and cost, most airline systems are almost certainly deployed on private networks that don’t have access to the Internet. To get this type of system to work, the airline would have to provide direct access to the government service, an internal proxy, or some kind of direct connection to the government network that bypasses the Internet. All of those solutions are impractical.

Secondly, communicating with a central service poses a risk in terms of reliability. If the airlines can’t connect to the government service, do they just approve all of the ticket purchases and boarding requests that are made? If not, do the airline’s operations grind to a halt until communication is restored? The government probably doesn’t want to make all of the airlines dependent on the no-fly list service in real time.

And third, a centralized service opens up the airlines to a variety of attacks that aren’t available if they maintain their own copies. Both denial of service attacks and some man in the middle attacks could be used to prevent airlines from accessing the no-fly list, or to return bad information for requests to the no-fly list.

From an implementation standpoint, it’s easier for the airlines to maintain the lists themselves and to integrate that list into their own systems. Doing so is more robust, and the main risks are buggy implementations and out of date data. I wonder what sorts of testing regimes the government has in place to make sure that consumers of the no-fly list are using it properly? How do they test the matching algorithm that compares the names of fliers to names on the list?

Commentary

security software development

8 Comments

John
May 6, 2010 at 2:03 am

To get this type of system to work, the airline would have to provide direct access to the government service, an internal proxy, or some kind of direct connection to the government network that bypasses the Internet. All of those solutions are impractical.

I disagree. Proxy systems, secure web services and/or private VPNs between sites aren’t that hard. Besides, the airlines already have to have some sort of connection to get list updates.

I mostly agree on your second and third points, although honestly the list seems so unreliable in general if it was a live connection and it went down I don’t know that it would make any practical difference.

Your comments do make me wonder:

If each airline is just getting a list, how are they doing the matching? Is each airline writing it’s own code? This seems like more of a problem than forgetting to update the list.

But really It seems to me you get the best of both worlds by having a web service running the same software internal to each airline. All internal systems that need to check the list hit the airline’s internal web service. Those systems get updates pushed (or pulled via a web service interface) from the Feds as needed, and the entire list is cached locally. So everything is up to date as long as the connection to the Feds is live and if it goes down there is no general service interruption but there are no updates. It would be better than “we forgot to update the list” anyway.
Devdas Bhagat
May 6, 2010 at 4:18 am

Isn’t the airline copy just a fancier name for a cache?
John
May 6, 2010 at 8:06 am

Sure but typically a cache is updated automatically and transparently, which obviously didn’t happen on this case.
Rafe (Post author)
May 6, 2010 at 9:44 am

It would have to be a cache where all of the content is pre-cached, since they have their own copy of the entire list.
John
May 6, 2010 at 3:29 pm

Right – I was thinking about caching from the standpoint of updates but at that point it’s really just an update to the local copy (which is complete) and not a true cache.

My main point was that updates to each local database should be automatic, not something that has to be manually loaded.
Rafe (Post author)
May 6, 2010 at 4:17 pm

I do not discount the possibility that when there are updates to the database, someone at the airlines gets an email …
David Hall
May 7, 2010 at 10:09 pm

You seem concerned about the privacy of people on the list, but completely unconcerned with the opposite situation of flying and not being on the list. I don’t want my name being sent to government computers on a regular basis. The idea that your approach is less invasive of privacy seems mistaken.

The standardization of name checking can be achieved in many ways. Airlines can be required to link against a library where there is a really single function call “check_name(string name)” would be quite simple. And updates to the list/library can be enforced as a rsync every 15 minutes of these components. You could even have the list encrypted so it could only be read by the library. Of course, with these types of library updates, you could still have the government mess up and put up a broken library where the function has an infinite loop or something.
Rafe (Post author)
May 8, 2010 at 12:57 am

Your comment makes me wonder if the government keeps track of who flies. Or do they just subpoena the records from the airlines if they need them?

rc3.org

Strong opinions, weakly held

Designing a no-fly list

8 Comments

Leave a Reply Cancel reply

Recent Posts

Details

rc3.org

Strong opinions, weakly held

Designing a no-fly list

Previous post

Next post

8 Comments

Leave a Reply Cancel reply

Recent Posts

Details