One news item arising from the arrest of accused Times Square bomber Faisal Shahzad was that Emirates Airlines didn’t update their copy of the no-fly list soon enough after Shahzad’s name was added to prevent him from buying a ticket or boarding a flight out of the country. A non-programmer friend of mine was wondering why the airlines keep their own copy of the no-fly list rather than accessing some centralized resource that always has the most up-to-date list of names, and I thought I’d take a stab at explaining a few of the reasons why that may be the case.
The first question is, what’s a no-fly list? In short, it’s a list of names that airlines use some algorithm to match against. I have no idea how this part works, but it’s not really important. When someone tries to purchase a ticket or board a plane, the system should run their name against the list and return some kind of indication of what action should be taken if there’s a match. In matching against this kind of list, fuzzy matches will return more false positives, and stricter matches will do a poor job of accounting for things like alternate spellings and people adding or leaving out their middle names.
The question at hand, though, is how best to provide access to the no-fly list. These days, a developer creating a no-fly list from scratch would probably think about it as a Web service. Airlines would simply submit the names they wanted to check to the service, which would handle the matching and return a result indicating whether the person is on the list, or more specifically, which list they’re on. There are a number of advantages to this approach:
- A centralized list is always up to date. New names are added immediately and scrubbed names are removed immediately.
- The government can impose a standard approach to name matching on all of the list’s end users, avoiding problems with airlines creating their own buggy implementations.
- This approach offers more privacy to the people on the list, some of whom shouldn’t be on there. If you’re on the list but you never try to fly, nobody will know that you’re on there except the government agency compiling the list.
Given the strengths of this approach, why would the government instead allow each airline to maintain its own copy of the list, distributing updates as the list changes? I can think of a few reasons.
If the access to the list is provided by a centralized Web service, every airline endpoint must have the appropriate connectivity to communicate with that service. For reasons of security and cost, most airline systems are almost certainly deployed on private networks that don’t have access to the Internet. To get this type of system to work, the airline would have to provide direct access to the government service, an internal proxy, or some kind of direct connection to the government network that bypasses the Internet. All of those solutions are impractical.
Secondly, communicating with a central service poses a risk in terms of reliability. If the airlines can’t connect to the government service, do they just approve all of the ticket purchases and boarding requests that are made? If not, do the airline’s operations grind to a halt until communication is restored? The government probably doesn’t want to make all of the airlines dependent on the no-fly list service in real time.
And third, a centralized service opens up the airlines to a variety of attacks that aren’t available if they maintain their own copies. Both denial of service attacks and some man in the middle attacks could be used to prevent airlines from accessing the no-fly list, or to return bad information for requests to the no-fly list.
From an implementation standpoint, it’s easier for the airlines to maintain the lists themselves and to integrate that list into their own systems. Doing so is more robust, and the main risks are buggy implementations and out of date data. I wonder what sorts of testing regimes the government has in place to make sure that consumers of the no-fly list are using it properly? How do they test the matching algorithm that compares the names of fliers to names on the list?
Designing a no-fly list
One news item arising from the arrest of accused Times Square bomber Faisal Shahzad was that Emirates Airlines didn’t update their copy of the no-fly list soon enough after Shahzad’s name was added to prevent him from buying a ticket or boarding a flight out of the country. A non-programmer friend of mine was wondering why the airlines keep their own copy of the no-fly list rather than accessing some centralized resource that always has the most up-to-date list of names, and I thought I’d take a stab at explaining a few of the reasons why that may be the case.
The first question is, what’s a no-fly list? In short, it’s a list of names that airlines use some algorithm to match against. I have no idea how this part works, but it’s not really important. When someone tries to purchase a ticket or board a plane, the system should run their name against the list and return some kind of indication of what action should be taken if there’s a match. In matching against this kind of list, fuzzy matches will return more false positives, and stricter matches will do a poor job of accounting for things like alternate spellings and people adding or leaving out their middle names.
The question at hand, though, is how best to provide access to the no-fly list. These days, a developer creating a no-fly list from scratch would probably think about it as a Web service. Airlines would simply submit the names they wanted to check to the service, which would handle the matching and return a result indicating whether the person is on the list, or more specifically, which list they’re on. There are a number of advantages to this approach:
Given the strengths of this approach, why would the government instead allow each airline to maintain its own copy of the list, distributing updates as the list changes? I can think of a few reasons.
If the access to the list is provided by a centralized Web service, every airline endpoint must have the appropriate connectivity to communicate with that service. For reasons of security and cost, most airline systems are almost certainly deployed on private networks that don’t have access to the Internet. To get this type of system to work, the airline would have to provide direct access to the government service, an internal proxy, or some kind of direct connection to the government network that bypasses the Internet. All of those solutions are impractical.
Secondly, communicating with a central service poses a risk in terms of reliability. If the airlines can’t connect to the government service, do they just approve all of the ticket purchases and boarding requests that are made? If not, do the airline’s operations grind to a halt until communication is restored? The government probably doesn’t want to make all of the airlines dependent on the no-fly list service in real time.
And third, a centralized service opens up the airlines to a variety of attacks that aren’t available if they maintain their own copies. Both denial of service attacks and some man in the middle attacks could be used to prevent airlines from accessing the no-fly list, or to return bad information for requests to the no-fly list.
From an implementation standpoint, it’s easier for the airlines to maintain the lists themselves and to integrate that list into their own systems. Doing so is more robust, and the main risks are buggy implementations and out of date data. I wonder what sorts of testing regimes the government has in place to make sure that consumers of the no-fly list are using it properly? How do they test the matching algorithm that compares the names of fliers to names on the list?