Kellan Elliott-McCrea on scaling:
Scaling is always a catch up game. Only way its ever worked. If you never catch up then something isn’t working, but it isn’t original sin.
See also: Don Knuth on when to optimize.
Matt Gallagher at Cocoa With Love provides his general model for Cocoa applications. The terminology is specific to Cocoa, but the model applies to nearly all GUI applications. Needless to say, the way you fill in the blanks in his model is what makes an application unique and useful.
John Siracusa nails the developer mentality when it comes to tools perfectly:
And so continues one of the biggest constants in software development: the unerring sense among developers that the level of abstraction they’re current working at is exactly the right one for the task at hand. Anything lower-level is seen as barbaric, and anything higher-level is a bloated, slow waste of resources. This remains true even as the overall level of abstraction across the industry marches ever higher.
Eli Bendersky explains why the Fisher-Yates shuffling algorithm works:
What I do plan to do, however, is to explain why the Fisher-Yates algorithm works. To put it more formally, why given a good random-number generator, the Fisher-Yates shuffle produces a uniform shuffle of an array in which every permutation is equally likely. And my plan is not to prove the shuffle’s correctness mathematically, but rather to explain it intuitively. I personally find it much simpler to remember an algorithm once I understand the intuition behind it.
This is the algorithm that the Collections.shuffle() method in Java uses.
Tim Bray on answering questions about Android for developers:
Quite a few of the developers who walked up haven’t learned about Practical Open Source; that you can answer an immense number of questions by just downloading the system source code and plowing through it.
One of my standard two part interview questions of late is, “When is the last time you solved a problem by looking at the source code for a library or framework you use?” and then getting them to explain what they found out. I consider it a strong warning sign when a developer doesn’t bother to download the source to the open source tools they use or isn’t willing or able to answer their own questions by reading the source code.
Danc has posted the notes to a presentation he gave, Why we turned Microsoft Office into a Game. It’s a great piece on the complexity of applications and how to manage it for users. In it, he gets down to the core problem that faces companies trying to build growing businesses around software — dealing with the fact that different users take advantage of different features, and that applications tend to grow more complex as their user bases grow. It seems to me that the fashionable answer to this problem is to claim to be an auteur of application development, and to only build the features that are appealing to you. But that’s not the way big software companies work, and it’s really not the way they should work. If you’re in the software business, this presentation is a must-read.
One news item arising from the arrest of accused Times Square bomber Faisal Shahzad was that Emirates Airlines didn’t update their copy of the no-fly list soon enough after Shahzad’s name was added to prevent him from buying a ticket or boarding a flight out of the country. A non-programmer friend of mine was wondering why the airlines keep their own copy of the no-fly list rather than accessing some centralized resource that always has the most up-to-date list of names, and I thought I’d take a stab at explaining a few of the reasons why that may be the case.
The first question is, what’s a no-fly list? In short, it’s a list of names that airlines use some algorithm to match against. I have no idea how this part works, but it’s not really important. When someone tries to purchase a ticket or board a plane, the system should run their name against the list and return some kind of indication of what action should be taken if there’s a match. In matching against this kind of list, fuzzy matches will return more false positives, and stricter matches will do a poor job of accounting for things like alternate spellings and people adding or leaving out their middle names.
The question at hand, though, is how best to provide access to the no-fly list. These days, a developer creating a no-fly list from scratch would probably think about it as a Web service. Airlines would simply submit the names they wanted to check to the service, which would handle the matching and return a result indicating whether the person is on the list, or more specifically, which list they’re on. There are a number of advantages to this approach:
Given the strengths of this approach, why would the government instead allow each airline to maintain its own copy of the list, distributing updates as the list changes? I can think of a few reasons.
If the access to the list is provided by a centralized Web service, every airline endpoint must have the appropriate connectivity to communicate with that service. For reasons of security and cost, most airline systems are almost certainly deployed on private networks that don’t have access to the Internet. To get this type of system to work, the airline would have to provide direct access to the government service, an internal proxy, or some kind of direct connection to the government network that bypasses the Internet. All of those solutions are impractical.
Secondly, communicating with a central service poses a risk in terms of reliability. If the airlines can’t connect to the government service, do they just approve all of the ticket purchases and boarding requests that are made? If not, do the airline’s operations grind to a halt until communication is restored? The government probably doesn’t want to make all of the airlines dependent on the no-fly list service in real time.
And third, a centralized service opens up the airlines to a variety of attacks that aren’t available if they maintain their own copies. Both denial of service attacks and some man in the middle attacks could be used to prevent airlines from accessing the no-fly list, or to return bad information for requests to the no-fly list.
From an implementation standpoint, it’s easier for the airlines to maintain the lists themselves and to integrate that list into their own systems. Doing so is more robust, and the main risks are buggy implementations and out of date data. I wonder what sorts of testing regimes the government has in place to make sure that consumers of the no-fly list are using it properly? How do they test the matching algorithm that compares the names of fliers to names on the list?
A couple of friends have taken a stab at the binary search exercise and posted the results to their blogs. Check out Kellan Elliott-McCrea’s implementation and Erik Kastner’s implementation(s).
I wanted to point the programmers in the audience at a series of three posts by Mike Taylor of The Reinvigorated Programmer. In the first, he challenges programmers to sit down and see if they can write an implementation of binary search on the first try, without testing. I tried the exercise without reading the whole post last week and tested intermittently, so I blew it. My implementation didn’t work the first time but that’s because I started testing it before it was finished. Or maybe it’s because I suck. I’ll probably give it a try again later when it’s not so fresh on my mind.
If nothing else, the exercise revealed that I’m too eager to start on things for my own good. I started doing the exercise before I read all the rules, and I started testing my implementation before I was done writing it.
He also posted two followups . In the first, Common bugs and why exercises matter, he apologizes for the structure of the first post, which led me to dive in:
The bad: because I produced the Jon Bentley quote from Programming Pearls before stating the rules of the challenge, a lot of people eagerly ploughed straight in, and so inadvertently broke the rules. My bad.
Here’s his explanation of why it’s a good idea to work on exercises like the one he posted, in response to people who argue that it’s useless to write your own binary search routine when perfectly good implementations already exist:
Why would we think that in programming we don’t need to do exercises that are similarly related to our day-to-day work?
I have a hypothesis about that, but it’s not one that’s going to be popular. The boxer, the concert pianist and the sprinter need to be at the absolute top of their game in order to succeed. If the boxer’s not light on his feet, he’ll get beaten up; if the pianist lacks dexterity, he simply won’t get booked, in such a competitive career; the sprinter deals in margins of hundredths of a second. They practice, exercise, do training drills because they must: if they fall to, say, 97% of their best performance, they lose. Could it be that programming is a little too comfortable? Do employers expect too little? Are we content just to stay some way ahead of the pack rather than striving to excel? That’ll work if you’re happy to write Enterprise Beans For The Enterprise for the rest of your career. Not so much if you’re hoping to go and work for Google.
I’ll be blunt. If I were researching a job applicant and found them arguing that doing programming exercises is a waste of time, I’d reject them immediately. Even if you’re not doing programming exercises to improve your skills, you ought to at least know that there’s value in doing them. I don’t want to work with people who don’t take their craft seriously.
In his second followup, he takes on the assertion that the rule against testing the implementation as you work on it is bad. I like the title: Testing is not a substitute for thinking. Here’s his point:
The point is this: testing is only one of the weapons in our armoury, and it’s one that has recently become so fashionable that a lot of programmers are in danger of forgetting the others. Although testing is valuable, even indispensible, it is also limited in important ways, and we need to have facility with other techniques as well.
The entire series is well worth reading if you’re a professional software developer. It certainly shook me out of my comfort zone and made me realize that I could stand to do more basic programming work, rather than figuring out how to glue things together or what the right syntax is in JavaScript to do something I already know how to do in Java or Ruby.
Github developer Scott Chacon describes their development process:
At GitHub we don’t have a project tracker or todo list – we just all work on whatever is most interesting to us. No standup meetings, burndown charts or points to assign. No chickens or pigs. It’s sort of the open source software style of business – everyone itches thier own scratch. Inexplicably, it works really well and keeps everyone engaged, new features appearing quickly and bugs fixed rather fast. No managers, directors, PMs or departments – and it’s the most agile, focused and efficient team I’ve ever worked with. Maybe we should write a book about it.
The first question that occurs to me when reading this is, under what conditions would such an approach work? (The second is, do they have a quality assurance department, and if so, how do they plan their work?)
But let’s go back to the first. I can think of a few prerequisites:
There are probably a lot more conditions required to make this sort of arrangement work, but those are the ones that immediately leap out at me. The beautiful thing about this approach is that it insures that you get exactly the developers you’d like to have. The people who would not want to work under these conditions are not the ones you’d want anyway, and the developers you would want would leap at the chance to work in this fashion.
Thanks to Ryan Tomayko for the link.
Update: Be sure to read Ryan’s comment below, he adds a lot more details about how things work at GitHub.
© rc3.org. Powered by WordPress using the DePo Skinny Theme.