March 2013 – rc3.org

Month: March 2013 (page 1 of 2)

The long journey toward production

Last week one of the data analysts at work asked me to help him out with a script he was writing. The script generates a CSV file and uploads it to an FTP server. He had one file containing a sequence of SQL queries, and another shell script that executes that script and then uploads the results via FTP. I thought it would be fun to write up what it took to convert those bits of code into something that meets the definition of production service in our environment.

The first, most obvious problem, was that the script was running on the analyst’s development VM, not on a production server. Relatedly, it was running from his personal crontab. The only traces that this production service even existed were in his personal space. That seemed wrong. It also queries tables in his personal schema and had the credentials for his database account hard coded in the script.

Fortunately we already have a production cron server that’s hooked into our deployment system along with a version controlled directory for the scripts to schedule cron jobs.

Relatedly, we are mostly a PHP shop, and we write cron jobs as command line PHP scripts. This may not be to your tastes (or mine), but it’s what we do. So the script needed to be ported to PHP. It also needed to extend our standard base class for crons. This provides basic features like logging to our centralized log management system, as well as conveniences like locking to prevent overlapping runs of the script and the ability to accept an email address to which to send alerts.

To get all of this working I had to rewrite the script in PHP, implementing the functionality to generate the CSV file and then send it via FTP. The SQL queries required to collect the data creates a couple of temporary tables and then runs a query against those tables. My first thought was that I would just run the queries natively from PHP through our database library, but the temporary tables only last the duration of a database session and there’s no guarantee that the queries will be run within the context of a single session, so the tables were disappearing before I could retrieve data from them.

Instead I had to put all of the queries into one variable and then run them through the command line client for the database using PHP’s proc_open function, piping the contents of the variable to the external process. I also switched things up to use the appropriate database credentials, which required the analyst to update the permissions for that table. Ideally, we’ll eventually change things up so that the data is stored in a production schema.

At that point, I had a script that would work but it didn’t have any error handling and it wasn’t a subclass of the base cron script we use. Adapting it to use the base cron script was pretty straightforward. Error handling for these types of scripts is a bit more complex. I opted to do one check to see whether the CSV file was created successfully, and then to catch any errors that occurred with FTP and alert. Fortunately, the base cron script makes it easy to send email when failures occur, so I didn’t have to write that part.

Finally, I just had to pick a time for the script to run, add the crontab entry, and then push the script through our deployment system. Or at least that was the idea. For whatever reason, the script works when I run it manually but it does not appear to be running through cron, so I’m running it manually every day for now. I also realized that if the script runs before the big data job that generates the data for it finishes, or that job fails for any reason, then the output of the script will be wrong. That means I need another layer of error handling to detect problems with the big data job and send an alert rather than uploading invalid data.

Why write this up? It’s to point out that for most projects, getting something to work is just a small, small part of building a production service. Exporting a CSV file from a database query and uploading it to an FTP server takes just a few minutes. Converting that into a service that runs within the standard infrastructure, and handles failure conditions smoothly takes hours.

There are a few takeaways here. The first is that anything we can do to make it easier to build production services is almost certainly worth the investment. Having a proper cron base script was really helpful. I’m creating a superclass of that base class that’s designed just for these specific kinds of jobs to make this work easier next time.

The second is an acknowledgement on the part of everyone involved in a project that getting something working is just the beginning, not the end of the project. The work of making a service production-ready isn’t fun or glamorous, but it’s what separates the hacker from the software engineer. Managers need to account for the time it takes to get something ready for production when allocating resources. And everybody needs to be smart about figuring out the level of reliability any service needs. If a service is going to run indefinitely, you need to be certain that it will work when the person who wrote it goes on vacation.

The third is that at any company, people are building services like this all the time outside the production context. You usually find out things went wrong with them at the worst possible time.

Garret Vreeland on taking notes

March 24, 2013 / Rafe / 4 Comments

Garret Vreeland on taking notes:

Every day, I stuff more articles and links in there, in anticipation of the day when I’ll need to take advantage of them. Yet when I have an issue and need to find a solution, or I am looking for a reference … time and again I simply Google.

I find myself in the same boat. I use Evernote mainly as a repository not for notes but for the weekly status reports I compose for work, and a couple of months ago I did go back and read through them all. I have 4674 bookmarks in Pinboard, and I don’t look at them all that often. It’s easier to just use a Web search.

If Google searched your notes in Keep and showed the results alongside your Web search results, that would be compelling. That probably won’t happen, for the same reason why Google doesn’t have an option for searching your Google Drive when you search the Web. It’s a strategy tax.

John Siracusa on Self-Reliance

March 21, 2013 / Rafe / 0 Comments

Self-Reliance

John Siracusa handicaps the players in the mobile industry based on their dependencies on other companies. This is an interesting basis for analysis that could be applied widely.

Tom Ryder on RSS with Newsbeuter

March 17, 2013 / Rafe / 0 Comments

RSS with Newsbeuter

“The Mutt of newsreaders.” Awesome.

Matt Haughey on RSS and news readers

March 15, 2013 / Rafe / 1 Comment

Thoughts surrounding Google Reader's demise

Matt Haughey talks about RSS and news readers. I’m glad to say that I just renewed my annual subscription for NewsBlur, so as a news consumer, I’m not too affected by the impending demise of Google Reader. I do worry that the loss of Google Reader will reduce the readership of RSS feeds in general, and most of the readers of this blog still read it through RSS. I’d still write the blog if only ten people read it, but it’s nice to know people are reading it, or at least marking it read in their favorite reader.

The challenges of redesigning Wikipedia

March 13, 2013 / Rafe / 0 Comments

A questioner on Quora asked whether Wikipedia has ever considered a redesign. In response, Wikpedia designer Brandon Harris, has written the clearest explanation I’ve read of the challenges involved in changing the design of a large scale Web site. If there’s one universal truth that I’ve absorbed more and more deeply with age, it’s that change never comes easy.

Here’s one bit:

How about languages? We support around 300 of them. That’s a scale problem that most people forget. I’ve seen several unsolicited redesigns that may look pretty but nearly all of them ignore (or worse, downplay) what is arguably the greatest feature of Wikipedia: that you can get it in your own language. If we want to change a text label (say, from “Password” to “Enter your password”) it will require hundreds of volunteers to check and localize that text. It’s a daunting task.

That’s just one of many complexities that he catalogs.

The 100 year gun control project revisited

March 10, 2013 / Rafe / 1 Comment

After the Newtown massacre, I wrote about a 100 year gun control project. Here’s a bit of what I wrote:

I would suggest that those people lengthen their time frame. What if we came up with a plan to fundamentally change America’s gun culture over the next 100 years? There are policies that we could start pursuing today that would move us in that direction, and taking those steps beats giving up in every way.

The New York Times reports today falling household gun ownership is already a long term trend:

The household gun ownership rate has fallen from an average of 50 percent in the 1970s to 49 percent in the 1980s, 43 percent in the 1990s and 35 percent in the 2000s, according to the survey data, analyzed by The New York Times.

In 2012, the share of American households with guns was 34 percent, according to survey results released on Thursday. Researchers said the difference compared with 2010, when the rate was 32 percent, was not statistically significant.

Gun control advocates would do well to pursue cultural change rather than legal change. Rather than banning handguns or changing laws around concealed carry, groups should be working to stigmatize both. The idea that people should be responsible for defending themselves with firearms in public places should rightly be considered a fringe view in a civilized, urban society.

The design of the index for Facebook’s Graph Search

March 9, 2013 / Rafe / 0 Comments

Under the Hood: Building out the infrastructure for Graph Search

Really interesting post on the design of the search index used by Facebook’s Graph Search feature. How challenging was it to build? Facebook started the project in 2009 and migrated all their other search systems to it before building Graph Search. All three of the engineers listed as working on the project were at Google prior to working at Facebook.

How hard is it to build a lyrics site?

March 7, 2013 / Rafe / 10 Comments

Song lyrics sites are universally terrible. They are a usability nightmare, with bad markup and tons of ads. Here are some examples:

The markup is bad, there are ads everywhere, and the usability generally sucks. Why? I understand that running such a site exposes you to legal risk, and I’ve always assumed that the outlaw foundation of such sites explains their awfulness. Even so, I’m wondering how difficult it would be to make a better attempt.

Any high quality site in this vein has to have some form of revenue, because it will attract a lot of traffic. The secret is to spend as little as possible on infrastructure. The other day, the NPR News Apps Blog had a post about building a high capacity, low cost site. That seems like a good starting point.

The other requirement is a big catalog of songs, organized by artist and album, and then the lyrics for all of them. I think building such a database with absolutely minimal human intervention would be fun.

Right now I’m just kicking this idea around. If I start working on it, I’ll post about my progress.

How developers use API documentation

March 6, 2013 / Rafe / 0 Comments

Chris Parnin writes about problems with API documentation, as evidenced by developer migration to Stack Overflow. The whole thing is incredibly interesting, and points to a need for a major reconsideration of what makes for good documentation.

rc3.org

Strong opinions, weakly held