Commentary – Page 23

Category: Commentary (page 23 of 982)

Why programmers should study math

One thing I’ve come to appreciate in the past year is the degree to which a solid math education can benefit a software developer. Google software engineer Javier Tordable surveys the math behind a number of Google products in his presentation Mathematics at Google. Inspirational.

OWS is buying bad debt and forgiving it

November 8, 2012 / Rafe / 1 Comment

The People’s Bailout

Occupy Wall Street is raising money to buy debt that’s in collections for pennies on the dollar and then forgive it. Incredible example of hacking the system for positive change.

What we learned last night

November 7, 2012 / Rafe / 0 Comments

The main thing we learned last night from the massive success of the poll aggregators that I wrote about before the election is that the polls do accurately reflect the variables that have traditionally been thought of as beyond polling. The Republicans launched a massive legislative voter suppression effort that probably affected the results. The Obama campaign put together what was probably the greatest get out the vote effort in history. What we learned is that their impact was factored into the polls. Even the naive model used by electoral-vote.com did pretty well (their Rasmussen-free map did as well as Nate Silver). Forecasting the election by aggregating state polls is a winning strategy, at least for the time being.

Update: Here’s a list of the individual polling firms that most accurately predicted last night’s results. Good polling is critical, and this year’s polling was very good (as proven by electoral-vote.com), but the main takeaway is that there’s almost no point in looking at individual poll results when you can aggregate all of them.

Broad agreement among electoral vote models

November 5, 2012 / Rafe / 0 Comments

Canadian writer Colby Cosh takes more advanced whack at Nate Silver today, arguing that it’s foolish to equate defending Nate Silver and defending science. I agree. He also argues that Nate Silver’s actual analytical skills are likely overrated, going back to his days as a baseball analyst. For more on that, check out the comment thread on this post at Baseball Think Factory.

This comment from that thread gets pretty close to the truth:

This piece does a good job of arguing that Silver’s baseball projections, like his political projections, aren’t notably better than the projections put together by other smart folks in the field. In 2008 and 2010, Silver’s projections did fine, but not notably better than other folks in the field. This seems like a good and important point – Silver isn’t a “wizard”, he’s a good writer with a good model that spits out results of a quality similar to the models of other folks who aren’t as good at writing.

Indeed, what we see is that Silver’s projections are broadly in line with what most people who have built statistical models of the likely results see. Here’s a summary of predictions from a variety of poll aggregators, all of whom use different models:

FiveThirtyEight: 314.4 electoral votes for Obama
The Princeton Electoral Consortium predicts Obama will get 309 electoral votes
electoral-vote.com has Obama at 294
Election Projection has Obama at 303
The Huffington Post Election Dashboard has Obama at 277 with 70 tossup votes
Real Clear Politics has Obama at 201 with 146 tossups
Votamatic has Obama at 326

If you’re interested in how the aggregators differ check out this post from the Princeton Electoral Consortium.

If you want to see a fuller list of predictions, Ezra Klein also has his own pundit scoreboard.

People who are wrong about data analysis

November 4, 2012 / Rafe / 3 Comments

The first order of people who don’t get data analysis are those who believe it’s impossible to make accurate predictions based on data models. They’ve been much discussed all week in light of the controversy over Nate Silver’s predictions about the Presidential campaign. If you want to catch up on this topic, Jay Rosen has a useful round up of links.

There are, however, a number of other mistaken ideas about how data analysis works as well that are also problematic. For example, professional blowhard Henry Blodget argues in favor of using data-driven approaches, but then saying the following:

If Romney wins, however, Silver’s reputation will go “poof.” And that’s the way it should be.

I agree that if Silver’s model turns out to be a poor predictor of the actual results, his reputation will take a major hit, that’s inevitable. However, Blodget puts himself on the same side as the Italian court that sent six Italian scientists to jail for their inaccurate earthquake forecast.

If Silver’s model fails in 2012, he’ll revisit it and create a new model that better fits the newly available data. That’s what forecasting is. Models can be judged on the performance of a single forecast, but analysts should be judged on how effectively adapt their models to account for new data.

Another post that I felt missed the point was Natalia Cecire arguing that attempting to predict the winner of the election by whatever means is a childish waste of time:

A Nieman Lab defense of Silver by Jonathan Stray celebrates that “FiveThirtyEight has set a new standard for horse race coverage” of elections. That this can be represented as an unqualified good speaks to the power of puerility in the present epistemological culture. But we oughtn’t consider better horse race coverage the ultimate aim of knowledge; somehow we have inadvertently landed ourselves back in the world of sports. An election is not, in the end, a game. Coverage should not be reducible to who will win? Here are some other questions: How will the next administration govern? How will the election affect my reproductive health? When will women see equal representation in Congress? How will the U.S. extricate itself from permanent war, or will it even try? These are questions with real ethical resonance. FiveThirtyEight knows better than to try to answer with statistics. But we should still ask them, and try to answer them too.

I, of course, agree with her that these are the important questions about the election. When people decide who to vote for, it should be based on these criteria, and the press should be focused on getting accurate and detailed answers to these questions from the candidates.

The fact remains that much of the coverage is, however, still focused on the horse race. Furthermore, much of the horse race coverage is focused largely on topics that do not seem to matter when it comes to predicting who will win the election. This is where data-driven analysis can potentially save us.

If it can be shown that silly gaffes don’t affect the ultimate result of the election, there may be some hope that the press will stop fixating on them. One of the greatest benefits of data analysis is that it creates the opportunity to end pointless speculation about things that can in fact be accurately measured, and more importantly, to measure more things. That creates the opportunity to focus on matters of greater importance or of less certainty.

David Roher on Nate Silver

November 1, 2012 / Rafe / 1 Comment

Nate Silver’s Braying Idiot Detractors Show That Being Ignorant About Politics Is Like Being Ignorant About Sports

David Roher at Deadspin talks about Silver’s background as a baseball analyst and the critics that cannot accept that statistical modeling might provide more accurate predictions than the anecdote-driven analysis of experts.

Google’s NYC Hurricane Sandy Map

October 28, 2012 / Rafe / 0 Comments

From Google Maps: Hurricane Sandy: NYC

This is what I was talking about the other day when I talked about developing a capability in mapping. If Apple is going to catch up with Google in this area at all, they need more than a mobile app.

What does it mean to be a senior engineer?

October 25, 2012 / Rafe / 0 Comments

You should read John Alspaw’s essay, On Being A Senior Engineer, even if you don’t aspire to be a senior engineer. Maybe you already think of yourself as a senior engineer. Maybe you work in some completely unrelated field. What it’s really about is becoming a mature professional and not only mastering the skills of your field but passing them on to other colleagues in useful ways as well.

I recently read a post about the value provided by managers. I was not surprised to read that managers do add value, but I was a bit surprised at the means by which that value is added. I would have assumed that it was by keeping people happier, removing distractions that sap productivity, or helping to prioritize work. As it turns out, the actual value is in teaching people how to do their jobs better.

So as a manager, the best way to add value is to help people along their path to becoming senior engineers (if the people who report to you are engineers). As an engineer, you should be looking for opportunities to work for a manager who can teach you to be a better engineer. And perhaps most importantly, you don’t have to be a manager to help other people get better at their jobs, so you should helping other people get better as a key aspect of you’re job. This is one of the key points of John’s essay.

Anyway, you should be reading his post and not mine. It’s a road map to making the most of the incredible opportunity we have to work as engineers.

Advanced Vim registers at Arabesque

October 23, 2012 / Rafe / 0 Comments

Advanced Vim registers

Tom Ryder’s Arabesque is one of my favorite developer blogs because it’s almost always about how to get more out of your tools. In this case, he talks about a Vim feature that I’m eager to master — registers (think multiple clipboards for Vim). A colleague who happens to be an Emacs user and I were talking about the massive productivity advantages to be gained by mastering Unix text editors. They’re power tools built by and for people who live in their text editors all day over the past two or three decades. It’s tough for any old graphical editor to compete.

Amazon’s misplaced faith automated anti-fraud algorithms

October 23, 2012 / Rafe / 1 Comment

Amazon has gotten a lot of bad publicity today because they canceled the account of a customer named Linn and deleted all of the content on her Kindle because her account was flagged by a fraud detection algorithm that linked her account to an account associated with fraudulent activity. Let’s look at what went wrong.

First, a lot of the coverage is focused on DRM. This is the risk of purchasing DRM-protected content. Amazon was able to revoke her access to material that she previously purchased because of the DRM. That’s bad. DRM is bad. Don’t buy books protected by DRM.

What interests me as a software engineer, though, is the fraud-detection part of the equation. Using algorithms to identify related accounts is pretty standard stuff. Amazon is closing fraud-related accounts, and then apparently running an algorithm that finds related accounts and closing them as well. The problem with any algorithm like this is that false positives are inevitable. Some number of accounts identified as being related will actually be unrelated.

Given that this is a foreseeable outcome of any algorithm that performs this sort of categorization, Amazon’s business policies should reflect this. For one thing, they shouldn’t be automatically suspending accounts based on the results of this check alone. It’s incredibly hostile to customers. Furthermore, the responses from customer service reflect an absolute faith in an algorithm that is certain to be imperfect. That’s bad business.

If a business is going to use an algorithm-based approach to fraud problems like this, there’s got to be an understanding of the limitations of such a system. When you ignore that fact, you run into public relations disasters like the one Amazon encountered today.

rc3.org

Strong opinions, weakly held