rc3.org

Strong opinions, weakly held

Author: Rafe (page 23 of 989)

Don’t change sshd’s port

Don’t change sshd’s port

From Arabesque, my favorite blog for Unix geeks. I always change the sshd port, so I’m delighted to read a sound argument against doing so.

Camille Fournier on writing software for humans

I really liked this post by Camille Fournier, who runs the engineering team for Rent the Runway. When confronted with the problem of giving customers the confidence that the garment they rent will fit properly when it arrives, engineers tend to turn to solutions that involve 3D modeling and “virtual fit assistants.” Unfortunately, real humans are put off by these approaches. The solution they arrived at is much lower tech, but much better for customers. There are two takeaways, I think. The first is that diversity of all kinds on a team is valuable because it leads to a wider variety of proposed solutions to problems. The second thing is that this kind of problem really proves the value of experimentation as a product development approach. Try things and measure the results. You’ll probably wind up being surprised.

How will society adjust to ever-easier data collection?

The New York Times ran two opinion pieces this weekend right next to each other that both stand at the intersection of the how the government and politics work and social change that results from technological change. In the first, Joe Nocera argues that the big question in the resignation of David Petraeus is whether we’re comfortable with the FBI snooping through our email on relatively flimsy grounds:

But the Petraeus scandal could well end up teaching some very different lessons. If the most admired military man in a generation can have his e-mail hacked by F.B.I. agents, then none of us are safe from the post-9/11 surveillance machine. And if an affair is all it takes to force such a man from office, then we truly have lost all sense of proportion.

The second was about what increased use of data in political campaigns means long-term. As I’ve mentioned, I’ve been working in the analytics world this year, so this topic is highly relevant to me. It’s also very complicated. On one hand, improving our ability to collect and analyze data enables us to better understand what people want and expect from our products, or, in the case of campaigns, our politicians. On the other hand, combining our more advanced understanding of human behavior with deeper data sets creates the opportunity for more effective manipulation in addition to more effective communication.

While the people creating big data tools may not be evil, the organizations that use them going forward may not agree to the same principles. The big question in both the Petraeus case and in the use of big data by campaigns is that regardless of our level of comfort with the government, campaigns, or companies knowing so much about us, we don’t really have control over the gathering of that information.

Dalton Caldwell on the near future of Twitter

Twitter is pivoting

Dalton Caldwell looks at some recent Twitter moves and tries to predict the company’s upcoming strategy. Here’s what he argues that it’s about:

The Discover tab is the future. Rather than forcing normal users to make sense of a realtime stream, they can see what content is trending.

Here’s what I don’t get. You can facilitate the mode of usage that Twitter may envision for everyday users without hurting the power users that have made it what it is. How did celebrities and “brands” figure out how to engage on Twitter? By watching the pioneering users of the service build a following. And many of those users have become celebrities in their own right in the context of Twitter. If Twitter put me on their board (and they should), that’s the advice I’d give them. Passive users may contribute most of the revenue, but the power users contribute most of the energy.

Why programmers should study math

One thing I’ve come to appreciate in the past year is the degree to which a solid math education can benefit a software developer. Google software engineer Javier Tordable surveys the math behind a number of Google products in his presentation Mathematics at Google. Inspirational.

OWS is buying bad debt and forgiving it

The People’s Bailout

Occupy Wall Street is raising money to buy debt that’s in collections for pennies on the dollar and then forgive it. Incredible example of hacking the system for positive change.

What we learned last night

The main thing we learned last night from the massive success of the poll aggregators that I wrote about before the election is that the polls do accurately reflect the variables that have traditionally been thought of as beyond polling. The Republicans launched a massive legislative voter suppression effort that probably affected the results. The Obama campaign put together what was probably the greatest get out the vote effort in history. What we learned is that their impact was factored into the polls. Even the naive model used by electoral-vote.com did pretty well (their Rasmussen-free map did as well as Nate Silver). Forecasting the election by aggregating state polls is a winning strategy, at least for the time being.

Update: Here’s a list of the individual polling firms that most accurately predicted last night’s results. Good polling is critical, and this year’s polling was very good (as proven by electoral-vote.com), but the main takeaway is that there’s almost no point in looking at individual poll results when you can aggregate all of them.

Broad agreement among electoral vote models

Canadian writer Colby Cosh takes more advanced whack at Nate Silver today, arguing that it’s foolish to equate defending Nate Silver and defending science. I agree. He also argues that Nate Silver’s actual analytical skills are likely overrated, going back to his days as a baseball analyst. For more on that, check out the comment thread on this post at Baseball Think Factory.

This comment from that thread gets pretty close to the truth:

This piece does a good job of arguing that Silver’s baseball projections, like his political projections, aren’t notably better than the projections put together by other smart folks in the field. In 2008 and 2010, Silver’s projections did fine, but not notably better than other folks in the field. This seems like a good and important point – Silver isn’t a “wizard”, he’s a good writer with a good model that spits out results of a quality similar to the models of other folks who aren’t as good at writing.

Indeed, what we see is that Silver’s projections are broadly in line with what most people who have built statistical models of the likely results see. Here’s a summary of predictions from a variety of poll aggregators, all of whom use different models:

If you’re interested in how the aggregators differ check out this post from the Princeton Electoral Consortium.

If you want to see a fuller list of predictions, Ezra Klein also has his own pundit scoreboard.

People who are wrong about data analysis

The first order of people who don’t get data analysis are those who believe it’s impossible to make accurate predictions based on data models. They’ve been much discussed all week in light of the controversy over Nate Silver’s predictions about the Presidential campaign. If you want to catch up on this topic, Jay Rosen has a useful round up of links.

There are, however, a number of other mistaken ideas about how data analysis works as well that are also problematic. For example, professional blowhard Henry Blodget argues in favor of using data-driven approaches, but then saying the following:

If Romney wins, however, Silver’s reputation will go “poof.” And that’s the way it should be.

I agree that if Silver’s model turns out to be a poor predictor of the actual results, his reputation will take a major hit, that’s inevitable. However, Blodget puts himself on the same side as the Italian court that sent six Italian scientists to jail for their inaccurate earthquake forecast.

If Silver’s model fails in 2012, he’ll revisit it and create a new model that better fits the newly available data. That’s what forecasting is. Models can be judged on the performance of a single forecast, but analysts should be judged on how effectively adapt their models to account for new data.

Another post that I felt missed the point was Natalia Cecire arguing that attempting to predict the winner of the election by whatever means is a childish waste of time:

A Nieman Lab defense of Silver by Jonathan Stray celebrates that “FiveThirtyEight has set a new standard for horse race coverage” of elections. That this can be represented as an unqualified good speaks to the power of puerility in the present epistemological culture. But we oughtn’t consider better horse race coverage the ultimate aim of knowledge; somehow we have inadvertently landed ourselves back in the world of sports. An election is not, in the end, a game. Coverage should not be reducible to who will win? Here are some other questions: How will the next administration govern? How will the election affect my reproductive health? When will women see equal representation in Congress? How will the U.S. extricate itself from permanent war, or will it even try? These are questions with real ethical resonance. FiveThirtyEight knows better than to try to answer with statistics. But we should still ask them, and try to answer them too.

I, of course, agree with her that these are the important questions about the election. When people decide who to vote for, it should be based on these criteria, and the press should be focused on getting accurate and detailed answers to these questions from the candidates.

The fact remains that much of the coverage is, however, still focused on the horse race. Furthermore, much of the horse race coverage is focused largely on topics that do not seem to matter when it comes to predicting who will win the election. This is where data-driven analysis can potentially save us.

If it can be shown that silly gaffes don’t affect the ultimate result of the election, there may be some hope that the press will stop fixating on them. One of the greatest benefits of data analysis is that it creates the opportunity to end pointless speculation about things that can in fact be accurately measured, and more importantly, to measure more things. That creates the opportunity to focus on matters of greater importance or of less certainty.

David Roher on Nate Silver

Nate Silver’s Braying Idiot Detractors Show That Being Ignorant About Politics Is Like Being Ignorant About Sports

David Roher at Deadspin talks about Silver’s background as a baseball analyst and the critics that cannot accept that statistical modeling might provide more accurate predictions than the anecdote-driven analysis of experts.

Older posts Newer posts

© 2024 rc3.org

Theme by Anders NorenUp ↑