rc3.org

Strong opinions, weakly held

Month: June 2013

persistent.info: Getting ALL your data out of Google Reader

persistent.info: Getting ALL your data out of Google Reader

Google Reader dies this weekend. Mihai Parparita, a former member of the Google Reader team, has created a tool to extract all your data from it. You should definitely look into the alternatives. I am using NewsBlur.

The human body is a tool for experts

Lately I’ve been thinking about exercise, mainly because I’ve been experimenting with Crossfit. Too much has been written about the good and bad of Crossfit, this isn’t one of those posts. What I’m interested in is some of the exercises people do at Crossfit, and how they make me think about how we use our bodies to do work.

Crossfit emphasizes Olympic weight lifts — the clean and jerk and the snatch. They’re both techniques for getting weight from the ground to over your head. With the clean and jerk you do it in two movements, the snatch involves one continuous movement. Both require a large amount of skill, and even people who have been doing them for awhile tend to be pretty terrible at it.

All but the most skilled can lift more weight using simpler approaches. What’s interesting, though, is that if you can perform these lifts well, you’ll be able to lift more weight than an equally strong person could using other techniques. Mastery enables you to make the most of your own physical potential.

Olympic weightlifting teaches you to be an expert in using a particular tool (your body) for a specific task (getting some weight from the ground to over your head). It’s is a pure example of a case where doing things the hard way takes a person further than easier paths can provide.

I find this really motivating. Currently I am at the point where I feel my weakest when I’m trying to do the proper Olympic lifts. The simpler the approach, the more effective I am. I am intrigued, though, by the idea of learning how to use my own body like an expert.

The spooks and the social media titans and the online commerce goliaths are collaborating to improve data-crunching software tools that enable the tracking of our behavior in fantastically intimate ways that simply weren’t possible as recently as four or five years ago. It’s a new military industrial open source Big Data complex. The gift economy has delivered us the surveillance state.

Andrew Leonard writes about the interaction of open source, private industry, and government intelligence agencies in Netflix, Facebook — and the NSA: They’re all in it together. I think the piece is perhaps a bit too negative, I’ll try to follow up on that later.

How management and teaching are alike

Ta-Nehisi Coates writes about his experience of teaching a writing class at MIT this semester. Here he is talking about motivating students:

I didn’t have to work hard to motivate people. What I found was that if I showed up, and I was excited, they fed off of that, and they got excited. I came to feel that teaching was performance. My job was to communicate my own energy and belief in the importance of the work.

As a manager, this is what I try to bring to my work. Hopefully I’m successful.

I’m going to write more about this some other time, but one thing I’ve come to realize is that there’s nothing I appreciate more than a committed performance. Whether it’s from a singer, an actor, a teacher, or a colleague. Commitment makes up for almost any other deficiency.

The whisteblower’s name is Edward Snowden

Today The Guardian prints the identity of the PRISM whistleblower, at his request:

The individual responsible for one of the most significant leaks in US political history is Edward Snowden, a 29-year-old former technical assistant for the CIA and current employee of the defence contractor Booz Allen Hamilton. Snowden has been working at the National Security Agency for the last four years as an employee of various outside contractors, including Booz Allen and Dell.

For those of us who believe that the protection of whistleblowers is an essential component of a functioning democracy, the question before us is how to effectively protest when he is inevitably arrested and tried. Web site defacements by Anonymous aren’t going to keep him out of jail.

You’ll also read a lot of demands that President Obama pardon Snowden, but I think that’s highly unlikely. The Obama Justice Department has aggressively prosecuted whistleblowers, and the pardon power is generally applied through a bureaucratic process handled by the Justice Department. I’m not optimistic about prosecutorial discretion saving the day, either.

In any case, keeping Snowden’s name in the news as much as possible will be important.

See also: Bruce Schneier on whistleblowers.

Analysts and their instruments

As I’ve mentioned previously, currently I’m working in the realm of Web analytics. I don’t have a deep statistics background, and I’m definitely not what anyone would mistake for a data scientist, but I do have a good understanding of how analytics can be applied to business problems.

I gained most of that understanding by way of being a baseball fan. I was hanging out with baseball nerds on the Internet talking about baseball analytics long before Moneyball was a twinkle in Michael Lewis’ eye.

Around the time most baseball teams started hiring their own analysts, I assumed that baseball analytics was a solved problem. Given all of the money at stake and all of the eyes on the problem, new analytic insights would be less common. That has turned out not to be the case, for interesting reasons.

The aspect of baseball that makes it the perfect subject for statistical analysis is every game is a series of discrete, recordable events that can be aggregated at any number of levels. At the top, you have the score of the game. Below that, there’s the box score, which shows how each batter and pitcher performed in the game as a whole. From there, you go to the scorecard, which is used to record the result every play in a game, in sequence. Most of the early groundbreaking research into baseball was conducted at this level of granularity.

What happened in baseball is that the instrumentation got a lot better, and the new data created the opportunity for new insights. For example, pitch-by-pitch records from every game became available, enabling a number of interesting new findings.

Now baseball analytics is being fed by superior physical observation of games. To go back in time, one of the greatest breakthroughs in baseball instrumentation was the radar gun, which enabled scouts to measure the velocity of pitches. That enabled analysts to determine how pitch velocity affects the success of a pitcher, and to more accurately value pitching prospects.

More recently, a new system called PITCHf/x has been installed at every major league ball park. It measures the speed and movement of pitches, as well as where, exactly, they cross the strike zone. With it, you can measure how well umpires perform, as well as how good a pitcher’s various pitches really are. You can also measure how well batters can distinguish between balls and strikes and whether they’re swinging at the wrong pitches. This data enabled the New York Times to create the visualization in How Mariano Rivera Dominates Hitters back in 2010.

If you’re working on analytics and you find it’s difficult to glean new insights, it may be time to see if you can add further instrumentation. More granular data will always provide the opportunity for deeper analysis.

One explanation of the hype behind Big Data

Cam Davidson-Pilon talks about 21st Century Problems. Here’s how he describes most of the great technological leaps of the 20th century:

What these technologies have in common is that are all deterministic engineering solutions. By that, I mean they have been created by techniques in mathematics, physics and engineering: often being modeled in a mathematical language, guided by physics’ calculus and constrained and brought to life by engineering. I argue that these types of problems, of modeling deterministically, are problems that our fathers had the luxury of solving.

And here’s the truth behind the hype about Big Data we see so much of these days:

Statistical problems describe the space we haven’t explored yet. Statistical problems are not new: they are likely as old as deterministic problems. What is new is our ability to solve them. Spear-headed by the (constantly increasing) tidal wave of data, practitioners are able to solve new problems otherwise thought impossible.

© 2024 rc3.org

Theme by Anders NorenUp ↑