Mining the Web for Feelings, Not Facts →
“Translating the slippery stuff of human language into binary values will always be an imperfect science, however. … The simplest algorithms work by scanning keywords to categorize a statement as positive or negative, based on a simple binary analysis (“love” is good, “hate” is bad). But that approach fails to capture the subtleties that bring human language to life: irony, sarcasm, slang and other idiomatic expressions. Reliable sentiment analysis requires parsing many linguistic shades of gray.”
Thus lies the crux of yesterday’s Times piece on quantifying users’ feelings on the Web. As algorithms become more sophisticated and companies more adept at using them to gauge public interest in given subjects, the 25- (er, 30) year-old Internet ingenue will become even more invaluable to this analytic process.
I do admit to fearing otherwise, however; I’ve had several conversations recently with other Internet-y types about the challenge of transitioning analog entities (print, institutions) onto the Web. People need to be an integral part of that process—the aforementioned early adopter should take precedent over data sets collected by marketing firms or the highly sophisticated algorithms that parse them. Investing in a small team of in-house Web enthusiasts who truly understand—and moreover, identify with—the task at hand acts as a long-term insurance policy on the project; online communities are best understood over time, by those embedded within them.
Analytics are important—I use them, and most I know do (obsessively, even). At its heart, however, crossing the digital divide is an act governed by psychology. To privilege a mathematical equation over a sociological study—to bank on an algorithm in the name of “efficiency”—is to overlook the very audiences we seek to connect with.