Sentiment and Sentimentality

With a wink at Jane Austen, the world seems to have grown rather enamored with Artificial Intelligence recently. That’s not to say that AI hasn’t been a pop-culture archetype since I can remember. In terms of actual software and solutions, however, the ebb and flow of AI is something like a Jane Austen novel played out over the past 50 years. And per Ms. Austen, “If things are going untowardly one month, they are sure to mend the next.” Though recently, things seem to have jumped up a notch and a discernible phase shift has occurred. Probably due to a Moore’s Law tipping point coupled with unicorn hunting packs of investors, the widespread commercialization of good old fashioned AI (GOFAI), turbocharged with GPUs and ASICs, is now very actually real. But a neural network is not always the best nor only choice.

Take, for example, a recent blog entry on OpenAI, the Unsupervised Sentiment Neuron. The neuron in question, though, is not your standard neural network variety, but something far simpler. Detecting sentiment from Amazon reviews (unsupervised) can be reduced to a single ‘neuron’ which, from a linear regression model designed to predict the next single character in a sequence, and in doing so the model learned an interpretable feature, and that simply predicting the next character in Amazon reviews resulted in discovering the concept of sentiment. Per the researchers:

“The sentiment neuron within our model can classify reviews as negative or positive, even though the model is trained only to predict the next character in the text.”

Classifying sentiment accurately from short bursts of text using a next-character prediction engine is pretty awesome. The thing is, the sentiment use case was never a target, never pursued, and never predicted with the next-character work. It was one of those sweet side effects that often happens when we pursue the unknown. Like the microwave oven, the technology for which was discovered while researching radar gear. Or penicillin, quite accidentally discovered due to a sink full of dirty dishes. Metaphors notwithstanding, the emergence of an unsupervised classification model from the training and execution of a supervised learning machine model is very cool. And beyond the cool factor, new research into linguistics and information theory is clearly implied.

The epic fail of prediction systems the world witnessed on Election Day in the USA in 2016 may have, perhaps, discredited data science to some degree, or at least given us pause when it comes to machine learning, prediction engines, and the stock we place in such innovations. But old school methods, like the next-character predictor, may yet hold surprises for us. I am a huge fan of software and especially AI systems. But our sentimentality for more mature pursuits like NLP may yet hold surprises for us.