Tim Kucejko: May 2016

Wednesday, May 11, 2016

Is Artificial Intelligence Racist?

20 years ago, Microsoft launched Clippy, the AI office assistant that became a lightning rod of scorn for the company. Not resting on their laurels, 20 years later, Microsoft decided to demonstrate it’s AI prowess, this time by releasing a “chatbot” capable of carrying on conversations.

Unfortunately, the chatbot began making racially inflammatory comments - to the point that Microsoft found itself in the somewhat familiar place of shutting an AI project down and apologizing for it.

But what went wrong? Is AI really racist?

The answer is “It depends.” Think of AI as similar to the immature child prodigy whom you know could easily pick the locks on your house or reprogram your cars on-board computer. You’re somewhat nervous about the kid getting in with the wrong crowd and using his abilities to do bad things before they develop some sort of moral restraint.

Put another way, AI depends on data. To train a chatbot to have conversations, one very logical way would be to build a massive database of conversations, compile a bunch of questions/statements, and then catalog the most common responses. Therefore, if the chatbot reads a lot of Arnold Schwarzenenegger movie scripts, then it will likely be pre-disposed to violent tendencies and telling people “I’ll be back". If the database of conversations tends to have a lot of dialog that could be considered racist, then the chatbot’s responses will be racist as well.

Along similar lines, many large data sets depend on smartphones, or devices like the apple watch. It’s not hard to imagine that such devices could more likely to be owned by wealthy people. Thus data sets built using such technology could represent skew/bias toward the wealthy, and AI built on top of such data sets could in turn “discriminate" against the less fortunate in a society.

So is AI racist? Yes, no, and maybe so.

Monday, May 9, 2016

AI "Accurately" Predicts Your Death

If you’re looking for an incredible example of journalists over-hyping the capabilities of AI (sometimes referred to by it’s synonym of “Big Data”), you need look no further than this article talking about AI predicting death “Accurately.”

You can rest assured that a bunch of geeks in a lab somewhere haven’t used a computer to crack the code of time and space. And no, it can’t tell you with certainty whether or not you’re going to die next Tuesday at 3:52pm local time. Dissecting this one takes some critical thinking and it helps if you have some experience with predictive analytics.

First of all, the writer/editor are having a bit of fun here being somewhat loose with the definition of the word “accurately.” Webster defines accurate as “Free from error or defect; Consistent with a standard, rule, or model; precise; exact.” The definition is vague, because the same word can be used to mean either “perfect” or “right more than half the time.” When you deal with predictive analytics on a large scale, the results are almost never perfect. Hence, the article may be guilty of using the accurate to mean “right more than half the time", but gleaning disproportionate amounts of attention and web traffic from people who read accurate to mean nearly “perfect."

A second convenient detail about the article is the nature of the prediction. Given enough time, there's a 100% probability that everyone alive today will die. So if I build a model to predict death, I can cheat and always predict “Yes, you will die” and I’ll be right… eventually.

Once the hype is peeled back a bit, you can look at what the article is really saying and make a strong case that this is really nothing new. Actuaries (a.k.a. “Data Scientists”) have known for decades (centuries?) that while you can’t predict specifically what will happen with any one person, you can make reasonable assumptions for a group of people. It’s effectively an example of the Central Limit Theorem, one of the core principles underpinning modern statistics. In fact, the math has long been accurate enough to underpin some very valuable companies that sell life insurance.

Mark Twain famously said that "there are three kinds of lies: lies, damn lies, and statistics.” This article feels like yet another example of how slight tweaking of definitions and statistical models can really go a long way toward distorting reality.

Wednesday, May 4, 2016

When Machines Fail

For all the promise of technology making our lives easier and threats of machines taking away our jobs, machines aren’t perfect at performing either function. One thing people are good at that machines… aren’t good at… is knowing what to do when something goes wrong.

Whatever the application, whether a function is performed by a machine or a human, something will go wrong. Period. Full Stop.

The next question is who/what will be there to pick up the pieces in a way acceptable to all parties involved. To my knowledge, nobody’s even tried to automate that one yet.

Monday, May 2, 2016

What Machines Are Good At

I had a college professor who frequently quipped that "a computer is a very fast moron.” I don’t think he originated the quote, but I do think it is very instructive.

When you hear computer CPUs described as "1.8 Ghz” what that really means is that the computer is (theoretically) capable of doing a math problem 1.8M times every second. A dual core processor can do 3.6M math problems in a second, and a quad core processor can do 7.2M math problems in a second. Suffice to say that computers do math a lot faster than people do.

When you think about it, a lot of things in life can be modeled as a math problem. The morning drive to the office, a game of chess, the trajectory of a missile, preparing dinner, and even large parts of human language can be modeled reasonably accurately as math problems. Given enough time and money spent on defining the rules of a problem, many things can very consistently be done better by a computer than a human.

Another advantage that computers have is that they are amazing at learning in parallel. If a human wants to learn something, he/she has the fundamental limitation of only having 24 hours in a day minus eating and sleeping. A machine on the other hand can divide and conquer. If a team decides one day that they want to do something crazy like organize the world’s information and make the best of human knowledge available to anyone instantaneously, they can spin up a few million computers and digest all of the worlds information a few times a day to account for the continually growing/changing nature of the global conversation.

In short, computers are very good at an ever-increasing number of tasks that can be broken down either into data sets of things/events that have come before or a set of pre-defined rules. That said, computers haven’t learned how to define the rules of the game yet.