Monday, November 9, 2015

Will We Meet Rosie From The Jetsons?

If you haven’t seen The Jetsons, it was a 1962 TV cartoon series depicting a middle-class family of the future. They drove their flying cars to their house in the sky, they ate food made automatically by their food-a-rac-a-cycle,  and they had flat screen TVs and made videoconference calls from their watches.

The Jetsons also had Rosie, the robotic maid who took care of house cleaning and other chores that weren’t already automated. Today, we have iRobot Roomba, the  robot that uses AI technologies to “map” a house and go vacuum, and robotics are on the verge of taking more big steps. Researchers have made robots that can learn to cook by watching youtube videos, and deliver packages to your house in an hour

Come with me if you will for a brief thought experiment. What if:
  • Amazon prime air gets approved for deliveries by the FAA and starts delivering a wide variety of foods and other perishable goods within an hour.
  • Cooking websites began not only providing recipes, but whole meal plans.
  • Cooking websites also expanded to be able to pre-populate an amazon.com/Wal Mart.com shopping cart with all the necessary ingredients to make a given meal that you could then have delivered via drone in an hour, ensuring that all the ingredients were as fresh as possible?
  • The cooking robot prepared the meal to specification every time with little or no human interaction?
  • After the meal, said robot was programmed to go find a nice corner to fold itself up and sit in the corner until the next time it’s needed? 
If Rosie became a reality, which (if any) of the following would be affected:

50 years ago when The Jetsons first aired, the technologies in depicted probably felt like pipe dreams that couldn’t possibly come true. Today, they don’t feel quite so far fetched.

Monday, November 2, 2015

Jobs That Machines Can Do

Artificial Intelligence has plenty of limitations. While I don’t foresee a day whenpeople will be machines’ pets, I do think the jobs of tomorrow will be differentthan what exists today.
Here’s an incomplete list of jobs that seem likely to be at least somewhat automated in the next few years.
A business owner who knows how to customize, sell, and deploy AI/robots in a system to automate a task seems like a group likely to be in high-demand in the future. If business owners ever get automated, it will be one of the very last jobs to go before humans do indeed become robot’s pets. 

Monday, October 26, 2015

Training My Artificial Replacement


I’ve been programming for 20+ years. I’ve made a career out of training my (technological) replacements to do my job for me, and lately I’ve read a lot of articles talking about AI/machines taking jobs from people.
Perhaps the biggest example of automation that I’ve seen is Amazon Web Services. IT Projects that 10 years ago would have taken 50 people 6 months and millions of dollars of infrastructure can today be completed in minutes by one person feeling frisky enough to spend next Tuesday’s lunch money.
Yes, jobs will continue to be automated - continuing the 500+ year trend dating back before the printing press. The arrival of AI and It’s synonyms make the current trend more interesting, but AI still has plenty of limitations
On the one hand, I’ve yet to see anything that gives me the impression that AI is going to start creating and running businesses any time soon. On the other hand, the future looks increasingly likely to hold a lot of smaller teams at smaller companies accomplishing bigger things.

Monday, October 19, 2015

Business: The Land of 10,000 Assumptions

Nathan Rupert “Confused or Disgusted?” September 11, 2015 via Flickr; 
Creative Commons 2.0 Generic
Telling people that I’m a data scientist is kinda fun. At the time of this writing, it’s still a relatively new job title, and it’s not one that most people understand. Once I reassure people enough get past the confused stares, I usually move into this “parable:"

Every business is a collection of approximately 10,000 assumptions. Said assumptions represent the Highest Paid Person’s Opinions (HPPOs) of said company’s leadership. Most of the assumptions are correct, or the company would go out of business. In most companies, It’s probably safe to assume that 5-10% of those 10,000 assumptions are wrong, it’s just that nobody in the business is sure how to tell the right assumptions apart from the wrong ones.

In the future, I firmly believe that the companies that most aggressively, (and efficiently) hunt down, identify, and fix the wrong assumptions will have a distinct competitive advantage over the companies that don’t. 

How do you discern which assumptions are right? You use the scientific method. You ask questions about your customers, you look at your company’s data to find the answers, and the answers almost always lead you to better questions. Lather, rinse, repeat. It’s data-driven science, that usually gets shortened to Data Science.

That’s what I do.

Monday, October 12, 2015

Stuff Siri Says

Siri is Apple’s lovable digital assistant that comes complete with just a touch of sass. (Or if you’re an Android/Microsoft person, Google Now and/or Cortana do roughly the same thing.) Siri represents a significant investment from Apple over a number of years, and is among the world’s leading efforts in Artificial Intelligence and it’s synonyms.
Despite her best efforts, Siri doesn’t always get it right. There are (sometimes NSFW) web sites that focus on funny mishaps that Siri makes. Among other things, this is a good case study in the limitations of cutting edge AI. 
Whenever I see someone talking about ultra-intelligent machines pulling an I Robot/Matrix/Terminator style takeover of the world I usually smile. Then I wonder how machines are going to pull that off when Siri can't even tell if she's screwing up trying to transcribe my text messages. Kinda like that time I told Siri "text my wife on my way home period"

Monday, September 21, 2015

Artificial Intelligence Synonyms

Synonyms are words that mean largely the same thing, although there may be contextual differences. In my view, the following terms are synonyms. 
Artificial Intelligence - often focused on robotics. The AI field used to approach intelligence by building models of the world. There may still be pockets of people that still do that, but if there are, they publish papers and attend conferences that I haven’t come across yet. My understanding of the AI field is that it’s largely adopted the Machine Learning Approach.
Machine Learning - largely similar to Artificial Intelligence in it’s application, although it’s usually defined in a way that simply finds patterns/correlations in past data. Machine learning usually focuses on what’s going to happen and doesn’t care as much about “why”.
Data Science/Data Mining - largely generic terms. I went to KDD2014, a conference that labels itself as “Knowledge and Data Discovery” There was an entire “visualization” track of presentations that to me felt strikingly similar to Business Intelligence.
Informatics/Bioinformatics - focused on various aspects of the medical profession/business. Applications here are wide and deep, and include everything from real-time disease prediction to analyzing who’s (not) going to pay their bills.
Natural Language Processing - focused on gleaning insights from a large collection of unstructured text. 
Computer Vision - often similar in practice to Natural Language Processing, although it analyzes pixels of pictures/videos rather than collections of text.
Business Intelligence - Anecdotally, my experience is that many BI people focus on questions that they can answer with sums and averages. Data Scientists may look at the practice of BI and assert that BI practitioners don’t ask hard enough questions. Sales professionals from BI vendors have been known in the past to push back on this assertion from Data Scientists. 
If you disagree, I love comments! Please be nice.

Monday, September 14, 2015

Data Science and Dysentery

If you ask 100 experts what Data Science is, you’re likely to get 100 different answers. Often times the answer you get will reflect the background and skill set of the person speaking. (Full disclosure: I have a Computer Science degree, and I learned statistics over time.)
I define Data Science as applying the old-school scientific method (observation, question, hypothesis, experiment, analyze, etc) using 21st century tools. I’m not sure that I agree 100% with wired magazine that Data Science makes science “Obsolete”, but I do agree that the technology is a game changer. 
Data Science is kinda like the invention of air travel. Air travel didn’t make walking completely obsolete, just like traditional science isn’t likely to go anywhere either. However, now that airports are commonplace it is considerably less common for people to walk the Oregon Trail and die of dysentery. 
Image couratesy of walknboston, “Hard Drive” September 14, 2015 via Flickr; Creative Commons 2.0 Generic

Why I'm not a fan of Big Data

I’ve never been a fan of the term “Big Data” - for several reasons: 
  1. It’s not totally clear how to measure “big” in this context. Is "big" a certain number of rows or columns, or is it a specific size on disk? All three quantities can be manipulated.
  2. “Big” is a subjective term, with a meaning that changes over time. Data contained on a full hard drive from 20 years ago fits on a keychain today - with room to spare. 
  3. “Big Data" sounds suspiciously like “Big Oil”, “Big Tobacco”, and “Big Pharma,” 3 other groups of entities that the media likes to demonize. 
  4. More often than not, when somebody uses the term “Big Data”, it’s a marker indicating that said person doesn’t have a good understanding of what they’re talking about.
I personally prefer the terms “Data Science” or “Analytics” unless I’m working in a niche that generally uses another synonym. I’m convinced that “Big Data” has mainly survived because a sizable set of journalists  who don’t understand the field keep using the term.
The downside is that (for better or for worse) the term has stuck. Using said term has become somewhat of a necessary evil for anybody (like me) who wants to communicate with people interested in the topic.

Image couratesy of Scott Kveton, “You Have Died of Dysentery” August 31, 2015 via Flickr; Creative Commons 2.0 Generic

Tuesday, September 1, 2015

Data: Where Do You Start?

In conversations about how to use data to create value, one common question is “Where do you start?” 
Often times, people think that it’s necessary to start on day 1 by deploying some complex algorithm with a name that sounds really technical. Not only do I not suggest that approach, I happen to think that it's a horrible idea.
I start all of my data efforts with a series of questions about the field I’m working in at the time. What’s happened to date? What efforts will move the needle? What are the main things the organization can change to improve the operation? In my experience, data work initiated without a specific question in mind leads to technology without an immediate real-world practical use.
To Quote Albert Einstein, Things should be as simple as possible, but no simpler. 
Photograph by Oren Jack Turner, Princeton, N.J. Modified with Photoshop by PM_Poon and later by Dantadd. [Public domain], via Wikimedia Commons