Machine Learning

How to Win in the Age of AI

The promise of Artificial Intelligence (AI)  has fascinated us for a long time.  

As early as 1951 when by some definition the first AI programs were written (to play checkers, chess etc...) there was wild optimism about the promise of AI. 

In 1970 Marvin Minsky thought that, “In from three to eight years we will have a machine with the general intelligence of an average human being.” 

However, it turned out that we had greatly underestimated how hard these problems were and greatly hyped the promise of AI.

After a long winter in the AI research community, there is finally a lot to celebrate again due to recent success in the fields of AI and Machine Learning (ML). While we are still far from Strong-AI—where machines can simulate an average human’s intelligence—the code for several extremely hard AI problems has been cracked.

Recently, Google demonstrated that their speech recognition system can understand speech at a higher accuracy than a human being. 

Picture and video analysis has become so good that things that looked like science fiction look very real today. For example, autonomous driving cars already look like a reality. 

These are very hard machine learning problems that most experts a decade ago would not have predicted we would make such fast progress on. Computer Scientists are discovering that some operations that our brains can do with ease and that had proved elusive to computer algorithms are actually achievable if we build a deep enough model and give it enough parameters to learn. 

There has been some really amazing progress that we should celebrate.

Naturally, some of this technology is making its way into business software. There is a real sense of excitement about it in the business and analytics community. As far as hype cycles go, this feels like the peak. 

The problem with such hype cycles for business leaders is that it is often hard to predict the future.   Is it the next revolution like the Internet, or is it the next bubble like Hadoop where the excitement outweighs the immediate benefits?  Will you be left in the dust if you don’t join in, or will you be left with a lot of expensive lessons that you could have avoided if you’d just waited?

Before co-founding ThoughtSpot I spent about five years leading a team of ML engineers at Google that was responsible for building ML infrastructure and models for predicting when a user was going to click on an AdSense ad. 

Every quarter to find better Machine Learning models, we trained hundreds of models that probably had two orders of magnitude more data and features than anything else that I know of. In process, most quarters, we added more to Google’s revenue than the total revenue of most pre-IPO companies.

So I can say with confidence that technology exists in this domain that can make a tremendous impact in the business world. It’s not an excitement bubble like Hadoop.

The big question is, how do we take advantage of these technologies in a world where a lot of new AI-tech is showing up every day? 

Here is some advice.

Perfect is the enemy of good

Given the right data and the right problem modeling, even the most naive tools will give you significant gains.  The relative difference between a good enough solution and the best solution is not that much. 

In my previous role, I saw a set of smart engineers who knew nothing about ML.  They, had the wisdom and good luck of inserting machine learning into an important problem and were able to increase yield by as much as 30%.  After that, a whole bunch of Machine Learning PhDs spent years getting the next 5% out of it. 

This is very common.. Just putting something sensible in place often gets you 70-80% of the benefit. So there’s no point spending months choosing your tool in the beginning. But definitely build good data models so that you can easily swap tools later.

Not every problem is worth going the extra mile

The journey from good enough to optimal is hard and treacherous. If you do want to take advantage of the problem, make sure you have enough leverage so that the next 5-10%  improvement is worth the investment. 

If you are optimizing yield on hundreds of millions of dollars the path is obvious. 

If you are in the business of saving lives, the path is obvious. 

But in other cases you should think hard about whether you need it or not.

Humans are still the most important ingredient

To build a good ML/AI solution for any problem, the most important ingredient is still human intuition. Hiring the right set of people is probably the single biggest contributor of success for any such project.

The most bang for your buck in any AI/ML project comes from intelligent feature engineering.

While some of the feature selection process can be automated, mostly it remains an art.

Most often the feature that is most valuable may not even be there to begin with. New data pipelines need to be established to capture it. People who understand the problem domain and ML techniques are usually the ones who can establish connections that are necessary to make  this happen.

The other danger in a lack of the right experience is overfitting. There is a popular saying in data-science (attributed to economist, Ronald Coase), "If you torture the data long enough, it will confess."

Pick the tools that make your team 100 times faster at iterating

Like all important work, Machine Learning is 5% inspiration, 95% perspiration. It takes a lot of bad hypotheses to get to a good one. 

Most organizations that care about the accuracy of their machine learned models work tirelessly to reduce the time between someone having an idea and the time when the idea can be empirically proved or disproved. In fact, even outside of machine learning, if you want to know how innovative an organization is, one of the best ways to find out would be to measure the time it takes them to validate an idea.

This is one of the biggest reasons that we do what we do at ThoughtSpot. Relational Search was designed from the ground up to shrink the inquiry part of validation by 100 times. With SpotIQ we are attempting to make that process another 100x faster.

Build a way to reliably measure tiny successes

This is one thing that very few people realize but which has been critical to our success. In my experience, only one in ten or one in twenty ideas result in any improvement. Even when they do improve your results, more often than not the improvement is so small that it drowns in the noise of measurement. 

But if you do build a measurement system that can reliably catch small improvements, once you have had 50-100 of them the combined gains are usually unbeatable. So it is not only important to have a fast iteration machine so that you can try a thousand ideas quickly, it is equally important to have a way to know when something moves the needle a tiny bit.

Overall, as someone who has invested a good part of his professional life into Machine Learning, I believe that things couldn’t be more exciting. I hope you find my experience valuable and can use ML to turn your business into a rocket ship.

×