From Doom to Delight: How to Turn Around Any Data Project

If you were going to start a company, where would you begin? Build a headquarters for 2,000 people? Hire a big sales team? List some shares on the New York Stock Exchange? Probably not.

Savvy entrepreneurs know that a successful business starts with identifying a real problem in the market. What’s the pressing need you’re going to solve? Why hasn’t it been solved to date? What unique product or service can you create to solve it? Only then do you build a team, develop a prototype and create a company, moving gradually against clear objectives to minimize risk.

We should think about data initiatives in the same way. If you embark on a “data project” at your company, you’re setting yourself up to fail. That’s because “data projects” almost by definition are ill-defined and unfocused. In fact, Gartner has estimated that 60 percent of big data projects are abandoned before they get past the pilot stage, citing a lack of clear objectives, overinvestment in complex tools, and a failure to get buy-in from stakeholders.

I’d argue the real reason, however, is that we’ve come to believe data has inherent value. Ever since The Economist declared that data is “the world’s most valuable resource,” businesses have been scouring their organizations for every scrap of information that might be useful and throwing it in a massive data lake without thinking through how that data will actually drive the business forward. Data is only valuable when it serves a business goal, and investing in a project not tied to a clear outcome will be costly. Those who ignore this will repeat the mistake businesses made in the 1990s when they built massive ERP systems that never delivered on their promise to move the business forward.

BI practitioners should approach data initiatives methodically and with a purpose. Here are some steps that will ensure your project delivers real value, and that your users won’t turn around when it’s complete and tell you they wanted something else. These are obviously general points, but you can think about how they apply specifically to your business.

Give the people what they want

Start by sitting down with your business users to understand exactly what data they want and how they plan to use it. This is the reverse of what many projects attempt to do. Instead of starting at the beginning of the data pipeline and building a complex delivery infrastructure, give people the data they’re asking for right away. This way, your users aren’t banging down your door while you spend three months building the infrastructure -- and your pivot costs will be much lower if they discover it’s not what they wanted after all. In addition, by starting with your business users, your project will be tied to a concrete business goal from the outset.

Start manually

Yes, delivering data right away means you may be supplying it manually for a while, but that’s better than building out a self-service platform, connecting it to various data sources and finding out later it’s not useful. Feed data into your users’ BI tools manually if you have to, and build out the automation technology afterwards. That way, you’re not building infrastructure before you know it will provide immediate value.

Don’t boil the ocean

Data has transitioned over the past decade from a highly targeted asset to an abundant resource. It’s streaming in from mobile apps, marketing tools, IoT devices, ecommerce sites and more. If you believe this data will someday provide value, by all means collect it now; storage is cheap, and future projects that leverage machine learning will benefit from an abundance of historical data. However, don’t invest the time and resources to clean and prepare all this data for use until you know it will provide value. It’s one thing to store data, it’s quite another to invest in data models and delivery pipelines before you know they’ll be useful. Every bit of data you make available to end users should be tied to a business objective. If it isn’t, leave it where it is.

Take a leaf from the agile playbook

Software teams learned early on that an iterative approach to development is faster, more efficient and leads to fewer errors. Business teams are now adopting this agile methodology, and those in charge of data initiatives should do the same. An agile approach implies breaking a project into small parts that can be delivered as they’re completed. This provides faster time to value and and makes it easier to pivot if the data you’re collecting isn’t yielding the expected results. It’s a cautious approach that minimizes risk and provides a faster return.

Trust your data

The number one objection I hear from organizations considering a new data initiative is that they don’t trust their data. How can they allow business users to make decisions affecting the future of the company if they’re not certain the data is 100 percent current and accurate? Well, I have news for you: Your employees are making decisions already, either based on no data at all or on fragments of the very same data that you’re concerned about. Don’t be paralyzed by inaction because you don’t fully trust your data; it’s better to get started and learn along the way.

Data is a valuable asset, but data initiatives can be costly so minimizing risk and maximizing returns is essential. These costs include not just the infrastructure and resources to build a data platform, but the opportunity costs of not getting it right. If you spend three months building a data delivery system and your users still can’t get the answers they want, they’re unlikely to ask you to build it all over again. They simply won’t ask those questions, and that could be a huge missed opportunity for the company. Building incrementally and validating your platform along the way is how you get it right first time. That way, your business project is far more likely to succeed.