and what to do about them
Chief Data Strategy Officer
Senior Analytics Evangelist
2023 promises to be a watershed year for data and analytics
A faltering economy fuels investments in data and analytics
CDOs start to make a play for CEO
Data fluency becomes a board room and national mandate
Decision intelligence grows up with insight to action
Data mesh outpaces data fabric
FinOps gains ground as cloud costs blow budgets
The metrics layer becomes the new battleground
Analytics engineers with data modeling skills command higher salaries
Dirty data lets companies pretend to invest in ESG
ABOUT THE AUTHORS
Have a hot take?
Ready to uncover more insights?
We are in the defining decade of data . Threats to business models and economic shocks abound. Power is shifting everywhere you look, from the market to our tech stacks to the relationship between businesses and consumers. Disruption and change have become the norm.
Data-savvy organizations are poised for victory in a winner-take-all market. The gap has widened between analytics leaders and laggards in the last two years. Organizations that have transformed digitally, embraced innovation and agility, and fostered a data fluent culture boast higher revenues and profits. Those late to the game, clinging to outdated mindsets and tech stacks, are floundering — if they’re even still in business.
The secret to these organizations thriving amidst such volatility: data. But it’s not just any data. It’s fine-grained, near real-time data, coming from myriad data sources, flowing through hybrid cloud platforms to every user in an organization at the point of impact. Realizing this data bliss has never been more achievable, however over-hyped technologies, flawed organizational design, and prioritizing the wrong use cases remain real threats for every data leader in 2023.
The brightest data and analytics professionals do not want to work on legacy platforms, bound by legacy processes. They want to elevate their careers by delivering higher business impact, and they want all the modern tools and thinking required to do so.
So, as you craft your 2023 plans for data and analytics, here are the top trends, predictions, and resolutions to keep you ahead of your competitors.
The global economy continues to falter with some countries declaring a recession, while others can only say there are mixed signals. Tech behemoths have slashed thousands of jobs and the Meta has lost $750B of its market cap. This comes as no surprise, as many businesses trim budgets to weather whatever storm 2023 holds.
And yet the one area of the economy that has not shrunk remains the data and analytics space. Tech providers that are part of the modern data stack continue to show healthy growth.
Overall, the sentiment among business leaders is that investments in data, analytics, and AI are the key to competing in a digital economy.
Even prior to the pandemic, organizations that had been investing in data and analytics were already outperforming those that weren’t, and the gap has only widened in the last two years. Then there are the “ leap froggers ” or organizations who only recently began their cloud and digital transformation journey. These organizations grew 4X the rate of laggards, and even faster than analytics leaders between 2018 and 2020.
This compressed transformation is fueling tech spending in this space, but also contributing to unprecedented churn in CDO roles. Leadership teams who want to transform and become more data-driven need greater alignment in just how quickly they can modernize technology, transform business processes, change culture, and reskill talent.
However, this is not to say that spending on technology innovations will feel like a party. Budgets will be scrutinized based on value, modernization, and differentiated capabilities. In any downturn, the knee-jerk reaction is often to freeze all budgets and go into cost-cutting mode. But this approach only takes you so far. In studying airlines in the 9/11 aftermath and retailers in the 2008 recession, it’s more impactful for businesses to focus on operating efficiencies such as supply chain management and customer loyalty. With supply chains, ensuring products are available in the right store or digital distribution center, without excess inventory, is key. In terms of customer loyalty, retaining existing customers who also are your most profitable ones is a priority. The key to maximizing efficiencies and revenue retention? Data.
Align to business outcomes and value delivered.
Recognize that technology is just one piece of transformation and that only modernizing technology will not yield desired business results.
Formally invest in upskilling and reskilling with specific goals and learning time allotments.
Proactively decommission legacy technology and processes.
Let’s face it: the role of the CDO is not for the faint of heart. It requires a degree of grit, tenacity, vision, and cross-functional collaboration that rivals that of the CEO. As every facet of the world around us becomes more digital, the importance of data as a strategic business asset has further increased. In this digital economy, a CEO must at the very least understand data. The most successful ones leverage it as a competitive advantage.
And yet, the CDO role is still in its infancy, with one-quarter of CDOs still in their first year on the job. This first generation of CDOs — the defensive CDOs — are primarily concerned with safeguarding data and tightly governing the keys to the proverbial data kingdom. These CDOs usually reported to the IT manager or CIO.
Second-generation CDOs see how data can be used to improve business operations and create data products. They expanded from a defensive gatekeeper role to increasingly focusing on value delivery and in some cases, data monetization. This maturation is similar to the maturing of the CFO role , which historically focused on keeping finances in order and now drives business results. Savvy CIOs have elevated these offensive, business-minded CDOs to report to the Chief Digital Officer or COO while command-and-control CIOs may have felt threatened by the CDO’s expanding sphere of influence.
With data now ubiquitous, touching every part of an organization in a way that only money ever has, CDOs have insight into all corners of an organization. This global view paired with a nuanced understanding of the business sets CDOs up to make a real run at CEO. And yet, the role of the CDO will continue to remain precarious. One reason CDOs have such short tenures , usually less than two years, is because of a lack of alignment from business leaders on what it really takes to be data-driven, or unrealistic expectations on how quickly change can be affected. On the other hand, CDOs who do drive change and upset the status quo may get pushed out by the old guard. This is one reason we have seen such clean sweeps of executive leadership teams — a trend that won’t slow down in 2023.
Consider these examples:
Google and Facebook are really data companies.
Capital One, the first company to have a CDO, no longer has one. And yet data is part of the core business.
CEO loves data
and their CDO recently became executive
the e-commerce line of business.
Recognize the degree of grit required of the role. Expect conflict as you drive change in your organization. Assess your allies and detractors through this lens.
Know when an organization lacks either the culture or the leadership to transform at a pace this economy requires. Know your worth in a market hungry for leaders who understand both business and data.
Invest in yourself, building out those skills required to get you to that next level. You are most likely well-versed in technology and data skills, but what about leadership, negotiation, and business domain?
Rewrite the data-business relationship to guide business counterparts to the questions that should be asked, not just answering what they’ve asked.
As the digital economy becomes the economy, the need for a more data fluent workforce has been growing commensurate with organizations’ desire to become data driven And yet, the ability to correctly interpret data is a weakness for the general public, both personally and as a business skill.
Consider the following:
Amid the pandemic, most people did not understand the data, whether data gaps in case counts because early testing was not available or the shift to non-reported home tests later on.
During last October’s deadly hurricane in Florida, only 18% could correctly interpret the cone of certainty.
Only 22% of business people feel confident in using data.
According to a Gartner survey , 61% of business leaders say their organization “cherry picks” data to tell their own version of a story.
The U.S. Federal Data Strategy includes a provision for improving data fluency. The USA also once again restored the role of Chief Data Scientist. However, none of this addresses teaching data fluency in schools. Statistics, a key part of data fluency, is not consistently offered. Organizations such as Data Science for Everyone have moved in to fill this void. Other countries such as Singapore are further along, with a formal data literacy program that more than half of federal employees have completed.
In business, CDOs increasingly own the charge for building data fluency, working collaboratively with people teams and business leaders to specify the level of fluency required by job function and role. However, some organizations have made the mistake of conflating data fluency with technical literacy. Forcing business people to learn coding in SQL or Python or days of training in hard-to-use BI tools is a mistake, one that further exacerbates business people’s fear of data.
Create a graphic that shows the flow of data in a business process by major subject areas. This is not a pipeline orchestration model, it’s a high level map of how your business works and where data comes from.
Create a cross-functional tiger team for a line of business with roles representing the systems, data, processes, and business operations. Then, conduct a day-in-the-life-of-a-customer for each role.
Partner with a university or consultancy to provide executive data fluency workshops.
Flip your training priorities to spend 80% on data training and enablement and 20% on tools and technology.
Within your community, volunteer time with youth organizations like Girls Who Code or Boys and Girls Scouts to spread awareness of basic data concepts. Bring fun subjects to these workshops such as data in everyday life, sports, or outer space.
Decision intelligence was once the realm of rules-based models: Approve a loan if the credit score is above 700, decline it if the score is below 700. Interview the candidate if they went to an Ivy League school, reject if there is a career gap.
With more data and modern cloud ecosystems, decision intelligence can now be smarter, fueled by the best combination of human insight and technical advances.
The combination is critical. Blackbox AI that lacks transparency risks bias at scale. There are too many decisions where an exception to the rule may yield a better result. These decisions require a human in the loop. Should Apple Pay really have husband ? Would better insights have addressed this biased AI and prevented a PR blunder?
Bringing analytics to bear on decision intelligence, one might have run a query to ask, “How do credit scores between male and female applicants compare when salaries are the same?” A following query could be, “Give me a list of those applicants whose direct deposits and on-time payments have been consistent for the last 3 years.” With the results of this query, the list of customers to flag in the system for deeper review could be automated by reverse ETL. With reverse ETL, insight-related data is taken out of the cloud data platform and exported to a spreadsheet or written back to the operational app. Alternatively, the analytics platform relies on an open API to communicate from the analytics platform to the operational application. The shift of work to cloud-based business applications with open APIs enable a more closed-loop insight to action process.
extract only, source to target
source to target, back to source
Alternatively, the analytics platform relies on an open API to communicate from the analytics platform to the operational application. The shift of work to cloud-based business applications with open APIs enable a more closed-loop insight to action process.
Evaluate analytics workflows that include subsequent manual processes for automation opportunities. Look specifically for manual exports to spreadsheets that then involve re-entering data in cloud-based business applications.
Identify high volume decisions that could be augmented with AI. Include diverse stakeholders earlier in the design processes to minimize the risk of unintentional biases.
Recognize the degree that culture, poor data fluency, and incentives interfere with the desired action being taken.
The world of data frameworks is at a crossroads. Never before have the debates raged fiercer as designs and doctrines clash.
Last year, we predicted that a combination of data mesh, data fabric, and lake house concepts would finally dethrone the data warehouse. Throughout 2022, data mesh proved to be the hottest and most debated topic. Data conference organizers have been knocking on the door of Zhamak Dehgani, founder and author of the Data Mesh, to be a keynote speaker. Innovative brands such as Roche, McKesson, and Capital One have all openly talked about how they are adopting a data mesh.
The four fundamental pillars of data mesh are appealing to data leaders who are driven to attain increased speed to market and decreased time to insights.
Even with all the attention that data mesh is getting, questions and concerns still remain:
Will it work in a smaller organization?
Is the actual problem our belief in monolithic centralized data warehouses and fear of change?
Can domain-driven design (DDD) and microservice principles of software engineering be adapted to data management and analytics?
Gartner’s hype cycle added further fuel to the debate, predicting that the data mesh will be obsolete before maturing. We disagree with this assessment and suspect they will reverse course next year.
Despite these concerns data mesh has been gaining mindshare and traction with both business and technology leaders. While both data mesh and data fabric are frameworks and not architectures, data mesh focuses on an architecture framework centered on organizational change and data as a product which is resonating well within the industry. On the other hand, data fabric is often seen as a technology-centric framework that is falling short as organizations migrate to more modern data stacks.
Check emotions at the door. As a data leader you are here to solve business problems not embark in holy wars based on technological dogma.
Attend or coordinate an executive education on data mesh.
Conduct an evaluation of data mesh for your organization with a cross-functional leadership team including the line of business and technology leaders.
Understand data mesh as a sociotechnical approach to managing and analyzing data. Highly centralized organization models may lend itself to a data warehouse approach, whereas federated organizational models fit a data mesh approach.
With economic uncertainty fueled by a host of issues, including the war in Ukraine, inflation, and politics, data and IT organizations will have a tall order in 2023. They will look to recession-proof the business while protecting the cloud investments they made already over the past several years.
While the move to the cloud was already underway, the pandemic accelerated this transition for countless companies. Initially, the “need for speed” and data modernization outweighed the need for cost controls, transparency, and even FinOps.
But the pendulum has swung back. While data and IT organizations will continue to see the value of investments into cloud infrastructure, optimizing existing budgets and managing future costs will be paramount to organizations.
Consider the following:
According to marketing intelligence firm Slintel , Snowflake controls 19.73% of this year’s data warehousing market share, second only to Amazon's Redshift 22.16% share. Certainly, there are many factors contributing to Amazon’s continued leadership in the data warehousing segment; not the least of which is its fixed-pricing model which provides customers with predictable costs.
Define metrics and analytics that are needed within your organization to control cost and determine what actions can be taken.
Define a FinOps capability within your organization. That means all cloud data costs, not just data or public cloud costs.
Evaluate how your public cloud provider provides discounts for committed usage. Every public cloud provider has a mechanism for discounts based on usage commitments.
Create a waterline analysis that demonstrates the cloud services that are using optimized or discounted resources.
We’ve had wars over databases, BI tools, and data warehouses. In 2023, the metrics layer becomes the new battle ground as organizations aspire to establish a single version of the truth, universally accessible by best-of-breed analytics products in the modern data stack.
This battle is not entirely new, nor is the problem itself. What has changed is the technology potentially enabling interoperability between different components in the data and analytics stack. Potentially.
Since the beginning of the data warehouse era, the hope was that the data warehouse would provide a central version of the truth. And yet, within each BI tool data analysts are able to create their own metrics leading to different definitions and multiple versions of the truth. Consider the following two examples:
In an attempt to reign in these different versions of the truth, some BI vendors exposed their semantic layers to third-party BI tools. OBIEE was one of the first to do this, showing their semantic model as an ODBC data source to other BI tools. Performance was poor and few ever adopted this approach. Moreover, this approach did not enable business data analysts with domain expertise to control these data models; semantic models of first generation BI tools were largely controlled by IT. Enter dbt.
dbt began as a SQL-based transformation tool enabling analytics engineers to create reusable data sets that could be accessed and persisted by a range of modern data cloud platforms. ThoughtSpot was the first analytics platform to enable interoperability. In October 2022, the company announced its new metrics store at its Coalesce conference to 20,000 attendees.
While dbt seems to have the most momentum, others are joining in. AtScale has shifted its positioning from data lake accelerator to metrics layer. Then there’s Google, who has been trying to resuscitate Looker by making it a metrics layer. The first sign of this was in Tableau and Looker’s joint press release that enabled Tableau access to Looker data models. Frankly, there was no real innovation here. The announcement seemed more an indictment of Looker's visualization abilities and Tableau’s lack of robust data models. The subsequent Looker announcement at Google Next, however, does show a product evolution. What remains to be seen is how well this works in non-GBQ data stores and with the broader ecosystem.
Assess how open the potential metrics layer is in terms of databases it can access and persist to, as well as by which analytics products can interact with it.
Don’t expect first or second generation BI and visualization tools to be able to consume modern cloud metrics layers. The degree that they can will depend on the openness of their APIs.
In 2022, the analytics engineer displaced the data scientist as the sexiest job of the year. Next year, analytics engineers will be called on to do even more as they’re tasked with advanced data modeling. The modern data stack promises faster time to insights, self-service analytics, and the democratization of data. However that promise is predicated on quality data models.
How can you tell if you have a quality model? It enables the business (not the underlying data platform) to scale.
Analytics engineers are a hybrid role of software engineers, data engineers, and data analysts. According to Madison Schott , “an analytics engineer is someone who moves and transforms data from the source so that it can be easily analyzed, visualized, and acted upon by the data analyst or business user.” The skills needed for an effective analytics engineer are software engineering practices, great collaboration skills, SQL proficiency, dbt, and an understanding of data. These are the same skills needed to design and deliver quality data models.
As backlogs are shifting from data engineers to analytics engineers, the operational demands to maintain data pipelines will also increase.
Analytics engineers who can build reusable and scalable data catalogs, semantic layers, and data models will further extend the gap between data leaders and data laggards, scale analytics capabilities, and succeed with self-service analytics. All too often, the default approach is immediate and tactical, building models that are just a bunch of tables or worse, one big table. By contrast, the strategic and well-positioned route will be to design domain-specific data products or dimensional models, while leveraging data catalogs and semantic models that enable self-service analytics at scale.
Deliver domain-specific models that answer entire classes of analysis instead of just tactical dashboards.
Learn about data catalogs and semantic layers to see how they impact the scalability of your analytics capabilities.
Provide training to analytics engineers and data analysts on the creation and usage of dimensional models.
Is self-service BI attainable? Benefits and historical concerns of self-service BI
Power BI – Star schema or single table
The return of the JBOT
Understanding the components of the dbt Semantic Layer
Follow #bringbackdatamodeling on LinkedIn industry experts and hands-on architects discuss best practices.
Let’s be clear, we love the promise of Environmental, Social, and Governance programs in the data space. The idea that investors, customers, and employees have full transparency into how well a company is performing on these aspects and that as a society, we are all on the same page, is our version of data making the world better.
And yet, this is one area where data is used to lie. Vanity metrics and data gaps abound. These ESG vanity metrics now have a name: greenwashing . The greenwashing has been so bad that some investors estimate half of all so-called ESG funds will be reclassified .
Here are a couple ways organizations lie on these metrics:
Manipulating the data to tell the story companies want to tell — rather than the reality of the actions they are taking — threatens to undermine efforts across the market. Further, our own industry is part of the environmental problem. With increased digitization, we are consuming more electricity with an estimated 5 to 9 % of all electricity consumed by digital. As we capture and store more data, we are creating the equivalent of data “junkyards,” storing data “just in case.”
And yet, signs of progress are there. Amazon has pledged to be carbon neutral by 2040 and made material progress in the last year in reducing emissions. Businesses can choose green data centers that are powered by renewable energy sources. Some data centers are using AI to lower compute based on demand and usage patterns. Unilever provides transparency on its sustainability data for employees, citizens, and investors. These are the companies and efforts that we should aspire to and emulate as leaders in our industry.
Purge legacy data and legacy software.
Report CO2 on everything from dashboards to expense reports.
Make diversity reports accessible to the entire organization, not just line managers.
ABOUT THE AUTHORS
Cindi Howson is the Chief Data Strategy Officer at ThoughtSpot and host of The Data Chief podcast. Cindi is an analytics and BI thought leader and expert with a flair for bridging business needs with technology. As Chief Data Strategy Officer at ThoughtSpot, she advises top clients on data strategy and best practices to becoming data-driven and influences ThoughtSpot’s product strategy. She recently was awarded Motivator of the Year by Women Leaders in Data and AI.
Sonny Rivera is a Senior Analytics Evangelist at ThoughtSpot. Sonny is a modern data stack thought leader and expert. He brings with him over 25 years of experience in delivering data solutions that drive business value and increase the speed to insights. He also has a long history of bridging the gap between business needs and technical capabilities.
His non-data prediction for the year? 2023 will be the last year of Daylight Savings Time in the United States, new legislation will eliminate it.