10 data and analytics
trends for 2024

and what you can do about them

Cindi Howson

Chief Data Strategy Officer

Benn Stancil

Field CTO





Sonny Rivera

Senior Analytics Evangelist

INTRO 

For the last three decades, data has been out of reach for many.

Instead of bringing data-driven ideas to life, we were hardcoding SQL statements, wading through technical debt, and sitting on silos of fragmented data—unable to extract timely insights. The powerful experts were left to tell a blessed version of the truth. 

While we’ve experienced bursts of progress, the dream of truly democratized data remained elusive. But Generative AI (Gen AI) has sparked the rebirth of our industry, breaking down technical barriers and firmly placing your business, people, and ideas at the center of data and analytics. 

This is the Data Renaissance.

By embracing the opportunities of this moment, you can empower your entire community—your team, colleagues, customers, and partners—to reimagine every part of their world, making the impossible possible.

That doesn’t mean it will be an easy course to chart. This opportunity only exists for those willing to step up and embrace it. The names that go down in history will be the ones who are bold, first movers. Leaders who seek to understand not only the opportunities of Gen AI, but also the limitations and pitfalls, will set their organizations up to thrive in this new world.

In the following chapters, we’ll share insights, resources, and actionable resolutions—covering everything from legislation and ecosystem innovation to career paths and product building. This guide is built to illuminate how you can make the most of data and analytics in 2024 and beyond, but we know it’s just the beginning. And we want to hear your thoughts. Join the conversation.

TREND
01

Gen AI kicks off 5th
wave of analytics
and BI

Gen AI is transforming the entire data and analytics ecosystem, kicking off the fifth wave of analytics and BI—a wave that will be bigger, faster, and more furious than any of its predecessors. As we predicted in February 2023, the first movers to embrace Gen AI with ethics and trust will have an outlasting advantage. Lack of imagination and technical debt remain the key blockers to unlocking this opportunity.

Right now, organizations are so mired in creating dashboard widgets or waiting for data loads it’s hard to fully reimagine a radically new way of going from collected data to insight to action, without all the current plumbing and processes.

Think about 1997: Before high-speed Internet and smartphones, did you ever imagine shopping from your phone, trying those trendy jeans on virtually, having them magically appear on your doorstep within hours, paying with a click of your watch or phone, and receiving a photo as proof of your delivery? In 1997, you would have laughed.

We need to stop micro-optimizing the archaic processes and start reimagining them in a Gen AI world—across the whole data-to-insight value chain.

Reimagining the future of healthcare

A reimagined Gen AI future in healthcare, for example, might include medical data shared from a patient’s Fitbit and smartphone with all their doctors. Or, health alerts that are simultaneously sent to the patient and doctor, benchmarked against a population, with recommended actions. If the alert was blood pressure related, the recommended actions might include a class at a yoga studio, lower sodium foods from the grocery store, or a refill of your medication.

Five waves of analytics and BI innovation

The waves of disruption in our industry have gotten shorter, while the rate of creative destruction has accelerated across these waves:

01

The first wave was about enabling report developers to create reports without needing to code in SQL. It took 15 years to reach maturity, with SAP BusinessObjects and IBM Cognos being the market leaders.

02

Wave two focused on data analysts’ ability to visualize data locked in spreadsheets, cubes, and data warehouses. It took 10 years to reach maturity, with Tableau, Qlik, and Microsoft Power BI being the market leaders.

03

The third wave was augmented analytics, which brought Natural Language Processing (NLP) and automated insights into on-premises data stores and centralized data warehouses. It took 3 years to reach maturity, with ThoughtSpot pioneering this category and others rapidly acquiring companies to imitate For example, Tableau acquired Clear Graph and launched Ask Data, and SAP had several acquisitions and a rebranding, which resulted in SAP Analytics Cloud.

04

The modern data stack ushered in the fourth wave of data and analytics, aided by a pandemic in which companies accelerated their digital transformation and cloud migration plans. Looker, Mode, and Sigma were the leaders in this space, with deep integration into cloud data platforms and transformation tools like dbt. But, the modern data stack wave never fully matured; its peak was curtailed by a global war and an economic crisis, now being replaced by the fifth wave.

05

The fifth wave is the Gen AI wave that’s actively reshaping every phase of the data and analytics workflow. This wave began at the start of 2023 when ThoughtSpot was the first to announce and launch an AI-Powered Analytics and BI experience in production, ThoughtSpot Sage. Microsoft Copilot for Power BI and Google Duet AI for Looker are both in preview as of November and August of this year, respectively, and Tableau Pulse is expected to be released in the Spring of 2024.

Customers buying and deploying analytics and BI products in this wave must consider:

  • Trust with human-in-the-loop, always.

  • Flexibility of LLMs used to balance accuracy, cost, and provider safeguards. This may include walled-garden options and industry-specific LLMs. 

  • Degree that products are built for the cloud, have open APIs, interoperability across data ecosystem components, and provide the ability to leverage all data, including semi-structured.

Your 2023 resolutions:

Prioritize use cases. Continue (or start) co-innovation sessions with key stakeholders to prioritize the highest-value use cases. Use the approach of quick wins where the data is ready enough, the value is high enough, and there is cultural readiness in that business unit or department.

Get hands-on experience. Experiment hands-on with different Gen AI products to understand trust, accuracy, and cost trade-offs. Consider every aspect of the data pipeline—data capture, data modeling, natural language query, natural language generation, action, and chatbot support and interactions. Leverage free trials for rapid proofs of concepts.

Require humans in the loop. With hallucinations a given, evaluate risk-mitigation strategies to ensure human-in-the-loop whenever Gen AI is leveraged. Recognize the value of directionally accurate insights and the need for higher accuracy in certain use cases.

Focus on business value. Resist chasing shiny objects and focus on how disruptive solutions solve business problems. As a best practice with any innovation, take a baseline of your KPIs today, then compare how Gen AI capabilities improved these KPIs.

Get ready to fight for budget. Sharpen your negotiation skills as CDAOs and CFOs continue to wrestle between innovation, spending visibility, and SaaS app sprawl.  Recognize that legacy tools and processes contribute to missed opportunities, higher manual costs, and technical debt.  

TREND
02

Dark data gets
a bit brighter

For as long as we’ve heard about the potential of big data, we’ve also heard about the potential of unstructured data.

Each year, the predictions have been the same:

Unstructured data—the information contained in text documents, images, videos, and other formats, as opposed to the structured data contained in precisely defined tables of numbers—holds a vast amount of useful insight. Technological advances will soon help us unlock it. 

So far, the results have been underwhelming. While there have been pockets of progress, most businesses are making relatively light use of their unstructured data. Chances are, if someone tells you that they’ve uncovered an interesting insight while analyzing data, you’ll assume they’ve been working with numbers and spreadsheets—i.e., structured data.

In other words, for most companies, “data” is synonymous with “structured data.”

Why haven’t we done more with unstructured data? Often, it’s still too hard to work with. To extract meaningful value from thousands of images and text files, we can’t just search through them or parse their contents and summarize them into something useful.

With structured data, we can combine millions of individual data points into metrics and aggregate summary statistics. But how do you condense millions of emails or audio files in the same way? Though some vendors provide ways to extract customer sentiment from social media posts or common themes from sales calls, these tools can only answer a limited number of questions on specific datasets.

The process of working with more general datasets has been more or less manual: someone reads the text, learns from it, mentally aggregates it, and then summarizes what it means for other people. This is extremely expensive, relatively slow, and prone to all sorts of errors. But last year, that changed.

The explosion of accessible, large language models (LLMs) now makes this sort of analysis—the very human practice of taking in a bunch of messy information and summarizing it—far cheaper, faster, and more reliable than it’s ever been before.

There are two major ways these technologies will profoundly impact business:

Aggregation:

First, they provide a means for aggregating this data into something digestible. For example, an LLM that’s been trained on or fine-tuned with thousands of customer support tickets can summarize those tickets similarly to the way an average summarizes thousands of data points. Just as computers replaced the need for us to tediously compute various statistics by hand, LLMs will replace the need for us to read hundreds of individual documents and manually compress them into learnings and key takeaways.

Extraction:

Second, LLMs will make it easier for us to extract structured data from unstructured data. Tools like Snowflake’s Document AI use LLMs to pull specific pieces of information from documents. Want to know what percentage of your sales contracts include non-standard terms, even when those terms are recorded in PDF documents? Looking to analyze how frequently your customer reviews mention product quality and what percentage talks about it positively or negatively? LLMs can parse and classify these documents faster, cheaper, and just as proficiently as humans can.

All that said, LLMs are still a new and rapidly evolving technology. Most companies are just starting to work with them. Moreover, in order to make use of unstructured data, companies still have to collect it, build reliable pipelines to ingest it, and figure out how to deal with a range of complex security and compliance concerns. These are not trivial pieces of data infrastructure, and will take time—years, probably—for some companies to build.

Though we’re not there yet, the future is promising. Until recently, it’s taken some degree of faith to believe in the value of unstructured data. We’ve been told it’s in there, like some precious ore buried deep inside of an impenetrable rock. LLMs are getting us closer to the center.

Your 2024 resolutions:

Start assessing the value of your unstructured data. Ask yourself, if you had a million researchers who could read every contract, customer review, or interview feedback form and summarize them in some way, what would you ask them to do? What initiatives could you launch, or what difficult decisions could you make with the help of these researchers?

Prepare to invest in data collection. LLMs are only as useful as the data that’s fed into them. Collecting clean and reliable data—especially from new sources that you haven’t historically prioritized—will take time. While unexpected setbacks and frustrating problems will happen, they’ll all be solvable. Just don’t expect it to happen overnight.

Prioritize security and compliance from the beginning. There may be lots of valuable information in emails or in user content that’s created in your application—but can you use it? If you can, can you use it with external vendors like Microsoft and Anthropic? And how do you make sure your AI applications don’t hallucinate private information back to their users? Ask yourself these questions early so you don’t invest in building tools and products that you can’t use.

TREND
03

Data mesh and data
contracts face off with
data quality

Recent research by Experian makes a startling claim: companies typically lose between 15% to 25% of their revenue due to poor data quality.

These losses manifest in various forms, including:

  • Time and resources spent in correcting errors

  • Verifying data through alternative sources

  • Adhering to compliance

  • Mitigating consequences of ensuing inaccuracies

Companies typically lose
between 15% to 25% of their
revenue due to poor data
quality.

Recent research by Experian

Gartner's analysis underscores that poor data quality significantly impairs an organization's competitive edge, impeding crucial business goals. Similarly, a Harvard Business Review report citing IBM reveals a broader economic impact, where the U.S. economy annually incurs a staggering $3.1 trillion loss owing to data-related inefficiencies. These losses are attributed to reduced productivity, frequent system outages, and increased maintenance expenses.

The research clearly demonstrates that prioritizing high-quality, reliable data is not only a technical necessity but also a strategic imperative for Gen AI to achieve its maximum potential. Enter data mesh and data contracts.

Zhamak Dehghani, CEO of NextData and author of “Data Mesh,” states that “Trusted data still isn’t accessible enough to support the AI revolution. Decentralized data is the future.”

Chad Sanderson, CEO and co-founder of Gable.ai, is creating a data collaboration platform based on data contracts to meet the growing needs of BI and AI. The emergence of startups focused on data contracts and data mesh indicates a growing interest in these trends, but data quality issues have plagued the industry for decades. Leaving many to question: Is the solution data contracts or data mesh?

Is the solution data contracts or data mesh?

Both sociotechnical approaches strive to enhance the agility, efficiency, and value derived from data in large enterprises. Data mesh focuses on a decentralized, domain-oriented data management and architecture approach; data as a product is a key concept as opposed to a centralized data warehouse.

Data contracts focus more on greater collaboration to decentralize data ownership, manage schema evolution, provide data quality measures, and enforce SLA, enabling data to reach its fullest value.

 

$3.1 trillion loss in US economy annually
due to data-related inefficiencies.

Harvard Business Review

Sounds amazing, right? Yet, the controversy still remains. Here’s why:

  • Aren’t data contracts just a sub-component of a data quality tool?

  • That are the standards, tooling, and processes for data mesh and data contracts?

  • Aren’t data contracts just a practical application of the underlying data mesh concepts, not a product or practice unto itself?

  • Will they become a standard or die a slow and silent death like other approaches? 

Even in the face of these questions, NextData and Gable.ai raised funds in the Fall of 2023. Zhamak led NextData to acquire $12 million in funding, and Chad Sanderson raised $7 million to start Gable.ai. Clearly, the investors see value in both data mesh and data contracts. 

The demand for quality data at scale from AI and BI coupled with the industry-wide data quality issues suggest that there is potential for data mesh and data contracts to revolutionize data management in BI and AI.

However, the future remains uncertain, with their widespread acceptance and efficacy yet to be determined.

Your 2024 resolutions:

Analyze the costs of bad data to your organization. Use this discovery to determine if data mesh or data contracts could help you. Decide on your key objective(s) for implementing data contracts in your organization. What problems do you want to solve first, and why are they essential to the business?

Use your leadership platform to promote data quality within. Ensure the conversation focuses on value realization and creation, and highlight the critical role of data quality in delivering value. 

Evaluate your 1st-party data to determine if there is hidden value. No data is too small to add value. Learn about smart-sizing your data for Gen AI from Andrew Ng.

Recognize that data mesh is not something you can buy. It requires modern cloud technology but, more so, cultural readiness to shift power to business units.

TREND
04

Data fluency
expands to AI
fluency

In 2023, we saw data literacy and data fluency initiatives rise on the list of CEO priorities. In 2024, data fluency will evolve to include AI fluency, and organizations that already had a jump start with their data fluency programs will be in a better position to leverage AI.

Organizations have made missteps in conflating data fluency with technical literacy. The two are very different. Technical literacy is about using the technologies whether SQL, Python, Tableau, Mode, ThoughtSpot, Alteryx, and so on. Data fluency is about understanding where data originates, business definitions of data elements, statistical concepts, visual storytelling, and interpretation.

Gen AI will substantially lower the technical proficiency required to interact with data. True natural language processing makes it easier to interact with data. GPT-like interfaces, such as ThoughtSpot Sage, enable business users to more easily ask questions of their data, generate insights in a few clicks, and automatically summarize visualizations and anomalies through text-based explanations. These innovations allow organizations to balance their once-tech-heavy literacy programs and focus more on the language of the business, an understanding of incomplete or biased data, mathematics, and basic statistics.

With AI becoming infused in our daily lives and business processes, key aspects of AI literacy include:

  • Recognizing when AI is involved and AI-generated content

  • Understanding how digital interactions fuel AI models

  • Being aware of a model’s degree of accuracy

  • Understanding inputs to algorithms including training data and emphasis of features/attributes

  • Realizing how using public AI platforms may contribute to data unintentionally becoming part of the training data set

84%

of surveyed Americans
fail AI literacy skills

For example, when Hurricane Ian was approaching the West Coast of Florida in Fall 2022, many citizens within the cone of uncertainty did not prepare or evacuate. The storm veered slightly south, devastating Fort Myers.

Public officials only realized later that too many people misinterpreted the cone of uncertainty. The whole concept of a prediction continues to remain lost on many citizens—whether in hurricanes, snowstorms, or even how long your Uber ride will take.

As Gen AI starts being used in classrooms—with schools having wide-ranging differences in policies—students, parents, and teachers are all faced with AI literacy challenges. Popular tools to check for plagiarism are also confounded with Gen AI content, producing false positives. Teachers who better understand how the tools are trained, and why they are less accurate for students where English is a second language, are less likely to falsely accuse a student of plagiarism.

Regulatory requirements to label AI-generated content may help with identifying such content. But that alone isn’t enough. In a survey of over 1500 Americans, conducted by the Allen Institute for AI, 84% failed basic AI literacy skills.

Your 2024 resolutions:

Launch a data fluency program today. If you have not already launched a data fluency program, do so now. Although the CDAO should own this initiative, collaborate with HR and People Ops as part of a comprehensive learning and upskilling program. Evaluate providers such as Datacamp, The Data Literacy Academy, The Data Lodge, or local universities to formalize and scale.

Train your team on Gen AI. Enable basic training on Gen AI concepts for all employees; include what data gets shared and persisted in public models.

Leverage gamification to drive critical concepts. Gamify human-in-the-loop practices to demonstrate AI limitations and benefits—for example: resumes rejected/accepted or marketing emails most likely to elicit a response.

Measure and communicate your efforts. Publish and communicate data fluency and AI fluency KPIs within the organization. Measure progress over time and by role. 

TREND
05

Semantic layers
garner more
debate than
deployment

Will the adoption of standalone semantic layers take off in 2024? Probably not, at least if history is any guide. Despite the recent releases by dbt Labs, Looker, and their industry competitors’ efforts to rejuvenate these systems, the impact of semantic layers in 2024 appears grim. Combined with the disruptive force of Gen AI, this suggests that code-centric and cube-centric semantic layers will soon be regarded as the gatekeepers of a bygone analytics era.

The semantic layer promises to bridge the gap between complex data structures and downstream stakeholders—like analysts, data scientists, and non-technical business users. Self-service analytics is a core capability of that promise. This is achieved by presenting data in a format that is not only accessible and consistent but, most importantly, easy enough for a business user to understand and use in the context of their day-to-day work.

While interest in the promise of semantic layers has grown and the need has not diminished, recent incarnations of the standalone semantic layer must overcome the longstanding and still unresolved challenges.

Despite the recent releases by dbt Labs, Looker, and their industry competitors’ efforts to rejuvenate these systems, the impact of semantic layers in 2024 appears grim.

Four longstanding challenges to the semantic layer include:

1. Integration and interoperability

Integration and interoperability continue to be immature across the multiple areas of the modern data stack—not just semantic layers. So, the semantic layer becomes another layer that needs to integrate with security, catalogs, linage, observability, analytics tools, query tools, and custom applications.

The standalone semantic layer has an additional integration challenge since it resides between analytics tools and data warehouses or data lakes. It must support multiple languages like SQL and MDX for data access, multiple interfaces like REST, GraphQL, or JDBC for programmatic access and metadata discovery, and multiple SQL dialects for data access across data lakes, warehouses, and cubes. Integration between tools and technologies can be complex and costly, leaving tool vendors to choose only a select few to support. 

2. Governance and data quality

Universal semantic layers, the holy grail of semantic models, have suffered from poor data governance, duplicated metrics, poor data quality, and a lack of flexibility and agility due in part to the reality that not all metrics are predefined, shareable, and universal. Analysts are often creatively solving problems with data—so, if the needed data is not readily available in the semantic layer, they’ll get it by hook or by crook. This often leads to “a hellscape of metrics,” as one data leader put it. Modern data operations and deployment strategies will help, especially when code-first interfaces and markup languages are available, but that rarely solves many of the governance and data quality issues.  

3. Data privacy and security

Data privacy and security are among the top priorities for all data and technology leaders, making security compliance non-negotiable. The universal semantic layer acts as an intermediary between data sources and downstream applications, so it not only ensures consistent definitions across an organization but uniformly applies these policies across the organization.

Few standalone universal semantic layers can meet mandated security standards to support role-based access controls (RBAC), row-level security (RLS), and integrate with authentication and identity management systems, like Azure Active Directory. While these issues may not be insurmountable for organizations, they will cause significant delays in deployment. 

4. Financial frugality

Given the current climate of greater cloud cost accountability and tool rationalization, adding more tools and capabilities to the modern data stack is challenging. Cost and benefit analyses are under more scrutiny than ever as organizations continue to struggle with FinOps and clearly identifying tangible business value. The ROI bar is very high, and the benefit of the semantic layer is difficult to measure as it is one of those assets belonging more to the centralized data team.

Throughout its evolution, the semantic layer's core mission has remained constant, even as the demand to be more dynamic, intelligent, and user-centric has increased.

With the emergence of cloud computing and Gen AI—the semantic layer still promises to bridge the gap between complex data structures and downstream stakeholders. More broadly, the idea that every solution should have a conversational AI coupled with human-in-the-loop capabilities is seeping into the collective consciousness of data leaders, having the effect of ‘freezing’ the market for products that don’t yet have these capabilities. On the upside, LLMs enhanced with a semantic layer could potentially resolve the issue of hallucinations.

The concept of the semantic layer as a valuable component of a data stack is not obsolete. However, the future lies in agile, AI-driven systems that offer real-time insights with minimal human overhead. And in the future, the standalone universal semantic layers may find themselves conspicuously obsolete.

Your 2024 resolutions:

Create cross-functional teams. Include data engineers, analysts, and business users to identify the business value gained by having an agnostic semantic layer. Categorize the benefits as hard, soft, or foundation. Clearly identify the impact on the business and the measures needed.

Evaluate available integrations before buying. Go beyond marketing claims to evaluate the depth of integration between semantic layers and analytics and BI platforms, as well as specific cloud data platforms, paying close attention to updates, metadata, and data governance capabilities.

Prep your semantic layer for Gen AI. In the Gen AI era, evaluate the degree to which Gen AI is used to co-develop and create synonyms for business terms.

TREND
06

FinOps evolves
to LLMOps

FinOps will expand into managing the costs of operating and optimizing the value of LLMs. Evolution is a critical concept in FinOps. Executives and data leaders have learned, first-hand, the value of FinOps through their experiences with cloud migrations and digital transformations. By now, we know how Adobe racked up an $80K-per-day cloud bill without noticing for over a week, resulting in unexpected costs of over $500K. Adobe is not the only one; the list includes notable companies like Intuit, Pinterest, and Coinbase. And it’s not just the big names wracking up big bills. 1 in 5 companies face an unexpected cloud bill, according to a recent article in Technology Magazine.

1 in 5 companies

face an unexpected cloud bill, according to a recent article in Technology Magazine.

AI services are appearing daily, bring-your-own-LLM (BYOL) is rising, and training AI on private 1st party data is becoming critical to creating a competitive advantage.

It’s easy to jump into the deep end before you realize the water is over your head. But today’s leaders are clear-eyed about the potential cost of new AI offerings.

The enthusiasm for Gen AI and LLMs has not detracted leaders from their commitment to responsible cloud cost management. The lessons learned from the COVID-19-fueled adoption of cloud computing and digital transformation are still fresh in their minds and wallets. According to McKinsey & Company, 89% of large companies have ongoing digital and AI transformations, with only 31% of companies capturing the expected revenue lift and only 25% realizing the expected cost savings.

The underwhelming outcomes of digital transformations have driven the maturation of FinOps over the past five years, leading to a sharp increase in its adoption among top-performing organizations.

That doesn’t deter the onslaught of real and perceived AI hype in 2024. However, the organizations that come out on top will be proactive in making FinOps for LLMs part of their AI initiatives.

82%

of organizations now have formal FinOps in place

58%

of organizations present FinOps KPIs to the
C-Suite or board of directors

71%

state achieving IT goals without FinOps would be challenging

Your 2024 resolutions:

Build buy-in for FinOps. Educate yourself about FinOps principles and practices. Then, advocate for FinOps by explaining its benefits to the CFO, line of business leaders, and product managers. Present case studies and best practices to demonstrate how FinOps can optimize costs and improve efficiency.

Upskill your key leaders across your workforce with FinOps training. Train and certify resources on FinOps to advance your organization's knowledge and understanding.

Engage with the FinOps community outside your organization. Attend conferences, participate in forums, and network with peers to stay updated on the latest trends and best practices in FinOps for LLM/AI.

Understand cost and pricing models for LLM services. Strive to understand the various options that are available, including OpenAI, Anthropic Claude, Llama 2, Google PaLM 2, and Cohere LLM APIs. Evaluate if a SaaS-based solution, managed service, or self-hosted solution is best for your organization.

TREND
07

Data roles get even
more confusing

2024 is the year of the analytics engineer, AI engineer, prompt engineer, Chief AI Officer who knows?

Back in the early 2010s, the taxonomy of a data team was fairly simple.

Data analysts worked with lines of business to build reports and solve business problems. Data scientists—who often had PhDs in statistics, physics, or economics—built complex predictive models to create forecasts or build recommendation engines. And data engineers built technical systems that would collect, store, and process data for the data analysts and data scientists. The lines weren’t perfectly clear, but they were reasonably well understood. Back then, data leaders could’ve drawn boundaries around the roles in the same way that residents of a city draw lines around a neighborhood: They might not line up perfectly, but everybody would roughly agree on where the Upper West Side is.

Then, in 2012, data scientists—not data analysts or data engineers, but data scientists—were famously declared the sexist job of the 21st century. And all hell broke loose.

Companies became enamored with the possibilities of data science, and data scientists became unicorns that everyone wanted to hire. In 2016, Glassdoor declared it the best job in the United States. Understandably, job seekers started looking for data science roles, favoring them over their lowly cousin—the data analyst.

Wanting to hire the best talent, companies responded by rebranding analysts as data scientists. Lyft explained it all in a 2018 article.

Nicholas Chamandy, then Director of Data Science, Mapping at Lyft, wrote:

“Even if only a minority of the tech industry has shifted their definition of Data Science towards the business analytics end of the spectrum, the presence of even one or two major players in that group (and there are several) immediately puts us on the back foot.

For this reason, we’re shifting our data analysts over to the Data Scientist title. To maintain a functional distinction between the roles, we’re also rebranding our data scientist job title to Research Scientist.”

Things have only gotten more confusing since then. Analytics engineering burst onto the scene in 2019 , occupying a role somewhere between analytics and data engineering. By 2022, it had replaced data science as the trendiest job in the industry. If you asked data leaders to draw role boundaries this year, many of them would say the same thing that the residents of Staten Island said about a few blocks caught in no man’s land: “Nobody knows what to call this neighborhood.”

Unfortunately, data teams shouldn’t expect any relief in 2024. AI’s explosion in popularity will add three layers of complexity to an already strained taxonomy:

1. It will create even more titles for data leaders to choose from. 

Companies looking for AI experts can no longer just look for data scientists; they will also have to look for AI researchers, prompt engineers, and other titles that have gone mainstream in 2023.

2. People will want AI in their title—regardless of what they do.

Because of the hype around AI, job seekers will likely start looking for titles that are more explicitly related to AI. An opening for an “AI researcher” will probably attract more candidates than an opening for a “data scientist,” even if the job responsibilities are exactly the same. This includes C-suite leaders as well: Chief Data Officers (CDOs) and Chief Data and Analytics Officers (CDAOs) are now starting to remarket themselves as Chief AI Officers (CAIOs). 

3. AI jobs won’t always be data jobs

Perhaps worst of all, companies will struggle to figure out which teams should own their AI initiatives. Historically, AI and ML programs have been owned by either centralized or embedded data science teams. This makes sense when a company is building its own models on its own data. The new wave of Gen AI technologies—tools like GPT from OpenAI, Claude from Anthropic, and Vertex AI from Google—don’t need the same amount of bespoke training; in some cases, they can be used straight out of the box. Will engineering teams own these initiatives? Data teams? A complex matrix of dotted lines and cross-functional owners? 

All of this will create further confusion about what we should call data teams and what they should do. There are solutions, though, so long as data leaders find a balance between experimenting with the latest trends without getting caught up in them. 

Your 2024 resolutions:

If you’re hiring for data roles, make your job descriptions extremely clear. AI roles are going to draw hundreds of applicants, often on the title of the role alone. Though it’s okay to rebrand roles to attract top talent—and it may even be necessary, as it was for Lyft in 2018—you should ensure candidates know exactly what the role is and isn’t responsible for. Moreover, you can’t rely on traditional sources of market information like Radford to always be up-to-date. The AI landscape is moving too fast.

Be careful about chasing fads. Updating titles to reflect what’s popular is one thing; creating roles because they’re trending is another. Before you hire someone to be a prompt engineer or an AI writer—in title and responsibility—make sure it’s a job you absolutely need. And if you aren’t sure, try to move someone into that position part-time to see if they feel there will be a durable and full-time need for that role in a year or two. 

Make sure that AI initiatives have clear owners. Though data teams have historically been responsible for AI research, times have changed. Some companies will train their own LLMs from scratch, some will fine-tune open source models, and many (most? ) will simply customize and prompt the foundational models provided by vendors like OpenAI, Anthropic, and Google. Rather than making data teams responsible for AI initiatives, consider exactly what type of expertise you need. It might be application engineers, who can work with providers’ APIs; it might be data engineers, who are sourcing and cleaning data; it might be data scientists, who are training and evaluating fine-tuned open-source models. To avoid any confusion, choose the team that makes the most sense for your goals, and make sure everyone understands that they’re the project’s primary owner. 

Executive leadership can help, but isn’t a panacea. Though some organizations would benefit from hiring a Chief AI Officer (especially those already making significant investments in AI and needing someone to be responsible for thinking about the thorny issues of compliance and security), CAIOs can’t outright replace CDOs or CDAOs. As Microsoft CEO Satya Nadella argued, you can’t have an AI strategy without a data and analytics strategy. 

Don’t forget the humble analyst. In 2018, Cassie Kozyrkov, then the chief decision scientist at Google, said that analysts were more important than the trendy data scientists: “If you overemphasize hiring and rewarding skills in machine learning and statistics, you’ll lose your analysts. Who will help you figure out which problems are worth solving then? Your data will lie around useless.” You could make the same argument today. Even if the hype around AI is real, companies will still need people who can quickly help decision makers diagnose and evaluate business problems. In other words, companies will still need analysts.

TREND
08

Multi-experience
analytics delights
business users
but rattles central
CIOs and CDAOs

Organizations have always wished for one BI tool that would do it all. Meanwhile, users have always wanted a BI tool that is ideal for them, based on their skills and task at hand. 

Now, you throw Gen AI into the mix. It’s a race to bring in this innovation without throwing out all the plumbing of whatever analytics and BI platforms are already deployed. What is a CIO or CDAO to do? Buy five or six different analytics tools optimized for each user persona? Or, settle for good-enough capabilities across a few use cases and limit how many specialty products are added to the mix?

It’s a fool's game. I find myself thinking about family cars and modes of transportation. Settling for good enough is like using one of those Duck Boats and assuming it will work equally well on snowy roads and high-speed highways as it does on city streets and the Boston harbor. But then again, how often does a family buy a different vehicle for each use case? And even if they did, is that a responsible use of resources? Some things need to be shared and common, like electric car chargers, and then the mode of transportation can be optimized for the use case and driving conditions.

Busy Executive

A busy executive wants KPI charts on a mobile phone. 

Finance user

A finance user wants everything delivered via a spreadsheet. 

Data analyst

A data analyst wants the flexibility of SQL and Python. 

Business user

A business user wants Liveboards they can assemble and tweak themselves.

There was a time when analytics and BI platforms had a safe investment period of three to five years. Earlier in the first wave of BI, that investment horizon was easily ten years. The waves of innovation (as discussed in the first chapter) have accelerated, but the mindsets of central BI buyers have not. They should.

No longer do organizations have to sign long-term contracts with big upfront perpetual licensing commitments. Nowadays, with pay-as-you-go, consumption-based pricing, procurement approaches should be more flexible. This shift in licensing models is an industry shift that enables every analytics persona to have their own optimized experience, without sacrifice. License the product for a month or a year, and if your needs evolve or a competitor innovates faster, switch. That’s the theory. 

In reality, interoperability across the analytics ecosystem and even within a single analytics platform remains a work in progress. Even when an organization has implemented a cloud data platform, legacy processes, and visualization tools, they continue to extract, replicate, and aggregate that data into their own proprietary stores—often unnecessarily so. Metadata between different modules may not work together, and newer capabilities may not work consistently.

Just look at Tableau’s dbt integration that only works in Tableau desktop; not within Tableau Server or Online. Or consider Power BI’s smart narratives that work differently within the desktop and the browser, with constraints in updating visuals.

Optimized experiences by persona, using a shared foundation and flexible licensing, solve the wishes of both the users and procurement team. ThoughtSpot was already executing on this vision. For example, ThoughtSpot for Sheets uses the same shared semantic model as ThoughtSpot Liveboards. Multi-experience was also the vision behind the ThoughtSpot and Mode merger–let the data analysts have their code environment with the business users having their no-code environment, but the data model can be via dbt and visualizations created in one environment can be surfaced in another.

This multi-modal experience extends to embedding analytics into business applications like Salesforce and ServiceNow—either as Liveboards or as natural language search interfaces. It’s unclear how excited the market is about extending the data experiences to virtual reality with glasses like XReal or HoloLens. But, I can imagine field workers appreciating this hands-free ability to call up critical KPIs.

Your 2024 resolutions:

Banish the “good enough” mindset; what you save in bundled licensing costs, you lose in user adoption, delight, and insight opportunities.

Develop a strategy to manage the analytics portfolio while embracing innovation and reducing complexity. Recognize that different skill levels and personas will have distinct requirements. This may also apply by vertical or functional domain as well.

Evaluate analytics platform vendors based on depth of capabilities to serve multiple personas but also with integration beneath the covers for a shared data foundation and seamless workflow. 

Ruthlessly upskill and reskill. Do not let legacy mindsets and technical debt prohibit you from continuing to modernize your analytics portfolio. Recognize that you are not only modernizing technology but also having to break away from what may be decades of processes and habits.

TREND
09

AI will make
products worse

If these predictions were written eight years ago, we might’ve said that the future of data and AI was on the blockchain. This revolutionary new technology, I might’ve continued, will change tens of thousands of businesses. Adopt it, I’d conclude, or get left behind.

Though people made these sorts of forecasts—according to one survey, more than half of global business leaders thought that ten percent of global GDP would be stored on the blockchain by 2027—nobody got left because they ignored the warnings. Instead, the opposite happened: A number of companies chased the hype, pushed blockchain-powered products out the door that were more sizzle than substance, and shut them down a few years later.

Now, the technology world is turning its attention away from Bitcoin and towards AI. A majority of CEOs say that Gen AI is one of their top priorities, and everyone from small startups to multinational enterprises like Walmart and PwC are rushing AI-backed products and tools out to their customers and employees. Every day it seems a SaaS vendor launches a new chatbot.

The problem, however, is that building a chatbot is relatively easy. Building a good chatbot—one that’s more than demoware and genuinely improves the experience of using a product—is hard. 

Simple wrappers around GPT and Anthropic aren’t enough. Vendors have to build and maintain much more than that—think:

  • Integrations into metadata repositories

  • Infrastructure for storing that metadata and evaluating its usefulness

  • Systems for evaluating the accuracy of results

  • Tests for hundreds of versions of different prompts

  • Feedback loops to help models improve

But that’s not all. Vendors might also have to train their own LLM, or choose from what is quickly becoming a huge catalog of domain-specific options. And most of all, they have to figure out if you need a chatbot at all, or if AI is better integrated into their products in a completely different way. 

We know these intricacies intimately because we encountered them when building and refining ThoughtSpot Sage. For example, we couldn’t just send a question like “How many accounts did we have in Asia last quarter?” to a LLM and return an accurate SQL query.

Instead, Sage is designed to:

Understand company nuances:

LLM-based analytics products need to know what a user means by accounts, what countries the company includes in Asia, and if questions about quarterly metrics should be based on a fiscal quarter or calendar quarter. 

Ensure consistency:

Despite LLMs being probabilistic, Sage needs to guarantee that two people asking the same question with the same intent will get the same answer. Moreover, follow-up questions—like, of those accounts, which ones were new? —should build on the exact same logic as the first question.

Return team-specific results:

Engineers typically use the term account to refer to a team workspace in a SaaS product while sales reps typically use the same term to reference a business that they’re selling to. Different teams might ask the same question, and expect different results.

Work in the wild:

And, Sage has to do all of this securely, reliably, accurately, and quickly, on enterprise datasets with enterprise-grade performance SLAs.

See Sage in action

Companies that do this deep-thinking work will be rewarded with more useful products and happier customers.

But the companies that don’t—the companies chasing AI because it’s the latest fad—will actually end up making their products worse. Their bolted-on, half-baked chatbot will work for simple requests, but its inconsistency will constantly frustrate customers. People won’t use the feature, and product teams will stop investing in its success. Over time, that chatbot will slowly decay into an awkward barnacle that everyone learns to ignore.

In other words, in 2024, AI will transform thousands of products—some for better, and some for worse.

Your 2024 resolutions:

Start with real customer problems. Before building any new AI products, make a list of problems your customers want to solve. Then, figure out if you can solve those problems with LLMs. Don’t convince yourself that a problem is important to solve just because an LLM might be able to solve it. 

Think outside the bot. By the end of next year, I predict that the most successful LLM-powered products won’t be chatbots. Instead, they’ll integrate the power of AI—synthesis, creativity, and the ability to understand imprecise instructions—in much more seamless ways. When brainstorming ways to improve your products with Gen AI, ask yourself what is the ideal customer experience and can Gen AI help you build that experience. 

Real results will require real investments. If you’re building an AI-powered product or feature, make sure everyone—especially leaders—is on the same page about how much time and effort is needed for the project. Today’s technology makes it easy to ship AIet products, but it still takes a lot of time and sustained effort to make them good. If you want to build something impactful, it will likely require a durable, lasting commitment.

A void overselling and under-delivering. Not all products have to be great! It’s ok to also build features that are primarily to attract attention, and to experiment with new technology. If you do, it’s important to be honest with yourself about your goals, and even more important not to mislead customers about what the product can actually do.

TREND
10

AI innovation
continues to
outpace legislation

While the global AI innovation race speeds ahead, regulators are working to mitigate potential harm to humanity. The range of risk includes full annihilation to more insidious harms.

No matter where you stand on the AI fear spectrum, the impacts to businesses and individuals are real. The recent drama of OpenAI’s dismissal, then rehiring, of Sam Altman shows the high-stakes and extreme philosophical differences that exist. To all, the reality is clear: regulation risks stifling innovation—a risk no country in a global, digital economy wants to take.

While the playing field is not level globally—China, for example, will take a different view on AI regulation than the USA—tech innovators want the playing field to be level within regional economies. Gen AI is evolving so quickly that there is zero chance regulation can keep pace with the ever-expanding capabilities.

Recent examples of AI-induced harms:

The European Nations

To date, the EU has been at the forefront of AI guidelines, just as they were with respect to privacy and GDPR. The strongest of these guidelines includes requirements to specify the training data set and for AI-generated content to be labeled as such. Germany, Italy, and France are also working on specific guidelines that control the application of the models rather than the technology itself.

The United States

In the USA, President Biden issued an executive order in November 2023 that guides the use of AI in federal agencies, but it is so broad that it offers little beyond what is already common sense. Current proposals regarding deepfakes in the US House of Representatives have more substance, but they were proposed four years ago and only recently resurfaced. 

The United Kingdom:

The UK hosted a safety summit that led to 28 countries and the UK signing the Bletchly Declaration, which declares international collaboration to ensure safe AI for all. The UK National AI Strategy also includes some notable specifics around funding AI scholarships and education.

These efforts may be useful in raising awareness and building consensus, but builders and deployers of AI models must be more proactive right now.

Your 2024 resolutions:

Form a governance and ethics committee. For builders of AI, ensure a governance and ethical review committee is in place before building begins. Articulate your organization's ethical values and outline how the company will live up to them.

Don’t let your tech be misused. Include a misuse and abuse clause in software licensing that enables you to deny service of anyone using your product in a way that harms others.

Bring in outside ethical experts. Hire an external ethicist to identify potential unintended outcomes. Internal teams may lack the objectivity to identify such blindspots and may have misaligned incentives to call them out when under the pressure of deadlines and revenue goals. 

Understand bias, especially in training data. Recognize that many biases start in the training data sets; account for this in the model and evaluate using synthetic data to compensate for biased training data.

Make ethics part of your buying decision. For deployers and buyers of AI, evaluate a tech provider’s protocols for ethics by design and fail-safe switches.

Help the wider community understand the implications of Gen AI. As experts in data, analytics, and AI, be an active participant in training citizens and policymakers by partnering with groups such as Data Science for Everyone, Mark Cuban Foundation, and Responsible AI Institute.

Bring ethics into education for data careers. For educators, require ethics as part of the data science and business analytics curriculum; leaders should offer the same for employees, just as topics such as cyber security awareness are an annual certification.

GET A DEMO 

2024 is your year to become
a Renaissance business.

The world’s most innovative companies use AI-Powered Analytics from ThoughtSpot to empower every person in their organization with the ability to ask and answer data questions, create and interact with data-driven insights, and use these insights to make informed decisions.

About the authors

Cindy Howson

Chief Data Strategy Officer
ThoughtSpot

Cindi Howson is the Chief Data Strategy Officer at ThoughtSpot and host of the award winning The Data Chief podcast. Cindi is an analytics and BI thought leader and expert with a flair for bridging business needs with technology. In her role at ThoughtSpot, she advises top clients on data strategy and best practices to becoming data-driven and influences ThoughtSpot’s product strategy. She was awarded Motivator of the Year by Women Leaders in Data and AI and named one of the most influential people in data by DataIQ100.

Sonny Rivera

Senior Analytics Evangelist
ThoughtSpot

Sonny Rivera is a Senior Analytics Evangelist at ThoughtSpot. Sonny is a modern data stack thought leader and expert. He brings with him over 25 years of experience in delivering data solutions that drive business value and increase the speed to insights. He also has a long history of bridging the gap between business needs and technical capabilities.

Benn Stancil

field cto
ThoughtSpot

Benn Stancil is the Field CTO at ThoughtSpot. He joined in 2023 as part of its acquisition of Mode, where he was a Co-Founder and CTO. While at Mode, Benn held roles leading Mode’s data, product, marketing, and executive teams. He regularly writes about data and technology at benn.substack.com. Prior to founding Mode, Benn worked on analytics teams at Microsoft and Yammer.