Data is the backbone of modern businesses. When leveraged correctly, it helps organizations grow revenue 10-30% more than their peers, delighting customers, employees, and investors in the process.
With the massive scale of cloud data platforms, capturing and storing data has become incredibly cheap, while the data itself has never been more valuable. As companies look to capitalize, data volumes have exploded. Awash in so much information, organizations now have a new challenge: managing all this data. Traditional data architectures were designed to manage structured data in a highly centralized manner, which might not be sufficient for the needs of modern businesses. Enter data mesh – a new approach to data architecture that is gaining popularity among tech leaders.
A data mesh is a type of data platform architecture that leverages domain-oriented, self-service design that enables teams to create, secure, process, and share data products across domains in an organization. This approach allows organizations to access a distributed dataset without relying on monolithic data infrastructures and processes.
Data meshes focus on creating highly scalable architectures to optimize the flow of data while ensuring security, governance, and interoperability. They are designed to ensure that domains within an organization have complete control over their data and make it easy to access the right information at the right time. By using a data mesh and pairing it with a self-service analytics tool, organizations can create an environment where stakeholders have access to up-to-date, trusted data at all times.
Traditional data architectures were designed to manage structured data in a centralized manner. In this approach, data is usually managed by a team of specialists, often in a center of excellence or centralized business intelligence team that is maintaining a single, monolithic data warehouse. For many organizations, this poses a challenge for effectively leveraging data across domains. This is because a central ETL pipeline might lead to less access and control over increasing data volumes for different use cases, which, in turn, puts a heavy load on both the data team and prevents them from engaging in more strategic, high value initiatives such as predictive analytics or implementing tools like GPT for analytics. Consequently, traditional data architectures can hinder companies from adapting to rapidly changing market conditions, customer expectations and needs, or disruptions to their operations, leading to delays, missed opportunities, and unrealized revenue. In contrast, data mesh is a distributed architecture that emphasizes domain data ownership, self-service data platforms, and cross-functional collaboration.
While implementing a data mesh will vary organization to organization, there are several key principles that apply across the board, including:
Each domain owner is responsible for the quality, availability, and compliance of their data as products. This approach enables cross-functional collaboration and communication between distributed data across domains.
In data mesh, data governance is federated rather than centralized. While each domain is responsible for their data governance, there is a universal set of data standards and policies that each domain must agree to. This helps to ensure that data is compliant and of high quality to its consumers.
Data mesh emphasizes self-service data platforms that allow domain experts to store, access, transform, and manage their data without relying on a central data team. While this can result in some duplication, this approach not only improves data agility but also reduces the burden on the central data team. These are particularly powerful when paired with a self-service BI tool that allows business people to explore these data platforms without similarly requiring the data team’s intervention.
Data mesh emphasizes treating data as a product that can be decentralized and owned by the domain experts. This approach not only enables cross-functional collaboration but also makes it easier to scale data infrastructure.
While data products can refer to many things, such as building a data app, in a data mesh context, data products are autonomous data assets that are managed by domain experts who are responsible for their quality, accessibility, and usability. These domains can be structured around business processes, customer segments, product lines, or any other relevant dimension.
Identifying data products requires a deep understanding of the business, its use cases, and associated processes. It involves mapping out the data flows and identifying the areas where data ownership is ambiguous or unclear. Once the data products have been identified, domain experts can be assigned to take ownership and start managing them independently.
A data mesh approach requires the right set up for the data team. These data teams are composed of domain experts and tech specialists who are responsible for the management of the data products. They should have the necessary skills and knowledge to ensure that the data is of high quality, accessible, and usable and a thorough understanding of the data KPIs that underpin these. These teams should also have the autonomy to make decisions related to their domains.
"I think what a lot of people misjudge is that data mesh is an architecture paradigm and it is not a architecture paradigm. It is a ways of working with data in an organization for scaling."
See why online retailer Zalando was first to embrace the Data Mesh.
It is important to note that building data teams is not just about hiring new people. It may involve reorganizing existing teams, providing them with the necessary training and support to become data experts, and helping these data professionals grow their careers. This can be achieved by providing training programs, mentoring, and coaching.
Creating a data mesh infrastructure involves providing the necessary tools and technologies to support the data teams. This may include data storage, data processing, data access, data pipelines, and data security technologies. The infrastructure should be designed to support the autonomy of the data teams while ensuring that the data is still consistent, reliable, and secure.
The data mesh infrastructure should also support the self-service analytics platforms that enable domain experts to access and manage their own data. These platforms should be easy to use, flexible, and customizable to meet the specific needs of each domain, empowering them to take advantage of the data assets they have developed.
Data governance involves defining the global policies, procedures, and standards that govern the use, management, and sharing of data within the organization. Data governance should also establish the roles and responsibilities of the data teams, define the processes for data management and sharing, and ensure that the data is compliant with legal and regulatory requirements.
Data mesh offers a wide range of benefits for businesses and organizations. Some of these include:
With data mesh, each team is responsible for the data within its domain, and therefore, has a vested interest in ensuring that the data is accurate, up-to-date, and relevant. This can help to eliminate data duplication and inconsistencies that often arise when data is managed centrally, while driving a high degree of ownership. As a result of doing this, teams are more likely to hit their data quality metrics.
By allowing each team to own and manage the data within its domain, the teams can respond more quickly to changes in their business environment. This means that they can adapt their data models and processes to better serve the needs of their customers or clients, without adding more technical debt to central data teams.
A data mesh architecture promotes the sharing of data as products across teams and departments. The decentralized strategy behind data mesh creates a new mindset within the organization by viewing data as a product, an asset that can and should drive business performance. This is especially true when data mesh initiatives are paired with modern analytics tools like ThoughtSpot that let all kinds of users engage with data directly.
Within a data mesh architecture, data from disparate systems can be streamed, integrated, transformed, and analyzed all at once within each domain team. By eliminating data flow into a single ETL pipeline, efficiency within and across teams is increased while protected by a centralized observability and monitoring infrastructure. This allows domain teams to design and develop according to their use cases while maintaining full control of their data products and services.
Anything good in life doesn’t come without its challenges. Some data mesh challenges include:
Traditional centralized data management systems are deeply ingrained in many organizations, and it can be challenging to convince stakeholders to adopt a new approach to data management. Change management continues to be one of the biggest barriers to successful data initiatives, including implementing a data mesh. It may take time and effort to convince stakeholders of the benefits of data mesh and to get their buy-in.
Data mesh requires a different mindset and approach to data management, one that prioritizes decentralization, collaboration, and agility. It may take time and effort to shift the culture within an organization to embrace these principles.
Data meshes are quickly gaining popularity as an effective alternative to traditional data architectures. By understanding the four principles of a mesh, you can more easily and successfully implement one in your organization. You will likely experience numerous benefits, such as improved data quality and data agility. However, to truly reap the rewards of these investments, organizations must make these data products accessible, meaningful, and actionable for every user. Companies like Gilead and CDW are pairing data mesh with ThoughtSpot to turn these data products into business impact. If you want to see the same, sign up for a ThoughtSpot free trial today and witness the power of self-service analytics for data mesh.