Data Mesh

Data Mesh was shaped by Thoughtworks and is a decentralized, cross-domain data architecture that focuses on performance, agility and self-service. areto offers Data Mesh solutions with Snowflake, Data Virtuality and Azure.

What is Data Mesh?

The term data mesh describes a domain-driven analytical data architecture in which data is treated as a product. In this decentralized architecture, data from individual business units (domains) is not combined in a large platform, but is managed, processed and stored by the corresponding business units. The goal of Data Mesh is to create a scalable self-service platform where data supply and demand come together on a domain-structured level.

Data Mesh was coined in 2019 by Zhamak Dehghani at Thoughtworks. This paradigm is intended to act as an alternative to traditional data management solutions such as data warehouses and data lakes. Instead of a monolithic platform, data is decentralized and ownership of it is attributed to individual business units. Rather than channeling the data into a central platform, they must host their own domain and provide the data in a way that is easily consumable. The physical location of the data can still be a centralized infrastructure, only the content and ownership is assigned to the individual units.

„A data mesh considers domains as a first-class concern, applies platform thinking to create self-serve data infrastructure, treats data as a product, and introduces a federated and computational model of data governance.”

Business Domain Data Mesh blue areto 1

This data can then be duplicated for different areas and put into the format that is appropriate for each use. This requires a shift in thinking from the traditional push and ingest ETL model to a serving and pull model for all business domains. Interfaces between domains require common language and understanding as well as technological solutions

The basic idea is to adapt the management architecture and the effort to extract value from existing data with AI and analytics to the complexity of organizations and to make it scalable for the future.

Data Mesh Features

Results-oriented by thinking in terms of data products

Focused on operations and analysis

Capturing data in real time

Decentralized architecture

Data streams as standard processing

Direct connection of data producers and consumers

Integrated security and transparency

Rethinking from ETL to CTL

Why Is Data Mesh Useful?

In a data mesh, the data is not managed by a central team that brings together and maintains the data from all the different business units. Instead, all business units are responsible for their own data and the data products created with it, since the units work most with the data products, know them and can prepare them. Streaming of live data is also ensured via data pipelines. This means that there are no losses during preparation or transmission. This provides a qualitative basis for data-driven decisions. A data mesh can be placed on top of existing systems.

In classic BI architectures such as data warehouses or data lakes, there is a central data team which is responsible for the centralized data. Due to the increasing amount of data and the supply of project teams and data scientists, these teams are often overloaded. As a result, the speed of provisioning suffers, as does the quality of the data. The problem is usually not the technology, but the organizational structure. This is another area where the Data Mesh approach steps in and prevents this problem from occurring in the first place.

Data Mesh Architecture

For Data Mesh, the software principle of domain-driven designs applies. According to this, the software, or in this case the technical architecture, should be based on the logic and structure of the company, not the other way around. This is the best method to ensure fast value creation and reusability.

Data Mesh Federation areto

There are various approaches and expansion stages of Data Mesh. Large organizations in particular like the decentralized architecture approach. However, some of them fear that merging data can lead to duplicates, the formation of data silos and overlapping areas of responsibility. A hybrid approach and partial implementation are also possible and partly resemble a data lakehouse architecture.

Every organization has its own data architecture and unique challenges to consider. Not all organizations are ready for full-scale decentralization. Our Data Mesh Experts* will be happy to advise you on which topology is the best fit for you.

4 Basis Principles of Data Mesh

From Classical Data Architecture to Data Mesh

From centralized ownership

To decentralized ownership

From centralized data storage

To an ecosystem of data products

From main focus on data pipelines

To main focus on individual business unit data

From data as by-product

To data as product

From a centralized data team

To cross-functional divisional data teams

Benefits of Data Mesh in Numbers

clarity on the value of enterprise data
47 %
operational data availability through data pipelines
0 %
faster innovation cycles by shifting from ETL to CTL
0 x
less data engineering
25 %

5 Essential Actions for Successful Implementation of Data Mesh

A cost and benefit analysis needs to be done up front by the organization. Data Mesh makes sense for companies with multiple business units where there are resources for structural change and new roles. Among these new roles is a product manager in each business unit. This position includes data management tasks, data analysis, and traditional product manager tasks. But there is also a need for a data team per business unit to prepare and share data.

A team of experts should be available across the business units for special topics and assistance, which also clearly defines the tasks of the individual departments. In addition, a data office team as a new organizational unit ensures the development and implementation of a suitable business strategy. A decentralized, data-driven culture is created.

The basic principle of data networking applies. Each area must share data access with other areas. To ensure efficiency and rich analytics, data must be prepared before sharing. The data infrastructure for this must be fundamentally aligned. If organizational units use different cloud platforms, for example, this data must also be available to other domains across platforms. Some providers already offer their own exchange platforms in the cloud, so that no data pipelines are necessary for sharing. It is possible to extend the model to collaboration with customers and partners. This way, they can also access the organization’s live data on demand, without the need for ETL or copies in the cloud.

In order for data to be shared and used across departments, individual business units must prepare the data and make it accessible. Metadata should also be included in the data product to ensure understanding from a technical and business perspective.

One option is also a data catalog that acts as a metadata repository where each domain publishes its data products. Governance and access rights can also be set there as a management layer by defining groups and roles. Some cloud providers already provide data catalog services to their customers, simplifying the integration of this.

Data governance is often cited by enterprises as a top priority. Traditionally, a central data warehouse team is responsible for it. In the data mesh, each domain owns its data and accordingly has its own governance. Coordination between the individual domains is essential for this.

A data office is identified for this purpose. A small team that is led by a Chief Data Officer. This is primarily responsible for the coordination of data management. The exact tasks can be customized from organization to organization. These may include the development of global governance policies or quality control of data products.

A key principle of Data Mesh is self-service and should be enabled across the board. This refers to the entire lifecycle of data and analytics. Thus, individual business units can access and manage all the data they need across the enterprise, independent of a central infrastructure team. To ensure this, some tools can be recommended by the Data Office.

  • Data management tools such as a cloud platform environment that makes data shareable
  • Data catalog tools that support the discovery of data across domains
  • Data analysis tools for self-service such as the ability to set up a development or test environment

Data Mesh Use Cases

AI and ML

Machine learning (ML) and artificial intelligence (AI) models can be easily fed with data from multiple sources without running the data through a centralized location.

Marketing

Marketing teams are able to execute the right campaigns for the right customers, at the right time, and through the right channels.

IT

Data latency can be reduced with instant access to query data from nearby areas without access restrictions.

Customer360

More automated processes that provide a better and more contextual customer experience. This results in lower average handling time, better first-contact resolution, and higher customer satisfaction.

Data Protection

Security policies can be easily applied by integrating external data governance, policy, and security tools at a global level before making them available to data consumers in business units.

Data Mesh with Snowflake

The Snowflake platform is the cornerstone of the data cloud and is specifically designed to connect businesses of all sizes and industries. A unified architecture makes it possible to integrate a wide variety of data from a wide variety of sources into the platform. Snowflake is very powerful and can be used for a variety of workloads. This makes it the fastest growing data platform on the market today.

One of the biggest challenges of data mesh is how data products can be created cross-functionally by separate teams and distributed across many domains. Through its multi-cluster architecture, Snowflake provides a unique way to share data – the Snowflake Market Place. Here, data owners can give data consumers access to live data in minutes, without having to duplicate or move it in any way. This also applies across organizations, such as customers and partners. This increases transparency, customer satisfaction and thus business performance.

The governance of the data is also simplified by Snowflake. You can track data usage and historical access at any time. In addition, you can see which data is most frequently used internally and externally.

It works

Replace manual with automated to operate at scale, optimize costs, and minimize downtime.

Performance

Run any number or type of job across all users and data volumes quickly and reliably.

Collaboration

Extend access and collaboration across teams, workloads, and data seamlessly and securely.

DATA MESH snowflake Marketplace areto

Additionally, the independence of the Snowflake Cloud Data Platform is very relevant for data mesh architectures. Thus, Snowflake can be used on AWS and Azure. In the future, Snowflake will also be available in many more cloud regions than solutions from other cloud providers. It can be used across regions and clouds and connects regions and cloud systems together, greatly simplifying the adoption of Data Mesh.

Become a data-driven company with the areto Snowflake experts!

BENEFITS WITH SNOWFLAKE CLOUD DATA PLATFORM

Powerful utilization for multiple workloads

Data sharing through the Snowflake Market Place

No need to duplicate or move for data access

Simplified Governance

Data mesh implementation through cross-cloud usage

Data Mesh Snowflake Benefits of the Data Cloud areto

The Snowflake Cloud Data Platform gives you improved scalability with a low cost of ownership. Even in the event of data loss, data can be recovered. Reduce the complexity of your technology landscape with Snowflake as a central component. This allows your specialists to focus on data development rather than maintenance and administration.

Data Mesh with Data Virtuality

The Data Virtuality platform enables distributed architectures that are essential to the data mesh concept. It is designed to create a unified and secure data layer across multiple distributed data systems.

  • Data integration capabilities allow domains to manage and prepare their own data
  • Data models can be created and shared by individual domains in the virtual layer
  • Data products can be accessed via SQL or APIs
  • A central team can perform global governance, quality and security measurements with the Data Virtuality platform
  • Metadata stores make data discoverable
  • Various tools for governance, policies and data catalogs can be easily integrated
Data Virtuality Data Mesh demo areto

Data Mesh Principles with Data Virtuality

Domain-orientierted ownership

Scalability through decentralized data ownership

  • Independent setup of datasets and models of domains
  • Data virtualization with or without data replication
  • Use of different tools in the domains through SQL

Data as a Product

Data becomes a product when it can be easily understood and securely used

  • Web Business Shop makes data discoverable
  • 200 connectors for easy access to data stores
  • Data delivery layer as a powerful foundation for data utilization
  • Transformation option within the platform

Self Service

Low complexity by supporting autonomous data usage in the self-serve platform

  • Self-service access through Data Marketplace
  • Data modeling layer can be easily shared among users
  • Easy combination of analytical and operational data

Federated Governance

Compatible ecosystem through global rules for federated governance

  • Unified data management
  • Security and privacy features also at row and column level
  • Data creation within the platform
  • Easy integration with external centralized identity and policy management platforms
Data Mesh Data Virtuality Architektur areto

Become a data-driven company with the areto Data Mesh experts!

Overtake the competition by making faster and better decisions!

Find out where your company currently stands on the way to becoming a data-driven company.
We analyze the status quo and show you what potential exists.
How do you like to start?

Free consultation & demo appointments

Do you already have a strategy for your future data mesh solution? Are you already taking advantage of modern cloud platforms and automation? We would be happy to show you examples of how our customers are already using areto’s agile and scalable data mesh solutions.

Workshops / Coachings

Our workshops and coaching sessions provide you with the necessary know-how to build a modern data mesh architecture. The areto Data Mesh TrainingCenter offers a wide range of learning content.

Proof of Concepts

Which architecture is right for us? Are the framework conditions suitable? Which prerequisites have to be created? Proof of Concepts (POCs) answer these and other questions so that you can then make the right investment decisions. This way, you start your project optimally prepared.

Data Mesh Know-How Video Library

Introduction to Data Mesh – Thamak Dehghani

Keynote – Data Mesh by Zhamak Dehghani

The Shift to Data Mesh

Leverage your data. Discover opportunities. Gain new insights.

We look forward to hearing from you

till sander areto 2

Till Sander
CTO
Phone: +49 221 66 95 75-0
E-mail: till.sander@areto.de