Published: Last updated:

Data Mesh

Data Mesh is a decentralised data architecture approach in which data ownership, responsibility, and technical implementation rest with the respective domain teams — not with a central data-engineering team. Each domain treats its data as a product offered to other domains.

Central data lakes and data warehouses do not scale in large organisations, because a single team cannot hold the domain knowledge of every business unit.

The Four Principles

  1. Domain-Driven Data Ownership: Each domain owns and is responsible for its own data and data pipelines.
  2. Data as a Product: Data is developed with a product mindset: quality, documentation, SLOs.
  3. Self-Serve Data Platform: A central platform provides infrastructure and tooling so that domains can independently create data products.
  4. Federated Computational Governance: Global standards (security, data protection, quality) are enforced in a decentralised but consistent manner.

Comparison to Centralised Approaches

  • Data Lake: Central repository for all raw data, often degraded into a "data swamp".
  • Data Warehouse: Centralised and schema-on-write; difficult to scale with many source systems.
  • Data Mesh: Decentralised, schema-on-read per product; scales with organisational size.

Focus: Scaling Through Decentralisation

Data Mesh solves the problem that data competence cannot be concentrated in a single central team.

FAQ

Do we really need Data Mesh, or is it just hype?

Data Mesh makes sense from an organisational size at which a central data-engineering team becomes a bottleneck. For small to medium-sized organisations, a well-managed data lake or a modern data warehouse is often sufficient.

What is a Data Product in practice?

A Data Product is an asset made available by one domain to other domains: a cleaned, documented, versioned dataset with defined SLOs for freshness and quality.

Reference Guide


Related Topics