Data vault modeling is quickly becoming the standard approach to modeling a data warehouse. Compared to other popular approaches, Data Vault modeling represents a paradigm shift – a new way of thinking.
DATA VAULT 2.0 // DATA VAULT AUTOMATION
Data Vault is a modeling technique for data warehouses that is particularly suitable for agile data warehouses. It offers a high degree of flexibility for extensions, a complete unitemporal historicization of the data and allows a strong parallelization of the data loading processes. Data Vault modeling was developed in the 1990s by Dan Linstedt. After its first implementations in 2000, it gained greater attention from 2002 onwards through a series of articles. In 2007, Linstedt won the support of Bill Inmon, who called it the “optimal choice” for his DW 2.0 architecture.
areto is a specialist for the package of modeling, architecture and methodology approaches propagated by Linstedt since 2013 under the name Data Vault 2.0. Also noteworthy are the publications by Hans Hultgren on data vault modeling and by John Giles on the creation of data vault models using patterns.
Ralph Kimball’s dimensional modeling focuses on simple data analysis and is optimal for the access layer of a data warehouse.
Bill Inmon propagated an enterprise integration layer in 3rd normal form, which transforms all source systems into a uniform, historicized departmental model. The modeling in 3rd normal form is optimized for operational systems and quickly reaches its limits when it comes to data integration.
ARCHITECTURE AND MODELING
ARCHITECTURE AND MODELING
Data Vault allows flexible and fast customization of the data warehouse. A real advantage for companies. Static data warehouses become increasingly complex over time. This automatically leads to higher costs for continuously occurring extensions and changes to the data warehouse. However, the extensive implementation and test cycles not only lead to an increase in costs, but also often to personnel bottlenecks, innovation backlogs and an exhaustive search for ETL and modeling experts.
Companies that want to survive in today’s competition cannot afford these waiting times. They need to respond quickly to ever-changing current market needs. This has to be echoed in the data warehouse implementation. Data Vault is the solution.
Modern data warehouses are agile!
Modern
Data Vault combines the best of the dimensional and normalized modeling world. Data Vault is specifically designed to solve agility, flexibility, and scalability issues. It was developed as a granular, non-volatile, auditable, historical repository for enterprise data from multiple operating systems.
Modular
Changes extend the model without changing existing ones. Thus, there is hardly any impact on existing processes and only minimal testing effort (regression tests).
Scalable
Complete parallelization of loading. Different interfaces can be loaded independently of each other. Incremental approach. Content is insert only and provided with Slowly Changing Dimension 2 (SCD2) historicization. ETL or ELT can/should take place automatically.
Data Vault was not developed as a pure data model, but much more as an all-encompassing collection of methods:
With Data Vault model an additively and agilely data warehouse!
Data Modeling Methods
Methods of data processing
Architectural Principles
Agile development process
The Data Vault architecture essentially consists of three layers:
Data Vault 2.0 offers a high degree of flexibility for extensions of the DWH, a complete historicization of the data and allows a strong parallelization of the data loading processes. In modeling, all information belonging to an object is divided into three categories and strictly separated from each other.
The first category “Hub” includes information that clearly describes an object, i.e. gives its identity (e.g. product number for the product).
Hub – Is the “root” of an entity (integration anchor):
The second category “Link” describes relationships between objects (e.g. assignment of a product to a sales channel).
Link – Maps the relationships between hubs:
Attributes that describe an object (e.B. product name) belong to the third category, the “satellite”.
Satellite – Stores the detailed data from hubs and links:
This type of modeling allows changes to be made flexibly, so that no existing tables need be adapted. New tables are simply added. Because of the strong schematization of the data loading processes, templates can (ought to) be used. For example, a change or extension of the data loading process is usually already possible by adjusting the configuration.
In the interest of our customers, areto ensures that data integration is as standardized as possible. The proliferation of Data Vault as a data modeling method for the data warehouse has led to the development of numerous Data Warehouse Automation (DWA) solutions. The combination of leading DWA tools, analytical databases such as Exasol or Snowflake and the technical expertise of areto leads to large time and cost savings. areto offers market-leading solutions from our partners WhereScape, Data Vault Builder and Matillion or our own open source solution areto Data Chef (which we successfully use in many customer projects).
The Data Vault architecture and modeling approach, with its simple and understandable modeling paradigms and naming conventions, enables a quick understanding of both the source and the transformed data. Data Vault combines the best of the dimensional and normalized modeling world. This makes modeling scalable, flexible and consistent in itself. It can be adapted to the individual company needs and offers optimal support for agile process models.
Data Vault is revolutionizing the architecture of the data warehouse with its new way of data integration and data delivery. Because of the strong standardization of processes, it is possible to automate the data provisioning to a very high degree.
With Data Vault, you create new opportunities and perspectives to grow your business and lead it into the future.
Book a support appointment with one of our Data Vault 2.0 experts! Quick solution approaches and best-practise to your concrete problems in dealing with the innovative modeling and architecture approach to agile data warehouse modeling!
Costs
0,5 hours – 110 €
1,0 hours – 200 €
2,0 hours – 350 €
The Data Vault 2.0 consultation hour offers you the opportunity to receive support for small and large questions at short notice. Benefit from the experience of our experts in solving your problem. This way, you can quickly get back to your actual work.
Becoming a data-driven company with the areto Data Vault experts!
Find out where your company currently stands on the way to becoming a data-driven company.
We analyze the status quo and show you what potential exists.
How do you want to get started?
Free consulting & demo appointments
Do you already have a strategy for your future DWH solution? Are you already taking advantage of modern cloud platforms and automation? We would be happy to show you examples of how our customers are already using areto’s agile and scalable DWH solutions.
Workshops / Coachings
Our workshops and coaching sessions provide you with the know-how you need to set up a modern DWH. The areto DWH TrainingCenter offers a wide range of learning content.
Proof of Concepts
Till Sander
CTO
Phone: +49 221 66 95 75-0
E-mail: till.sander@areto.de