Optimizing the Data Vault architecture on Snowflake & free SQL!

Optimizing Data Vault on Snowflake Cloud Data Platform
Data Vault physical Data Model designed for the Snowflake Cloud Data Platform using SQLDBM

Following on from my recent presentation at the virtual Munich Data Modeling Meetup on how to optimize the Data Vault architecture on Snowflake, along with the session recording, the SQL scripts are now available.


Data Vault is a modern, agile approach to building an enterprise data warehouse and analytics repository. It provides auditable historical data which can be loaded at speed.

Snowflake is a fully-managed service for data warehousing, data lakes, data engineering, data science, data application development, and for securely sharing and consuming data.

My session followed a series of blog posts by Kent Graziano, Chief Technical Evangelist at Snowflake. Areas covered included parallel data loading of a Data Vault on Snowflake, separation of workloads, key management, and managing big data to support schema-on-read.

Session recording

A recording of the session can be found here.

SQL Scripts on Github

The SQL scripts used during the presentation are now available on Github.

SQL Scripts summary

Script nameDescription
01 – DDL.sqlThe DDL used to create the required Data Vault tables etc. on Snowflake.
05 – Multi table inserts.sqlLoads the Data Vault from ‘Staging’ (sample Snowflake database) using Multi Table Insert statements. Uses Overwrite All for testing purposes.
12 – Create Warehouses.sqlCreates 3 x Snowflake Virtual Warehouses to load Hubs, Links, and Satellites separately.
15 – MultipleWarehouses – 01 – Load Hubs.sql
15 – MultipleWarehouses – 02 – Load Sats.sql
15 – MultipleWarehouses – 03 – Load Links.sql
Each script loads Hubs / Links / Satellites separately using a dedicated Virtual warehouse for separation of workload.
30 – VARIANT – Data Load.sqlLoad JSON into the Data Vault
40 – BusinessVault.sqlParse JSON data and optimization of the Business Vault.
Tagged on:

2 thoughts on “Optimizing the Data Vault architecture on Snowflake & free SQL!

  1. Jimmy Lin

    Hey Dan, Thank you for providing the information and scripts in the GitHub repository. One thing I did notice was that the OpenWeather sample data in Snowflake was deprecated so your Variant – Data Load.sql doesnt work correctly anymore.

    1. Dan Galavan Post author

      Hey Jimmy,

      Thanks for the message. Yes it depends; that script is tailored for the sample data that’s available with some of the earlier regions e.g. AWS Ireland.

      Best regards,

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.