Data warehouse naming convention guide your team will actually follow
Views
3.9K
Copies
1.1K
Likes
701
Comments
0
Copy rate
28.0%
Prompt
You are a principal data engineer. Write a clear, opinionated naming convention guide for a [warehouse_type] data warehouse that follows a [modeling_approach] pattern.
Cover:
1. Layer prefixes (raw, staging, intermediate, marts)
2. Table naming (fact vs dim, event tables, snapshot tables)
3. Column naming (keys, timestamps, booleans, currencies)
4. dbt-specific conventions
5. Rules for acronyms, abbreviations, and plurals
6. Examples for each rule plus 3 anti-patterns to avoid
Keep it concise, practical, and enforceable.Customise this prompt
Fill in 2 variables to personalise this prompt
Preview
You are a principal data engineer. Write a clear, opinionated naming convention guide for a [warehouse_type] data warehouse that follows a [modeling_approach] pattern.
Cover:
1. Layer prefixes (raw, staging, intermediate, marts)
2. Table naming (fact vs dim, event tables, snapshot tables)
3. Column naming (keys, timestamps, booleans, currencies)
4. dbt-specific conventions
5. Rules for acronyms, abbreviations, and plurals
6. Examples for each rule plus 3 anti-patterns to avoid
Keep it concise, practical, and enforceable.
Example output
# Data Warehouse Naming Convention Guide
This guide is the single source of truth for naming schemas, tables, and columns in our Snowflake warehouse. We follow a dimensional (Kimball) modeling approach with a dbt-driven transformation pipeline. If it is not in this document, ask in #data-platform before inventing a pattern.
## 1. Schema and Layer Prefixes
Every schema name communicates its role. We use four layers:
- **raw_** — untouched ingest from source systems (raw_stripe, raw_salesforce). Read-only, append-only, typed loosely.
- **stg_** — one-to-one cleaned views of raw tables. One stg_ model per source table, nothing more.
- **int_** — intermediate transformations that are reused across marts but are not themselves end-user facing.
- **marts_** — business-facing schemas, subdivided by domain (marts_finance, marts_product, marts_marketing).
Never mix layers in one schema. Never query raw_ directly from a dashboard.
## 2. Table Naming
Fact tables are plural nouns prefixed with fact_, describing a business event: fact_orders, fact_page_views, fact_subscriptions. Dimension tables are singular nouns prefixed with dim_: dim_customer, dim_product, dim_date. Snapshot tables (slowly changing type 2) end in _history: dim_customer_history. Bridge tables go in int_ and end in _bridge.
## 3. Column Naming
All columns are snake_case and lowercase. Use full words over abbreviations. Specific rules:
- Primary key: table_name_id (customer_id, order_id). Never just "id".
- Foreign key: matches the referenced primary key exactly.
- Timestamps: suffix with _at (created_at, updated_at, shipped_at). All timestamps are UTC.
- Dates (no time): suffix with _date (order_date, signup_date).
- Booleans: prefix with is_, has_, or can_ (is_active, has_trial).
- Currency: suffix with _amount and store in minor units where possible; add a companion _currency column for multi-currency tables.
- Counts: suffix with _count (order_count, session_count).
## 4. dbt Conventions
Model filenames exactly match the resulting table name. Test files live alongside models. Sources go in _sources.yml, documentation in _models.yml. Every mart model has a description, every column has a description, every primary key has a unique + not_null test.
## 5. Acronyms, Abbreviations, Plurals
Write acronyms lowercase in snake_case: url, api, sku, utm_source. Spell out words unless the abbreviation is universally known in the business (ltv, mrr, arr are acceptable). Fact tables are plural, dimensions are singular — no exceptions.
## 6. Anti-Patterns to Avoid
**Anti-pattern 1:** Generic table names like "data", "records", or "info". Every table must answer "what grain of what thing".
**Anti-pattern 2:** Mixed-case or CamelCase identifiers. Snowflake will upper-case them and make querying miserable in other clients.
**Anti-pattern 3:** Datestamps baked into production table names (orders_2024). Partition instead; tables are forever, dates rotate.
When in doubt: be boring, be explicit, be consistent. Clever names age badly; clear names compound.