Data catalog entry template that actually gets used
Views
11.4K
Copies
1.9K
Likes
1.9K
Comments
0
Copy rate
17.0%
Prompt
You are a data governance lead. Write a data catalog entry template for a single table in a [warehouse_type] warehouse. The audience is mixed — analysts, PMs, engineers, and occasional execs.
The template must cover:
1. Quick summary (what this table is in one sentence)
2. Ownership and freshness expectations
3. Grain and primary key
4. Key columns with business meanings
5. Upstream sources and downstream consumers
6. Known caveats and deprecated columns
7. Example queries
Show the template filled in for a realistic example table. Favor directness over corporate phrasing.Customise this prompt
Fill in 1 variable to personalise this prompt
Preview
You are a data governance lead. Write a data catalog entry template for a single table in a [warehouse_type] warehouse. The audience is mixed — analysts, PMs, engineers, and occasional execs.
The template must cover:
1. Quick summary (what this table is in one sentence)
2. Ownership and freshness expectations
3. Grain and primary key
4. Key columns with business meanings
5. Upstream sources and downstream consumers
6. Known caveats and deprecated columns
7. Example queries
Show the template filled in for a realistic example table. Favor directness over corporate phrasing.
Example output
# Data Catalog Entry: marts_finance.fact_subscriptions
## 1. Summary
One row per subscription-period per customer. This is the canonical source of recurring revenue — if you are building any MRR, ARR, churn, or expansion metric, start here.
## 2. Ownership and Freshness
- **Owner:** Finance Data Team (Slack: #data-finance, primary: Priya Desai)
- **On-call rotation:** PagerDuty schedule "Finance Data"
- **Refresh cadence:** Every 2 hours, via the dbt Cloud job "finance_marts_hourly"
- **Freshness SLA:** < 3 hours from Stripe event time
- **Last run status:** See Monte Carlo monitor "fact_subscriptions_freshness"
If the SLA is breached, the owning team is paged. Do not DM individual engineers — use the on-call channel.
## 3. Grain and Primary Key
**Grain:** one row per (customer_id, subscription_id, period_start). A subscription that renews monthly produces 12 rows per year.
**Primary key:** subscription_period_id (surrogate, MD5 of the three grain columns).
This is not a snapshot table. Historical rows are immutable once the period closes — late-arriving corrections are written as adjustments to a dedicated fact_subscription_adjustments table.
## 4. Key Columns
- **customer_id** — FK to dim_customer. Stable across renewal events.
- **plan_id** — the plan in effect for this specific period, not the customer's current plan.
- **period_start, period_end** — timestamps in UTC, inclusive start, exclusive end.
- **mrr_amount** — monthly recurring revenue for this period in USD, converted from source currency using the period-start exchange rate. Annual plans are divided by 12.
- **is_trial** — true if the customer is still in a free trial during this period.
- **churn_flag** — set on the final period before cancellation. Not the same as cancellation_date, which can be in the future for scheduled cancellations.
## 5. Upstream and Downstream
**Upstream:** raw_stripe.subscriptions, raw_stripe.invoices, stg_fx_rates, dim_plan.
**Downstream:** marts_finance.mrr_daily, marts_product.feature_usage_by_plan, the executive revenue dashboard, the Finance close process.
Changes here ripple widely — coordinate breaking schema changes through #data-platform at least one sprint in advance.
## 6. Caveats and Deprecations
- **legacy_plan_name** is deprecated as of 2025-08-01. Use plan_id joined to dim_plan. The column will remain for backfill purposes until 2026-08-01.
- Subscriptions created before 2023-01-01 have an imputed period_start based on invoice dates. This is flagged via the is_backfilled column.
- Refunds are not subtracted here — see fact_subscription_adjustments.
## 7. Example Queries
- **Current MRR:** sum mrr_amount where CURRENT_DATE between period_start and period_end.
- **Trailing 12-month churn rate:** count of churn_flag over customers active 12 months ago.
- **Expansion MRR by month:** window delta of mrr_amount per customer.
Working examples live in the "finance/examples" folder of the dbt repo.