stg_
to indicate that they are staging tables.
models/staging/product
directory
of the example-dbt-project.users
table need the following changes:
user_id
name
- Should be renamed to user_name
email_address
- Should be renamed to user_email
lat
- Should be renamed to latitude
long
- Should be renamed to longitude
createdAt
- Should be renamed to created_at
updatedAt
- Should be renamed to updated_at
users
table.
merchants
table need the following changes:
merchant_id
merchant_name
geo_loc
- This is a custom datatype that stores “latitude/longitude” as a single value. We should split this into two columns latitude
and longitude
createdAt
- Should be renamed to created_at
updatedAt
- Should be renamed to updated_at
merchants
table.
items
table need the following changes:
item_id
merchant_id
name
- Should be renamed to item_name
because explicit is better than implicitcategory
- Should be renamed to item_category
because explicit is better than implicitprice
- Should be renamed to base_price_usd
so that it’s clear that the price is in USDcreated_at
updated_at
items
table.
promotions
table need the following changes:
promotion_id
product_id
- The use of the word ‘product’ is because of legacy reasons. We should rename this to item_id
to match the items
tablediscount
- By default we store the discount percentage as an integer e.g. If we offer 20% discount, this value is 20.
We will convert this to a float by dividing by 100 to make it easier to work with. We will also rename it to discount_percentage_as_float
created_at
updated_at
promotions
table.
inventory
table need the following changes:
inventory_id
merchant_id
item_id
qty
- Should be renamed to quantity
for claritydate_acquired
- Should be renamed to acquisition_date
because we keep the units as suffixescost
- Should be renamed to unit_cost_usd
so that it’s clear that the price is in USD and we’re talking about the cost of a single unitcreated_at
updated_at
total_cost_usd
which is the product of quantity
and unit_cost_usd
since this is a common calculation that we’ll need to do.
The following dbt model creates a staging table for the inventory
table.
orders
table need the following changes:
order_id
customer_id
- Should be renamed to user_id
to match the users
tablestore_id
- Should be renamed to merchant_id
to match the merchants
tabledate
- Should be renamed to order_placed_at
because it’s a timestamp and not a date, and we want to clarify that it’s the time when the order was placedcreated_at
updated_at
orders
table.
order_items
table need the following changes:
order_item_id
order_id
product_id
- Should be renamed to item_id
to match the items
tableqty
- Should be renamed to quantity
for claritycreated_at
updated_at
order_items
table.
snapshot
functionality in dbt to create snapshots of our staging tables.
These snapshots will allow us to track changes to our data over time and create a history of our data.
Such a table that contains historical records of your data is called a slowly changing dimension (SCD) table.
Once we have our SCD tables, we will create our fact tables.