Offer Snapshots

Offer Snapshots

Data Contract Template

Data Contract Template

Ensure data from merchant offer snapshots is fresh, complete, and reliable before it is used for pricing operations, market availability analysis, and competitive benchmarking.

Data contract description

This data contract is designed for Idealo, where daily offer snapshots capture merchant pricing, availability, and market data across supported countries. By governing the offer_snapshots dataset with a composite primary-key uniqueness check on offer_id and date_utc, a 24-hour freshness window, and market- and enum-level constraints, the contract helps Idealo keep offer state data dependable for downstream pricing operations, availability monitoring, and competitive benchmarking.

offer_snapshots_data_contract.yaml

dataset
variables:
  ASSET_NAME:
    default: "offer_snapshots"
  ASSET_TYPE:
    default: "merchant_offer_feed"
  BUSINESS_DOMAIN:
    default: "Offers"
  OWNER_TEAM:
    default: "Pricing Team"
  ONCALL_ROUTE:
    default: "#pricing-support"
  DATA_DICTIONARY_REFS:
    default: |
      offer_id -> dict.offer_id.v3
      price -> dict.price.v2
  STORAGE_LOCATION_BASE_PATH:
    default: "s3://idealo-offers/state/"
  PARTITIONING_SPEC:
    default: "date_utc, country"
  SLA_DUE_TIMES:
    default: "00:15 UTC daily"
  PRIMARY_KEYS_AND_UNIQUENESS:
    default: "offer_id, date_utc"
  SCHEMA_SPEC:
    default: "offer_id: string, date_utc: string, price: float, currency: string, availability_status: string"
  CONSTRAINTS_AND_ENUMS:
    default: "price >= 0, currency IN (USD, EUR, GBP), availability_status IN (in_stock, out_of_stock, preorder, unknown)"
  FRESHNESS_AND_COMPLETENESS_EXPECTATIONS:
    default: "≥90% of active merchants present by SLA due"
    
checks:
  - schema:
      allow_extra_columns: false
      allow_other_column_order: false
  - row_count:
      threshold:
        must_be_greater_than: 0
  - failed_rows:
      name: "Ensure primary keys are unique"
      qualifier: pk_uniqueness
      query: |
        SELECT offer_id, date_utc
        FROM datasource.database.schema.offer_snapshots
        GROUP BY offer_id, date_utc
        HAVING COUNT(*) > 1
      threshold:
        must_be: 0
  - freshness:
      column: date_utc
      threshold:
        unit: hour
        must_be_less_than: 24
columns:
  - name: offer_id
    data_type: string
    checks:
      - missing:
      - duplicate:
  - name: date_utc
    data_type: string
    checks:
      - missing:
  - name: price
    data_type: float
    checks:
      - missing:
      - invalid:
          name: "Price must be non-negative"
          valid_min: 0
  - name: currency
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid Currencies"
          valid_values:
            - USD
            - EUR
            - GBP
  - name: availability_status
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid Availability Status"
          valid_values:
            - in_stock
            - out_of_stock
            - preorder
            - unknown
  - name: merchant_id
    data_type: string
    checks:
      - missing:
  - name: country
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Supported Markets"
          valid_values

Data contract description

This data contract is designed for Idealo, where daily offer snapshots capture merchant pricing, availability, and market data across supported countries. By governing the offer_snapshots dataset with a composite primary-key uniqueness check on offer_id and date_utc, a 24-hour freshness window, and market- and enum-level constraints, the contract helps Idealo keep offer state data dependable for downstream pricing operations, availability monitoring, and competitive benchmarking.

offer_snapshots_data_contract.yaml

dataset
variables:
  ASSET_NAME:
    default: "offer_snapshots"
  ASSET_TYPE:
    default: "merchant_offer_feed"
  BUSINESS_DOMAIN:
    default: "Offers"
  OWNER_TEAM:
    default: "Pricing Team"
  ONCALL_ROUTE:
    default: "#pricing-support"
  DATA_DICTIONARY_REFS:
    default: |
      offer_id -> dict.offer_id.v3
      price -> dict.price.v2
  STORAGE_LOCATION_BASE_PATH:
    default: "s3://idealo-offers/state/"
  PARTITIONING_SPEC:
    default: "date_utc, country"
  SLA_DUE_TIMES:
    default: "00:15 UTC daily"
  PRIMARY_KEYS_AND_UNIQUENESS:
    default: "offer_id, date_utc"
  SCHEMA_SPEC:
    default: "offer_id: string, date_utc: string, price: float, currency: string, availability_status: string"
  CONSTRAINTS_AND_ENUMS:
    default: "price >= 0, currency IN (USD, EUR, GBP), availability_status IN (in_stock, out_of_stock, preorder, unknown)"
  FRESHNESS_AND_COMPLETENESS_EXPECTATIONS:
    default: "≥90% of active merchants present by SLA due"
    
checks:
  - schema:
      allow_extra_columns: false
      allow_other_column_order: false
  - row_count:
      threshold:
        must_be_greater_than: 0
  - failed_rows:
      name: "Ensure primary keys are unique"
      qualifier: pk_uniqueness
      query: |
        SELECT offer_id, date_utc
        FROM datasource.database.schema.offer_snapshots
        GROUP BY offer_id, date_utc
        HAVING COUNT(*) > 1
      threshold:
        must_be: 0
  - freshness:
      column: date_utc
      threshold:
        unit: hour
        must_be_less_than: 24
columns:
  - name: offer_id
    data_type: string
    checks:
      - missing:
      - duplicate:
  - name: date_utc
    data_type: string
    checks:
      - missing:
  - name: price
    data_type: float
    checks:
      - missing:
      - invalid:
          name: "Price must be non-negative"
          valid_min: 0
  - name: currency
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid Currencies"
          valid_values:
            - USD
            - EUR
            - GBP
  - name: availability_status
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid Availability Status"
          valid_values:
            - in_stock
            - out_of_stock
            - preorder
            - unknown
  - name: merchant_id
    data_type: string
    checks:
      - missing:
  - name: country
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Supported Markets"
          valid_values

Data contract description

This data contract is designed for Idealo, where daily offer snapshots capture merchant pricing, availability, and market data across supported countries. By governing the offer_snapshots dataset with a composite primary-key uniqueness check on offer_id and date_utc, a 24-hour freshness window, and market- and enum-level constraints, the contract helps Idealo keep offer state data dependable for downstream pricing operations, availability monitoring, and competitive benchmarking.

offer_snapshots_data_contract.yaml

dataset
variables:
  ASSET_NAME:
    default: "offer_snapshots"
  ASSET_TYPE:
    default: "merchant_offer_feed"
  BUSINESS_DOMAIN:
    default: "Offers"
  OWNER_TEAM:
    default: "Pricing Team"
  ONCALL_ROUTE:
    default: "#pricing-support"
  DATA_DICTIONARY_REFS:
    default: |
      offer_id -> dict.offer_id.v3
      price -> dict.price.v2
  STORAGE_LOCATION_BASE_PATH:
    default: "s3://idealo-offers/state/"
  PARTITIONING_SPEC:
    default: "date_utc, country"
  SLA_DUE_TIMES:
    default: "00:15 UTC daily"
  PRIMARY_KEYS_AND_UNIQUENESS:
    default: "offer_id, date_utc"
  SCHEMA_SPEC:
    default: "offer_id: string, date_utc: string, price: float, currency: string, availability_status: string"
  CONSTRAINTS_AND_ENUMS:
    default: "price >= 0, currency IN (USD, EUR, GBP), availability_status IN (in_stock, out_of_stock, preorder, unknown)"
  FRESHNESS_AND_COMPLETENESS_EXPECTATIONS:
    default: "≥90% of active merchants present by SLA due"
    
checks:
  - schema:
      allow_extra_columns: false
      allow_other_column_order: false
  - row_count:
      threshold:
        must_be_greater_than: 0
  - failed_rows:
      name: "Ensure primary keys are unique"
      qualifier: pk_uniqueness
      query: |
        SELECT offer_id, date_utc
        FROM datasource.database.schema.offer_snapshots
        GROUP BY offer_id, date_utc
        HAVING COUNT(*) > 1
      threshold:
        must_be: 0
  - freshness:
      column: date_utc
      threshold:
        unit: hour
        must_be_less_than: 24
columns:
  - name: offer_id
    data_type: string
    checks:
      - missing:
      - duplicate:
  - name: date_utc
    data_type: string
    checks:
      - missing:
  - name: price
    data_type: float
    checks:
      - missing:
      - invalid:
          name: "Price must be non-negative"
          valid_min: 0
  - name: currency
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid Currencies"
          valid_values:
            - USD
            - EUR
            - GBP
  - name: availability_status
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid Availability Status"
          valid_values:
            - in_stock
            - out_of_stock
            - preorder
            - unknown
  - name: merchant_id
    data_type: string
    checks:
      - missing:
  - name: country
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Supported Markets"
          valid_values

How to Enforce Data Contracts with Soda

Embed data quality through data contracts at any point in your pipeline.

Embed data quality through data contracts at any point in your pipeline.

# pip install soda-{data source} for other data sources

# pip install soda-{data source} for other data sources

pip install soda-postgres

pip install soda-postgres

# verify the contract locally against a data source

# verify the contract locally against a data source

soda contract verify -c contract.yml -ds ds_config.yml

soda contract verify -c contract.yml -ds ds_config.yml

# publish and schedule the contract with Soda Cloud

# publish and schedule the contract with Soda Cloud

soda contract publish -c contract.yml -sc sc_config.yml

soda contract publish -c contract.yml -sc sc_config.yml

Check out the CLI documentation to learn more.

Check out the CLI documentation to learn more.

How to Automatically Create Data Contracts.
In one Click.

Automatically write and publish data contracts using Soda's AI-powered data contract copilot.

Make data contracts work in production

Business knows what good data looks like. Engineering knows how to deliver it at scale. Soda unites both, turning governance expectations into executable contracts.

4.4 of 5

Start trusting your data. Today.

Find, understand, and fix any data quality issue in seconds.
From table to record-level.

Trusted by

4.4 of 5

Start trusting your data. Today.

Find, understand, and fix any data quality issue in seconds.
From table to record-level.

Trusted by

4.4 of 5

Start trusting your data. Today.

Find, understand, and fix any data quality issue in seconds.
From table to record-level.

Trusted by