Sales Transactions

Sales Transactions

Assurez une cohérence irréprochable entre tous vos produits de données

Assurez une cohérence irréprochable entre tous vos produits de données

Ensure Sales Transactions data is fresh, accurate, and reliable before its used for revenue reporting, margin analysis, and channel performance insights.

Data contract description

This data contract enforces schema stability, a 24-hour freshness SLA based on order dates, and required identifiers for orders, customers, and products. It prevents missing or invalid quantities and prices, restricts sales channels to approved values, blocks duplicate order line items, and enforces financial integrity rules to ensure net amounts correctly reflect quantity, unit price, and discounts. Together, these checks protect revenue accuracy, prevent inflated sales metrics, and ensure downstream reporting for margin, channel performance, and financial reconciliation remains trustworthy.

sales_transactions_data_contract.yaml

dataset: datasource/database/schema/sales_transactions 

variables: 
  FRESHNESS_HOURS: 
    default: 24
checks:
  - schema:    
      allow_extra_columns: false
      allow_other_column_order: false
  - row_count:
      threshold:
        must_be_greater_than: 0  
  - freshness:
      column: order_date
      threshold:
        unit: hour
        must_be_less_than_or_equal: ${var.FRESHNESS_HOURS}
	- failed_rows:
      name: "order_date must not be in the future"
      qualifier: order_date_not_future
      expression: order_date > CURRENT_TIMESTAMP    
  - failed_rows:
      name: "No duplicate line items per order"
      qualifier: dup_order_product
      query: |
        SELECT order_id, product_id
        FROM sales_transactions
        GROUP BY order_id, product_id
        HAVING COUNT(*) > 1
      threshold:
        must_be: 0      
  - failed_rows:
      name: "net_amount must equal (quantity * unit_price) - discount_amount"
      qualifier: net_amount_formula
      expression: net_amount <> ((quantity * unit_price) - discount_amount)     
  - failed_rows:
      name: "discount cannot exceed gross amount"
      qualifier: discount_le_gross
      expression

columns:
  - name: order_id
    data_type: string
    checks:
      - missing:
      - invalid:
          valid_min_length: 1
          valid_max_length: 64         
  - name: customer_id
    data_type: string
    checks:
      - missing:
  - name: product_id
    data_type: string
    checks:
      - missing:     
  - name: quantity
    data_type: integer
    checks:
      - missing:
      - invalid:
          name: "Quantity must be positive"
          valid_min: 1        
  - name: unit_price
    data_type: float
    checks:
      - missing:
      - invalid:
          valid_min: 0          
  - name: channel
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Allowed sales channels"
          valid_values

Data contract description

This data contract enforces schema stability, a 24-hour freshness SLA based on order dates, and required identifiers for orders, customers, and products. It prevents missing or invalid quantities and prices, restricts sales channels to approved values, blocks duplicate order line items, and enforces financial integrity rules to ensure net amounts correctly reflect quantity, unit price, and discounts. Together, these checks protect revenue accuracy, prevent inflated sales metrics, and ensure downstream reporting for margin, channel performance, and financial reconciliation remains trustworthy.

sales_transactions_data_contract.yaml

dataset: datasource/database/schema/sales_transactions 

variables: 
  FRESHNESS_HOURS: 
    default: 24
checks:
  - schema:    
      allow_extra_columns: false
      allow_other_column_order: false
  - row_count:
      threshold:
        must_be_greater_than: 0  
  - freshness:
      column: order_date
      threshold:
        unit: hour
        must_be_less_than_or_equal: ${var.FRESHNESS_HOURS}
	- failed_rows:
      name: "order_date must not be in the future"
      qualifier: order_date_not_future
      expression: order_date > CURRENT_TIMESTAMP    
  - failed_rows:
      name: "No duplicate line items per order"
      qualifier: dup_order_product
      query: |
        SELECT order_id, product_id
        FROM sales_transactions
        GROUP BY order_id, product_id
        HAVING COUNT(*) > 1
      threshold:
        must_be: 0      
  - failed_rows:
      name: "net_amount must equal (quantity * unit_price) - discount_amount"
      qualifier: net_amount_formula
      expression: net_amount <> ((quantity * unit_price) - discount_amount)     
  - failed_rows:
      name: "discount cannot exceed gross amount"
      qualifier: discount_le_gross
      expression

columns:
  - name: order_id
    data_type: string
    checks:
      - missing:
      - invalid:
          valid_min_length: 1
          valid_max_length: 64         
  - name: customer_id
    data_type: string
    checks:
      - missing:
  - name: product_id
    data_type: string
    checks:
      - missing:     
  - name: quantity
    data_type: integer
    checks:
      - missing:
      - invalid:
          name: "Quantity must be positive"
          valid_min: 1        
  - name: unit_price
    data_type: float
    checks:
      - missing:
      - invalid:
          valid_min: 0          
  - name: channel
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Allowed sales channels"
          valid_values

Data contract description

This data contract enforces schema stability, a 24-hour freshness SLA based on order dates, and required identifiers for orders, customers, and products. It prevents missing or invalid quantities and prices, restricts sales channels to approved values, blocks duplicate order line items, and enforces financial integrity rules to ensure net amounts correctly reflect quantity, unit price, and discounts. Together, these checks protect revenue accuracy, prevent inflated sales metrics, and ensure downstream reporting for margin, channel performance, and financial reconciliation remains trustworthy.

sales_transactions_data_contract.yaml

dataset: datasource/database/schema/sales_transactions 

variables: 
  FRESHNESS_HOURS: 
    default: 24
checks:
  - schema:    
      allow_extra_columns: false
      allow_other_column_order: false
  - row_count:
      threshold:
        must_be_greater_than: 0  
  - freshness:
      column: order_date
      threshold:
        unit: hour
        must_be_less_than_or_equal: ${var.FRESHNESS_HOURS}
	- failed_rows:
      name: "order_date must not be in the future"
      qualifier: order_date_not_future
      expression: order_date > CURRENT_TIMESTAMP    
  - failed_rows:
      name: "No duplicate line items per order"
      qualifier: dup_order_product
      query: |
        SELECT order_id, product_id
        FROM sales_transactions
        GROUP BY order_id, product_id
        HAVING COUNT(*) > 1
      threshold:
        must_be: 0      
  - failed_rows:
      name: "net_amount must equal (quantity * unit_price) - discount_amount"
      qualifier: net_amount_formula
      expression: net_amount <> ((quantity * unit_price) - discount_amount)     
  - failed_rows:
      name: "discount cannot exceed gross amount"
      qualifier: discount_le_gross
      expression

columns:
  - name: order_id
    data_type: string
    checks:
      - missing:
      - invalid:
          valid_min_length: 1
          valid_max_length: 64         
  - name: customer_id
    data_type: string
    checks:
      - missing:
  - name: product_id
    data_type: string
    checks:
      - missing:     
  - name: quantity
    data_type: integer
    checks:
      - missing:
      - invalid:
          name: "Quantity must be positive"
          valid_min: 1        
  - name: unit_price
    data_type: float
    checks:
      - missing:
      - invalid:
          valid_min: 0          
  - name: channel
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Allowed sales channels"
          valid_values

How to Enforce Data Contracts with Soda

Embed data quality through data contracts at any point in your pipeline.

Embed data quality through data contracts at any point in your pipeline.

# pip install soda-{data source} for other data sources

# pip install soda-{data source} for other data sources

pip install soda-postgres

pip install soda-postgres

# verify the contract locally against a data source

# verify the contract locally against a data source

soda contract verify -c contract.yml -ds ds_config.yml

soda contract verify -c contract.yml -ds ds_config.yml

# publish and schedule the contract with Soda Cloud

# publish and schedule the contract with Soda Cloud

soda contract publish -c contract.yml -sc sc_config.yml

soda contract publish -c contract.yml -sc sc_config.yml

Check out the CLI documentation to learn more.

Check out the CLI documentation to learn more.

How to Automatically Create Data Contracts.
In one Click.

Automatically write and publish data contracts using Soda's AI-powered data contract copilot.

Qualité des données IA basée sur la recherche

Nos recherches ont été publiées dans des revues et conférences de renom, telles que NeurIPs, JAIR et ACML. Les mêmes lieux qui ont fait progresser les fondations de GPT et de l'IA moderne.

Qualité des données IA basée sur la recherche

Nos recherches ont été publiées dans des revues et conférences de renom, telles que NeurIPs, JAIR et ACML. Les mêmes lieux qui ont fait progresser les fondations de GPT et de l'IA moderne.

Qualité des données IA basée sur la recherche

Nos recherches ont été publiées dans des revues et conférences de renom, telles que NeurIPs, JAIR et ACML. Les mêmes lieux qui ont fait progresser les fondations de GPT et de l'IA moderne.

4,4 sur 5

Commencez à faire confiance à vos données. Aujourd'hui.

Trouvez, comprenez et corrigez tout problème de qualité des données en quelques secondes.
Du niveau de la table au niveau des enregistrements.

Adopté par

4,4 sur 5

Commencez à faire confiance à vos données. Aujourd'hui.

Trouvez, comprenez et corrigez tout problème de qualité des données en quelques secondes.
Du niveau de la table au niveau des enregistrements.

Adopté par

4,4 sur 5

Commencez à faire confiance à vos données. Aujourd'hui.

Trouvez, comprenez et corrigez tout problème de qualité des données en quelques secondes.
Du niveau de la table au niveau des enregistrements.

Adopté par