Account Balances

Account Balances

Assurez une cohérence irréprochable entre tous vos produits de données

Assurez une cohérence irréprochable entre tous vos produits de données

Ensure Account Balances data is fresh, complete, and reliable before its used for financial reporting, reconciliation, and regulatory purposes.

Data contract description

This data contract enforces schema stability, a 24-hour freshness SLA, and required identifiers and balance timestamps to ensure reliable account balance reporting. It prevents duplicate account snapshots, blocks future-dated balances, enforces valid currency formatting, and ensures opening and closing balances are always present. Together, these checks protect financial reporting accuracy, prevent reconciliation discrepancies, and ensure downstream risk, accounting, and regulatory processes rely on consistent point-in-time balance data.

account_balances_data_contract.yaml

dataset: datasource/database/schema/account_balances

variables:
  FRESHNESS_HOURS:
    default: 24
checks:
  - schema:
      allow_extra_columns: false
      allow_other_column_order: false
  - row_count:
      threshold:
        must_be_greater_than: 0
  - freshness:
      column: balance_date
      threshold:
        unit: hour
        must_be_less_than_or_equal: ${var.FRESHNESS_HOURS}
  # Dataset integrity rules
  - failed_rows:
      name: "balance_date must not be in the future"
      qualifier: balance_date_not_future
      expression: balance_date > CURRENT_TIMESTAMP
  - failed_rows:
      name: "No duplicate account balance snapshots (account_id + balance_date)"
      qualifier: duplicate_account_snapshot
      query: |
        SELECT account_id, balance_date
        FROM account_balances
        GROUP BY account_id, balance_date
        HAVING COUNT(*) > 1
      threshold:
        must_be: 0
  - failed_rows:
      name: "Opening and closing balances must not be negative beyond allowed overdraft tolerance"
      qualifier: excessive_negative_balance
      expression

columns:
  - name: account_id
    data_type: string
    checks:
      - missing:
          name: No missing values
      - invalid:
          name: "account_id length guardrail"
          valid_min_length: 1
          valid_max_length: 64
  - name: balance_date
    data_type: date
    checks:
      - missing:
          name: No missing values
  - name: opening_balance
    data_type: decimal
    checks:
      - missing:
          name: No missing values
  - name: closing_balance
    data_type: decimal
    checks:
      - missing:
          name: No missing values
  - name: currency
    data_type: string
    checks:
      - missing:
          name: No missing values
      - invalid:
          name: "Currency must be ISO-4217 (3 uppercase letters)"
          valid_format:
            name: ISO-4217 code
            regex: "^[A-Z]{3}$"

Data contract description

This data contract enforces schema stability, a 24-hour freshness SLA, and required identifiers and balance timestamps to ensure reliable account balance reporting. It prevents duplicate account snapshots, blocks future-dated balances, enforces valid currency formatting, and ensures opening and closing balances are always present. Together, these checks protect financial reporting accuracy, prevent reconciliation discrepancies, and ensure downstream risk, accounting, and regulatory processes rely on consistent point-in-time balance data.

account_balances_data_contract.yaml

dataset: datasource/database/schema/account_balances

variables:
  FRESHNESS_HOURS:
    default: 24
checks:
  - schema:
      allow_extra_columns: false
      allow_other_column_order: false
  - row_count:
      threshold:
        must_be_greater_than: 0
  - freshness:
      column: balance_date
      threshold:
        unit: hour
        must_be_less_than_or_equal: ${var.FRESHNESS_HOURS}
  # Dataset integrity rules
  - failed_rows:
      name: "balance_date must not be in the future"
      qualifier: balance_date_not_future
      expression: balance_date > CURRENT_TIMESTAMP
  - failed_rows:
      name: "No duplicate account balance snapshots (account_id + balance_date)"
      qualifier: duplicate_account_snapshot
      query: |
        SELECT account_id, balance_date
        FROM account_balances
        GROUP BY account_id, balance_date
        HAVING COUNT(*) > 1
      threshold:
        must_be: 0
  - failed_rows:
      name: "Opening and closing balances must not be negative beyond allowed overdraft tolerance"
      qualifier: excessive_negative_balance
      expression

columns:
  - name: account_id
    data_type: string
    checks:
      - missing:
          name: No missing values
      - invalid:
          name: "account_id length guardrail"
          valid_min_length: 1
          valid_max_length: 64
  - name: balance_date
    data_type: date
    checks:
      - missing:
          name: No missing values
  - name: opening_balance
    data_type: decimal
    checks:
      - missing:
          name: No missing values
  - name: closing_balance
    data_type: decimal
    checks:
      - missing:
          name: No missing values
  - name: currency
    data_type: string
    checks:
      - missing:
          name: No missing values
      - invalid:
          name: "Currency must be ISO-4217 (3 uppercase letters)"
          valid_format:
            name: ISO-4217 code
            regex: "^[A-Z]{3}$"

Data contract description

This data contract enforces schema stability, a 24-hour freshness SLA, and required identifiers and balance timestamps to ensure reliable account balance reporting. It prevents duplicate account snapshots, blocks future-dated balances, enforces valid currency formatting, and ensures opening and closing balances are always present. Together, these checks protect financial reporting accuracy, prevent reconciliation discrepancies, and ensure downstream risk, accounting, and regulatory processes rely on consistent point-in-time balance data.

account_balances_data_contract.yaml

dataset: datasource/database/schema/account_balances

variables:
  FRESHNESS_HOURS:
    default: 24
checks:
  - schema:
      allow_extra_columns: false
      allow_other_column_order: false
  - row_count:
      threshold:
        must_be_greater_than: 0
  - freshness:
      column: balance_date
      threshold:
        unit: hour
        must_be_less_than_or_equal: ${var.FRESHNESS_HOURS}
  # Dataset integrity rules
  - failed_rows:
      name: "balance_date must not be in the future"
      qualifier: balance_date_not_future
      expression: balance_date > CURRENT_TIMESTAMP
  - failed_rows:
      name: "No duplicate account balance snapshots (account_id + balance_date)"
      qualifier: duplicate_account_snapshot
      query: |
        SELECT account_id, balance_date
        FROM account_balances
        GROUP BY account_id, balance_date
        HAVING COUNT(*) > 1
      threshold:
        must_be: 0
  - failed_rows:
      name: "Opening and closing balances must not be negative beyond allowed overdraft tolerance"
      qualifier: excessive_negative_balance
      expression

columns:
  - name: account_id
    data_type: string
    checks:
      - missing:
          name: No missing values
      - invalid:
          name: "account_id length guardrail"
          valid_min_length: 1
          valid_max_length: 64
  - name: balance_date
    data_type: date
    checks:
      - missing:
          name: No missing values
  - name: opening_balance
    data_type: decimal
    checks:
      - missing:
          name: No missing values
  - name: closing_balance
    data_type: decimal
    checks:
      - missing:
          name: No missing values
  - name: currency
    data_type: string
    checks:
      - missing:
          name: No missing values
      - invalid:
          name: "Currency must be ISO-4217 (3 uppercase letters)"
          valid_format:
            name: ISO-4217 code
            regex: "^[A-Z]{3}$"

How to Enforce Data Contracts with Soda

Embed data quality through data contracts at any point in your pipeline.

Embed data quality through data contracts at any point in your pipeline.

# pip install soda-{data source} for other data sources

# pip install soda-{data source} for other data sources

pip install soda-postgres

pip install soda-postgres

# verify the contract locally against a data source

# verify the contract locally against a data source

soda contract verify -c contract.yml -ds ds_config.yml

soda contract verify -c contract.yml -ds ds_config.yml

# publish and schedule the contract with Soda Cloud

# publish and schedule the contract with Soda Cloud

soda contract publish -c contract.yml -sc sc_config.yml

soda contract publish -c contract.yml -sc sc_config.yml

Check out the CLI documentation to learn more.

Check out the CLI documentation to learn more.

How to Automatically Create Data Contracts.
In one Click.

Automatically write and publish data contracts using Soda's AI-powered data contract copilot.

Qualité des données IA basée sur la recherche

Nos recherches ont été publiées dans des revues et conférences de renom, telles que NeurIPs, JAIR et ACML. Les mêmes lieux qui ont fait progresser les fondations de GPT et de l'IA moderne.

4,4 sur 5

Commencez à faire confiance à vos données. Aujourd'hui.

Trouvez, comprenez et corrigez tout problème de qualité des données en quelques secondes.
Du niveau de la table au niveau des enregistrements.

Adopté par

4,4 sur 5

Commencez à faire confiance à vos données. Aujourd'hui.

Trouvez, comprenez et corrigez tout problème de qualité des données en quelques secondes.
Du niveau de la table au niveau des enregistrements.

Adopté par

4,4 sur 5

Commencez à faire confiance à vos données. Aujourd'hui.

Trouvez, comprenez et corrigez tout problème de qualité des données en quelques secondes.
Du niveau de la table au niveau des enregistrements.

Adopté par