BCBS 239

BCBS 239

Assurez une cohérence irréprochable entre tous vos produits de données

Assurez une cohérence irréprochable entre tous vos produits de données

Santiago Viquez

DevRel chez Soda

Ensure accounting data is complete, consistent, and reconcilable to support BCBS 239-compliant risk reporting.

Data contract description

This data contract enforces BCBS 239–aligned controls on the exposures dataset. It validates schema stability, detects orphan counterparties, duplicate records, and invalid exposure amounts via failed_rows checks, enforces LEI format and referential integrity at the column level, and reconciles total exposures against accounting balances. Together, these controls help ensure exposure reporting is complete, accurate, and consistent across risk and finance systems.

bcbs239_data_contract.yml

dataset
checks:
  - schema:
      allow_extra_columns: false
      allow_other_column_order: false
  - row_count:
      threshold:
        must_be_greater_than: 0
  - failed_rows:
      name: exposures_without_valid_counterparty
      qualifier: counterparty_integrity 
      query: |
        SELECT e.counterparty_id
        FROM datasource.db.schema.exposures e
        LEFT JOIN datasource.db.schema.counterparty_master m
          ON e.counterparty_id = m.counterparty_id
        WHERE m.counterparty_id IS NULL
      threshold:
        must_be: 0
      attributes:
        bcbs239:
          - P3
      description: "Integrity: every exposure must reference an existing counterparty in the master data."
  - failed_rows:
      name: duplicate_exposure_records
      qualifier: duplicate_records 
      query: |
        SELECT counterparty_id, facility_id, as_of_date
        FROM datasource.db.schema.exposures
        GROUP BY counterparty_id, facility_id, as_of_date
        HAVING COUNT(*) > 1
      threshold:
        must_be: 0
      attributes:
        bcbs239:
          - P3
          - P4
      description: "Accuracy and completeness: no duplicate exposure records per counterparty, facility, and reporting date."
  - failed_rows:
      name: invalid_exposure_amount
      qualifier: exposure_amount_validity 
      query: |
        SELECT counterparty_id, exposure_amount
        FROM datasource.db.schema.exposures
        WHERE exposure_amount IS NULL
           OR exposure_amount < 0
      threshold:
        must_be: 0
      attributes:
        bcbs239:
          - P3
      description: "Accuracy and integrity: exposure amounts must be populated and non-negative."
columns:
  - name: lei_code
    data_type: string
    checks:
      - invalid:
          name: lei_code_format
          valid_format:
            name: LEI must be 20 alphanumeric
            regex: '^[A-Z0-9]{20}$'
          attributes:
            bcbs239:
              - P3
            description: LEI format is 20 uppercase alphanumerics
  - name: counterparty_id
    checks:
      - missing:
          attributes:
            bcbs239:
              - P4
          description: "Completeness: every exposure must include a counterparty identifier."
      - invalid:
          name: counterparty_id_in_master
          valid_reference_data:
            dataset: datasource/db/schema/counterparty_master
            column: counterparty_id
          attributes:
            bcbs239:
              - P3
            description: "Integrity: all counterparties in exposures must exist in the master data."
  - name: exposure_amount
    data_type: decimal
    checks:
      - missing:
          attributes:
            bcbs239:
              - P4
          description: "Completeness: exposure amounts must be populated."
      - invalid:
          name: exposure_amount_non_negative
          valid_min: 0
          attributes:
            bcbs239:
              - P3
          description: "Accuracy: exposure amounts must be zero or positive."
  - name: as_of_date
    data_type: date
    checks:
      - missing:
          attributes:
            bcbs239:
              - P4
          description: "Completeness: every exposure record must include a reporting date."
  - name: facility_id
    data_type: string
    checks:
      - missing:
          attributes:
            bcbs239:
              - P4
          description: "Completeness: each exposure must reference a facility."
      - invalid:
          name: "facility_id length guardrail"
          valid_min_length: 1
          valid_max_length: 64

reconciliation:
  source:
    dataset: datasource/db/schema/accounting_balances
  checks:
    - metric_diff:
        name: total_exposures_vs_total_balance
        source_expression: SUM(balance_amount)
        target_expression: SUM(exposure_amount)
        threshold:
          must_be_less_than: 1000
        attributes:
          bcbs239:
            - P3
            - P7
          description: "Reconciliation vs accounting: overall tolerance across the books (sum vs sum)"

Data contract description

This data contract enforces BCBS 239–aligned controls on the exposures dataset. It validates schema stability, detects orphan counterparties, duplicate records, and invalid exposure amounts via failed_rows checks, enforces LEI format and referential integrity at the column level, and reconciles total exposures against accounting balances. Together, these controls help ensure exposure reporting is complete, accurate, and consistent across risk and finance systems.

bcbs239_data_contract.yml

dataset
checks:
  - schema:
      allow_extra_columns: false
      allow_other_column_order: false
  - row_count:
      threshold:
        must_be_greater_than: 0
  - failed_rows:
      name: exposures_without_valid_counterparty
      qualifier: counterparty_integrity 
      query: |
        SELECT e.counterparty_id
        FROM datasource.db.schema.exposures e
        LEFT JOIN datasource.db.schema.counterparty_master m
          ON e.counterparty_id = m.counterparty_id
        WHERE m.counterparty_id IS NULL
      threshold:
        must_be: 0
      attributes:
        bcbs239:
          - P3
      description: "Integrity: every exposure must reference an existing counterparty in the master data."
  - failed_rows:
      name: duplicate_exposure_records
      qualifier: duplicate_records 
      query: |
        SELECT counterparty_id, facility_id, as_of_date
        FROM datasource.db.schema.exposures
        GROUP BY counterparty_id, facility_id, as_of_date
        HAVING COUNT(*) > 1
      threshold:
        must_be: 0
      attributes:
        bcbs239:
          - P3
          - P4
      description: "Accuracy and completeness: no duplicate exposure records per counterparty, facility, and reporting date."
  - failed_rows:
      name: invalid_exposure_amount
      qualifier: exposure_amount_validity 
      query: |
        SELECT counterparty_id, exposure_amount
        FROM datasource.db.schema.exposures
        WHERE exposure_amount IS NULL
           OR exposure_amount < 0
      threshold:
        must_be: 0
      attributes:
        bcbs239:
          - P3
      description: "Accuracy and integrity: exposure amounts must be populated and non-negative."
columns:
  - name: lei_code
    data_type: string
    checks:
      - invalid:
          name: lei_code_format
          valid_format:
            name: LEI must be 20 alphanumeric
            regex: '^[A-Z0-9]{20}$'
          attributes:
            bcbs239:
              - P3
            description: LEI format is 20 uppercase alphanumerics
  - name: counterparty_id
    checks:
      - missing:
          attributes:
            bcbs239:
              - P4
          description: "Completeness: every exposure must include a counterparty identifier."
      - invalid:
          name: counterparty_id_in_master
          valid_reference_data:
            dataset: datasource/db/schema/counterparty_master
            column: counterparty_id
          attributes:
            bcbs239:
              - P3
            description: "Integrity: all counterparties in exposures must exist in the master data."
  - name: exposure_amount
    data_type: decimal
    checks:
      - missing:
          attributes:
            bcbs239:
              - P4
          description: "Completeness: exposure amounts must be populated."
      - invalid:
          name: exposure_amount_non_negative
          valid_min: 0
          attributes:
            bcbs239:
              - P3
          description: "Accuracy: exposure amounts must be zero or positive."
  - name: as_of_date
    data_type: date
    checks:
      - missing:
          attributes:
            bcbs239:
              - P4
          description: "Completeness: every exposure record must include a reporting date."
  - name: facility_id
    data_type: string
    checks:
      - missing:
          attributes:
            bcbs239:
              - P4
          description: "Completeness: each exposure must reference a facility."
      - invalid:
          name: "facility_id length guardrail"
          valid_min_length: 1
          valid_max_length: 64

reconciliation:
  source:
    dataset: datasource/db/schema/accounting_balances
  checks:
    - metric_diff:
        name: total_exposures_vs_total_balance
        source_expression: SUM(balance_amount)
        target_expression: SUM(exposure_amount)
        threshold:
          must_be_less_than: 1000
        attributes:
          bcbs239:
            - P3
            - P7
          description: "Reconciliation vs accounting: overall tolerance across the books (sum vs sum)"

Data contract description

This data contract enforces BCBS 239–aligned controls on the exposures dataset. It validates schema stability, detects orphan counterparties, duplicate records, and invalid exposure amounts via failed_rows checks, enforces LEI format and referential integrity at the column level, and reconciles total exposures against accounting balances. Together, these controls help ensure exposure reporting is complete, accurate, and consistent across risk and finance systems.

bcbs239_data_contract.yml

dataset
checks:
  - schema:
      allow_extra_columns: false
      allow_other_column_order: false
  - row_count:
      threshold:
        must_be_greater_than: 0
  - failed_rows:
      name: exposures_without_valid_counterparty
      qualifier: counterparty_integrity 
      query: |
        SELECT e.counterparty_id
        FROM datasource.db.schema.exposures e
        LEFT JOIN datasource.db.schema.counterparty_master m
          ON e.counterparty_id = m.counterparty_id
        WHERE m.counterparty_id IS NULL
      threshold:
        must_be: 0
      attributes:
        bcbs239:
          - P3
      description: "Integrity: every exposure must reference an existing counterparty in the master data."
  - failed_rows:
      name: duplicate_exposure_records
      qualifier: duplicate_records 
      query: |
        SELECT counterparty_id, facility_id, as_of_date
        FROM datasource.db.schema.exposures
        GROUP BY counterparty_id, facility_id, as_of_date
        HAVING COUNT(*) > 1
      threshold:
        must_be: 0
      attributes:
        bcbs239:
          - P3
          - P4
      description: "Accuracy and completeness: no duplicate exposure records per counterparty, facility, and reporting date."
  - failed_rows:
      name: invalid_exposure_amount
      qualifier: exposure_amount_validity 
      query: |
        SELECT counterparty_id, exposure_amount
        FROM datasource.db.schema.exposures
        WHERE exposure_amount IS NULL
           OR exposure_amount < 0
      threshold:
        must_be: 0
      attributes:
        bcbs239:
          - P3
      description: "Accuracy and integrity: exposure amounts must be populated and non-negative."
columns:
  - name: lei_code
    data_type: string
    checks:
      - invalid:
          name: lei_code_format
          valid_format:
            name: LEI must be 20 alphanumeric
            regex: '^[A-Z0-9]{20}$'
          attributes:
            bcbs239:
              - P3
            description: LEI format is 20 uppercase alphanumerics
  - name: counterparty_id
    checks:
      - missing:
          attributes:
            bcbs239:
              - P4
          description: "Completeness: every exposure must include a counterparty identifier."
      - invalid:
          name: counterparty_id_in_master
          valid_reference_data:
            dataset: datasource/db/schema/counterparty_master
            column: counterparty_id
          attributes:
            bcbs239:
              - P3
            description: "Integrity: all counterparties in exposures must exist in the master data."
  - name: exposure_amount
    data_type: decimal
    checks:
      - missing:
          attributes:
            bcbs239:
              - P4
          description: "Completeness: exposure amounts must be populated."
      - invalid:
          name: exposure_amount_non_negative
          valid_min: 0
          attributes:
            bcbs239:
              - P3
          description: "Accuracy: exposure amounts must be zero or positive."
  - name: as_of_date
    data_type: date
    checks:
      - missing:
          attributes:
            bcbs239:
              - P4
          description: "Completeness: every exposure record must include a reporting date."
  - name: facility_id
    data_type: string
    checks:
      - missing:
          attributes:
            bcbs239:
              - P4
          description: "Completeness: each exposure must reference a facility."
      - invalid:
          name: "facility_id length guardrail"
          valid_min_length: 1
          valid_max_length: 64

reconciliation:
  source:
    dataset: datasource/db/schema/accounting_balances
  checks:
    - metric_diff:
        name: total_exposures_vs_total_balance
        source_expression: SUM(balance_amount)
        target_expression: SUM(exposure_amount)
        threshold:
          must_be_less_than: 1000
        attributes:
          bcbs239:
            - P3
            - P7
          description: "Reconciliation vs accounting: overall tolerance across the books (sum vs sum)"

How to Enforce Data Contracts with Soda

Embed data quality through data contracts at any point in your pipeline.

Embed data quality through data contracts at any point in your pipeline.

# pip install soda-{data source} for other data sources

# pip install soda-{data source} for other data sources

pip install soda-postgres

pip install soda-postgres

# verify the contract locally against a data source

# verify the contract locally against a data source

soda contract verify -c contract.yml -ds ds_config.yml

soda contract verify -c contract.yml -ds ds_config.yml

# publish and schedule the contract with Soda Cloud

# publish and schedule the contract with Soda Cloud

soda contract publish -c contract.yml -sc sc_config.yml

soda contract publish -c contract.yml -sc sc_config.yml

Check out the CLI documentation to learn more.

Check out the CLI documentation to learn more.

How to Automatically Create Data Contracts.
In one Click.

Automatically write and publish data contracts using Soda's AI-powered data contract copilot.

Qualité des données IA basée sur la recherche

Nos recherches ont été publiées dans des revues et conférences de renom, telles que NeurIPs, JAIR et ACML. Les mêmes lieux qui ont fait progresser les fondations de GPT et de l'IA moderne.

4,4 sur 5

Commencez à faire confiance à vos données. Aujourd'hui.

Trouvez, comprenez et corrigez tout problème de qualité des données en quelques secondes.
Du niveau de la table au niveau des enregistrements.

Adopté par

4,4 sur 5

Commencez à faire confiance à vos données. Aujourd'hui.

Trouvez, comprenez et corrigez tout problème de qualité des données en quelques secondes.
Du niveau de la table au niveau des enregistrements.

Adopté par

4,4 sur 5

Commencez à faire confiance à vos données. Aujourd'hui.

Trouvez, comprenez et corrigez tout problème de qualité des données en quelques secondes.
Du niveau de la table au niveau des enregistrements.

Adopté par