Streaming Event Contracts

Streaming Event Contracts

Assurez une cohérence irréprochable entre tous vos produits de données

Assurez une cohérence irréprochable entre tous vos produits de données

Ensure data from streaming event contracts is complete, structured, and reliable before it is used for event-driven data governance, producer-consumer alignment, and integration validation at scale.

Data contract description

This data contract is designed for Teknasyon, where application event streams feed an event-driven data platform used for downstream reporting and machine learning datasets at very large scale. In this use case, streaming_events governs the metadata for a user_interaction stream by defining which team owns it, which topic and environment it belongs to, which schema and version it should follow, and which platform and client versions it is expected to cover, so event definitions can be validated consistently before they move through Teknasyon’s broader ingestion and processing pipeline.

streaming_events_data_contract.yaml

dataset
variables:
  CONTRACT_NAME:
    default: "teknasyon_streaming_event_contract"
  EVENT_NAME:
    default: "user_interaction"
  PRODUCT_APP:
    default: "Getcontact"
  OWNER_TEAM:
    default: "Mobile App Team"
  OWNER_CONTACT:
    default: "#mobile-app-support"
  ENVIRONMENT:
    default: "prod"
  TOPIC_NAME:
    default: "user_interaction_topic"
  SCHEMA_REF:
    default: "https://schemas.teknasyon.io/events/user_interaction.json"
  SCHEMA_VERSION:
    default: "1.0.0"
  PLATFORM:
    default: "ios"
  CLIENT_VERSION_MIN:
    default: "1.5.0"
  EXPECTED_VOLUME_BAND:
    default: "high"

checks:
  - schema:
      allow_extra_columns: false
      allow_other_column_order: false
  - row_count:
      threshold:
        must_be_greater_than: 0
columns:
  - name: contract_name
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Contract Name length"
          valid_min_length: 1
          valid_max_length: 128
  - name: event_name
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Event Name length"
          valid_min_length: 1
          valid_max_length: 128
  - name: product_app
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Product App name length"
          valid_min_length: 1
          valid_max_length: 128
  - name: owner_team
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Owner Team length"
          valid_min_length: 1
          valid_max_length: 64
  - name: owner_contact
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Owner Contact format"
          valid_min_length: 3
          valid_max_length: 128
  - name: environment
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid Environments"
          valid_values:
            - prod
            - rc
  - name: topic_name
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Topic Name length"
          valid_min_length: 1
          valid_max_length: 128
  - name: schema_ref
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Schema Location URL"
          valid_min_length: 5
          valid_max_length: 255
  - name: schema_version
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Schema Version format"
          valid_format:
            name: Version pattern
            regex: '^[0-9]+\.[0-9]+\.[0-9]+$'
  - name: platform
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid Platforms"
          valid_values:
            - ios
            - android
            - web
            - backend
            - multi
  - name: client_version_min
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Client Version format"
          valid_format:
            name: Version pattern
            regex: '^[0-9]+\.[0-9]+\.[0-9]+$'
  - name: expected_volume_band
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Expected Volume Band"
          valid_values

Data contract description

This data contract is designed for Teknasyon, where application event streams feed an event-driven data platform used for downstream reporting and machine learning datasets at very large scale. In this use case, streaming_events governs the metadata for a user_interaction stream by defining which team owns it, which topic and environment it belongs to, which schema and version it should follow, and which platform and client versions it is expected to cover, so event definitions can be validated consistently before they move through Teknasyon’s broader ingestion and processing pipeline.

streaming_events_data_contract.yaml

dataset
variables:
  CONTRACT_NAME:
    default: "teknasyon_streaming_event_contract"
  EVENT_NAME:
    default: "user_interaction"
  PRODUCT_APP:
    default: "Getcontact"
  OWNER_TEAM:
    default: "Mobile App Team"
  OWNER_CONTACT:
    default: "#mobile-app-support"
  ENVIRONMENT:
    default: "prod"
  TOPIC_NAME:
    default: "user_interaction_topic"
  SCHEMA_REF:
    default: "https://schemas.teknasyon.io/events/user_interaction.json"
  SCHEMA_VERSION:
    default: "1.0.0"
  PLATFORM:
    default: "ios"
  CLIENT_VERSION_MIN:
    default: "1.5.0"
  EXPECTED_VOLUME_BAND:
    default: "high"

checks:
  - schema:
      allow_extra_columns: false
      allow_other_column_order: false
  - row_count:
      threshold:
        must_be_greater_than: 0
columns:
  - name: contract_name
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Contract Name length"
          valid_min_length: 1
          valid_max_length: 128
  - name: event_name
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Event Name length"
          valid_min_length: 1
          valid_max_length: 128
  - name: product_app
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Product App name length"
          valid_min_length: 1
          valid_max_length: 128
  - name: owner_team
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Owner Team length"
          valid_min_length: 1
          valid_max_length: 64
  - name: owner_contact
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Owner Contact format"
          valid_min_length: 3
          valid_max_length: 128
  - name: environment
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid Environments"
          valid_values:
            - prod
            - rc
  - name: topic_name
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Topic Name length"
          valid_min_length: 1
          valid_max_length: 128
  - name: schema_ref
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Schema Location URL"
          valid_min_length: 5
          valid_max_length: 255
  - name: schema_version
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Schema Version format"
          valid_format:
            name: Version pattern
            regex: '^[0-9]+\.[0-9]+\.[0-9]+$'
  - name: platform
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid Platforms"
          valid_values:
            - ios
            - android
            - web
            - backend
            - multi
  - name: client_version_min
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Client Version format"
          valid_format:
            name: Version pattern
            regex: '^[0-9]+\.[0-9]+\.[0-9]+$'
  - name: expected_volume_band
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Expected Volume Band"
          valid_values

Data contract description

This data contract is designed for Teknasyon, where application event streams feed an event-driven data platform used for downstream reporting and machine learning datasets at very large scale. In this use case, streaming_events governs the metadata for a user_interaction stream by defining which team owns it, which topic and environment it belongs to, which schema and version it should follow, and which platform and client versions it is expected to cover, so event definitions can be validated consistently before they move through Teknasyon’s broader ingestion and processing pipeline.

streaming_events_data_contract.yaml

dataset
variables:
  CONTRACT_NAME:
    default: "teknasyon_streaming_event_contract"
  EVENT_NAME:
    default: "user_interaction"
  PRODUCT_APP:
    default: "Getcontact"
  OWNER_TEAM:
    default: "Mobile App Team"
  OWNER_CONTACT:
    default: "#mobile-app-support"
  ENVIRONMENT:
    default: "prod"
  TOPIC_NAME:
    default: "user_interaction_topic"
  SCHEMA_REF:
    default: "https://schemas.teknasyon.io/events/user_interaction.json"
  SCHEMA_VERSION:
    default: "1.0.0"
  PLATFORM:
    default: "ios"
  CLIENT_VERSION_MIN:
    default: "1.5.0"
  EXPECTED_VOLUME_BAND:
    default: "high"

checks:
  - schema:
      allow_extra_columns: false
      allow_other_column_order: false
  - row_count:
      threshold:
        must_be_greater_than: 0
columns:
  - name: contract_name
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Contract Name length"
          valid_min_length: 1
          valid_max_length: 128
  - name: event_name
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Event Name length"
          valid_min_length: 1
          valid_max_length: 128
  - name: product_app
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Product App name length"
          valid_min_length: 1
          valid_max_length: 128
  - name: owner_team
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Owner Team length"
          valid_min_length: 1
          valid_max_length: 64
  - name: owner_contact
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Owner Contact format"
          valid_min_length: 3
          valid_max_length: 128
  - name: environment
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid Environments"
          valid_values:
            - prod
            - rc
  - name: topic_name
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Topic Name length"
          valid_min_length: 1
          valid_max_length: 128
  - name: schema_ref
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Schema Location URL"
          valid_min_length: 5
          valid_max_length: 255
  - name: schema_version
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Schema Version format"
          valid_format:
            name: Version pattern
            regex: '^[0-9]+\.[0-9]+\.[0-9]+$'
  - name: platform
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid Platforms"
          valid_values:
            - ios
            - android
            - web
            - backend
            - multi
  - name: client_version_min
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Client Version format"
          valid_format:
            name: Version pattern
            regex: '^[0-9]+\.[0-9]+\.[0-9]+$'
  - name: expected_volume_band
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Expected Volume Band"
          valid_values

How to Enforce Data Contracts with Soda

Embed data quality through data contracts at any point in your pipeline.

Embed data quality through data contracts at any point in your pipeline.

# pip install soda-{data source} for other data sources

# pip install soda-{data source} for other data sources

pip install soda-postgres

pip install soda-postgres

# verify the contract locally against a data source

# verify the contract locally against a data source

soda contract verify -c contract.yml -ds ds_config.yml

soda contract verify -c contract.yml -ds ds_config.yml

# publish and schedule the contract with Soda Cloud

# publish and schedule the contract with Soda Cloud

soda contract publish -c contract.yml -sc sc_config.yml

soda contract publish -c contract.yml -sc sc_config.yml

Check out the CLI documentation to learn more.

Check out the CLI documentation to learn more.

How to Automatically Create Data Contracts.
In one Click.

Automatically write and publish data contracts using Soda's AI-powered data contract copilot.

Qualité des données IA basée sur la recherche

Nos recherches ont été publiées dans des revues et conférences de renom, telles que NeurIPs, JAIR et ACML. Les mêmes lieux qui ont fait progresser les fondations de GPT et de l'IA moderne.

4,4 sur 5

Commencez à faire confiance à vos données. Aujourd'hui.

Trouvez, comprenez et corrigez tout problème de qualité des données en quelques secondes.
Du niveau de la table au niveau des enregistrements.

Adopté par

4,4 sur 5

Commencez à faire confiance à vos données. Aujourd'hui.

Trouvez, comprenez et corrigez tout problème de qualité des données en quelques secondes.
Du niveau de la table au niveau des enregistrements.

Adopté par

4,4 sur 5

Commencez à faire confiance à vos données. Aujourd'hui.

Trouvez, comprenez et corrigez tout problème de qualité des données en quelques secondes.
Du niveau de la table au niveau des enregistrements.

Adopté par