
Source-Aligned Events
Source-Aligned Events
Assurez une cohérence irréprochable entre tous vos produits de données
Assurez une cohérence irréprochable entre tous vos produits de données
Ensure data from source-aligned event streams is fresh, complete, and reliable before it is used for product analytics, user engagement tracking, and marketplace performance measurement.
Data contract description
This data contract is designed for Adevinta Spain, where source-aligned data products form the bronze layer of a medallion architecture, governing raw event streams from the moment they are ingested. The event_data dataset captures clickstream interactions — such as job_viewed, apply_clicked, and search_performed — across Adevinta Spain's marketplace platforms, with a one-hour freshness expectation on ingest_time and enum-level constraints on event names, platforms, entity types, and action outcomes. The contract helps keep event telemetry trustworthy and reusable for downstream analytics across its marketplace ecosystem.
source_aligned_event_data_contract.yaml
datasetvariables: CONTRACT_NAME: default: "event_data" CONTRACT_VERSION: default: "1" DESCRIPTION: default: "Product and event analytics data for consumer internet / marketplace platforms." START_DATE: default: "2024-01-01T00:00:00+00:00" SCHEMA_SOURCE: default: "url" SCHEMA_LOCATION_URL: default: "https://github.com/adevinta/event-schemas/event_data_v1.json" SCHEMA_FORMAT: default: "jsonSchema" SCHEMA_VERSION: default: "1.0" LANDING_SOURCE: default: "pub.mytopic" SDRN: default: "@client.@id" RELATION_KEY: default: "event_id" checks: - schema: allow_extra_columns: false allow_other_column_order: false - row_count: threshold: must_be_greater_than: 0 - freshness: column: ingest_time threshold: unit: hour must_be_less_than: 1
columns: - name: event_id data_type: string checks: - missing: - duplicate: - invalid: name: "Event ID length guardrail" valid_min_length: 1 valid_max_length: 128 - name: event_name data_type: string checks: - missing: - invalid: name: "Valid event names" valid_values: - job_viewed - apply_clicked - review_submitted - comment_posted - search_performed - name: event_time data_type: timestamp checks: - missing: - name: ingest_time data_type: timestamp checks: - missing: - name: user_id data_type: string checks: - invalid: name: "User ID length guardrail" valid_min_length: 1 valid_max_length: 128 - name: anonymous_id data_type: string checks: - invalid: name: "Anonymous ID length guardrail" valid_min_length: 1 valid_max_length: 128 - name: session_id data_type: string checks: - missing: - invalid: name: "Session ID length guardrail" valid_min_length: 1 valid_max_length: 128 - name: platform data_type: string checks: - invalid: name: "Valid platforms" valid_values: - web - ios - android - backend - name: app_version data_type: string checks: - invalid: name: "App version length guardrail" valid_min_length: 1 valid_max_length: 64 - name: entity_type data_type: string checks: - missing: - invalid: name: "Valid entity types" valid_values: - job - company - review - post - comment - message - name: entity_id data_type: string checks: - missing: - invalid: name: "Entity ID length guardrail" valid_min_length: 1 valid_max_length: 128 - name: action_result data_type: string checks: - missing: - invalid: name: "Valid action results" valid_values
Data contract description
This data contract is designed for Adevinta Spain, where source-aligned data products form the bronze layer of a medallion architecture, governing raw event streams from the moment they are ingested. The event_data dataset captures clickstream interactions — such as job_viewed, apply_clicked, and search_performed — across Adevinta Spain's marketplace platforms, with a one-hour freshness expectation on ingest_time and enum-level constraints on event names, platforms, entity types, and action outcomes. The contract helps keep event telemetry trustworthy and reusable for downstream analytics across its marketplace ecosystem.
source_aligned_event_data_contract.yaml
datasetvariables: CONTRACT_NAME: default: "event_data" CONTRACT_VERSION: default: "1" DESCRIPTION: default: "Product and event analytics data for consumer internet / marketplace platforms." START_DATE: default: "2024-01-01T00:00:00+00:00" SCHEMA_SOURCE: default: "url" SCHEMA_LOCATION_URL: default: "https://github.com/adevinta/event-schemas/event_data_v1.json" SCHEMA_FORMAT: default: "jsonSchema" SCHEMA_VERSION: default: "1.0" LANDING_SOURCE: default: "pub.mytopic" SDRN: default: "@client.@id" RELATION_KEY: default: "event_id" checks: - schema: allow_extra_columns: false allow_other_column_order: false - row_count: threshold: must_be_greater_than: 0 - freshness: column: ingest_time threshold: unit: hour must_be_less_than: 1
columns: - name: event_id data_type: string checks: - missing: - duplicate: - invalid: name: "Event ID length guardrail" valid_min_length: 1 valid_max_length: 128 - name: event_name data_type: string checks: - missing: - invalid: name: "Valid event names" valid_values: - job_viewed - apply_clicked - review_submitted - comment_posted - search_performed - name: event_time data_type: timestamp checks: - missing: - name: ingest_time data_type: timestamp checks: - missing: - name: user_id data_type: string checks: - invalid: name: "User ID length guardrail" valid_min_length: 1 valid_max_length: 128 - name: anonymous_id data_type: string checks: - invalid: name: "Anonymous ID length guardrail" valid_min_length: 1 valid_max_length: 128 - name: session_id data_type: string checks: - missing: - invalid: name: "Session ID length guardrail" valid_min_length: 1 valid_max_length: 128 - name: platform data_type: string checks: - invalid: name: "Valid platforms" valid_values: - web - ios - android - backend - name: app_version data_type: string checks: - invalid: name: "App version length guardrail" valid_min_length: 1 valid_max_length: 64 - name: entity_type data_type: string checks: - missing: - invalid: name: "Valid entity types" valid_values: - job - company - review - post - comment - message - name: entity_id data_type: string checks: - missing: - invalid: name: "Entity ID length guardrail" valid_min_length: 1 valid_max_length: 128 - name: action_result data_type: string checks: - missing: - invalid: name: "Valid action results" valid_values
Data contract description
This data contract is designed for Adevinta Spain, where source-aligned data products form the bronze layer of a medallion architecture, governing raw event streams from the moment they are ingested. The event_data dataset captures clickstream interactions — such as job_viewed, apply_clicked, and search_performed — across Adevinta Spain's marketplace platforms, with a one-hour freshness expectation on ingest_time and enum-level constraints on event names, platforms, entity types, and action outcomes. The contract helps keep event telemetry trustworthy and reusable for downstream analytics across its marketplace ecosystem.
source_aligned_event_data_contract.yaml
datasetvariables: CONTRACT_NAME: default: "event_data" CONTRACT_VERSION: default: "1" DESCRIPTION: default: "Product and event analytics data for consumer internet / marketplace platforms." START_DATE: default: "2024-01-01T00:00:00+00:00" SCHEMA_SOURCE: default: "url" SCHEMA_LOCATION_URL: default: "https://github.com/adevinta/event-schemas/event_data_v1.json" SCHEMA_FORMAT: default: "jsonSchema" SCHEMA_VERSION: default: "1.0" LANDING_SOURCE: default: "pub.mytopic" SDRN: default: "@client.@id" RELATION_KEY: default: "event_id" checks: - schema: allow_extra_columns: false allow_other_column_order: false - row_count: threshold: must_be_greater_than: 0 - freshness: column: ingest_time threshold: unit: hour must_be_less_than: 1
columns: - name: event_id data_type: string checks: - missing: - duplicate: - invalid: name: "Event ID length guardrail" valid_min_length: 1 valid_max_length: 128 - name: event_name data_type: string checks: - missing: - invalid: name: "Valid event names" valid_values: - job_viewed - apply_clicked - review_submitted - comment_posted - search_performed - name: event_time data_type: timestamp checks: - missing: - name: ingest_time data_type: timestamp checks: - missing: - name: user_id data_type: string checks: - invalid: name: "User ID length guardrail" valid_min_length: 1 valid_max_length: 128 - name: anonymous_id data_type: string checks: - invalid: name: "Anonymous ID length guardrail" valid_min_length: 1 valid_max_length: 128 - name: session_id data_type: string checks: - missing: - invalid: name: "Session ID length guardrail" valid_min_length: 1 valid_max_length: 128 - name: platform data_type: string checks: - invalid: name: "Valid platforms" valid_values: - web - ios - android - backend - name: app_version data_type: string checks: - invalid: name: "App version length guardrail" valid_min_length: 1 valid_max_length: 64 - name: entity_type data_type: string checks: - missing: - invalid: name: "Valid entity types" valid_values: - job - company - review - post - comment - message - name: entity_id data_type: string checks: - missing: - invalid: name: "Entity ID length guardrail" valid_min_length: 1 valid_max_length: 128 - name: action_result data_type: string checks: - missing: - invalid: name: "Valid action results" valid_values
How to Enforce Data Contracts with Soda
Embed data quality through data contracts at any point in your pipeline.
Embed data quality through data contracts at any point in your pipeline.
# pip install soda-{data source} for other data sources
# pip install soda-{data source} for other data sources
pip install soda-postgres
pip install soda-postgres
# verify the contract locally against a data source
# verify the contract locally against a data source
soda contract verify -c contract.yml -ds ds_config.yml
soda contract verify -c contract.yml -ds ds_config.yml
# publish and schedule the contract with Soda Cloud
# publish and schedule the contract with Soda Cloud
soda contract publish -c contract.yml -sc sc_config.yml
soda contract publish -c contract.yml -sc sc_config.yml
Check out the CLI documentation to learn more.
Check out the CLI documentation to learn more.
How to Automatically Create Data Contracts.
In one Click.
Automatically write and publish data contracts using Soda's AI-powered data contract copilot.

Qualité des données IA basée sur la recherche
Nos recherches ont été publiées dans des revues et conférences de renom, telles que NeurIPs, JAIR et ACML. Les mêmes lieux qui ont fait progresser les fondations de GPT et de l'IA moderne.
Explore more data contract templates
One new data contract template every day, across industries and use cases
4,4 sur 5
Commencez à faire confiance à vos données. Aujourd'hui.
Trouvez, comprenez et corrigez tout problème de qualité des données en quelques secondes.
Du niveau de la table au niveau des enregistrements.
Adopté par




4,4 sur 5
Commencez à faire confiance à vos données. Aujourd'hui.
Trouvez, comprenez et corrigez tout problème de qualité des données en quelques secondes.
Du niveau de la table au niveau des enregistrements.
Adopté par
Solutions




4,4 sur 5
Commencez à faire confiance à vos données. Aujourd'hui.
Trouvez, comprenez et corrigez tout problème de qualité des données en quelques secondes.
Du niveau de la table au niveau des enregistrements.
Adopté par
Solutions
Company








