Product Interaction Events

Product Interaction Events

Data Contract Template

Data Contract Template

Ensure data from product interaction events is fresh, complete, and reliable before it is used for behavioral analytics, user journey analysis, and product performance measurement.

Data contract description

This data contract is designed for Glassdoor, where high-volume product events such as job_viewed, apply_clicked, review_submitted, comment_posted, and search_performed help the company understand user behavior across jobs, reviews, and workplace conversations on its platform. By putting a one-hour freshness expectation on ingest_time and governing key interaction context such as session, platform, entity, and action outcome, the contract supports Glassdoor’s shift-left approach to data quality by making product telemetry more dependable at the point of production, reducing downstream breakage, and preserving trust in the event data that powers analytics at scale.

product_interaction_data_contract.yaml

dataset
checks:
  - schema:
      allow_extra_columns: false
      allow_other_column_order: false
  - row_count:
      threshold:
        must_be_greater_than: 0
  - freshness:
      column: ingest_time
      threshold:
        unit: hour
        must_be_less_than: 1
columns:
  - name: event_id
    data_type: string
    checks:
      - missing:
      - duplicate:
      - invalid:
          name: "Event ID length guardrail"
          valid_min_length: 1
          valid_max_length: 128
  - name: event_name
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid event names"
          valid_values:
            - job_viewed
            - apply_clicked
            - review_submitted
            - comment_posted
            - search_performed
  - name: event_time
    data_type: timestamp
    checks:
      - missing:
  - name: ingest_time
    data_type: timestamp
    checks:
      - missing:
  - name: user_id
    data_type: string
    checks:
      - invalid:
          name: "User ID length guardrail"
          valid_min_length: 1
          valid_max_length: 128
  - name: anonymous_id
    data_type: string
    checks:
      - invalid:
          name: "Anonymous ID length guardrail"
          valid_min_length: 1
          valid_max_length: 128
  - name: session_id
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Session ID length guardrail"
          valid_min_length: 1
          valid_max_length: 128
  - name: platform
    data_type: string
    checks:
      - invalid:
          name: "Valid platforms"
          valid_values:
            - web
            - ios
            - android
            - backend
  - name: app_version
    data_type: string
    checks:
      - invalid:
          name: "App version length guardrail"
          valid_min_length: 1
          valid_max_length: 64
  - name: entity_type
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid entity types"
          valid_values:
            - job
            - company
            - review
            - post
            - comment
            - message
  - name: entity_id
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Entity ID length guardrail"
          valid_min_length: 1
          valid_max_length: 128
  - name: action_result
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid action results"
          valid_values

Data contract description

This data contract is designed for Glassdoor, where high-volume product events such as job_viewed, apply_clicked, review_submitted, comment_posted, and search_performed help the company understand user behavior across jobs, reviews, and workplace conversations on its platform. By putting a one-hour freshness expectation on ingest_time and governing key interaction context such as session, platform, entity, and action outcome, the contract supports Glassdoor’s shift-left approach to data quality by making product telemetry more dependable at the point of production, reducing downstream breakage, and preserving trust in the event data that powers analytics at scale.

product_interaction_data_contract.yaml

dataset
checks:
  - schema:
      allow_extra_columns: false
      allow_other_column_order: false
  - row_count:
      threshold:
        must_be_greater_than: 0
  - freshness:
      column: ingest_time
      threshold:
        unit: hour
        must_be_less_than: 1
columns:
  - name: event_id
    data_type: string
    checks:
      - missing:
      - duplicate:
      - invalid:
          name: "Event ID length guardrail"
          valid_min_length: 1
          valid_max_length: 128
  - name: event_name
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid event names"
          valid_values:
            - job_viewed
            - apply_clicked
            - review_submitted
            - comment_posted
            - search_performed
  - name: event_time
    data_type: timestamp
    checks:
      - missing:
  - name: ingest_time
    data_type: timestamp
    checks:
      - missing:
  - name: user_id
    data_type: string
    checks:
      - invalid:
          name: "User ID length guardrail"
          valid_min_length: 1
          valid_max_length: 128
  - name: anonymous_id
    data_type: string
    checks:
      - invalid:
          name: "Anonymous ID length guardrail"
          valid_min_length: 1
          valid_max_length: 128
  - name: session_id
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Session ID length guardrail"
          valid_min_length: 1
          valid_max_length: 128
  - name: platform
    data_type: string
    checks:
      - invalid:
          name: "Valid platforms"
          valid_values:
            - web
            - ios
            - android
            - backend
  - name: app_version
    data_type: string
    checks:
      - invalid:
          name: "App version length guardrail"
          valid_min_length: 1
          valid_max_length: 64
  - name: entity_type
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid entity types"
          valid_values:
            - job
            - company
            - review
            - post
            - comment
            - message
  - name: entity_id
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Entity ID length guardrail"
          valid_min_length: 1
          valid_max_length: 128
  - name: action_result
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid action results"
          valid_values

Data contract description

This data contract is designed for Glassdoor, where high-volume product events such as job_viewed, apply_clicked, review_submitted, comment_posted, and search_performed help the company understand user behavior across jobs, reviews, and workplace conversations on its platform. By putting a one-hour freshness expectation on ingest_time and governing key interaction context such as session, platform, entity, and action outcome, the contract supports Glassdoor’s shift-left approach to data quality by making product telemetry more dependable at the point of production, reducing downstream breakage, and preserving trust in the event data that powers analytics at scale.

product_interaction_data_contract.yaml

dataset
checks:
  - schema:
      allow_extra_columns: false
      allow_other_column_order: false
  - row_count:
      threshold:
        must_be_greater_than: 0
  - freshness:
      column: ingest_time
      threshold:
        unit: hour
        must_be_less_than: 1
columns:
  - name: event_id
    data_type: string
    checks:
      - missing:
      - duplicate:
      - invalid:
          name: "Event ID length guardrail"
          valid_min_length: 1
          valid_max_length: 128
  - name: event_name
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid event names"
          valid_values:
            - job_viewed
            - apply_clicked
            - review_submitted
            - comment_posted
            - search_performed
  - name: event_time
    data_type: timestamp
    checks:
      - missing:
  - name: ingest_time
    data_type: timestamp
    checks:
      - missing:
  - name: user_id
    data_type: string
    checks:
      - invalid:
          name: "User ID length guardrail"
          valid_min_length: 1
          valid_max_length: 128
  - name: anonymous_id
    data_type: string
    checks:
      - invalid:
          name: "Anonymous ID length guardrail"
          valid_min_length: 1
          valid_max_length: 128
  - name: session_id
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Session ID length guardrail"
          valid_min_length: 1
          valid_max_length: 128
  - name: platform
    data_type: string
    checks:
      - invalid:
          name: "Valid platforms"
          valid_values:
            - web
            - ios
            - android
            - backend
  - name: app_version
    data_type: string
    checks:
      - invalid:
          name: "App version length guardrail"
          valid_min_length: 1
          valid_max_length: 64
  - name: entity_type
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid entity types"
          valid_values:
            - job
            - company
            - review
            - post
            - comment
            - message
  - name: entity_id
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Entity ID length guardrail"
          valid_min_length: 1
          valid_max_length: 128
  - name: action_result
    data_type: string
    checks:
      - missing:
      - invalid:
          name: "Valid action results"
          valid_values

How to Enforce Data Contracts with Soda

Embed data quality through data contracts at any point in your pipeline.

Embed data quality through data contracts at any point in your pipeline.

# pip install soda-{data source} for other data sources

# pip install soda-{data source} for other data sources

pip install soda-postgres

pip install soda-postgres

# verify the contract locally against a data source

# verify the contract locally against a data source

soda contract verify -c contract.yml -ds ds_config.yml

soda contract verify -c contract.yml -ds ds_config.yml

# publish and schedule the contract with Soda Cloud

# publish and schedule the contract with Soda Cloud

soda contract publish -c contract.yml -sc sc_config.yml

soda contract publish -c contract.yml -sc sc_config.yml

Check out the CLI documentation to learn more.

Check out the CLI documentation to learn more.

How to Automatically Create Data Contracts.
In one Click.

Automatically write and publish data contracts using Soda's AI-powered data contract copilot.

Make data contracts work in production

Business knows what good data looks like. Engineering knows how to deliver it at scale. Soda unites both, turning governance expectations into executable contracts.

4.4 of 5

Start trusting your data. Today.

Find, understand, and fix any data quality issue in seconds.
From table to record-level.

Trusted by

4.4 of 5

Start trusting your data. Today.

Find, understand, and fix any data quality issue in seconds.
From table to record-level.

Trusted by

4.4 of 5

Start trusting your data. Today.

Find, understand, and fix any data quality issue in seconds.
From table to record-level.

Trusted by