Soda has acquired ML monitoring startup NannyML
Soda has acquired ML monitoring startup NannyML
Jun 9, 2025

Maarten Masschelein
Maarten Masschelein
Maarten Masschelein
CEO and Founder at Soda
CEO and Founder at Soda
CEO and Founder at Soda
Table of Contents



Today we’re kicking off Soda Launch Week with a major announcement: Soda has acquired NannyML.
Together, we’re building the most intelligent, context-aware data quality platform on the market. One that helps you prevent issues before they become business problems, detect anomalies that actually matter, and trace root causes across the entire stack, from data ingestion to automated decision-making.
This move brings together two teams with a shared goal: helping data and AI teams ship reliable, production-grade systems they can trust, whether those systems power dashboards, models, or autonomous agents.
Let’s get into what this means, why we’re doing it, and what’s coming next.
The Gap in Data Quality is Getting Worse
If you’ve worked on data or AI infrastructure, you’ve lived this:
A pipeline silently drops a column, no schema failure, but a downstream metric flatlines.
A dashboard suddenly shows revenue down 30%, and nobody knows why.
A model in production starts drifting due to subtle shifts in user behavior.
An agent retrains or reacts based on corrupted inputs, and no one catches it until decisions are made.
Most data quality tooling today can’t handle this. It was built for a different era of batch jobs, static schemas, predictable data flows. It flags too much noise, misses critical context, and rarely shows downstream impact.
At the same time, the systems we’re building today are more dynamic than ever:
Agents
LLM-powered decisioning
Real-time personalization
Hybrid batch-streaming pipelines
Continuous retraining loops
In this world, traditional checks and anomaly detection aren’t enough. Data quality isn’t just about correctness anymore, it’s about consequence.
Why NannyML
NannyML tackled one of the hardest problems in modern AI systems:
How do you monitor model performance in production, when there’s no ground truth yet?
Their open-source library introduced estimation-based performance monitoring, robust drift detection, and alerting designed for real-world ML pipelines. It became the go-to toolkit for teams running models where labels are delayed, sparse, or unavailable.
But more importantly, they saw what was coming:
That models don’t fail in isolation. They fail when data pipelines degrade, when user behavior shifts, when upstream assumptions break. And they believed the only way to solve this was to close the loop between data quality and AI behavior.
We’ve believed the same from day one.
By bringing our teams and platforms together, we’re unifying those layers. Delivering a product that can monitor your entire system, not just pieces of it.
What We’re Building Together
With NannyML’s team and tech now integrated into Soda, here’s what this unlocks:
Smarter detection at the DQ layer
NannyML’s algorithms will power a more intelligent core in Soda’s checks and observability. Reducing noise, surfacing real issues faster, and adapting to change.
Context-aware alerting across the stack
Trace anomalies across systems: from a column drift in your warehouse, to a prediction shift in your model, to a behavior change in your agents.
End-to-end observability: from data to decision
Monitor the full lifecycle, not just tables or checks. See how upstream issues ripple into downstream systems. Know what changed, why it matters, and what to fix.
AI-native quality infrastructure
Whether you’re running batch analytics, near-real-time features, or LLM orchestration, we’re building foundational infrastructure that keeps data and behavior aligned.
And yes, NannyML’s open-source project will remain open, maintained, and fully supported. We’re not sunsetting it. We’re expanding it.
Why Now
Because the cost of bad data is rising, and fast.
The systems data powers today are higher-stakes, faster-moving, and harder to debug.
If your tooling doesn’t understand impact, it’s not helping. If it can’t handle emergence and drift, it’s irrelevant. And if it’s not built for AI-native environments, it’s already behind.
We’re not here to slap “AI” on legacy checks. We’re here to make data quality actually intelligent:
Impact-aware
Context-rich
Lifecycle-connected
And ready for systems that learn, adapt, and act
This acquisition accelerates that mission.
What’s Coming This Week
This is Day 1 of Launch Week. All week long, we’ll be announcing new capabilities and product drops that show what intelligent, AI-first data quality looks like in practice.
Here’s a preview of what’s coming:
The fastest and most accurate metrics observability
Collaborative data contracts
A free forever tier and transparent pricing
We’re just getting started, and we’re building fast.
Where To Go Next
Watch the full announcement webinar
Hear directly from Maarten and Hakim about what’s changing, and what’s coming next.
Try Soda
See how our platform is evolving to support AI-native teams. No fluff, just the signals that matter.
This is the next chapter for data quality.
Smarter. Faster. AI-ready.
And built for teams like yours.
Today we’re kicking off Soda Launch Week with a major announcement: Soda has acquired NannyML.
Together, we’re building the most intelligent, context-aware data quality platform on the market. One that helps you prevent issues before they become business problems, detect anomalies that actually matter, and trace root causes across the entire stack, from data ingestion to automated decision-making.
This move brings together two teams with a shared goal: helping data and AI teams ship reliable, production-grade systems they can trust, whether those systems power dashboards, models, or autonomous agents.
Let’s get into what this means, why we’re doing it, and what’s coming next.
The Gap in Data Quality is Getting Worse
If you’ve worked on data or AI infrastructure, you’ve lived this:
A pipeline silently drops a column, no schema failure, but a downstream metric flatlines.
A dashboard suddenly shows revenue down 30%, and nobody knows why.
A model in production starts drifting due to subtle shifts in user behavior.
An agent retrains or reacts based on corrupted inputs, and no one catches it until decisions are made.
Most data quality tooling today can’t handle this. It was built for a different era of batch jobs, static schemas, predictable data flows. It flags too much noise, misses critical context, and rarely shows downstream impact.
At the same time, the systems we’re building today are more dynamic than ever:
Agents
LLM-powered decisioning
Real-time personalization
Hybrid batch-streaming pipelines
Continuous retraining loops
In this world, traditional checks and anomaly detection aren’t enough. Data quality isn’t just about correctness anymore, it’s about consequence.
Why NannyML
NannyML tackled one of the hardest problems in modern AI systems:
How do you monitor model performance in production, when there’s no ground truth yet?
Their open-source library introduced estimation-based performance monitoring, robust drift detection, and alerting designed for real-world ML pipelines. It became the go-to toolkit for teams running models where labels are delayed, sparse, or unavailable.
But more importantly, they saw what was coming:
That models don’t fail in isolation. They fail when data pipelines degrade, when user behavior shifts, when upstream assumptions break. And they believed the only way to solve this was to close the loop between data quality and AI behavior.
We’ve believed the same from day one.
By bringing our teams and platforms together, we’re unifying those layers. Delivering a product that can monitor your entire system, not just pieces of it.
What We’re Building Together
With NannyML’s team and tech now integrated into Soda, here’s what this unlocks:
Smarter detection at the DQ layer
NannyML’s algorithms will power a more intelligent core in Soda’s checks and observability. Reducing noise, surfacing real issues faster, and adapting to change.
Context-aware alerting across the stack
Trace anomalies across systems: from a column drift in your warehouse, to a prediction shift in your model, to a behavior change in your agents.
End-to-end observability: from data to decision
Monitor the full lifecycle, not just tables or checks. See how upstream issues ripple into downstream systems. Know what changed, why it matters, and what to fix.
AI-native quality infrastructure
Whether you’re running batch analytics, near-real-time features, or LLM orchestration, we’re building foundational infrastructure that keeps data and behavior aligned.
And yes, NannyML’s open-source project will remain open, maintained, and fully supported. We’re not sunsetting it. We’re expanding it.
Why Now
Because the cost of bad data is rising, and fast.
The systems data powers today are higher-stakes, faster-moving, and harder to debug.
If your tooling doesn’t understand impact, it’s not helping. If it can’t handle emergence and drift, it’s irrelevant. And if it’s not built for AI-native environments, it’s already behind.
We’re not here to slap “AI” on legacy checks. We’re here to make data quality actually intelligent:
Impact-aware
Context-rich
Lifecycle-connected
And ready for systems that learn, adapt, and act
This acquisition accelerates that mission.
What’s Coming This Week
This is Day 1 of Launch Week. All week long, we’ll be announcing new capabilities and product drops that show what intelligent, AI-first data quality looks like in practice.
Here’s a preview of what’s coming:
The fastest and most accurate metrics observability
Collaborative data contracts
A free forever tier and transparent pricing
We’re just getting started, and we’re building fast.
Where To Go Next
Watch the full announcement webinar
Hear directly from Maarten and Hakim about what’s changing, and what’s coming next.
Try Soda
See how our platform is evolving to support AI-native teams. No fluff, just the signals that matter.
This is the next chapter for data quality.
Smarter. Faster. AI-ready.
And built for teams like yours.
Today we’re kicking off Soda Launch Week with a major announcement: Soda has acquired NannyML.
Together, we’re building the most intelligent, context-aware data quality platform on the market. One that helps you prevent issues before they become business problems, detect anomalies that actually matter, and trace root causes across the entire stack, from data ingestion to automated decision-making.
This move brings together two teams with a shared goal: helping data and AI teams ship reliable, production-grade systems they can trust, whether those systems power dashboards, models, or autonomous agents.
Let’s get into what this means, why we’re doing it, and what’s coming next.
The Gap in Data Quality is Getting Worse
If you’ve worked on data or AI infrastructure, you’ve lived this:
A pipeline silently drops a column, no schema failure, but a downstream metric flatlines.
A dashboard suddenly shows revenue down 30%, and nobody knows why.
A model in production starts drifting due to subtle shifts in user behavior.
An agent retrains or reacts based on corrupted inputs, and no one catches it until decisions are made.
Most data quality tooling today can’t handle this. It was built for a different era of batch jobs, static schemas, predictable data flows. It flags too much noise, misses critical context, and rarely shows downstream impact.
At the same time, the systems we’re building today are more dynamic than ever:
Agents
LLM-powered decisioning
Real-time personalization
Hybrid batch-streaming pipelines
Continuous retraining loops
In this world, traditional checks and anomaly detection aren’t enough. Data quality isn’t just about correctness anymore, it’s about consequence.
Why NannyML
NannyML tackled one of the hardest problems in modern AI systems:
How do you monitor model performance in production, when there’s no ground truth yet?
Their open-source library introduced estimation-based performance monitoring, robust drift detection, and alerting designed for real-world ML pipelines. It became the go-to toolkit for teams running models where labels are delayed, sparse, or unavailable.
But more importantly, they saw what was coming:
That models don’t fail in isolation. They fail when data pipelines degrade, when user behavior shifts, when upstream assumptions break. And they believed the only way to solve this was to close the loop between data quality and AI behavior.
We’ve believed the same from day one.
By bringing our teams and platforms together, we’re unifying those layers. Delivering a product that can monitor your entire system, not just pieces of it.
What We’re Building Together
With NannyML’s team and tech now integrated into Soda, here’s what this unlocks:
Smarter detection at the DQ layer
NannyML’s algorithms will power a more intelligent core in Soda’s checks and observability. Reducing noise, surfacing real issues faster, and adapting to change.
Context-aware alerting across the stack
Trace anomalies across systems: from a column drift in your warehouse, to a prediction shift in your model, to a behavior change in your agents.
End-to-end observability: from data to decision
Monitor the full lifecycle, not just tables or checks. See how upstream issues ripple into downstream systems. Know what changed, why it matters, and what to fix.
AI-native quality infrastructure
Whether you’re running batch analytics, near-real-time features, or LLM orchestration, we’re building foundational infrastructure that keeps data and behavior aligned.
And yes, NannyML’s open-source project will remain open, maintained, and fully supported. We’re not sunsetting it. We’re expanding it.
Why Now
Because the cost of bad data is rising, and fast.
The systems data powers today are higher-stakes, faster-moving, and harder to debug.
If your tooling doesn’t understand impact, it’s not helping. If it can’t handle emergence and drift, it’s irrelevant. And if it’s not built for AI-native environments, it’s already behind.
We’re not here to slap “AI” on legacy checks. We’re here to make data quality actually intelligent:
Impact-aware
Context-rich
Lifecycle-connected
And ready for systems that learn, adapt, and act
This acquisition accelerates that mission.
What’s Coming This Week
This is Day 1 of Launch Week. All week long, we’ll be announcing new capabilities and product drops that show what intelligent, AI-first data quality looks like in practice.
Here’s a preview of what’s coming:
The fastest and most accurate metrics observability
Collaborative data contracts
A free forever tier and transparent pricing
We’re just getting started, and we’re building fast.
Where To Go Next
Watch the full announcement webinar
Hear directly from Maarten and Hakim about what’s changing, and what’s coming next.
Try Soda
See how our platform is evolving to support AI-native teams. No fluff, just the signals that matter.
This is the next chapter for data quality.
Smarter. Faster. AI-ready.
And built for teams like yours.
Case studies
Trusted by the world’s leading enterprises
Real stories from companies using Soda to keep their data reliable, accurate, and ready for action.
At the end of the day, we don’t want to be in there managing the checks, updating the checks, adding the checks. We just want to go and observe what’s happening, and that’s what Soda is enabling right now.

Sid Srivastava
Director of Data Governance, Quality and MLOps
Investing in data quality is key for cross-functional teams to make accurate, complete decisions with fewer risks and greater returns, using initiatives such as product thinking, data governance, and self-service platforms.

Mario Konschake
Director of Product-Data Platform
Soda has integrated seamlessly into our technology stack and given us the confidence to find, analyze, implement, and resolve data issues through a simple self-serve capability.

Sutaraj Dutta
Data Engineering Manager
Our goal was to deliver high-quality datasets in near real-time, ensuring dashboards reflect live data as it flows in. But beyond solving technical challenges, we wanted to spark a cultural shift - empowering the entire organization to make decisions grounded in accurate, timely data.

Gu Xie
Head of Data Engineering
4.4 of 5
Start trusting your data. Today.
Find, understand, and fix any data quality issue in seconds.
From table to record-level.
Trusted by




Case studies
Trusted by the world’s leading enterprises
Real stories from companies using Soda to keep their data reliable, accurate, and ready for action.
At the end of the day, we don’t want to be in there managing the checks, updating the checks, adding the checks. We just want to go and observe what’s happening, and that’s what Soda is enabling right now.

Sid Srivastava
Director of Data Governance, Quality and MLOps
Investing in data quality is key for cross-functional teams to make accurate, complete decisions with fewer risks and greater returns, using initiatives such as product thinking, data governance, and self-service platforms.

Mario Konschake
Director of Product-Data Platform
Soda has integrated seamlessly into our technology stack and given us the confidence to find, analyze, implement, and resolve data issues through a simple self-serve capability.

Sutaraj Dutta
Data Engineering Manager
Our goal was to deliver high-quality datasets in near real-time, ensuring dashboards reflect live data as it flows in. But beyond solving technical challenges, we wanted to spark a cultural shift - empowering the entire organization to make decisions grounded in accurate, timely data.

Gu Xie
Head of Data Engineering
4.4 of 5
Start trusting your data. Today.
Find, understand, and fix any data quality issue in seconds.
From table to record-level.
Trusted by
Solutions




Case studies
Trusted by the world’s leading enterprises
Real stories from companies using Soda to keep their data reliable, accurate, and ready for action.
At the end of the day, we don’t want to be in there managing the checks, updating the checks, adding the checks. We just want to go and observe what’s happening, and that’s what Soda is enabling right now.

Sid Srivastava
Director of Data Governance, Quality and MLOps
Investing in data quality is key for cross-functional teams to make accurate, complete decisions with fewer risks and greater returns, using initiatives such as product thinking, data governance, and self-service platforms.

Mario Konschake
Director of Product-Data Platform
Soda has integrated seamlessly into our technology stack and given us the confidence to find, analyze, implement, and resolve data issues through a simple self-serve capability.

Sutaraj Dutta
Data Engineering Manager
Our goal was to deliver high-quality datasets in near real-time, ensuring dashboards reflect live data as it flows in. But beyond solving technical challenges, we wanted to spark a cultural shift - empowering the entire organization to make decisions grounded in accurate, timely data.

Gu Xie
Head of Data Engineering
4.4 of 5
Start trusting your data. Today.
Find, understand, and fix any data quality issue in seconds.
From table to record-level.
Trusted by
Solutions
Company



