Published
Dec 12, 2023
Achieving Zero Defects with Data Testing Automation



FirstParty, a leader in data services, provides comprehensive solutions for organizations to efficiently manage and leverage their data assets. Founded with a mission to empower organizations of all sizes and industries, FirstParty offers expertise in assessing, organizing, and deploying data to maximize its value. Its team of seasoned data professionals brings a wealth of knowledge and tools to unlock the business value of data.
In 2022, FirstParty first set out to improve the quality of its data products to exceed industry standards and customer expectations. This led to a partnership with Soda, marking a significant step in the evolution of their data management practices and establishing new industry benchmarks.
The challenge: achieving zero defects across complex data products
FirstParty faced a significant challenge in ensuring data quality across its various data products due to the complexity of the data types and sources they deal with. Managing data quality in such a diverse environment required advanced techniques and tools to identify and correct inconsistencies and errors. The evolving nature of the data, with constant updates and changes, added to the complexity. All of this made it a considerable task for FirstParty to maintain its commitment to delivering the highest quality data.
"Finding a data issue on your own is like finding a needle in a haystack. In the past, we were checking the data manually at the point of delivery, like right at the very end of our data pipeline." — Jolie McDonnell, Data Scientist at FirstParty
Before implementing Soda, FirstParty's quality assurance process, while rigorous, was fundamentally reactive. The team employed a three-fold approach: peer-reviewed code, Airflow monitoring for job failures, and manual review of output files using Excel or Python notebooks. But this left them exposed to a critical vulnerability.
The problem wasn't just the manual effort—it was the timing. With pipelines containing 20 to 30 transformation steps, catching issues only at the end made troubleshooting nearly impossible. Finding the source of a problem meant retracing dozens of transformations and even model logic to locate the issue. Even worse, the team often didn't know if queries ran as intended. Even worse, the team often didn't know if queries ran as intended.
In 2022, the company embarked on a journey to implement a 'zero defect policy' for its data offering. This initiative was not only about maintaining data accuracy but also about installing a culture of excellence and reliability. Operating in the financial services and hedge fund space, where clients make high-stakes investment decisions based on FirstParty's data products, even a single error could be catastrophic.
The challenge went beyond technical solutions; it required a change in organizational mindset, where every team member became a guardian of data integrity. This holistic approach to data quality was essential to meet the high standards FirstParty set for itself and its customers.
The solution: proactive quality management
Several factors made Soda the right choice for FirstParty's diverse, fast-paced environment.
The ability to connect to multiple data sources was crucial—clients use various platforms like Redshift, and FirstParty needed flexibility to work with JSON files, Parquet files, and different partitioning schemes across projects.
YAML-based checks also enabled rapid iteration. Once FirstParty built checks for initial pipelines, they could reuse and adapt them for new projects. The readability meant non-technical stakeholders could understand what was being validated, creating transparency across the organization.
“The YAML format was a huge benefit to us. We’re a very diverse team, we come from a variety of skill sets, and sometimes just having easy-to-read YAML checks to understand what’s going wrong in the data across the company and the business is really imperative.” — Jolie McDonnell, Data Scientist at FirstParty
Real-time notification integrations completed the picture, ensuring the team knew immediately when something went wrong rather than discovering issues hours or days later.
By integrating Soda into its data processes, FirstParty not only streamlined its operations but also brought a new level of accuracy and efficiency to its data management. The key to their approach was adopting a "shift left" philosophy—implementing quality checks as early and as often as possible throughout the data pipeline, rather than waiting until the end.
Soda's advanced solutions enabled FirstParty to proactively identify and resolve data issues early in the pipeline, significantly reducing the time and resources spent on troubleshooting. This integration went beyond mere technical improvements; it fostered a culture of proactive quality management, where data quality became a shared responsibility across teams.
Soda's tools empowered FirstParty's team to not just react to data issues, but to anticipate and prevent them, ensuring that their 'zero defect policy' was not just an aspiration, but a practical, achievable goal.
Catching Issues at the Top of the Funnel
FirstParty's integration strategy centers on validating data at every transformation stage, starting the moment data arrives from clients. This top-of-funnel validation proved immediately valuable. When a client recently started dropping duplicate records, FirstParty caught it the same day the data arrived, before running any downstream transformations.
“If you're catching stuff at the end, you can't even use the data, and you don't even know where to fix it. That's what allows us to be super-surgical, right? We know exactly where that failed." — Ben Sgro, VP Engineering at FirstParty
Without checks in place, they wouldn't have discovered the problem until much later in the pipeline, making it far more difficult to diagnose and communicate to the client.
The impact: cultural transformation and client confidence
The collaboration with Soda has been the catalyst for a significant change in FirstParty's approach to data quality. The reduction in the number of data errors and the creation of an organized hub for quality assurance were only the first benefits to be gained.
Perhaps the most tangible impact has been on the team's quality of life and operational peace of mind. The team can now trust their pipelines are continuously validated, with issues caught and addressed during normal working hours rather than discovered in crisis mode.
With a small team supporting a growing portfolio of data products for demanding financial clients, FirstParty needed tools that multiplied their effectiveness. Soda's automation freed the team from manual validation work, allowing them to focus on building rather than performing repetitive quality checks.
The paradigm shift in data management also improved collaboration and communication across the business, fostering an organizational culture that prioritizes data quality. The impact of this change is far-reaching, resulting in delivering more reliable and higher quality data products.
These improvements have already resonated with clients, increasing their confidence in and satisfaction with FirstParty's services, and ultimately reinforcing FirstParty's reputation as a leading provider of world-class data services. Their competitive advantage isn't just technical capability, it's the trust they've built through transparent, proactive data quality management.
Looking ahead: from afterthought to core process
As the team has matured in their use of Soda, they've moved from treating data quality checks as an afterthought to making them core to the pipeline development process from day one.
"As we move forward with new pipeline builds and new clients, Soda is going to be core—no pun intended—to our pipeline-building process, because it's really important to have those checks every time a production-level table is built." — Jolie McDonnell, Data Scientist at FirstParty
This shift represents a fundamental change in methodology—moving from retrofitting quality checks to designing them in parallel with data transformations. Quality expectations are now captured as executable specifications during the design phase, creating living documentation that both validates data and serves as a technical contract with clients.
FirstParty's approach to data quality is deliberately iterative, starting with foundational checks, then expanding coverage based on real-world incidents. Each problem that occurs drives new preventive checks, creating increasingly resilient pipelines over time.
Soda plays a central role in this automation strategy, eliminating the manual QA processes that once consumed significant engineering time. As the team continues to grow, this focus on automation allows them to scale efficiently while maintaining their zero-defect standard.
Disclaimer: This material was created in 2023. Please note that figures and statistics may have changed since its publication.
Listen to the Podcast
In Conversation with FirstParty at Club Soda New York
Soda's CEO Maarten Masschelein joins FirstParty staff Jolie McDonnell (Data Scientist), Ben Sgro (VP Engineering), and Tommy Dodge (Director of Analytics).
The topic is data products and the conversation centers on FirstParty's mission to provide businesses with the capabilities to maximise the value of their data assets.

Get in Touch
For data services companies, consultancies, or any organization where data products are the core deliverable, FirstParty's journey offers a clear lesson: quality cannot be an afterthought or a post-processing step. It must be embedded from the beginning, automated at scale, and made transparent to stakeholders.
Schedule a demo with the Soda team to find out how much you could optimize your data quality strategy across your entire data ecosystem.





