Table of Contents

Published

Feb 5, 2021

Key Takeaways from Subsurface LIVE

Domien Declercq

Domien Declercq

Domien Declercq

Former Business Development at Soda

Former Business Development at Soda

Former Business Development at Soda

The Soda team recently participated in Subsurface LIVE Winter Edition, Dremio’s cloud data lake conference, that took place online.

Quick links:

It was fascinating to see how rapidly things have changed in data management over the last few years. Notably, the rising importance of data monitoring and the role of the data engineer, which coincidentally (and an unashamed plug here), we at Soda were showcasing at the conference with the availability of our new open source project, Soda SQL. Coincidence or perfect timing? Read on.

In the opening keynote, Tomer Shiran, Chief Product Officer at Dremio, touched upon some of the changes that have occurred in the data management space. Tomer spoke about the change in architecture from a monolithic, client-server approach often using proprietary software, to a more loosely connected, cloud-based approach that relies on open-source software.

The key driver for this is the need to manage much larger datasets. The modern data stack is increasingly massive and complex. Organizations are all too aware of the need to deliver the right data to the right people, at the right time. And, as Tomer highlighted, the need to make data available 24/7 to different users across the entire organization. Whilst this might be, we need to recognize that many businesses are struggling to keep on top of all the known (and unknown) data quality issues.

Through my virtual-quasi-real interactions and conversations at the Soda booth, I can confirm that the demand is real!

Data engineering and infrastructure teams are under immense pressure to manage the relentless demand for supreme-quality, analytics-ready data from an ever-increasing number of data sources.

These challenges - and the possible solutions - are what so many attendees discussed throughout the conference. In fact, ‘on-demand data availability’ might be the next big thing that business analysts and the industry will be talking about in 2021 and beyond.

Another key takeaway from the conference was the need for data compatibility and consistency across the data pipeline, from the source to the user. For example, a change in the data type of a column in a source DB needs to be reflected in a data lake’s schema that a user accesses. Well, data quality consistency validation across the data pipeline is one of the reasons for building the Soda data monitoring platform. It was fun to introduce the platform to attendees during the event.

Image of the Soda data monitoring platform in action

Undeniably, we are seeing the rise of data engineers, data product owners and data scientists. Hurrah! But with that comes the realization of the struggle to keep on top of all of the known (and unknown) data quality issues. We need to bring additional software engineering principles into the data engineering workflow and we've started with that at Soda. The resource-stricken data engineering and infrastructure teams struggling to manage the increasing demand for analytics-ready data from an ever-increasing number of data sources are ever-present.

Is it a hopeless situation? Absolutely not! This fast-growing community already uses a plethora of open-source developer tools to facilitate modern day data product management such as Spark or DBT. Now, we need to bring additional software engineering principles into the data engineering workflow. Let’s keep exploring that, together.

And really the best takeaway was the unity in the understanding that data needs to be monitored, tested and validated as soon as possible and ultimately before it reaches the user.

If you were unable to attend the conference yourself, you can access the sessions on-demand, here.

TL;DL: From all of the informative and provoking talks at Subsurface, listen to Tomer Shiran’s keynote and Roy Hasson’s AWS presentation on data lakes. But, depending on where you are in the data pipeline, all are worthwhile.

I started talking about change, and so I’ll end with it. Like many, I missed the in-person interaction that this community does well, and thrives on, however, it was still great, and the DJ session was unexpectedly fun! Soda was proud to sponsor, and I personally got a lot from the conference.

I, of course, would love for you to now go and explore Soda SQL, which appears to be so well-timed.

Soda SQL is our newly released open source project. Soda is championing the engineering principles of Test-Driven Development (TDD) in its data monitoring platform and we’d like for you to give it a try.

Go on, go and test yourself some good quality data.

Enjoy!

‼️ SodaSQL has become Soda Library.

Go here for more information: Introducing Soda Library

Start trusting your data. Today.

Soda fixes data. End-to-end. From detection to resolution.
All automated with AI.

Request a demo

Start trusting your data. Today.

Soda fixes data. End-to-end. From detection to resolution.
All automated with AI.

Request a demo

Start trusting your data. Today.

Soda fixes data. End-to-end. From detection to resolution.
All automated with AI.

Request a demo

Case studies

Trusted by the world’s leading enterprises

Real stories from companies using Soda to keep their data reliable, accurate, and ready for action.

At the end of the day, we don’t want to be in there managing the checks, updating the checks, adding the checks. We just want to go and observe what’s happening, and that’s what Soda is enabling right now.

Sid Srivastava

Director of Data Governance, Quality and MLOps

Investing in data quality is key for cross-functional teams to make accurate, complete decisions with fewer risks and greater returns, using initiatives such as product thinking, data governance, and self-service platforms.

Mario Konschake

Director of Product-Data Platform

Soda has integrated seamlessly into our technology stack and given us the confidence to find, analyze, implement, and resolve data issues through a simple self-serve capability.

Sutaraj Dutta

Data Engineering Manager

Our goal was to deliver high-quality datasets in near real-time, ensuring dashboards reflect live data as it flows in. But beyond solving technical challenges, we wanted to spark a cultural shift - empowering the entire organization to make decisions grounded in accurate, timely data.

Gu Xie

Head of Data Engineering

4.4 of 5

Start trusting your data. Today.

Find, understand, and fix any data quality issue in seconds.
From table to record-level.

Trusted by

Case studies

Trusted by the world’s leading enterprises

Real stories from companies using Soda to keep their data reliable, accurate, and ready for action.

At the end of the day, we don’t want to be in there managing the checks, updating the checks, adding the checks. We just want to go and observe what’s happening, and that’s what Soda is enabling right now.

Sid Srivastava

Director of Data Governance, Quality and MLOps

Investing in data quality is key for cross-functional teams to make accurate, complete decisions with fewer risks and greater returns, using initiatives such as product thinking, data governance, and self-service platforms.

Mario Konschake

Director of Product-Data Platform

Soda has integrated seamlessly into our technology stack and given us the confidence to find, analyze, implement, and resolve data issues through a simple self-serve capability.

Sutaraj Dutta

Data Engineering Manager

Our goal was to deliver high-quality datasets in near real-time, ensuring dashboards reflect live data as it flows in. But beyond solving technical challenges, we wanted to spark a cultural shift - empowering the entire organization to make decisions grounded in accurate, timely data.

Gu Xie

Head of Data Engineering

4.4 of 5

Start trusting your data. Today.

Find, understand, and fix any data quality issue in seconds.
From table to record-level.

Trusted by

Case studies

Trusted by the world’s leading enterprises

Real stories from companies using Soda to keep their data reliable, accurate, and ready for action.

At the end of the day, we don’t want to be in there managing the checks, updating the checks, adding the checks. We just want to go and observe what’s happening, and that’s what Soda is enabling right now.

Sid Srivastava

Director of Data Governance, Quality and MLOps

Investing in data quality is key for cross-functional teams to make accurate, complete decisions with fewer risks and greater returns, using initiatives such as product thinking, data governance, and self-service platforms.

Mario Konschake

Director of Product-Data Platform

Soda has integrated seamlessly into our technology stack and given us the confidence to find, analyze, implement, and resolve data issues through a simple self-serve capability.

Sutaraj Dutta

Data Engineering Manager

Our goal was to deliver high-quality datasets in near real-time, ensuring dashboards reflect live data as it flows in. But beyond solving technical challenges, we wanted to spark a cultural shift - empowering the entire organization to make decisions grounded in accurate, timely data.

Gu Xie

Head of Data Engineering

4.4 of 5

Start trusting your data. Today.

Find, understand, and fix any data quality issue in seconds.
From table to record-level.

Trusted by