Soda Introduces Operational Analytics Dashboards to Gather Insights on Platform Usage & Data Quality Efforts
It’s a stubbornly persistent struggle for organizations to ensure that their tech investments deliver on their promise, with the intended value and benefit. And with an increased investment in new data quality management solutions, we’ve heard many data teams ask for a standard way of reporting on their data quality efforts with Soda. Data teams usually want to answer the following:
- Are our users using the data quality monitoring solution?
- Are our users adopting it into their way of working and adding data quality checks before they use data?
- Are our data quality standards improving?
- How broad is our test coverage?
I’m excited to introduce a new Reporting API for Soda Cloud that enables you to build dashboards using Key Performance Indicators that help you understand Soda’s impact on ensuring data quality in your organization. Use the API to assess tool adoption and analyze the health and test coverage of your datasets.
To get organizations going, we’ve released several API endpoints that we think will answer some of the common questions that our users need to answer.
To understand user adoption, you can gather data that gives you insight into:
- Sign-ups: Who has signed up for Soda Cloud in your organization?
- Sign-ins: Who is using Soda Cloud every day?
- Daily account activity: How active is your team within Soda?
- Scans run: Which Soda scans has your team been running and how many tests have been executed?
Is Soda having an impact and helping us discover data issues before there is a downstream impact?
- Alerts sent: How many alerts has Soda Cloud sent to notify users of a data issue?
To get a sense of whether your data is meeting data quality agreements, you can gather data on:
- Dataset health: How healthy is each dataset, based on the number of tests that are passed during each Soda scan?
- Test results: What are the results of all our tests and how have these changed over time?
We also provide endpoints that enable you to understand how good your testing coverage is, where datasets are failing, and which specific rules and metrics are causing the majority of data quality problems:
- Tests: A list of all of the tests and the datasets to which they apply.
- Datasets: A list of all of the datasets that Soda Cloud accesses, including the last scan time and test failure counts.
- Dataset coverage: A score derived from the number of tests that apply to a dataset compared to the number of tests applied to all other datasets accessed by Soda Cloud.
To ensure that you have real-time visibility to improve your data’s quality and reliability, all platform impact endpoints run in real-time. All endpoints relating to adoption and usage are refreshed daily.
Working with poor quality data is time consuming and resource intensive. Using these endpoints, you can build data quality dashboards that help you quickly understand where data quality is falling short.
Improving data quality improves business
Here’s another advantage: a data quality dashboard is a powerful communication tool to show the impact and business value that your initiatives and programs are delivering and helps elevate the visibility and importance of data quality. It’s a great mechanism to get everyone interested in data and its quality, because every organization that is data-enabled knows that good data quality is fundamental to the success of their business.
By providing a simple way for everyone to understand where data is falling short and where data is making a difference, we are bringing everyone closer to the data and increasing their trust in it. That’s what we’re particularly proud of at Soda, making data quality a team sport.
The Reporting API enables you to build dashboards that provide insights that are useful and interesting either to a head of data or business operations, while providing data engineers and analysts with the granularity they need to prioritize their data reliability efforts.
Organizations need data quality management that addresses the needs of all users and incorporates them into a unified platform that has the ability to meet their diverse needs, in environments they are familiar with.
Know Your Datasets and Gauge Adoption
Assessing the quality of data within a dataset helps organizations get ahead of silent data issues and provide end-to-end transparency.
- Which datasets are least tested?
- Are critical datasets healthy?
- Which data issues should we tackle first?
- What can we learn about our data quality issues in order to improve and deliver trusted data to the organization?
- How often is our team running scans?
- How are people on our team using Soda Cloud?
Get Started
The Soda Cloud Reporting API is available as of November 23, 2021, to all registered users of Soda Cloud. Access our API documentation at docs.soda.io.
With the help of a data engineer collaborating with an analytics engineer in your team, you’ll be able to use the API to build reporting dashboards in under an hour. Our docs have a step-by-step guide to get you going.
As a data engineer, once you’ve captured the data you need from the Reporting API, move it into the storage of compute systems that power your existing reporting or visualization tools.
Engage your analytics engineer to transform the data, bake it into the business logic, and create beautiful dashboards that answer the questions that will help your team understand and trust the data.
If you’re new to Soda, sign up for a free Soda Cloud account today. You’ll need a Soda Cloud account connected to an instance of Soda SQL installed in your environment. If your team has already defined data quality tests and run some scans against datasets, you'll have data that the Reporting API can retrieve. If you’d like to see the reporting API in action, get in touch with us to request a demo, or use the easy-to-follow guide available in our docs.
Why an API and not ready-made dashboards?
Our data engineers and data scientists unanimously agreed that providing an API for our users would address the need to get quick visibility into user adoption, understand the data that matters most to the business, and ensure that data quality standards are improving. Speaking with our community, we also realized that we don’t need to replace the reporting and visualization tools that they already use. We aim to make it easy to integrate with Soda and this API means teams can continue using the tools they love – one less change to support greater adoption.
The Future is Built on Trusted Data
This is just the beginning. Customers should always expect more value from the tools and platform that they’re investing in, and we’re committed to delivering it, quickly and consistently. We’re continuing development on Soda’s Reporting API to connect you to the data that helps you understand your data and trust your data.
With that in mind, we have taken another cue from our community of users to offer Incident Management within Soda Cloud. Keep an eye out for the imminent release of a Slack-integrated feature that helps teams collaborate to quickly resolve data quality issues, before they have a downstream impact.
We’d love to hear how the Soda Reporting API works for you and we’re available for any questions or feedback. Join our Soda Community on Slack and let us know what you think!