With the change in how organizations operate with data, data teams are working with a decentralized, domain-driven approach. This change requires that the domain teams have the support, accountability, and ability to use and share their data.
Setting Everyone Free
At the same time, there is a very critical business need for data consumers and business subject matter experts to take ownership of their data quality and be involved in data quality management. Data management in the modern data architecture is computationally tied into every step of the data flow and product lifecycle. Most data teams today are organized by domain and are composed of people in different roles such as a data or analytics engineer and a data product manager. Given the cross-functional nature of data, teams are dependent on each other to provide reliable high-quality data, every day.
Today, we’re introducing the next milestone in Soda’s mission to bring everyone closer to the data. Available now, in preview mode, are a new set of features and capabilities that bring Soda Cloud to the next level. The new features and capabilities have been built for the analysts and the data consumers in companies that are building innovative new products using data, as well as the teams responsible for producing data, or managing the contracts with data vendors. It’s a set of capabilities that brings a fresh approach to how teams can get ahead of data issues.
Are You Being Served?
Following on from the GA of SodaCL, the domain-specific language for data reliability, we’re very excited to make this preview available. Within Soda Cloud, analysts can easily write their own checks and run Soda Scans. Analysts are now on their way to be able to fully self-serve and manage their own data quality.
To date, many of the existing tools to check data quality have been built for a technical audience, users that can read and write SQL. At Soda, we’re building data reliability tools and an observability platform to help the entire data team discover, prioritize, and resolve data quality issues. We're simplifying a traditional process and cumbersome approach that has made data analysts heavily reliant on data engineers to implement data checks.
Write Your Own Checks and Balances
We’re talking to organizations far and wide, every day, who are unable to scale their data operations without running into bottlenecks that can inevitably cripple their business. It’s so clear that analysts need to be enabled to fully self-serve, because when an analyst can write their own checks for data quality, the business can really begin to scale with reliable, high-quality data.
Now, analysts that are equipped with the data domain knowledge can simply author their own checks, analyze incidents, and fix issues so that data quality remains trustworthy and available at all times. Data engineers are no longer a bottleneck and data analysts are no longer under-served. In fact, data engineers are free to do what they do best, embedding data reliability checks-as-code into data pipelines using Soda Core, our open-source framework, powered by SodaCL.
Let me describe the new capabilities that are now available in Soda Cloud Preview Mode.
- Soda Agent, a managed Soda Core service. Soda Agent is a secure, cloud-native, infrastructure component that organizations can deploy across clouds and on-premise. It’s the first-of-its-kind that enables business domain experts to get involved in data quality workflows, whilst ensuring best-in-class security. Soda Agent is hosted within your own environment and no record-level data leaves your premises, private, or public cloud. It performs its operations using Soda Core, the industry leading open-source data reliability and quality framework developed and maintained under the Apache 2.0 open-source license.
- Self-serve. Data quality needs to be made available to the people that really understand the data. It’s no longer - and never should have been - the purview of engineers to understand data quality expectations. It's the business domain experts that need to have an active role in managing the data they use to meet the evolving needs of the business. Self-serve data quality reduces the engineering bottleneck when it comes to creating reliable, high-quality data products, with one language that is human-writable and -readable meaning that everyone on a data team can define the checks of what good data needs to look like and address specific business issues across multiple business domains.
- Automated data source onboarding. With Soda Agent, automatic data source onboarding is now available in Soda Cloud. As the number of sources continues to expand in volume and complexity, uniting and sharing good-quality data gives businesses the ability to maximize its value.
- Automated monitoring. To monitor data quality at scale, Soda now offers automated monitoring of key data quality metrics of your datasets. Automated monitoring covers checks like freshness, volume, and schema. Teams have the ability to add additional default checks to the configuration.
- Data sharing agreements. Data providers and consumers can clearly articulate the quality expectations for data products as they are created, and as they evolve as trusted sources of truth, with Soda’s self-serve ability to create data agreements. With the agreements workflow, users can easily define and set data quality expectations that stipulate the freshness, accuracy, and completeness of data products so that consumers can rely on the data they are working with, and data product teams can get alerts about quality issues that they can quickly investigate and address. There is easy collaboration between data domain teams, who can track agreements and ensure that data is, and remains, fit-for-purpose.
- SOC-2 Type II accredited. Soda is accredited with an industry-leading standard for the security, availability, and confidentiality of customer data. Our SOC-II Type 2 certification provides our global, industry-leading customers with the peace-of-mind that our internal controls, systems, and policies are designed and implemented to ensure the highest level of security and compliance when it comes to managing their data.
Free With Your Thresholds
In Soda Cloud, business domain experts use the agreements workflow to create an agreement - think of this as a contract between the business domain expert and the upstream teams, such as data engineers or data producers. It takes just five steps for data teams to define and align data quality expectations:
- Select a data source
- Write checks
- Identify stakeholders
- Set notifications
- Schedule when to run a scan
Check out our docs to learn more.
At Soda, we believe that we’re building the most robust platform for data teams. It’s very exciting to place these powerful new features into the hands of our customers and community in preview mode. We’re really looking forward to seeing how they put it to the test and get closer to self-serve data quality management, a capability that will really enable organizations to scale and conquer data issues across the entire business to create end-to-end observability.
Get On Board!
If you’re an existing customer and user of Soda Cloud, please get in touch with our team and we’d be happy to get you going on the road to self-serve data quality management. We can’t wait to show you how we can free your data engineers and empower your data analysts.
Soda Cloud is expected to be Generally Available later in 2022, following the (European) summer break.
It’s a great time to be working with good data.