In 2021 I had the pleasure to first get to know and speak with Zhamak Dheghani, Director of Emerging Technologies at ThoughtWorks, in season one of the Data Dream Team series. Zhamak is a software engineer and architect who is (in)famously known as the founder of the data mesh concept, a paradigm shift in how we manage data-driven value at scale.
I interviewed Zhamak last season as more of an introduction to Data Mesh. I was happy to be able to welcome Zhamak back to The Soda Podcast, following our first conversation in 2021 that I would now categorize as an introduction to data mesh. This episode is different to others in the Data Dream Team series. When Zhamak and I talked about this episode, it was clear that she didn’t want to have just another conversation introducing data mesh. We also had the good fortune that the universe was aligned with our desire to delve into data mesh, with her freshly published book, Data Mesh: Delivering Data-Driven Value at Scale, giving us a great base to have an in-depth discussion and not just another interview.
Our discussion happens over two parts - we tallied over two hours of good conversation and so thought it best to break it up.
Part 1
In part one, we reflect on what’s been happening since we first spoke.
We reconnect and speak during Zhamak’s three-month sabbatical - given to all employees (at the time of print) at ThoughtWorks on their ten-year work anniversary. A sabbatical is typically a time people can take to reflect, consider, and appreciate their work and life. Since we spoke, Zhamak has had the time to finish her book, reflect on the impact of data mesh, and think about what’s next.
I really appreciate Zhamak’s desire and ability to criticize and critique the idea. “You must disassociate yourself from the idea. The idea will have a life of its own, will get adopted, will get used and misused and evolve. And that shouldn't be a personal concern.”
When you have a great idea, it tends to land on a hype cycle. “I think we are still early days in that rising up to the top of the hype cycle.” But she doesn’t think people are just adopting data mesh because it’s on a hype cycle. “In reality, what has happened is that there was a pain and a problem that we hadn't spoken of. Data mesh surfaced that problem and that challenge, and also proposed an alternative [...].”
I find the history of the industry to be fascinating. There is a cycle and repeated pattern to the technology industry. It’s kind of a boom/bust cycle of ideas and technologies. I asked Zhamak to imagine the scenarios in which data mesh dies an ignominious defeat.
Zhamak: “We went too fast and burned ourselves really bad. In both experiences, individuals and companies tend to blame the methodology or technology for any problems. I feel that if people are looking for short, quick solutions today, and we don't go through this thoughtful and perhaps a longer term process of evolving and making the shift, the paradigm shift, and pick up a solution off the shelf, retrofit it, jam a technology into the organization, and then say, well, that didn't work. And we blame the paradigm and we move on to the next buzzword, to the next buzzword. Buzzwords in the data space are very short-lived, right?”
I worry that we need smarter than average developers and will task them with projects that will eventually bore them. Zhamak sees a time when “[...] software engineers are fantastic problem solvers. You've got to throw the right problem at them. And what data mesh says with domain-oriented data teams and domain-oriented, cross-functional teams is trying to throw the right problem.“ By giving software engineers and data engineers an interesting problem, we’ll keep them engaged.
A missing part of the data mesh equation is the ROI. There hasn’t been much talk about how long it takes to start getting value from data mesh. Zhamak clarified, “so yes, so it's not weeks, it's really months and years to get the platform and infrastructure in place to materialize and exploit, get value from that investment at scale. And I really hope that the future technology shortens that lead time and investment.”
Part 2
There have been a lot of introductions to data mesh out there, including my own, but there aren't too many that delve deeply into the subject. There are common criticisms of data mesh that I have seen online and that I have heard in conversations. However, I haven’t seen much of this covered in a podcast. In this part podcast, Zhamak and I are going to discuss the finer points of data mesh, deeper points, and criticisms of data mesh, including my own criticisms. This is an episode you won’t have seen anywhere else.
Zhamak was working on her book when we first talked in season one and the book is out now. We can see the fruits of her labor. One of the things I loved about her book is the diagrams. You can tell she spent a lot of time working on them and perfecting them.
Zhamak: “I always try to convey a concept in different ways. And for me, I'm a visual learner. And I guess I'm a conceptual thinker and to be able to conceptualize complex or demonstrate complex ideas in figure and shapes and their relationship,”. Make sure you check out her book and diagrams. They make the concepts around data mesh easier to understand.
Writing a book isn’t always easy. The ease doesn’t start once the book is out. The book lives on in other people’s minds. “[...] reading reflections, criticism, or constructive feedback, it's the most wonderful, [...] I highly recommend that if people are reading it, share back what you think and what's missing or where the conversation needs to go. One of my intentions behind the book was elevating the conversation beyond what it is and really moving on into how to make data mesh accessible to a wide range of organizations, not the ones that have a lot of money to spend on the technology.”
I’ve been thinking about the times that data mesh is applicable, or the right choice for companies. For both of us, it all comes down to complexity.
Zhamak: “There is a complexity, inherent complexity in building distributed systems and even more complexity building a data-oriented, distributed system or ecosystem. So unless your business has an equal inherent complexity, I wouldn't pick data mesh as a starting point.” This isn’t just relegated to data mesh advice. “I would say it's the same advice as I would've given, you know, 10 years ago, seven years ago, people wanting to do microservices. If your business is small, if the number of diverse sources and teams where the data originates from, the diversity of use cases for your data are small, perhaps a monolithic, centralized solution of a lake or a warehouse with that long pipeline, value stream pipeline that we talked about, perhaps that's okay.”
There are a few things I disagree with Zhamak on. In the book, she recommends that teams bring their own compute. Zhamak’s response was: “I think ideal space that you want to get to is where business domain or cross-functional domain teams are just focusing on the business problem at hand, right? The necessary elements of software and data that is directly impacting the outcome.”
I’ve experienced non-data engineering teams trying to make data engineering decisions. In data mesh, a downstream team can make their own decisions on which technologies they use. The times I’ve seen this, the teams don’t make the right decision the majority of the time. I asked Zhamak her opinion and she answered: “The self-serve platform of data mesh is a product, is probably multiple products, right? And having a platform product owner, which is one of the roles that I talk about briefly in the book, is really important, because you're right that the end users that are using, they're focusing on curating the playlist, perhaps are not the people to be in charge of running a cluster of, I don't know, Spark or whatever compute they're using, or even choosing what could be the best technology right now.”
Oftentimes these downstream teams are software engineers. There is a difference between software engineers and data engineers. I think a key difference is the data engineer coming from that software engineering background but having an in-depth understanding of data. Zhamak says: “[...] if your software engineers don't care about data, you shouldn't do data mesh. [...] In my mind, future generalists will be able to work with data and create and share data through data products, or use them for future engineering and machine learning training when the model has already been developed by specialist data scientists. Essentially, they use AI as a service.”
This is a future state I think will happen too.
Zhamak, in both the podcasts and her book, often talks about the future state of data technologies. It’s almost as if she lives in the future. Meanwhile, teams have to work in and deal with the here and now. Zhamak: “I say you are very right. According to Strengthsfinder 2.0, I am a futurist. It is one of my strengths, I suppose, according to them. But living in the now, I think let's, you know, let's, we don't have to solve the global scale data mesh problem today. We have to solve an enterprise or even a domain within an enterprise scale problem. And let's look at the technology we have today, let's find the ones that work better, let's keep pushing vendors.”
If you’ve been looking for a deeper perspective on data mesh (from the founder herself!), that goes deeper, look no further than this discussion. Be sure to check out both parts of the interview. I really enjoyed it and I am already excited to see what Zhamak does next!