Zhamak Dehghani is a software engineer, architect, and founder of data mesh. Zhamak founded the concept of data mesh as the paradigm shift needed in how we manage data at scale. Meet Zhamak and learn how data mesh will help organizations achieve data-driven value at scale.
A thoughtful conversation examining the paradigm shift and the unlearning required to build a data-driven organization at scale. Hear Zhamak, the founder of data mesh, discuss what data mesh is and what it isn’t. This conversation provides insights, failsafe tips, and inspiration to use data to augment and improve business and life.
Welcome to the Soda Podcast. We're talking about the Data Dream Team with Jesse Anderson. There's a new approach needed to align how the organization, the team, the people are structured and organized around data. New roles, shifted accountability, breaking silos, and forging new channels of collaboration. The lineup of guests is fantastic. We're excited for everyone to listen, learn, and like. Without further ado, here's your host, Jesse Anderson.
Hello, and welcome to the Data Dream Team Podcast. My name is Jesse Anderson. With me today, I have Zhamak Dehghani, she is known as the person who created the data mesh. We're going to get deep into data mesh and learn more about that, and why even management should know about this. Zhamak, welcome to the show. Would you mind introducing yourself a little bit more?
Hi, Jesse. Thank you for having me. I'm Zhamak, I work at ThoughtWorks as the Head of Emerging Technologies in North America. As you mentioned, I founded the concept of data mesh, and that takes a lot of my time, evangelizing it, and I also work with our clients in implementation and the transformation for adopting data mesh for their organization.
Excellent. And what sorts of things were you doing before ThoughtWorks? Or is there a time not before ThoughtWorks?
There is a time before ThoughtWorks. I've been in the industry for 20 something years. I've been a programmer, I've been an architect, I guess, technologist of all forms and shapes and sizes. During my 20 something experience, I've worked in R&D, I've built hardware, really, really small computers that you put in devices like pens and things like that, I've worked in large scale architecture, building networking and interoperability to connect many nodes and many computers, and observe them and make sure they all work. Yes, I guess I've worked in all layers of stack - from hardware, firmware, up to enterprise architecture on the microservices and operational side, and now, I guess, putting my nose into the data and BI, and causing a little bit of trouble there. But for the last three, four years I've been immersed in adopting modern engineering practices, modern digital organizational design, considering data and ML as an integral part of organizations, and that's been with ThoughtWorks and ThoughtWorks' clients.
People may not realize this, but when you're a consultant, instead of having a pretty myopic view of things, you get to see tons of industries, tons of companies all over the world.
I completely agree with you. I've been with ThoughtWorks, I think, over eight years now, about nine years. But before that, I worked for product companies, we were deep tech, building products. Your view becomes quite narrow in the space, and the problem and space that you are dealing with, and you go very deep. When you are a consultant, and particularly ThoughtWorks, where usually involving the execution. So building, whether it's our digital transformation, or we're building platforms or infrastructure, we work in many different areas.
But as you said, it's a fantastic advantage point, especially for people who are pattern recognizers. I am a big picture person, and I see patterns and try to connect the dots, and you have this great point of view of seeing problems that just repeat themselves, maybe in slightly different shape or form, in a different industry, in a different organization. But once you see these problems and patterns repeating, you can't help yourself not also finding solutions that often solve those problems, maybe in a different space, in an adjacent space, and try to adapt them and create magic at the intersection of solutions that have worked maybe somewhere else similar, but bring them to a new space and adapting them. It is a fun place to be.
How does it feel to be the creator of this?
Exhilarating, exciting, terrifying at the same time. In fact, the genesis of that was a question, a hypothesis that I put forward to challenge some of the assumptions we had made about three, I think, years ago now. And the night before I published an article - I had talked about it in conferences, but the night before I wanted to publish the article, I remember my manager at the time tapped me on the shoulder on email and asked whether I wanted to do this, because it was quite a different point of view, and challenging some of the status quo that for decades we had adopted, and she wondered if I would be okay with possibly getting attacked.
So it is, I guess, terrifying in a way, that you're really putting yourself out there as an individual, expressing the things you have seen worked, and haven't worked, and having... I guess, building up the courage to question. And, of course, we can't just question, we always have to come up with a solution, as well, and offer a solution that you've been refining or building over a short period of time. It's not that it's been decades that we've been doing this, it's just only been a few years.
So on one hand, it's terrifying, and on the other hand, I think I felt really encouraged and exhilarated, because the very first session, very first conference that I spoke at, I had tons of people that came to me after the conference and said, "All of these pain points that you described are my pain points. You are describing my life and my career in this space." So it really resonated, the concept resonated with folks, and it's really gone viral, and been... Of course, like anything, as you know, when something sees a very rapid adoption, sometimes it's misrepresented or misused. But nevertheless, it has captured people's imagination. That's very exhilarating, because I can see the possibility of all of these wonderful solutions, and the changes that will come after. And data mesh might be just a catalyst in that transformation and change, but there is a possibility of change, and I'm really excited about that.
Tell us about how you actually came up with the concept of data mesh.
It all started with observing the problems that we were seeing with our clients, the demands that they had. I was at the time, I think in 2018, the technical director for a portfolio of our clients in the west coast of US, in Silicon valley, in that area. And as you can imagine, a lot of these clients had invested quite a lot in their data and analytics, they were technologically quite advanced organizations, they weren't new to data or analytics. But despite the investment that they had built, they weren't really getting the results. They had hired quite a lot of data analysts or data scientists to improve their business with data and AI, but a lot of them were waiting to have access to data.
So, it really started with questioning why that's the case, why we're still, despite the investment, not seeing the results, and then going a bit deeper and scratching the surface on some of the deep assumptions that organizations had made. And in fact, I was coming from a very different part of the industry, I was coming from distributed system design, having seen organizations scale out their operations in the operational world in the microservices, and in the technology world, applications and services that run the business. At the point, I had gone through the evolution of late two thousands, or mid two thousands, where we saw this massive movement towards digitalization, that every function of the company started adopting digital touch points, and building applications, and I had seen how we had coped with that scale and thought about why we're not approaching the problem of data, applying the similar techniques. And those techniques were really around alignment of business with technology. In the world of data and BI and analytics, we actually haven't done that. We separate the data from the rest of the technology and the rest of the organization.
So in short, really, the nucleus of this idea came from questioning some of the, I guess, assumptions we had made around how we should organize ourselves, what the technology around managing data should look like, and comparing that to adjacent areas in the technology where we had managed to scale our practices, and adopting those practices, I guess, in the world of data and see what comes out of it. And from there, we started looking at all of the projects we were doing, data projects we were doing globally, and looking at the ingredients of success. Where are we successful? Where are we seeing results? And we're seeing results fast with data and ML. And what are those practices that we're doing?
So it was looking at, again, cross-cutting all of these problems and solutions in the adjacent space, in microservices and distributed architecture, and looking at successful stories that we had, and coming up with this, most importantly, question, not so much of an answer. And my initial writings are more around pointing to some of the challenges and some high level solutions, not really a prescribed answer.
It's great that you were able to have that purview, and be able to see all this data, and put it together. That's the difficult part that people, I don't think realize. So if you were to explain data mesh to a management person, how would you explain that?
Sure. I would say that data mesh is really an approach that affects both organizational structure and architecture in managing and sharing data for analytical use cases. So at a high level, it's a socio-technical approach in managing and sharing data, most importantly, for analytics. But at its heart is a decentralized approach. So it's an approach that believes in making sure our domains are tech-aligned business domains, such as, if you are a retailer, an order management ecommerce team, you are a payments team, you're a retail team. These teams have the accountability, support to share data, use data for analytical purposes.
So it's not an approach that externalizes that responsibility, so some data part of the organization. It really makes it an intrinsic element of everything we do. If you're an application development team, if you are a microservices development team, we're also responsible not only for building applications around the business, but also sharing the data that supports building analytics on top. So the core of it is the idea of decentralization of data sharing to the domains. And then once you do that, there’s few other principles behind it to support all of the challenges that arise from doing so. Once you decentralize data management to different teams, we have to address this concern of data siloing. If I'm the ecommerce team, and if I have the most advanced recommendation or personalization AI to give the best recommendation to my users, and I'm using the data that my ecommerce team generates, why should I have any incentive to share that data to anybody else? Why should I share this data with a retail team that's trying to personalize the experience of the customer in the shop?
The second pillar is about turning our perception and understanding of the data as an asset that we need to collect and measure success by the number of tables or bytes, to data being a product, and measuring success based on how best this data is shared and adopted and used by other people.
The third principle is really around, how do we empower these domain teams to use data for analytics, or run ML internally, or develop ML internally within the teams? What sort of platforms do we have to build? So it’s a pillar around self-serve platforms to enable and empower these autonomous teams that share their data?
Ultimately, the last pillar is around a model of data governance that allows data sharing in a decentralized fashion. How can we have these different parts of data being provided by different teams, but yet, that data has some measure of interoperability with the other data? How do we make the data interoperable? How do we make sure these data products respect privacy? How do we make sure the access to them is secure? How do we make sure we make a set of policy, global policy decisions, that can be encoded and embedded into every data product? So there is a pillar around federated and computational governance based on this idea of enabling policies, contracts, decisions, building them as automated code and embedding them into every data product. The only way that I can see a model of governance can work in a decentralized fashion. That was a very, very, very long answer. But in short, all of these pieces are necessary, so that we can decentralize the responsibility of sharing and providing secure, trustworthy, and usable data to build data-driven organizations at scale.
Excellent. And you described this as data mesh, as being socio-technical. I find, in our engineering technical environments, we think we just put some technology in place, and that changes things magically. What would you say to somebody thinking along those lines?
First of all, I completely agree with you. I think particularly the data side of the house, the data processing, or ML it's... all of our decision makings are really fueled by what's the next silver bullet I'll be sold? What's the next silver bullet technology I'm gonna adopt and it'll be great? And in fact a lot of client conversations that I have - when we're asking them, "Okay, what's your data strategy?" "Oh, we're going to have a stream here, and we're going to have a lake there, and a warehouse there, just a bunch of menu of technologies we're gonna adopt."
Of course, technology is an enabler, but it's just a means to an end. Ultimately, the purpose of the technology is empowerment of the people and the organization, and top of that hierarchy is getting value for our customers, our partners, optimizing our products for their experiences. So, I think we have to have a bit of a top-down view of the situation, not so much this bottom-up view. Technology-driven decision making is a very bottom-up approach in terms of, yes, we need these enablers, but for what purpose? And people are a big part of the ecosystem of organization. We know this from previous studies that we've done, like with Conway's law, your architecture affects the organization and vets Conway's law. And the same way, your organizational structure affects your architecture. These are very tightly related issues. Even though I'm an architect, I'm a technologist, I used to be a programmer, I get excited about technology, but this couldn't be just a technology conversation, because it would not get us far.
Exactly. I saw the same things, and had the same conclusions. You've talked about how your value and belief system has played a significant part in your life and choices. Would you mind sharing with our listeners what your values are, and how they've permeated so strongly about what you do and how you do it?
Sure. This is a really good question. Maybe I reflect on some of the fundamental value systems that I have in my life, I suppose, and in my technology decisions, as well, and I think the ones that apply to data mesh. Autonomy, a sense of freedom with responsibility is a big one for me. I actually grew up under a very authoritarian government, so that sense of freedom and autonomy is somewhere very close to my heart. But I know that with that comes a sense of accountability and responsibility. So thinking about the whole, while thinking about individual freedom and autonomy. And I think that's reflected in data mesh as well, that we can get value as an organization, by empowering individual teams to care about data, by incentivizing them to care about data, by educating them to use data in a way that they can benefit in their application development, in future development, in how they organize themselves or their workflows.
So empowering, incentivizing, educating individual tech-aligned domain teams, like these domain teams that focus on a particular aspect of the business and the technology for it, while putting a governance structure that makes sure that those people now have an accountability and responsibility with regard to the data, they make sure that they actually have a set of KPIs and incentives around the delights of the experience of the data consumers, they have incentives to share that data, and at the same time, empowering them with a platform. So autonomy with responsibility, and that sense of freedom to be able to move fast, but move fast responsibly, I think that's very important to me. And also, a sense of playfulness, and a sense of experimentation.
Particularly, with ML solutions, or data-driven solutions, we always start with a hypothesis. We're not exactly sure what's hidden in that data, we want to explore it. That's very different from imperative, logic-oriented programming that a lot of us are used to. With the logic-oriented programming, or imperative programming, we have an outcome we want to provide, I don't know, a new type of serving, selling shirts to our customers, we go and code it, we build a bunch of logics. If a customer did this, then propose them this product, and if they did that, take them to the sales page. It's very deterministic the way that we program our solutions with the imperative programming model.
When it comes to the data-driven programs that we write with ML, it's a bit more hypothetical, and starts with a question, is it possible that I make a recommendation to my users based on their past purchases, or based on the purchases of their family members? Is it possible that would influence them in a way that they find the product that they like, or not? So it all starts with a question, and you explore answering that question by exploiting the past data that you've had. But if it takes so long for you to get your hands on that data, to exploit it, to write the models, train the models, observe them, that experimentation is dead already before it gets started. You don't even dare to ask the question, because the cycle to go from the idea and the question, to actually validate it, is so long that you don't even bother.
So making sure that we can have that intrinsic experimentational approach, fueled by data and powered by data, available to our teams, and encourage them and empower them. That's also, I guess, important to me. Sometimes it works, and sometimes it doesn't work. Failure is part of that learning.
Well, that's super interesting. I had a similar experience with an authoritative, not government, but had growing up, and it changes you, when you experience it that much, of, I don't want to be under this, we can't put our teams under this. It's completely stifling in all different ways. So it's very interesting you saw that, and you talk about this, but how do we keep the training wheels still on there? That's great.
So you've started undertaking one of the more difficult, and probably unsung parts of life, and that's writing a book. It's going to be called Data Mesh: Delivering Data-Driven Value at Scale. It's really showing how much data is changing right now, and all the twists and turns. Tell us about how that book came about, and the impact it's having on your life.
Sure. You have gone through that experience successfully. You got to the point that many of us don't get to, to actually publish the book. So congratulations for that. And you probably have some empathy for what I'm going through. It actually started with a bunch of blog posts. So I had this urge of sharing. I shared in a few conferences, wrote the first blog post, and it had a lot of readership, more questions arise. So as I was developing these ideas with our clients, and actually building platforms, building technology, seeing the organizational challenges, I had this series of blog posts that I was writing late last year, and Martin Fowler gracefully shared his platform with me. He has a fantastic platform, and, of course, he curates the content that goes on his platform. He has this idea that the content has to be standing the test of time. It can't be a fad or... It has to be something that people can come in five years and read. But then he was kind enough to host my content.
So I kept sending these long articles to him, and he would come back and say, "No, no, no, no, this is 10 different blog posts. You've got to break this up." And that was a point I realized that, okay, maybe this is actually a book. Maybe this is not just a few blog posts, and that’s when I started working with O'Reilly as a publisher, and started first turning those series of blog posts that I had lined up to publish into chapters of the book, and then going from there.
I think I'm in the last mile, before the last mile right now, I'm just finishing the last part of the book, which is going to be a short part on the execution and strategy, and then going into the copy edit. But during the process, I've been putting early releases, unedited chapters out. So I've been getting a lot of, I guess, feedback, positive feedback, negative feedback, critical feedback. This is all engagement, this is all great engagement, and that proposed me to move forward and write the next chapter and get there. So it's been a year. The days have been long, and the year has been short, but I'm looking forward to the end of it.
There's three parts to writing a book. There's the writing, then there's the editing, and then there's the marketing. Because if you took all this time to write your book, and nobody knows about it, sometimes your ideas don't just go viral on their own. But hopefully, yours does. Hopefully, you have great success there. And I think that's what we're trying to do as authors, we're trying to get our ideas out there. So congratulations on that. Do you have an idea of when it will be completely published? It sounds like it's out there a little bit.
Yeah. So we've got, I think, about six chapters pre-released, and O'Reilly platform allows you to put the chapters unedited. So people who are interested in getting their hands on the early content, or looking at least the table of content, can do that on O'Reilly platform. The digital version, fingers crossed, if everything goes as planned, and I don't get sick, and my child doesn't get sick, and all of that, will be out late this year, just before the end of the year. The print version will be available in January.
Oh, congratulations. So the book is going to be linked on dreamteam.soda.io, and you can go get it there. I will go get it there, too. I'm excited to read it.
Thank you. Would love to hear your feedback.
I've done a lot of tech reviews, and I'm always trying to help out. I love seeing it, especially in early forms.
Let's break down some of the myths before we get started. There are quite a few different interpretations and degrees of understanding of what data mesh is. How do you think it should be understood?
That's a really good question. I think it has to be understood, at least, holistically. We often have people that come from the technology side and look at it, and they go, "Okay, how can I just wire the existing technology that I have in a different way that it feels a little bit more decentralized?" And you have people that come from the organizational side and say, "Oh, this is just an organizational change, and maybe I just use the technology I have, but I organize my team differently."
So I think it needs to be understood a bit more holistically. That's why I call it a socio-technical approach. So understand different facets of it, both organizational facets, as well as the technical facet. One of the other, I guess, challenges with understanding a concept that is multifaceted is that not a lot of people cross those boundaries, not a lot of people, both look at the infrastructure, and the platform, and the technology, and the organization. So maybe, if then organizations want to understand it and adopt it, there's a group of people that need to come together complimenting each other's skill sets and background and experiences to look at the big picture and holistically understand it.
I guess the other thing I would say is that, try to also understand it a bit more deeply, and implications of it, rather than superficially. Because superficially, we look at it and say, "Yeah, all of these pillars are intuitive, and make sense to my organization. It does make sense to distribute responsibility, so I can scale out, I can have paralyzation, I can remove points of synchronization." These are just the basic fundamentals of how to build scaled out systems.
Now, that system could be a technical system, could be a human system, it can be an organizational system. But if we don't go deeper and look at the implication of those decisions around decentralization... I don't have a really good answer, both from the organizational operating model, as well as the technology underneath. If the technology itself doesn't lend itself to providing cohesive and reliable query, and access data in a decentralized way, and we just prematurely build something decentralized, we're going to see pretty bad consequences a couple of years down the track.
So understand it holistically, look at all the players together, bring the right experts to the conversation, and go deep on each of these. Go deep on the consequences of that style of architecture, and look at the technology. And I try to do that in the book, I try to both go broad and introduce the pillars and the reason for their existence, and the pros and challenges, and also go really deep on the architecture. Every portion of the book is for a different persona. Most importantly, I would say, start with - if you try to understand it, if you try to adopt it, start with the problem it tries to solve. Why data mesh is for you, why should you care? What problem is it exactly trying to solve? Is that your problem? And also understand, at this point in time, there is a level of investment required. There's a little bit of a build-out that needs to happen. Is this for you now, at this point in time, or is your organization the type of organization that needs to wait another five years before they can adopt it?
So let's play that out. Let's say you have two different types of organization, one that needs to wait, and one that's doing this now. What's the value to either waiting and doing this now?
Sure. I think the ultimate value is the same for all kinds of organizations. The ultimate value that organizations take away, implementing data mesh, is to shift and reform and change themselves in a way that they can really embed intelligent data-driven decision making into every aspect of their business, and do that at scale, do that without compromising ability to change, without compromising ability to move fast. So sustain agility, sustain the speed of change, and yet, embed data-driven decision making in every part of your business, and do that globally at the level of your organization. So that's the Nirvana, that's where we want to get to.
Organizations that can do that today, they fall into the lead adopter, innovator adopter adoption curve of a new approach. So with that comes a set of aptitudes, and a set of capabilities that organizations that do it now need to have. Well, to start with, they need to have the problem of scale. If you don't have it, don't bother. They need to have an appetite for taking risks, learning, refining, perhaps, on their own, maybe with a few fellow organizations. We haven't really reached the majority adopter phase of this, diffusion of this technology, or diffusion of this approach. So you have to have that appetite around taking risks and experimentation.
Because it's a pattern, it's a paradigm that has transcended existing technology, there is a level of investment needed. So yes, you will rip the values, hopefully, you will rip the values, but with that comes a level of commitment that you have to have the capability for. Organizations that wait, I think, will fall into the persona of the organizations that are okay with not ripping all the values that we talked about, the benefits of scale and sustainability and so on. They're, perhaps, comfortable with moving slower, still not being resilient to change, the change is difficult.
When I say change, I mean changes to the late data landscape, everything as trivial, as simple as changing the fields and models of the data, to merger acquisition of part of your organization, so that you remain, perhaps, as slower. Your ambitions around the data and how the data can be used are a little bit more constrained and limited, and focused in some very few areas, so you don't get the scale of proliferation of use cases. However, you also don't take any risk, don't invest, perhaps, in building out, and you wait till other people have failed and tried, and there are more playbooks are available to you, and technology that you want to buy off the shelf. But that comes with the cost of opportunity loss, of course. And that's how, I guess, describe it to organizations, that is this for you? Do you fall into that innovation adoption curve, or do you fall into the late adopter curve of new technology diffusion?
Excellent. What's a favorite data story that you've never told before?
Ooh, I don't know. There are many data stories, but maybe I will tell a more recent one, and some relevant to the data mesh one. I do have a lot of fun data stories when I try to apply data-driven decision making to my own personal life, and found how painful it was. I won't share those ones. But I'll tell you why that's painful. I've built a few systems where I tried to predict my own behavior and the things that I like and don’t like. And then I realized, actually, a big part of how these systems work, you have to train them, you have to tell them about yourself. So you're a human in that loop of training, and it's a lot of work on the human side. So we'll keep those ones maybe another time. But the story that's related to data mesh, that was a moment of seeing those results, we worked with a client that - they're in the healthcare space, they're quite large, and COVID hits. And at the time, we were working with them since - I think it was April 2019, around that time. For half a year, we were working with them, and we built the initial foundation around the platform. We had built the structure of the teams and data products that are very domain-oriented, and there was a sense of autonomy and decentralization being built in. The platform wasn't totally mature, but mature enough to be able to produce data products, this domain-oriented, shareable, trustworthy, usable data for a variety of use cases, anything from analytics to machine learning. And then COVID happens, and this organization tries to engage with their members, with their audience using chatbots, and chatting on internet. But they needed to relay every information they were learning about the symptoms, about the questions people were asking, the help they needed, very quickly, to the providers, so they could get an insight of what was happening in the industry.
The whole end-to-end solution, from putting their chat button into the application, and into the websites, at points of engagements - which is not a very analytic focus thing, it's a very operational focus thing - to getting the content of those conversations to become data products, applying NLP on that to extract some insights, getting that to the providers, so they can learn and respond. All of that was done in a span of a few days, maybe a week or two. Of course, there was a lot of hard work and late hours. But on the data product and data sharing, and analytics and ML, because of the investment around the platform, because we could stand up a quick team around COVID conversations, independent of all of the other longitudinal health records, and clinical trials, and lab resources - all of those were separate. We had this domain oriented decomposition and autonomy, and we had a platform that was allowing these new data products independently being built end-to-end, delivering value. We could really put these up in response of something completely unforeseen and not predictable in a span of a few days, and start providing value for the organization.
So that was the point, that it was a confirmation that really, that's how we can get scale, bottom-up, creating this autonomy around high quality data sharing, right from the point of origin, in this case a chatbot, to getting that data in a form that can deliver meaningful insights to a provider, again, an operational point, and empowering those providers with intelligent information to deliver value to those members and patients.
That's an inspiring story. That's awesome. This podcast itself is focused on data dream teams. How do you think data mesh fits into that from an organizational structure point of view?
I'm excited to talk about teams. Because as I said, an organizational structure is a big part of it. I think the way data mesh fits into that narrative is that it paints a very different picture around how, in vision, the teams and people come together to both share data, and also use data for ML and analytics. And that picture is very cross-functional, is very much oriented around how that data actually delivers value. So data mesh, if we fast forward, and it's, I don't know, five years in the future, and a hypothetical organization has put data mesh together, in fact, there won't be any data team. Maybe there is a platform team to provide it, but there is no isolated data team somewhere in that universe of that organization.
The data has become - the data teams have really become embedded into the application teams, into the business and tech teams. So we have this biz-dev data ops cross-functional team that part of their work is purely functional and developing a feature, part of their work is around providing the data that is a byproduct of running those features and surfacing them as usable, trustworthy data that you can run analytics workload on it, which is a very different shape and form than the data that they store today in their databases. And part of their work is to see how that data and the data produced by other teams can empower intelligent features built into their application.
So I think the last, perhaps one of the last silos, and the walls of siloing organizations around this bifurcation of - this is where we build applications, and this is where we do data analytics - hopefully, that wall will come down, and every team is a data team, every team is an application team. There's no bifurcation there. That's their ideal place to get there, and maybe the intermediate doesn't quite look like that.
I agree with you. In my chapter about data ops, that's what I talk about is. I think we did ourselves a disservice by saying data ops was this small, we were changing a few things. I think it was - no, we need to organizationally change things. My definition in the book was, it was an end-to-end team. The end team could do end-to-end value creation. And it wasn't just - here's value creation of the data, it was application, it was everything. And I saw this as, this is how we reduce that friction. It was overall friction reduction. Is that what you've seen, as well?
Absolutely. I think that that's where we need to get to. Of course, if we think about it, why we made this decision, that these services today... This decision has been made because of a lack of... I guess, lack of distribution of knowledge around data and data tooling evenly in the organization. So knowledge around how to use all these tools around the data, how to do data processing at scale, how to do ML training. That knowledge started very centrally to a very few individuals that are so hard to find, and then put into the data team. And perhaps it was a bit of a residue of business intelligence happens separately for a very different purpose than running the applications in the business, so it latched itself to that model of organization, but also lack of even access, or diffusion of access to the data tooling and data knowledge was limited to very few, so we centralized.
I've seen, where you see this centralization, decentralization, centralization, decentralization every time we have a new trend in our technology, and I think that was... that decision was right at the point in time, but that point in time has long... we've long moved from that point in time. And if there are organizations that really want to explore taking advantage and exploiting all of this data everywhere in their solutions, then that separation just can't possibly happen. And I see that even with early implementations of the data mesh.
What I see is that organization as a whole buys into this idea of decentralization, distributed ownership of the data. Where they fail is actually distributed... to really create this responsibility and accountability and motivation for using data in every domain team. The organization doesn't wake up one day and say, "Let's imagine, how can we use data in each of the business functions that we have?"
If they do that, if they wake up one day and say, "Let's use data... Let's first get educated, of course. After a bit of education, let's imagine and visualize how our applications, how our routing of our products, our logistic, our order management, our sales, all of these functions can change if we can exploit all this knowledge and data that we have about interactions and touchpoints of our people, customers with our organization. And from there, let's see what ML we need. Let's then work backward and see what data do we need access to?"
If we start from that end result, then we see this motivation around those application teams wanting the data, and wanting to be part of the data ecosystem. But it often doesn't work that way. The organizations say, "Oh, we wanted to have this... a little bit of recommendation maybe here, maybe a little bit of optimization of our pricing, or our seasonal design of our product." And they have a very niche, few use is for ML, and they've pushed that aside to a data science team sitting separate from the business, and that's good enough.
If you do that, why does your application team have any incentive in sharing data, good things that you're talking about, that end-to-end? Just, they're not motivated about any part of that end-to-end, because they're not part of it, it's not intrinsic to what they do. So I think for that to actually happen, we need to change Maslow's hierarchy of needs for our teams to have data-driven application development and future development as a mandate, almost, as a requirement for how they build applications.
So you shared a few best practices. What are some other best practices about data mesh you recommend?
Good question. I think organizationally, really look at it as a strategy, as a transformation. We're really talking about a big change that affects your technology architecture, decentralized ownership, federated governance. So we have to have a good understanding of the big picture. So I would say as a best practice, have the stakeholders, have the buy-in of folks who set the vision for the organization, almost. One of the most successful cases that have been, it started with an organization. Their CEO, in his vision, in his strategic initiatives, had data and ML baked into every one of them. They had, structurally, and from investment point of view, they were investing in their data.
So we have to have those business drivers around the data. So it has to be some... There has to be some element of top-down visioning around why data mesh matters to us, organizationally, from business perspective, and then working backward from that vision statement to, okay, what are some of the strategic initiatives that can benefit from data mesh? And we can also use those initiatives as a vehicle for execution of this transformation. So we are transforming the building platform, while we're delivering value for those strategic initiatives.
That's exactly what we've done in the past, we look at, okay, you want to... This particular organization, you want to change the experience of every single customer that you have with data. Let's work backwards from that, look at the touch points that the customer has. Let's work backwards from that to see, what are the data driven business changes we have to put in place? And then work backward to ML, work backward to data products, work backward to the platform you have to build. So one of the, I guess, best practices would be, yes, look at this as a transformation, but to slice it, and iteratively build it and mature it, driven by the business value and the business initiatives.
The other thing I would say is that from very early, on bring in those business domains engaged. The Antipater that I see is that people buy into their distributed or decentralization idea, but they do that still confined to the walls of the data analytics group. So they decentralize, but within the data analytics group. The tech and business domains, the application development teams, the application teams, they are still oblivious to this whole event around data that's happening.
So go to the source very early on, because... You probably have a data warehouse, you have a data lake, many of them, perhaps. And some people, they think about data mesh as something downstream from their lake, and not upstream. And data mesh is all about upstream. It's actually about peer-to-peer. So engage those tech businesses or those application developers early on in every data product that you enable.
The other best practice that I think is applicable to really every transformation is to approach this in an evolutionary fashion. And by that, I mean that, yes, you are incrementally delivering value, you're incrementally maturing the organization, but there is no way that you know exactly how this path will plan out itself. You will take many squiggly lines and reroute your plans. So have a way of measuring, are you moving to the right direction? Have a compass. And that compass could be a set of litmus tests, a set of measurements that you measure along the way to make sure, yes, I'm taking all these detours, because the business priorities changed for this quarter, or we ran out of money here or there.
But I have this compass that tells me every step that I'm taking, still, I'm going closer to that target state of sharing value at scale, and getting value from data at scale. So define those tests and fitness functions, or KPIs, and track them along the way, so that you can navigate your way in an iterative and evolutionary fashion to this target state.
So you talked about some teams that are having difficulty making changes, and I've... And sometimes it's technical, and sometimes it's ways of working. So how do we help people unlearn what they've come to know and love, and have dealt with for most of their careers, so that they can start adopting data mesh?
Yeah. I would recommend a book from a friend of mine, Barry O'Reilly, Unlearn. He has a fantastic book on the topic of unlearning. But I love that language, Jesse, that you used, unlearn. It's really about unlearning. I think we have to stop for a moment, suspend a lot of assumptions, a lot of the things that we thought were best practices, and trust a system to take us to a different direction. If you don't change direction, we end up where we are heading right now. So that suspension of biases, and questioning our biases is really important.
What I've found is talking to a lot of different technologists, playing different roles, what people do is that they want the outcome... The outcome is not something that they argue with. We're all striving to achieve the same level of getting value from data, but we have this cognitive bias. So when we see a solution, we try to match it to what we know and what we're comfortable with, and somehow twist it and do some jujitsu on it, so it looks like what we always have done, but yet expect a different outcome.
I would think that we really need to inject some, perhaps, even new people, new expertise into the teams, so that we can depart from those very traditional biased ways of approaching the problems. I will give you an example. For example, I... even the folks that I work with. At ThoughtWorks, very early on, I worked with some of our most experienced data people, authors, and I've worked in the data space for a long time. And they were asking really good questions, but with a lens of an existing solution that they weren't familiar with. They would ask, "Oh, I need master data management. We use decentralized data, but I need to master my data."
So then my answer to that would be, why do you need to master your data? Let's go back to the beginning of why master data management matters. What is it about? So, when you layer these kind of onion of solutions, and a bunch of accidental complexity that we built along the way to create these solutions, and get to the core of the problem we were trying to solve, well, okay. In case of master data management, it's about having a consistent view of a few core entities, so that when we are... try to do sales prediction, or product planning, or inventory planning, we're talking about the same thing when we talk about product, at least similar things.
So that problem remains. That is a very valid objective and need and requirement. But the way we have approached it is through this very centralized, single canonical model, which had a lot of challenges, a concept of master data management, or the solutions to master data management. So now we need to challenge a lot of those solution, assumptions that we built and say, "Okay, how can we achieve the access to master data for some of these entities, but in a very decentralized fashion?" And we arrive at a very different solution, and, hopefully, it's more effective than what we've done.
So, yeah. So go back to really understand the objective, what are we trying to achieve, and try to forget many of the solutions that we built, or approach it based on a new set of laws or basic principles. And then I think it's actually a very exciting, and, for me, invigorating exercise, because now we're given a blank state to be creative, to start from first principle, apply the first principle of data mesh, apply... You will have a very good understanding of the objective, you probably have seen a lot of things that haven't worked, but try to imagine and create a new set of solutions, and that's exciting.
Well, that's one that I find people are the most passionate about, data lineage, master data management, and it's solving a problem that we just... that we're putting some duct tape and hope on, instead of, let's solve it the right way. We should have solved it better, but we didn't.
I think it's not so much the right way or wrong way. I think the way we had approached, traditionally, master data management was meaningful for a state of entropy that organizations were in at the time. As in, in terms of the scale of the organizations, the dimensions of the scale, the dimensions of the business, the speed of change to the business, perhaps it was okay to think about it as one canonical model, one centralized solution, put all the data, and then model it into this multidimensional thing.
But if you fast forward to today, where the state of organizational entropy is in this really fast accelerated change, constant change, the amount of energy that you have to put in to sustain that solution is not proportional to the value that you get, and it's not proportional to the speed of access to the data that you want, to the real time access, to the latest version of this master data that you want. So perhaps the solution was more meaningful and relevant at a point in time. It's not relevant at this point in time, because we cannot ignore the state of change and the acceleration of organizational entropy, I think.
Everybody was thinking, we could have this point in time where we put this line in the sand, and everything's now good, and I'm not sure if we can really say that ever, with data. There's always going to be something wrong with it. Maybe we should just accept that and move on.
Yeah, completely agree. I would always say, don't work against the grain, work with the grain. Don't be in step with the organizational change, and embrace that. Embrace the chaos, entropy, complexity of the system, and then come up with these adaptive approaches that they can adapt and scale with the constant change.
Okay. So we were talking about some of the changes that teams need to make. So I would say there's going to be some key skill set changes that are going to have to happen for data mesh. So what sorts of changes need to have, and how would we upskill and train these people?
Good question. I think when we say the responsibility, accountability around data needs to come to the application teams, of course, those application teams need to be skilled up, augmented with the right tools, so that they can really generate data, and share data, and use data the same way they generate applications or call APIs. So there's a bit of an education around data sharing for analytical understanding. Also, analytical workloads, and how analytical use cases want to access the data. So there is that change.
But I do also think that with the design of these next generation self-serve platforms that are native to data mesh, we would actually look at the end-to-end application delivery, and try to level up the platform capabilities in a way that are more natural and organic to generalists, full stack developers, and they're not so nuanced to data. I think we would raise the level of abstractions. So we provide easier tools, so that level of specialty can be extracted from the teams and pushed and abstracted away into the platform.
So I think there's a bit of both that needs to happen. This idea of data product ownership, looking at data as a product, I think this is another magical intersection between usage of the data, sharing of the data, and treating that as a product, treating the users of it, data scientist, data analysts, or data application developers that are building intelligent applications as the users, and see how that data needs to be delivered as a product, and have long term ownership of that responsibility, sharing data as a product, measure the success based on product thinking.
I think that that is a magical intersection, between product thinking and data, that gives birth to new roles, such as the data product owner, again, built into the same team. And it might be the same person. Maybe one day the application product owner and the data product owner are the same person, they have different roles, or maybe they're different people. But nevertheless, that role needs to be introduced. I think around governance, there would be new roles that we create, at least new responsibilities.
Again, the governance can't be a third party externalized responsibility of somebody else trying to bring security and privacy late in the game into the data. So we need to shift left those governance concerns, again, back to the team, so the accountability and knowledge around how to build in privacy from day one, how to build in access control at a fine grain level. All of that needs to be, again, pushed down to the teams, and shift left, and create incentive structures, so that all of those data domain-oriented data product owners are accountable for governance. Those would be some of the, I think, main changes to the teams and roles that need to happen.
You talked about quite a few different titles. Do you think there's ever going to be a data mesh engineer title?
I hope not. I hope not. Actually, I am a generalist myself. I jump into different boxes and different disciplines. I've jumped into many different disciplines in my, I guess, careers. So I really don't feel data mesh lends itself to specialization. I don't want to see that, I don't want to see this yet as another specialization. In fact, data mesh, I hope, is a catalyst for adoption of data by our generalist population. So it's a little bit... I think it's in contrast to the mission and one of its founding principles of data mesh, that we have data mesh experts and specialists. So I hope not, but I suppose there would be some of the existing data architects and data engineers that would get familiar with a new architectural paradigm, and that would be just another tool in their toolbox.
So related to that, what do you think is the most challenging role on a data team?
So on a, I guess, a data mesh product data team, what would be the most challenging role? That's a really good question. I have to put myself in the shoes of that person. I would think, initially, perhaps the most challenging role would be a data product owner that's working alongside an application product owner in a particular domain. Let's say, we're an application team, we're responsible for ecommerce, and now this... There has been an existing application product owner for the ecommerce, and what the features need to be, and now, either that person has a new role around the data product ownership that ecommerce uses and generates, or there would be a new person.
So working out how to reconcile product thinking across the lines of application and data within the same team, I think it's going to be a challenging and interesting place to be. Also, this new role will have some sort of an intersection with what traditionally has been the job of a governance team, the members of the governance team, because this person needs to now be responsible for security, compliance, all of those abilities that... or objectives of the governance, but built, right then and there, into the data that the team shares. And there's going to be a little bit of an overlap between governance and working out how we cooperate with each other, how the data product owner role fits into the governance. I think that would be the most challenging role, just because there are a lot of nuanced changes to the existing roles and structure that need to happen that intersect with this role.
Excellent. Going back to your background, you've worked at quite a few different roles within your career. Could you share any of the life-changing, more impactful or defining moments that you've had during your career?
Oh, there's been learning in every step of the way. We've become these whales, with barnacles attached to us as we learn. And as we unlearn, sometimes we shut them off, but define moments. I think I actually share this with a lot of more junior, or folks that are early in their careers, that make sure you have a good entry. Make sure at the beginning, you have... Especially early in your career, you have really great mentors and coaches that have learned... establish a good common sense, because that establishes your sense of what seems right and what doesn't doesn't smell good.
So I think my first job, I would say I got super lucky that I worked for a company that was innovating in... I don't know some workflow system management, but that didn't matter. But that introduced me to the whole Unix philosophy, and Unix way of building applications at a very low level to high levels of a stack. I think that was impactful, just learning the ways of, I guess, working with Unix operating system, and building applications with a similar philosophy.
Again, data me has some residue of that in it, building small applications, and only one thing, and really doing that one thing really well. So that was really impactful to, I guess, the person I became. Again, I worked in distributed system design at the level of networking with data processing, when real time wasn't a cool, hip thing. We were doing real time data processing from ATMs, and bank systems, and financial systems, and processing them, and detecting anomalies, and appreciation for complexity of distributed systems, and also appreciation for very simple protocols that can be so powerful.
So I've worked in building up applications in distributed systems, and I have a great appreciation for network design, and the protocols, like how really simple and beautiful protocols like TCP/IP, and so on, can... from that can emerge such powerful systems. I think that's been pivotal. I think working with hardware was fun, working with... And then in consulting, I would definitely encourage people to have a detour at some point in their career, go to consulting for a little bit, even if it's for a little bit, because your learning curve is just... exponentially gets faster, as you can imagine. You have, probably, been in that world.
You get really exposed to real-world problems, and problems of large organizations, not the fancy cool tech organizations, like the 100-year-old organizations, the legacy they carry, the challenges that come with the legacy. It's really fascinating to go from... go to different domains, go to healthcare, go to finance, go to tech, and just see the differences and similarities. Definitely, being a consultant at ThoughtWorks, with a deep focus on technology aspect, has been eye opening, to say the least.
I'm going to take a wild guess that there may be some openings of ThoughtWorks, then, if people are wanting, maybe be able to apply there and do some data mesh. Related to that, you have a daughter. What do you tell her about what you do each day, and what can she take away from it?
I don't know, do you have... Jesse, do you have children?
I have two girls.
Two girls. What age?
So a lot of what I think... They're nine and 12. So a lot of what I do on a daily basis, they don't understand. But personally, what I've been trying to do is I want tech to be a better place for women. So I think we're all trying to lay that groundwork, so that if they do choose to go that route, they're going to have a significantly better experience than, I would say, maybe not just now, but 10 years ago, or 20 years ago.
I agree. I agree. Mine is five years old, and it's really interesting to see how the personality shapes, even at such a young age. Is she a follower? Is she a leader? Can she think for herself? And it's a nice balance, because at this... It's funny, actually. It's a difficult balance, because at this age, you want to teach them to follow the rules, and be obedient, listen to your mom, go to bed at this time, listen to your teacher. But at the same time, I want her to challenge some of the rules. So she comes home, and the preschool she was going to didn't allow wearing any shirts that are related to commercial products, or TV shows, or of cartoons, and things like that, which I actually personally quite liked as a rule, as a guide that they had. But I wanted her to challenge that, to understand why.
One of the things that I try to, I guess, teach her, which was always, I guess, intrinsic in me, is to question, to ask why. To follow the rules, or follow your friends, or your teacher, but make sure you understand why you're doing it. Just don't do it because everybody else said so, and they're doing it that way. So I think I'm going to be in trouble when she's turning 15, probably teaching her this, but try to question and understand why, and if the change is needed, just be that change agent. It would be something that I would try to encourage.
I hate to break it to you, but as they hit their teenage years, that's going to happen inevitably. My daughter's about-
I know.
... to hit 13.
Right.
It's started already.
Yeah. Sometimes I wonder, am I doing myself any service?
So one final question for you, what do you never compromise on?
I guess the first principles. You always start with some first principles in problem-solving, in building solutions. Those are the things that I won't compromise on. Because I just feel like they're non-negotiable in a problem-solving, at least, space. And integrity, I guess, on a personal level. That can't be compromised, having a sense of integrity. In fact, in a lot of, for example, data mesh conversations, people come to me as part of consulting, piece of consulting, or I actually run tutorials twice a year, and they say, "Give us the answer, tell us exactly how it looks like." And I can very easily do that, and pull that exact solution out of thin air, using some creativity and imagination. But I think I always try to be honest to say, "This is the rough shape and guide based on what we know today. And of course, we have these first principles we can work from, but the exact shape of the solution has so many unknown parameters that we're yet to discover. It's so contextualized to your organization, it's so contextualized to your decision."
People actually get annoyed with me, that I don't give a fully baked answer, and put it just in front of them, because I want to be truthful to the maturity of the state of data mesh. Yes, we all love the objectives, we get the first principles, they're great. But exactly how it looks, let's not jump into the solution. And in fact, I push back and say, if somebody comes and sells you a baked data mesh solution, run the other way. We've only been at this for a few years, and we're talking about a paradigm shift, so we can't... it can't be true. So have integrity, and try to be truthful as much as possible. It's something I won't compromise, even if I'm losing business for that.
Thank you so much, Zhamak. This has been great. A reminder that Zhamak's book will be linked on dreamteam.soda.io. I encourage anyone looking to get some practical advice, different perspectives in-depth to get it and read It.
Thank you, Jesse. It was wonderful to be here. I'm excited we crossed paths with your work on data teams and data mesh, and, hopefully, we can make some magic.
Another great story, another perspective shared on data, and the tools, technologies, methodologies, and people that use it every day. I loved it. It was informative, refreshing, and just the right dose of inspiration. Remember to check dreamteam.soda.io for additional resources and more great episodes. We’ll meet you back here soon at the Soda Podcast.