Gwen Shapira is an Engineering Leader at Confluent and author. Gwen is well-known for her work with Apache Kafka, and has over fifteen years of experience working with customers, code, and big data.
In this episode Gwen gets ready to talk to Jesse about building highly functional, collaborative teams, and helping the next generation of database architects. She shares how she leads and nurtures individuals to be motivated and energized to realize enormous potential.
You’ll hear Gwen’s style as a smart, humble, and empathetic technical leader who builds and leads a strong team of brilliant engineers, who transform how organizations manage and stream data.
Welcome to the Soda Podcast. We're talking about the Data Dream Team with Jesse Anderson. There's a new approach needed to align how the organization, the team, the people are structured and organized around data. New roles, shifted accountability, breaking silos, and forging new channels of collaboration. The lineup of guests is fantastic. We're excited for everyone to listen, learn, and like. Without further ado, here's your host, Jesse Anderson.
Hello and welcome to the Data Dream Team Podcast. My name is Jesse Anderson. With me is Gwen Shapira. I have known Gwen Shapira for a long time. We’ll talk about a wide range of things because Gwen has 15 years of experience in doing these sorts of things, perhaps even more. If you recognize that name, it's probably because she's written a book or two that you've read. Gwen, would you mind introducing yourself a little bit more?
Yes, thank you Jesse, it's so good being here with you. So yeah, as I said, I have 15 plus years' experience just moving data around, is a good way to put it. Most recently I've spent the last six plus years at Confluent, various roles, and I'm currently leading Confluence Cloud Native Kafka Team, so we're basically taking Apache Kafka and trying to make the experience in Confluent Cloud be what one would expect from a service that was built from the ground up for the cloud. And yeah, as you mentioned, I wrote Kafka: The Definitive Guide, and before that the Hadoop Application Architecture book, both were really awesome courses.
Speaking of Confluent, you've had a pretty awesome growth period. I always marvel at it whenever I think about it and see you. Could you tell us more about what your journey has been going from a database architect all the way to an engineering leader?
Oh, wow, that was a very long journey. Basically I think I was a database architect up to maybe 2013, and my last project as a database architect was for a company in Japan asking me to integrate their brand new Exadata with their brand new Hadoop. And Hadoop at the time looked super exciting, so I just spent a lot of time learning more and more about it. And as I did, I saw more and more companies really moving a chunk of their data processing that was not the best fit for relation databases into this new way of handling data, one that was far more scalable and was really built for large-scale processing. So, that was exciting. And when I got back to the US, I basically gave Cloudera a call and found myself working for Cloudera.
And I got to do a lot... Most of what I know about big data and the problems that it solves, and also the problems that it creates, I learned at Cloudera. And then toward the end of my time there, I learned from my customers... It's funny how everything important I learn from my customers. I learned that it's really hard to manage the way data now flows in their architecture, that they have relation databases, they have no SQL databases, they have document stores, they have Hadoop, they have all those things. And then increasingly they choose something interesting called Kafka to manage it. And I started looking into that, and you know how first you look into a new technology and you're like, "Bah, why does anyone need it?"
But, the more you look at it the more it's like, "Oh yeah, this is actually a fairly good way to solve all those problems." And after spending some time working on Kafka for Cloudera I moved to Confluent, which was at that time like a 10 person company, to focus on Kafka so I can really get to focus on Kafka myself. And then my journey inside Confluent was kind of interesting as well, because I joined as an engineer, but I joined like a 10 person startup. So, the first year I was writing code, reviewing code, managing the release, handling support calls training, delivering some training. I did some professional services, you kind of do a bit of everything. And obviously you were around with us doing all of the training and the events and all that.
I remember you doing our first demo, that was pretty exciting. And then when Confluent grew a bit and we needed a product role, they gave me the opportunity to move into product. So, then I spent the next two years doing product management, and I was... There were a lot of products, I did some Connect, and then Replicator, data balancing, yeah, and then Confluent Control Center was my last product that I managed. From there I moved into marketing, mostly because we had a really amazing CMO and I wanted to learn from her as much as possible, which I did. I learned a ton about product marketing. And then I moved back into engineering, and you know how leadership journeys usually start, when they are like, "Oh, we have this small thing to do, and we have two or three people.
"Can you just run the team meeting and make sure that things stay on track and write the weekly report?" You can still write code, it's not really a management job, not a big deal. And then you start doing this, and then you look around three years later, and suddenly it's 20 plus people and you're actually managing two managers, and the whole area that you are leading just completely exploded. It was a very interesting journey.
It was, I've always loved watching your journey from afar, and I've loved what you've done, how successful you've been. Probably the most interesting thing I ever saw was a picture of you when you were in the Israeli Defense Force with your camo and your M16, that was pretty awesome.
That was a pretty awful picture though.
I disagree, I think it was awesome. So your ethos is leading people to build large data systems. What does this mean on a daily basis?
On a daily basis it obviously means that you need to have really good people, and good people can be from all levels of experience. You need some people who are very experienced, been there, done that, saw all the problems, know what not to do, know which pitfalls to avoid. And then you really need some, also junior people with just a lot of energy, a lot of drive, who want to learn how to do amazing things. So, you kind of need this very balanced team. And the thing about large systems is that they are complex by nature, and you really don't want to make them more complex than needed. And I do think that teams need some guidance around it, and it doesn't necessarily need to be from me. I hired good leaders that can also lead big projects, but you really...
As a leader, part of the job is to set a quality bar for your team, and to kind of set what is expected, what kind of trade offs to encourage, can you reduce complexity by taking some trade off somewhere else? I think that's really important. Some people hear large systems, and they kind of assume that they're going to be complicated, and they don't always keep an eye on how to make it as simple as possible given all the constraints, and that's a mistake that I've seen a lot. I really appreciated how my team thinks about doing things that are reliable and scalable from the ground up. Doing just scale without reliability is something that is incredibly dangerous, especially if you know that you're going to be the on-call person for your own product.
Related to that, there's a rise of engineers, specifically data engineers, some people are calling them analytics engineers. What sort of advice would you have for people who are wanting to transition from software engineering into data engineering?
Interesting. I think a lot of it is around learning the toolset, because the toolset is very different, and that's true in both directions. I have amazing engineers on my team, and I was fairly surprised a few years back to discover that they actually don't know SQL, or they know SQL but they're not very good at writing SQL, not very experienced. They spent their entire career writing in Java. So, I think the main thing... The mindset is almost the same, like you're an engineer, you build products, you provide solutions, you think about interfaces, and you think about the design. The main difference is really the tools in which you do it. Another thing to keep in mind, and that's I believe true for every part of engineering but maybe more than most in the data engineering space, the thing just evolves so fast.
Like, even if you've been in the space for 15, 20 years, it's so hard to keep up, to learn about everything. You really want to create a good network of practitioners that you can learn from and keep your knowledge up to date, hear about new things, discuss is this new way actually better or will it make my life easier? Will it make things simpler, or more reliable, or more scalable? And I think a lot of the questions that I get from people moving into the field is really, who do I follow, who do I read, who do I listen to? So in a way, this podcast is maybe one of the better ways for someone to move into the field and also stay relevant in the field, which is a lot harder than moving into.
Yeah, as self-serving as it sounds, plus one for this podcast. And I will also put some words into Gwen's mouth, Gwen is a really good person to be following on this, because she does think and talk about this. I also saw that thing that you were talking about with the SQL, it surprised me completely. I always thought it was just kind of a given that a software engineer would have SQL skills.
I know, right?
It's crazy. So yeah, software engineers, yes. SQL's kind of a bare minimum in my opinion, but maybe I'm old and - I mean, they don't teach it in the universities, all right? So, it's definitely not a mandatory course. Even if you took a database class, usually SQL is an afterthought, they teach you about, and LSM, and all those things. But, they rarely actually teach you good SQL. So, depending on where did you work, you may or may not have a chance to learn it, it's pretty weird.
Yes, that is. Just so people know, or... For the management that is listening, it's important to be able to go back and forth between them, because some problems are better solved in SQL, and some problems are better solved in code. Is that your opinion as well, Gwen?
Oh, totally. And again, the range of tooling is very large, and also the types of databases that people use, it's very wide. You run into a lot of different data stores, each one of them have their own SQL variant. And then when you start integrating them and really want to bring all the data together, or you want to build an application on top of it, you will really need to think very deeply about the tools you use and how to integrate them together. Another really weird part is that there are a lot of databases these days that nobody thinks of as a database. Like, if you think about a typical company, they could have tons of really important organizational data in Workday, tons of customer data in Salesforce. You don't think about them in databases, but in a sense they are.
They hold your most important data in the organization. So, you really need to also learn to work with those APIs, and how to tie them together into meaningful systems both for your data warehouse, maybe even integrate some of this information in the product that you build.
Given that you've done so much training and I've done a decent amount of training as well, you've done the O'Reilly Trainings, how do you think we should be upskilling and training individuals on data teams to get them to this next level?
You know that's really interesting, because as much as I believe in creating training material, the best learning that I've seen is by giving people ideas for projects and making them explore the material rather than structuring it, start to finish. So, I'm not obviously against people listening to classes, but I think retention when you just sit and listen is a big concern. And the reason retention is a big concern is because you are listening to a lot of information when you don't yet know that it's relevant to you. So, a lot of times, go through the class once, then start trying to implement what you learned from the class on a project that actually matters to you, and then continue going back to that material.
And I think every class now has online materials you can reference and keep going back to, and as you think about your problems, what in this is relevant for my problem, where can I apply it, where does it actually help? And when you engage with what you learned at that level, that's when the real learning happens, because you are naturally very curious because you're trying to solve a meaningful problem that you have, and it's a very different experience. Does that make sense to you, Jesse? You have way more experience training than I do.
I've done some training, just a little bit. But, I always love to hear your experience. But yes, it echoes my experience as well. There's a trend now, and I wholeheartedly disagree with it, that you sit there, you learn passively, you hear somebody talk, and you've somehow learned it by osmosis. And frankly, I worry about what's going to happen in the future with people coming in with those expectations, and those people sitting in seats, trying to create these systems.
Yeah. The other thing, I don't know if you still do it Jesse, but mentoring is so important. When you have questions, you try to make sense of them. You can read a lot of things online and 50% of them contradict the other 50%. You really want to have a good relationship with someone experienced that you can discuss those things with, when does this approach make sense, when does the other approach make sense, and really help you integrate, help you think what could be the next step in my learning, what makes sense for me to do now. It's just so important, like I learned so much from my mentors, and I strongly advise it for everyone. No matter how much experience you have, you will always have those gaps in your experience that you will need to fill. And filling them with a human being, with a real personal connection is so important.
Yes, in answer to your question, yes, I still do mentoring of companies and teams. One of the things that I'll share, and I think this is important for anybody who's listening to this, management, individual contributors, think of anybody in sports, and then think, do they have a coach? Cristiano Ronaldo, best football player in the world, soccer here in the US, Tom Brady, Coby Bryant when he was still alive, didn't just have one coach, they had several coaches. And it wasn't that Coby was bad at basketball, it's that he knew he could get better, and that there were things he could work on. So in a similar way, I really encourage people to not just think that they've mastered everything, but what they can improve on in the future.
Yeah. Honestly, nobody masters everything. Like, the more you know, the more you see how much you still have to do and improve and learn. And as I said, the space is changing so fast. So, even if you're absolutely perfect today, which you're not, tomorrow there will be a new way of doing things, and you will want to have someone to have the conversation with. Does this make sense, when does it make sense? And usually doing it with a pre-existing connection, with someone that you already know and trust and you understand each other, is incredibly useful.
Speaking of management on your side, what's been your biggest realization from being a manager?
Oh, man, there's been so many. You definitely don't know what you're getting into until you actually do it, and I think it's true for everything. But I think these days you... Again, it goes to learning, you read so much online that you think you're prepared for this, and then 50% of what you've read online absolutely does not apply in your situation. One of the biggest misconceptions that I've seen from new managers is the idea that if you have really good engineers on your team, like experienced and motivated, you don't need to do much as a manager. You can just point them at projects, and let them do their thing. And I did it from the beginning, it kind of backfires, because there are just so many different things that can go wrong.
Like, they will not quite get what you want out of the project, so you really have to keep discussing, aligning, making sure they understand your vision, or if it's someone else's vision really make sure that they understand the product vision, and what is these things that they're working on supposed to achieve so they can really make good decisions on that? And then you really want to make sure that they're not just motivated in general, that they're motivated by this specific project. So, you really want to align the project with the things that they're interested in, if it fits their career, their public visibility, what they find technically interesting. You really want to look for ways to create this alignment.
And then in some cases, you want to make sure that they don't get distracted, because in any normal company there is so much going on, everything always looks interesting and important and urgent. So, you really want to keep tabs that this is still a priority, that people are still focused on it, are thinking about it, did not get pulled into too many projects, too many directions. And even with very similar people, it requires work, attention, and obviously that's just on a project. The next thing is obviously being challenged, are they always interested, do they feel like they are growing their career? And, this is true that even someone with 20 plus years of experience will still have things that they have not done yet and will be interested in doing.
They hopefully are still excited about what they do, or you want to find ways to excite them about what they do, give them mentoring opportunities, give them opportunities to participate in the community, give them opportunities to appear in podcasts, whatever helps them feel like they're not stuck in place just executing the same thing over and over is incredibly important. And I think that's the fun part about the management job, and it's weird for me that so much advice online is encouraging new managers to not do the things that are actually the most fun.
I told the CEO the other day, who is a new CEO, the most difficult part of being a CEO is hearing all these conflicting advices and choosing a path. It's actually really difficult. Same thing for management, everybody's got an opinion but there's wildly varying context that they're delivering that in, and they're not always giving you the caveats.
Very true. And the other thing, protecting your team, a lot of managers' instinct is to be very protective of your team, and I am the same way. But, as a manager your goal is actually to balance protecting your team and doing the right thing for the company. That's the reason there are actually managers who align what a team needs and what the company needs, and make sure that we can do both at the same time. I notice that a lot of times managers really focus, and again a lot of it is driven by online advice, all you have to do is protect the team, be this kind of umbrella without thinking a lot about when does the umbrella actually have to let some rain in, both because it helps the engineers grow, like being exposed to something is a way of growing.
And that's true for kids, and it's also true for engineers. And also being exposed to the organizational context will just help them make better decisions, and help the company have a team that is actually focused on the customers and the business decisions, and I think this is very powerful for everyone involved. Jesse, do you also feel that one of the most powerful things that engineers in a team can do is really focus on where the business needs value, and try to actually be very proactive and initiate ideas and projects that are aligned with the business goals, rather than just do projects that are given to them by the business?
I wholeheartedly believe that. If you were to ask me what changed as I started a business and started running a business, it's that I'm able to see a clear path between technical, ROI, and the business ROI of it, where you're kind of achieving both at the same time, and I think that's a real key. If I were to suggest things to either individual contributors or even managers, it's cross-train as much as you can so that you can create that business value quicker. And as a direct result, you'll rise much, much faster.
Exactly that.
When we were connecting for this interview, you talked about how you're really interested in how shared services and managed services are going to be changing things for companies, and not just for companies, for individual contributors, for managers. Tell us more about your thoughts there.
Yes. In general, and it's not new, we've seen a huge rise in managed services. But, I think in the data space it's maybe a tiny bit newer? I mean, there's very old school things, like you could always use things like X-Ray. But, these days pretty much any kind of tool, application or data store that you want to use, whether it's Kafka, or MySQL, or Mongo, or Elastic, pretty much anything is available as a managed service. Also, all the workflow management. There is the Airflow company now, DBT, all those things can be done, managed. And if you look at today's financial environment, especially if you're a newish company, actually getting money is a lot easier than getting engineers. So, you really want to think about what is important that your data engineers will do, that...
Usually it is organizing, exploring, and designing data products for your company. And less actually configure MySQL, and Elastic, and Cassandra, and Kafka in the most optimal way. So, leaning hard into those managed services, and really shifting the mindset of not all data engineers are database configuration-obsessed. But, a lot of them seem to feel like they are, say, half SRE, half data engineer. And in some cases it's still important, but I would say that in other cases you don't want to spend time doing something that you can offload to a service. You want to spend your time thinking about the unique data of the company that you are working in. We talked about business value, the things that are really valuable for your business that are unique for your business.
And everything that is not unique, not differentiated, gazillion other companies do, gazillion other engineers could do. Try to find someone that your company can give a bit of cash to, and have them do it for you. Because, I would say that Kafka is normally very easy to manage, but it doesn't help when something goes wrong at 4:00 AM. Like, you are still there, staring at a bunch of logs at 4:00 AM trying to figure out what went wrong, and there is no reason for people to have to do it about really any system these days.
I agree with you, one of the things that really makes me cry myself to sleep at night is teams focusing on solving problems and figuring Kafka. That is not where your ROI is.
Exactly.
So when I work with companies I say, "Get out of the operations game as much as you can. This doesn't mean that you're going to fire your entire operations team, it just means that you have your operations team focus on the business value, the specific business value you have. Is that your thought as well?
Exactly, there are always unique places where your team actually adds value. And operations team I think really adds tons of value on the integrations, because all those managed services do not always play super well together out of the box. You will still go into a lot of your own applications, and a lot of the ways those applications work together for a fairly long amount of time, I would say. Can I do a quick plug? Like, if someone is actually really, really interested in operating Kafka, we are hiring.
So, if you want to... Yeah, I've talked about that, as, there's a few people that I would say, if Gwen has an opening on her team, this isn't just, "Hey, you can go plug away at Kafka." This is, "You can go learn from Gwen," and that in and of itself is a very interesting proposition. So yes, you can avail yourself of that, and avail yourself of Gwen's mentoring. So, you've written two books. What's a good way for people to think about wanting to either start a book, or how would they get that initial oomph to do it?
Yeah, I think for me, first question would be, why do you want to write a book? Because the book is a large project that is in many ways, even though O'Reilly and others now do kind of iterative releases, it still feels quite waterfully, in the sense that at some point you say, "Hey, I'm not adding any more new content to my book." And, the software keeps evolving, so... And it obviously just, it takes years. So you want to think, if your goal is to get information out there, do you want to maybe write a blog, and then after a few blogs maybe you can go and check if it actually plays together as a book, but you'll get this ongoing interaction and ongoing audience that you can work with? Same thing if you want a mailing list, if you want to be part of a community.
There are so many other ways to get your knowledge out there, have a YouTube channel, be a vlogger. I don't know that today I would still tell people, "Go write a book," to be honest. If you do want to write a book, basically start with the table of contents. Just like we learned in the writing classes in school, you start with the outline, and then you have a structure to fill in. And as you write the outline, you can gauge your level of excitement because pretty much every morning for the next year and a half of your life you will need to wake up and write a part of it, so it better be very, very exciting for you.
So, you and I have both written books. You still can't keep that excitement going after a year, it's difficult, it's really, really difficult.
The second time I wrote The Definitive Guide, I learned a trick that you actually... You always want to start with the exciting parts, but you really need to keep two or three chapters that really excites you to the end. Otherwise, as you said, you will just lose all motivation halfway through.
That's a good trick, thanks for sharing. Speaking of things that are exciting, what is out there that's a project or some technology that you think is game-changing that you're excited about?
Oh wow, there is always a bunch, right? GraphQL is an interesting one. I first learned about it actually in a podcast. I think Nick Shore, who invented it, was interviewed on a podcast. And then I talked to friends at Facebook who were using it, because it all started at Facebook. And they said, "Oh, it's actually a really nice way to create a data product on top of a database, and just give people very flexible data APIs." And you know this thing where you have a data product and it has a UI, and every time someone wants to change something in the UI you have to go and add new APIs on the back end, and this means go and implement a bunch of abstractions on top of your database? But, database is actually fine, it already has all this data.
GraphQL seems to really make a lot of that a lot easier. The main downside is that most people don't know how to use it yet, so you kind of go and provide APIs, and people look at that and they're like, "Okay, but can I have my rest now? Because I know how to write rest codes, there's good SDKs for it." The entire GraphQL ecosystem is really just starting, but it's starting in a very exciting way. So, for me that's kind of like, not something I would put in a product right now, it feels too early in a way. But, something that I would do as an intern project, check out, do as kind of a hackathon, just try to see what is there, form some opinion, and really watch that space. Another interesting area that I don't really understand yet but looks like something I should.
Where it seems to be ETL automation, people are so excited about it that I'm sure there's something extra there that I'm not quite seeing. But, it's essentially ETL with GitHub, from what I saw. It feels like there is a lot more. Oh, reverse ATL is pretty exciting, like the idea that you can take stuff out of your data warehouse and pipeline it into online products, versus what we always did the reverse way around. That your data world has insights that you actually want to push out to your users, that seems incredibly exciting, and I saw one or two companies doing this, and it definitely seems like something we'll be seeing tons more of. Because it's like, why have we not done it before, kind of thing.
That's interesting, what you're talking about with Graph. I had heard from a friend saying the winter of Graph is over, because... You probably remember six, seven years ago, you walk strata and it was nothing but Graph companies, and then it died very quickly. It seems like Graph is back, I'm going to keep my eye on it. Tell us something that your team would be pleasantly surprised to learn about you.
Oh, wow. I'm a very transparent person, I don't know that there's anything that my team doesn't know about me at this point.
Do they know you were in the IDF?
I believe so, I think they do. And I don't know if they will be very pleased to hear that, I think a lot of my management journey was to unlearn habits that I learned as a military officer. Like, you definitely do not want to run your engineering org like you do a military unit, that absolutely does not work. So yeah, I don't know if they will be surprised at all, and if it's a surprise I don't know if it'll be a pleasant one.
Okay, maybe they don't know this, but... And I don't know this, what was your ultimate rank?
You know, not only do I not remember, I actually don't know how it translates to English. Interesting. I started a Lieutenant and I got promoted, does that make a Captain? I'm not sure.
Yeah, it depends on the armed forces. I think it goes Lieutenant, Captain, Colonel? So, you retired as a Captain.
I believe so, yeah.
Okay, you will be forever known as Captain Gwen Shapira.
It actually sounds pretty cool, I can live with that. Thank you.
It actually sounds even better than General Gwen. But yeah, Captain Gwen, that's pretty awesome, okay.
It is, I do appreciate you coming up with that.
Okay. One last question for you. What do you never compromise on?
Oh, whoa. I'm actually a fairly stubborn person, there is a lot. It sounds basic, but honesty is something that is not... If I find someone in a very clear way factually lying, like not accidentally giving incorrect information then correcting themselves, but actually giving incorrect information and digging in in face of proof, that would probably be a relational breaker. I can't really imagine having any kind of a relationship as a manager, as a reporter, as a coworker, as a friend, where I can't just trust what another person tells me. Right? Trust is the basis of all human relationships.
Aye aye, Captain Gwen. Listeners, as always, we'll be linking to Gwen's recommended reading and her books on dreamteam.soda.io. That's where you can access and listen to more great guests talking about data dream teams. I'll be back soon with another great guest. Thank you again, Gwen.
It was great being here, thank you Jesse.
Another great story, another perspective shared on data, and the tools, technologies, methodologies, and people that use it every day. I loved it. It was informative, refreshing, and just the right dose of inspiration. Remember to check dreamteam.soda.io for additional resources and more great episodes. We’ll meet you back here soon at the Soda Podcast.