Erick Webbe is Head of Data Science at bol.com. Erick is building a data science team of bright minds to push the boundary of science, explore new methodologies, and experiment with new technologies. As a practical and thoughtful leader, his focus is on giving each individual the freedom to explore, the freedom to experiment, and figure out what will work and what will make bol.com a gamechanger in the online retail market. Hear Erick and Jesse as they discuss the scale of data science, building and managing teams, and making a difference.
Welcome to the Soda Podcast. We're talking about the Data Dream Team with Jesse Anderson. There's a new approach needed to align how the organization, the team, the people are structured and organized around data. New roles, shifted accountability, breaking silos, and forging new channels of collaboration. The lineup of guests is fantastic. We're excited for everyone to listen, learn, and like. Without further ado, here's your host, Jesse Anderson.
Hello, and welcome to the Data Dream Team Podcast. My name is Jesse Anderson. With me today is Erick Webbe, he is the head of data science at Bol.com. Welcome Erick, would you mind introducing yourself a little bit more?
Thanks Jesse, for sure. Hi, I'm Erick, the head of data science at Bol.com, where I currently oversee dozens of teams and I'm tasked to making sure that picking up the topics that will benefit all of them, end up on my plate, and while doing so also raising the bar on how we adopt and apply data science at Bol.com.
Excellent. Since we were talking about Bol.com, could you go a little bit deeper into it and what it is?
Sure. Bol.com is the biggest online retailer in the Benelux region, so in the northwestern part of Europe serving about a dozen million customers, and a general retailer as such - so offering you everything you would need in daily life. For people outside of Europe, compare it to what Amazon has to offer, but then in a Dutch and local flavor.
I just realized, B-O-L, does that stand for an acronym for “Best Online’ retailer?
It should. No, it comes from the heritage, where we are a former subsidiary of Bertelsmann, which is a German company, which back in about 1995 was exploring whether or not they should move online. That's basically where the name comes from, Bertelsmann Online, which used to have several brands across Europe in different countries. But we're the only ones that managed to... Bol.com burst basically, and still has the international allure with the “dot com'' there.
Okay. You mentioned before that you're expanding into the French market. Is that correct still?
Yeah, it sure is. Bol.com is one of the brands of Ahold Delhaize, which is mainly a supermarket chain which has brands across Europe and US. One of our supermarket partners, Delhaize, also has a large collection of stores in the French speaking part of Belgium. We were always present in the northern, Dutch-speaking part of Belgium, but we're now also expanding to the Southern part. To make sure that we get the most value out of bringing them brands together, and also using the stores there, for instance, as a pickup or drop-off point. But in order to do that, we should also tailor to the French speaking market, and as such indeed are translating to French.
Excellent. You shared that you have about 12 million users. Are there any other numbers that you can share publicly? For example, orders per second, traffic, that sort of thing?
If you look at the website, all the big numbers are there, but I think the biggest ones we share is that it is a multi-billion revenue company, so indeed as such quite sizable. We store a few terabytes of data, one to power all of our core processes, but also we need to be able to power all of our data science teams. But indeed, if you look at the numbers during the peak season, you indeed see we have thousands of search queries a second, and so many interactions it's just unfathomable on how big it can become. But also knowing that the moment I will tell you today, it's already outdated tomorrow.
Excellent. Excellent, so let's talk about you personally. You personally have a Masters in applied physics. Could you tell me more about how you've taken that Master's in physics and applied it in analytics?
Sure, I love to do so. When I finished the secondary school, you get maybe for the first time, the big question of, "What do you want to do with your life? What do you want to become when you grow up?" There I figured, "I want to understand how the world works." For me, the best way to understand how the world works is by learning physics. You see a phenomena, you build a hypothesis of how it could work, and then you test that using experiments. That's I think a philosophy that I still apply to my work every single day.
We have a challenge with a customer or with a partner. We think about how we can best help them overcome that problem or solve it, and then test that in real life as soon as we can. That mindset is something I use every day. Every data scientist will be familiar with experimentation. Experimentation is your bread and butter there, so that's something you apply every single day. But also the critical mindset, and being able to separate main contributions to smaller contributions, which I think is a key part of an applied physics background is something that still is very useful. Yes, there are a thousand things you could think of or could further improve, but these are the three things that matter most to make an impact that I still make a lot of use of.
I find that interesting, because I see some people are trying to chase after every single stone or every single rabbit that they see. What I found interesting as you were talking is figuring out what was the most interesting rabbit to chase after? What suggestions would you give to data scientists and say, "Here is how you find the most interesting rabbit to chase after?"
I think first would be, understand why you chased a rabbit at all. For instance, if you chase the rabbit for the sake of having food, be sure that that's the reason why you chase it. It's not about building the most beautiful trap or the most elaborate way to catch it, you catch it because you're hungry. If you always keep the end goal in mind and then start to think about how can I achieve that, that’s when you’ll become most effective. To translate that into the real life application of a data scientist every single day, is that if you get to solve a business problem by writing 10 lines of SQL code, scheduling that, and making that available,that's the way to be effective. That's the way to build credibility, and to build a reputation for yourself of being someone that can actually solve a problem, that you can then always leverage on a later point in time for the more fancy stuff, for the more modeling stuff.
That, I think, is key for what I'm seeing, at least for the more junior people entering the field is that I've learned modeling, so now I should do modeling, instead of I'm here to chase that same rabbit because I'm hungry. If it's a lot easier to pick berries, you should go for that. Apologies if I took the metaphor too far there, Jesse.
No, I love berries too. Yes, I think there's some key early lessons for early data scientists that I see happening too much, where I think in school people are taught how to solve a problem, but they're not taught what the problem to solve is. It's kind of here's your ABC test, rather than, okay, which one of ABC should you seek after? Another …
For sure. Maybe to build on that, what I typically see happening in the field of data science is that people enter the field from either three angles. Either you come from the angle of engineering, you know how to build things. You come from the angle of a formal data science background, you know how to model the world. Or you come from the angle of business analytics, you know how to solve a problem. It takes years to complement the one skill that you get from university with one of the two others before it becomes truly effective in the field. I always love to see people in need, making that journey and making the transition of, "Okay, I already mastered one of these three things. Which is the other one that I need to be effective, and how can I complement the third one with the people around me, within a team, to make sure I can be as effective as I want to be?"
Good. You were told that you were one of the most effective data scientists without knowing you were a data scientist. Tell me about what that meant and why that happened.
Yeah, that was a happy surprise. No, I still vividly remember that moment. We were walking down the hall and having one of these serendipitous conversations when someone actually told me that.
“I think you are the most effective data scientist that we have in the company”. "Oh, why is that?" “Yeah, you're pragmatic, and even without too many tooling and technology, you just go out there and try stuff.” That to some extent made me realize, "Oh wait, maybe I am already working as a data scientist without being fully aware of that."
Because indeed as a trained physicist, you're not that exposed to the whole field of data science. Back in the day when I was in university, data science as such I still think was very much emerging. But that was the first experience. I think from that point on, having a gut feel of what the power of data and how it can improve our business has always been there, just the vocabulary wasn't aligned with what is most common in industry by now. But that luckily has changed a lot over the last few years.
Now going deeper into that physicist or being a physicist, we've talked about experimentation. You said before, throw hard problems at physicists. Tell me more about why you said that.
Yeah. I think if you look at history in the last hundred years, I think in quite a few cases you've seen that if there's a big, hard audacious beast, quite often, it has been physicists that started to look at a problem and approach it from different angles to see if they could find a solution to make it work. Of course indeed, many famous examples are known from wartime, when we needed to have bigger, stronger weapons and to build things nuclear powered even.
But also if you look at more recent times, the most, or at least the initial solutions, for instance in making feeds more relevant - news feeds, social media feeds - quite often, that's been physicists that were looking at that, on how to overcome that. If you look at the original designs, for instance, for the Enigma machine for cryptography, have been physicists. I know for a fact that I'm hugely biased in this perception, being one myself as well.
But I think it shows that if you have a problem and you don't know where to begin at all, it's also very useful to add a generalist to it. Someone that knows a lot from a lot of different things a little bit, to move it one stage further in our understanding before adding more specialized people to it. I think to translate that back to daily life and to the teams that we run also here at Bol.com. When we are faced with a new problem, a problem we've never seen before, we tend to first offer that to a generalist. Someone who knows asking the right questions, knows how to do exploratory data analysis, knows how to build a first model and realize production, much more so than already introducing two, three, four specialists as a team, because you simply don't know yet what the problem will look like if you looked deeper into it.
That might be one of the reasons why I also stumbled into the field of data science in the stage where I was at, because no one knew what data science would exactly entail for us at a company. Maybe the generalist there could help in need, shape that and build our understanding. I think there it's... If you look at history, I think I've seen that a physicist is typically the generalist that can bring the problem from the original challenge to the first sixty percent solution, and after that you should surely add more specialists to bring it a lot further.
It's interesting that you talk about generalists that way. Another friend of mine, and a person who's been on the show Paco Nathan spoke pretty much the same thing. He said, "We need more generalists, these generalists will be able to help us." Then I think you touched on something that's pretty important, is that they'll help shape or form that problem. You've said before that physicists can get you to that sixty to eighty percent. Is that your take as well?
It's what I've seen happen a lot. Of course, there is no way of generalizing all of them, but that's what I'm seeing a lot. Although most people that I've studied with also myself have all been trained in applied physics, they have ended up in a wide range of fields where they're currently working. So many are still in hardcore physics. Quite a few of them are entrepreneurs, some have ended up in venture capital, in FinTech. Some of them have become teachers, but most of them are indeed ending up in new fields, in which creativity and solving problems that few people have solved before are their bread and butter, which they're also attracted to. It might be a chicken and egg problem, is that the type of people that come from that education tend to look for problems that they like to solve, or it's the other way around. For that one, I'm not quite sure. But that's something that I see happen quite a bit, yeah.
Now let's reverse the problem. How do you finish off that last twenty to forty percent?
Again, in essence, it's a simple one. It's about realizing two things: you never have one person that can solve the entire value chain from zero to one. Second, it's about understanding how specialties in our teams work to make that come about. For the first one is, although I think I am quite capable of breaking down a challenge into its core components, there are many things that I'm a lot less good at. For instance, don't expect me to structure things really well. I'm not the one that focuses the most on efficiency as well as I have seen others do that.
Realizing that, internalizing that, acknowledging that, and being vocal about that opens up the stage for other people to step in and to help. Where I think you should also need to be able to step over your own shadow, and realize that the person that got the team, the department from A to B, might not be the same person that the organization needs to bring from B to C as well, which I think is the first one.
I think the second is about generalists versus specialists, is that I feel that people's skill set to some extent, basically they span this span of space. This space as a whole can grow over time, but through vertices either indeed are wide or narrow, but then they go deeper. Where a generalist typically knows something about a lot of things, a specialist typically knows a lot about a few things. If your problem is already more advanced, more detailed and more known on what you'll be facing, finding a specialist to tackle that is exactly what you need.
If I take an example from our day-to-day life and one of our teams, we have been recommending products to our customers for over a decade. We know that it helps our customers explore their ways and their needs within our tens of millions of products by catalog. We know that the need is there. However, how to do that effectively at that scale requires quite a lot of detailed knowledge about data processing, engineering, neural nets, everything that comes with that. Knowing we have a need for that allowed us to hire a few dedicated specialists to really look into that, several people who have PhDs into the fields of image processing, information processing.
There, we had the luxury of knowing that we need people with a specialized bit of knowledge within neural nets. We're able to hire them, because one, we could tell them the challenge we have, why it's interesting for you. We also knew that making the investment for us would last for the years to come. I'm really glad that we did that, because now indeed we have a lot of exciting projects that are ongoing, where not only we recommend products based on who you are, but also to inspire you, or to pique your interest or to offer you products based on the look that you have. Which dress matches your shoes, those kinds of things. Which I think sets us apart in online retail, which is just exciting stuff.
Now I'm curious about your experience on this. In my experience, we have a long tail problem where the amount of effort to get to, let's say, sixty to eighty is much smaller than the last twenty percent. Is that your experience as well?
Often so. A few indeed examples may be underlying the truth of it, but mostly so, yes.
Okay. We were originally introduced by Niels Basjes. For those who don't know, he's one of the most phenomenal both architects and programmers I've talked to in a while. I met him at Berlin Buzzwords. If I were a recruiter and trying to poach somebody, I'd be going after Neils.
No, no.
No, don't try to poach him. Okay, forget that name.
Yeah.
Yeah, don't take him. In this case, he is really good at what he does. We had some pretty in depth conversations that you're not able to do most of the time. Let's say you don't have somebody like Neils. What do you do?
There I to some extent argue that within every company, you'll always have a Niels. However, with growing scale and growing complexity, the level of depth and weight of knowledge just becomes even bigger. But you'll always need someone that... You need in a certain area within your company he pushes the bounds in one direction. I think there are two things that you need to make that happen. I think with the first one you already shared there, Jesse, if you need someone that can translate a problem that you have into the steps that need to be taken. They don't have to fully internalize a problem themselves and know all the steps to take, but they know this is what I want to solve and I don't have this expertise. I need to find someone that can actually do this for me.
Second, the thing that you need is you need to have some level of urgency or drive to actually go for that. Because acquiring a new technology, acquiring new people, is an investment upfront that comes with uncertainty and it comes with risk, and you need to have a reason to take those. But if you have those two, if you have a drive to do something new, bold, challenging, and you have someone that can translate that challenge into what's technically looked for or required, then you have the core ingredients. Because then it's for people, also like Niels, it's for them, “Okay, this is what I want to solve, can you help me fix that?” He'll say, "Yes, just let me have a crack at it and I'll get back to you."
That is relying on the craftsmanship of others, for them to solve a problem and to get back to you on how to fix that, which I think is a wonderful distribution to have also within how you organize within.
Now let's switch a little bit back. Let's switch back to you. You recently, or within the past few years, switched from being an individual contributor to a manager. Why did you make that switch?
Long story short, I wanted to have the experience. I wanted to know what it is like to have a different role. Is that something that would suit me? Is that something that I would be skilled at? Is that something that would challenge me for days, months, years to come? And as a data scientist, I ran the experiment. It's curiosity that got me there, because philosophizing about is this something I would want, is this something that would appeal to me can only get you so far. So when the opportunity came, I grabbed it with both hands and it got me where I am today.
We've seen this theme of you liking to experiment. As I've thought about the experiments in my life, I don't think I've called them experiments, I think of it more as an engineer of how far can I go without breaking? We can test this, but let's not test it to failure, let's test it just slightly less than failure. How do you do that in life, where you have to eat, you have to live, you... How do you do an experiment that doesn't break you, as it were?
The way you phrase it, I have not done an experiment. For me, at least how you describe it, it's just as much an experiment as anything else. Let's go into the area that's unknown, let's see what happens. Let's build a feedback loop of, "Hey, it hasn't broken yet. Can I push it further? Do I want to push it further?" If the answer's yes, why not? Maybe you're also a bit more of a data scientist than you already realized there, Jesse.
But then to come to your question on how you balance that? I think the key deliberation I typically take there, and I think I actually stole the metaphor from Jeff Bezos there, is about realizing the choice that I'm making, is it a one-way door or a two-way door? Which in essence means so much as if I say yes to testing this, to do this experiment and it fails, is there a way back, yes or no? As long as there is a way back that doesn't cost you too much, go for it. Try it, see what happens, and the experience will always make you richer than you came in, and see what happens.
The thing there is, although my hair is graying, I'm not quite sure whether people can actually hear that on the podcast already, but it's graying. I am just 15 years into my working career out of 45. That's a third in. That's the stage where you should be exploring a lot more than you should be exploiting, because you have 30 more years to go. You better end up in a place where you are both convinced and very passionate about where you are and where you want to spend your time. For me, the only way to confirm that is by running experiments, by testing things out and by experiencing firsthand what it's like. As long as you do that in areas where even if the experience is not what you were hoping for but you can always turn back, always go for those.
Okay. You say something interesting and valuable I think for people who are listening, so I wanted to just dig into that. That is, in my personal life as I've made a decision, that's the calculus I've done too. For example, when I started my business. If I start my business and I fall flat on my face, what's the undo ability of that? Then to add on to what you were talking about, there is a certain level of cache or benefit to what you're doing. Let's say I had started my business and fallen flat onto my face. I now have this view that others, let's say, engineers don't have if I actually ran a business, I know what it takes to market and sell, and do this and that, and that's valuable to a company.
Similar to you, you saw that there was value. Even if you saw it, went into management and said, "Oh, I don't want to do this?" There's value to being able to say, "Yeah, I managed for a bit. Maybe I'm not a line manager, maybe not I'm a director, but maybe I'm a team lead, and that background in management helps me out and I just don't have to deal with some of that." Is that about what you're thinking as well?
For sure, for sure. To take one example, I think, for what you've just said. In the various roles I've had over the last few years, I've had times where I was a line manager for 14 people as well. Currently I am officially a line manager for two, which means to some people, that's having 12 people less as your direct view reports. For which also in the beginning, I figured, "Okay, is that something I care about? Is that something that I found important? Is that something that I want to have in my job and my career?" But I didn't know. Again there, I figured, "Okay, let's try it."
The thing I found in how I currently operate and do my work is that I spend almost as much time coaching others, in giving them feedback and trying to make them as effective as they can be without it being my formal responsibility. Just because I've experienced how much fun I find it to be able to do that with people, to give them those experiences and those perspectives back for them to reflect on and to adopt if they would prefer that, and not requiring that to be formalized on paper as well.
I've learned that for me, leadership is something that's a mindset, not something that is stored in a database, and the only way that I could have learned that is by picking up all these different roles, and I'm loving it to the days as today.
Speaking of that leadership and making a change, what did you do to prepare you to start leading a data science team?
That's a good question. I think being chucked into the deep end at some point, I'm not quite sure whether there was that much time to prepare. But no, I think a mixture of two things. One, I think there's a wonderful corpus of books and experiences from people that have gone before you already available out there. Just reading up on books on leadership, on different styles, on different perspectives that can help you find your own way and view on leadership.
Second, which might not so much be preparation upfront, but more something that I've learned quite rapidly into my tenure and into management. Is being open about what you're good at, where your weaknesses are at, what you know, what you don't know, and sharing that with the people around you, and not fake it until you make it and try to bluff your way through things. Because people will catch that, they will see through it, and it will hamper both of your learning experiences and relationships, and it's just way too detrimental.
You mentioned reading some books. What books would you recommend to everybody?
I think on the first hand it's The 7 Habits of Highly Effective People, I think the classic from Covey. Which is, for me, it's always sounded very clickbait-y, but it's genuinely a really good book and a framework to help you move forward. I've highly enjoyed Turning the Ship Around, about a submarine captain who literally was thrown into the deep end into his leadership style, which shows very practically how he did that in a very unexpected setting. Within a military setting, which is typically a lot more hierarchical, he adopted a completely different leadership style, which in my perspective was a lot more effective. I'm trying not to spoil the book too much for future readers.
I've really appreciated The Trillion Dollar Coach by Bill... Well, the story about Bill Campbell, he didn't write it himself. It's a wonderful look into how having a few core principles and living up to those can almost make life ridiculously easy. But in practice, it's harder than it seems. Which for me is also truly inspiring on how to make the tough choices, how to pick who you should coach and who not, and how to prepare for that. Which I think are three books already that people could just dive into.
I've read all those, those are all very good recommendations. What methodology do you use on a day-to-day basis in your teams? Is this scrum or something else?
If you look at the teams that we have, we have a little over 20 different teams within Bol.com that have one of our data scientists in them. For all of them we give them the mantra, you build it, you run it, you love it, which is our equivalent of we provide you with all the context that you would need to solve your problem, but how you solve it, that's up to you. As such, there's not one way how the teams run or operate, it's up for them.
Of course, we're also seeing that some methodologies are more effective than others in order to build great products. One tangible example there is, we've seen that the traditional scrum, so you work on a fixed set of challenges for two weeks with the marketplace, and scoping and everything there. It just doesn't fit the data science workflow as well as other activities would, simply because you don't know how long it will take to clean and to prepare your data. You might come to a very messy data set, you might run into a very clean data set, not that often. But that's often very hard to predict until you actually dive in.
We are finding that Kanban typically suits the teams a lot better, where you always have a stack rank of the most important problems that you would like to fix. Use scope only the three most important problems that will keep you busy for the next three to six months, and everything beyond that, it's on the backlog. I don't even care, because I'm not going to spend and waste time on detailing those. Which indeed you then break down into more manageable tasks, but always pick the most valuable one first and work it as hard as you can. Only if things take a lot longer than you were expecting in front, or that professionals indeed run into challenges that they were just not foreseeing, then reiterate on your approach. But Kanban is typically the most effective way of working and finding it today.
As we've seen from your style, you're very highly experimental. Tell me about the team experiments that you've run.
Ooh, there are a few there. Maybe the best one that I've run is... This is all the way back in 2018 or 2019, I think. We had a team already working on forecasting, and we knew that within our entire logistics operations, the deployment of data science was very slim. Especially within product operations, you should be looking more into operations research than pure data science, but for sake, for arguable, we'll put them in the same denominator. Basically I had one person to spare, one person with some time free on her hands. I said, "Hey, go out there, talk to a few people, see who you can find that is interested in trying something new." To basically build a beach head, first ground holds where we can basically build a reputation, a name for ourselves, and then continue the work there.
That was wonderful, because she had all the time in the world. No requirements, no need for stringent reporting. I trusted her to make all the right choices in where the biggest opportunities were. That freedom, I think, allowed her to basically explore without being held back. I think it was a few months before we started to build our own in-house solution for a traveling salesman problem on how we do order picking within the warehouses. The fun thing is that now, four years later, we have 16 full-time data scientists within the entire logistics operations, all resulting from that single experiment. Only they are now really pushing the bounds on not solving the simple problems, but also looking at, okay, now we solve five of these problems. How do we make sure that we do not end up in five local minimum, but how do we make sure that all of these solutions also generate a global minimum, and that's a really cool thing to do.
They're looking into challenges like, how do we make a digital twin, or a full... We can do reinforcement learning over our entire setup to make it even better than we have. All of those things basically resulted from having one person giving them free time to explore and to figure out, can we make this work? Which resulted in an already wonderful team. I think if we were to look back in three more years three times more, I'm looking at that team to be doubled or maybe even tripled, because we're doing a lot of really cool stuff.
Speaking of goals for your teams, you have much more overarching or much more long-term goals for data teams. Could you talk more about what goals you have?
My role specifically needs to focus on things that make all of the individual teams more effective. In terms of how we can share knowledge more effectively? How can we build better products? How can we make sure that the time spent on OPEX is lower? Those kinds of topics. As a result, the goals that I and the rest of my leadership typically look at are indeed a lot more long-term focused and maybe slightly different than most teams would have. For instance, we look at, is data science self-evident for everyone within the company? Which is our way of trying to build, or to create basically 6,000 eyeballs within the company that look at our business in a way that data scientists might also look at it, because those can generate so much more ideas on where it can adopt discipline than just in indeed the data scientist themselves would. That's one of the goals that we're looking into.
But the other one, for instance, is making sure that we are the best company in the region to deploy data science, and to learn how to put that into practice. For which we're now only looking into churn. Are we able to keep people with the company, and keep them engaged for a lot longer compared to relevant benchmarks? For instance, compared to other companies, or compared to other roles within a company? Because we believe that we have something unique to offer that should result in people staying with us for longer. But of course, that's typically a metric and a goal that's already measured in years, so influencing that is of course also immediately a long term goal as well.
You also mentioned some impact in academia. What would you like to change in academia?
What I would like to change mostly is making sure that the gap between academia and industry becomes smaller and smaller. Because there we have a lot of bright minds pushing the boundary of science, and looking into new methodologies and new technologies. For which as long as they remain with academia, they do not benefit society to the extent that they could. There, we're looking into ways to basically make that gap small and to bridge it, to make sure that if we need new technology on the forecast, it becomes available, we can also use it for our operational planning. If we have new technologies in learning to rank or in multimodal search, we can translate that into products we can actually give to our customers to basically help them find the things they were looking for a lot faster, easier, or things they were even yet unaware of they were looking for there as well.
I think one great example that we've recently done is bring people together, for instance on recommender systems to make sure that the goals that we had, making things relevant, recommended, and for instance also using the same information that you provide to us as a customer within your current session, within your current visit. Feed that back into a recommender system that can already incorporate that between different page visits. The moment that you click a certain product and you're transferred to the next page, that second page already shows you information which is more relevant, because it incorporates information from the page you've just visited into that next one. Which as a concept, as a human is not that hard to comprehend. But if you translate that into what it means on a technical level, that's quite impressive, and that's where I'm really happy that we, for instance, also work with the University of Amsterdam to incorporate everything that they know into the teams that we currently have.
I have one request for our listeners. If you are listening to this and you're part of one of those academic institutions and you have something that you want to share on the show, please do reach out so that we can... Because mostly what we've done has been really focused on industry, I would like to hear from academicians as well. But I really do love that you're trying to make that change in academia. You're coming at it from the data science side. I've been on various boards, review boards, that sort of thing, and to a T, the issue is what's being taught in college for engineering, software engineering is very different from what we're doing in industry. You could say, "Well, there's that problem."
But it's causing a problem for the students, those students generally have to come out and get a job on day one. The further they are from getting a job, the further they are from paying off those student loan debt if they're from the US, or in Europe, getting that job. It's a really key thing. Then on the employer side, that means that it's going to take even longer for that person to become productive. It is a key thing that we need to be doing as a whole industry, not just in software engineering, data engineering, but also data science.
For sure, and I think it's, again, one of those key examples where you see two things. On one hand, the whole field of data science is relatively speaking still quite immature, so we're still learning on how we can best embed that into products and how to develop that. But also, we're seeing that in a lot of areas within the field of data science, they are a true team sport. You need to have an analyst, a software engineer, a product owner and a data scientist come together to build great products. That's on a team level. But I think also on the longer timelines that you just mentioned, is that in able to keep advancing the field, we also need to have the feedback loop between industry, academia, society, work as well, to make sure that we share with each other what we have and what we need to make sure indeed that that educational programs indeed are a fit for need. To make sure that research directions you need are tailored to things that people are looking for.
As an example, I find it wonderful to see how well image recognition programs and algorithms can already support a physician or a doctor in spotting malicious tissue in all kinds of CT and MRI scans to a much faster and higher level of certainty than a human could. However, the way our legislation and governance currently are organized makes it really hard to embed data science within the healthcare business. Which for me is still a terrible shame, because machines are capable of ingesting so much more data and learning so much better from feedback on how to spot the images that a professional should look at. We should be making a lot more use of that, but you need a conversation between industry, technology, society, in this case healthcare on, okay, what do you need, what do you look for and how can we help each other? We still have exciting times ahead, because I'm sure that we'll get there in the next 10 to 20 years as well.
Well, I think we will. Of all the changes I'm most excited about, it's going to be some of this automation, and then the automation in turn will push down some of the pricing. By pushing down some of the pricing, we'll now get this into either third world countries. We'll improve the medical care all around. In this sense, a rising tide will float all these boats, so I'm excited to see that change.
For sure, yeah.
Going back to Bol, one thing I found interesting about your expansion plans was that they were more language-based than they were locale-based. What sorts of things change when you go from one language to another for your data teams?
Well first off yeah, the obvious one. Most fields of information that you have need to be duplicated. That's the obvious one. I think second, the things that you see happen a lot is about mindset. Do I see this as okay, we had one, we're now going to add number two and that's it? Or do I see this going from one to N? Do I now implement a technology or structure that allows us to scale across the world, or is it just this?
I think what I've seen so far is the teams take radically different approaches to that question, resulting from the time they have available, the guidance that they receive, the approach that they have to the problem. That makes it really exciting to see different approaches. For which on one hand, you could argue that, "Hey, if you are fully aligned, that everyone would adopt the same approach." However, by allowing for teams to have different approaches and to tailor their solution more to what's currently best fitting for their product, you allow for a lot more learning to happen as it goes, also for a question like this.
Which for me, and you've now got a glimpse of that, I like to have that. I like to have a different perspective, different approaches. A different methodology, because that gives you a new perspective on what will happen. Because the future is uncertain. You might have an idea now that we'll move into different companies, different countries, different languages. But for now, it might just, for now, be a plan. For as long as we're not there yet, that's all it is, and that allows you to basically apply statistics and probability to what is the chance that we'll move there, and different teams might assess the change and risk differently.
What do you find exciting in analytics right now?
I think a few things are very exciting about analytics at the moment. First off, the sheer volume of data is growing. We have data about everything. I generate more data from the watch that I wear every single day, than we would have about the global senses a few decades ago. It's bizarre. The reason why I find that exciting is because it allows you to pick up new use cases and new applications that were not available before. That's one thing that's exciting.
Second, and that's where I'm still also a bit of a techie myself, is that with that growing volume also comes an increase in the quality of the tools that you use. For instance where in the past, basically you had a shell script and SQL, that was about it. You're now seeing dedicated tooling coming up to also support what's coming up as the analytics engineer, which allows you to both raise the bar in terms of quality, but also go into the next level in terms of what you're capable of doing, which again allows for more use cases.
I think thirdly, given the fact that more and more people have been growing up with data and the possibility it has also allowed for more people to think about what is possible with data. How to use it, how not to use it, and how it can actually make an impact in our daily lives and by doing things smarter, or better, or faster or easier. Which again is all about growing the impact that data can have, because that's I think the overarching theme of all three of those. I am all about applying, making it work, finding the application that can make your life easier, smoother or just put a smile on your face. The more we can do that, the more excited I'll get.
You've only briefly touched on your teams. You've talked about data scientists being in various teams. Could you sketch out your recommended team makeup for us?
The recommended team for... Let's for now assume a semi-mature product, because I think there where we spoke earlier also on... If you're just getting started, if you have a generalist, if you have a single software engineer, that's fine. But let's assume you've gone through the initial stages, you have a rough idea of what your product typically does and solves. What would the team then look like? For which first and foremost, I would always have a product owner, a product manager, someone that understands who is my customer and what's the problem I'm solving? Because from that understanding always follows how do I do this? How do I solve this problem?
If you then need to look into the teams that are more frequent in my neck of the woods, so more of the data teams, typically I would always look into having some redundancy in the core skills in terms of engineering and in data. That would mean that you would have two engineers, software engineers, and two data scientists, data analysts, but people that are very data savvy as a first set of five. That means that you have redundancy, that means that you have enough people to build quite a few things yourself without growing the team too big.
Then the third element to look into would always be what does my product specifically need? What sets my product apart from others? For instance, in the world of online retail, some of our products are actually customer-facing and some of them are not. For instance, if you look at our search team, of course they produce search results in terms of a list, but that list can be displayed in a wide range of ways. For those teams, I would add designers. How do I present my results to my user in such a way that it's intuitive to look at?
If I look at one of the teams that we have, for instance our experimentation team; we have a team dedicated to supporting all of our... Well, close to 200 software engineering teams that we now have at Bol.com in running experiments every single day. They have a full-time statistician on board, someone who has a PhD in Bayesian statistics to make sure that not only the methodology that we use are sound, but there is also a sounding board for other teams that are looking into other teams that want to do an experiment. There you look into what does my team specifically need, and how do I add that.
Finally, is that the moment your team hits eight people, really start to think about are there logical ways to redistribute tasks or to further refine what my customer needs? Because if you start to look at teams of nine, ten, elevent people, just communicating and making sure that all of those people indeed have sufficient trust within each other becomes really hard to do. Then I would always look into splitting those up into smaller teams. But that's basically the basic autonomy for a team for me.
Of all the things you said, the one that was most interesting was having somebody with a PhD in Bayesian statistics. I've heard people with PhDs in statistics, but never somebody called out specifically in Bayesian for the experimentation.
I know there's a vivid, what do you call it, debate about the various ways on how to look at statistics and probability. I believe in each case specifically having a PhD in Bayesian statistics, I might be mistaken there. If you experiment on the scale at which we do, with having dozens of experiments run every time, you need to make sure that the methodology is sound, because that's what you rely on every single day. Doing that with people, switching devices from a desktop to a mobile phone. Having people switch from different IP addresses from VPN to not. Just tracking customers, making sure that your bucketing is okay, it's no easy feat. That's a thing where I'm really happy to work at a company that has the scale, that can afford themselves to have a specialist like that also on board. But also, it allows us to offer a very fulfilling and very deep job for some of the people we really care about, which I think is a really good thing.
What do you never compromise on? Other than Bayesian statistics?
I was going to say nothing. I'm the type of person that sees merits in making trade-offs in a lot of things.
But that's the Dutch part of you.
Yeah, it could very well be. It's the one that never says never, or indeed, it then starts to become very literal on the word never. I think the first one that's most prominent and that I hope that people will recognize me for is that I don't believe in keeping secrets, and I don't believe in telling people half the truth, because you think you can make better choices than them in certain areas. I'm the type of person that is open and transparent on what I know, what I think, what I feel. Because I believe that bringing everything to the table allows for better decision making. Although it might be scary at times, although it might every now and then not get the result that you were hoping for, it's the way that I believe that you build relationships that last, which are a lot more valuable than transactional decisions you make regularly on a single day. I would argue that's a trade that's hardly ever compromised on.
If there's one thing I have to pick to never compromise on, it's about making puns and making sure that people laugh every now and then and not taking yourself too seriously. That's I think the one that I hope I'll never grow too old and grumpy to do, because taking yourself too seriously just never works.
I like that. Never grow too old and grumpy to do a good pun. That's pretty punny.
Right? Yeah, I think so. I'm still not out on the fact, whether or not my wife stuck with me because of the puns or despite the puns. Either way, I'm happy with that.
Another great story, another perspective shared on data, and the tools, technologies, methodologies, and people that use it every day. I loved it. It was informative, refreshing, and just the right dose of inspiration. Remember to check dreamteam.soda.io for additional resources and more great episodes. We’ll meet you back here soon at the Soda Podcast.