Start With Data Governance to Generate Faster, More Accurate Product Analytics

Insights From The Trenches | Mammoth Growth Podcast

It’s the foundation of every technological initiative. But what happens if you build on a shaky foundation? Data governance is frequently missing from conversations about the value of product analytics, and Avo is out to change that. In this episode of the Mammoth Growth Podcast, Avo CEO and Co-Founder Stefania Olafsdottir chats with our EMEA President Stuart Scott about the challenges companies face with poor data governance, and how they can resolve these issues and spend more time building great products.

Product analytics is a discipline just like sales or marketing, and just like these, it will reward a high level of planning. While it’s one thing to understand the impact of a company’s product analytics on their overall decision making (analytics for your analytics), data governance addresses the very core of product analytics:

  • How can we possibly make good decisions with bad data?
  • How do we restore trust in our data?
  • If we have a gatekeeper to confirm data quality before we ship, how do we deal with that bottleneck?
  • How do we ship accurate product analytics based on good data, faster?

Stefania has seen this sequence play out over and over again, from really bad data to good data to self-serve data governance for product analytics. 

One of Stefania’s breakthroughs was starting small, by finding the one person on a team who was passionate about data quality. Starting there, it’s much easier to demonstrate success with data governance by focusing on tactical wins - for example, improving adoption of one feature at a time, or increasing the conversion rate for a single cohort of users. How to approach data governance projects is just as important as the technology you implement along the way. Stefania’s insights into successful data governance led her and her team to create Avo, which fixes data quality for product analytics so every company can build on a solid foundation.


Stuart Scott (00:05):
Hello and welcome to another episode of the Mammoth Growth Podcast at Mammoth Growth, one of the things we do is provide growth and data teams as a service. And a key part of that is often helping our clients fix their data quality and collect consistent, reliable and accurate data, which they can then use to make better business decisions. And a tool that we often recommend to our clients is Avo. So Avo is a tool for improving the data quality across your organization and setting up some structured data governance processes. And so I'm here today with Stefania Olafsdottir, who is the CEO and founder of Avo. Welcome.

Stefania Olafsdottir (00:47):
Right. Thank you. So excited to be here, Stuart.

SS (00:51):
And so I think it would be great to start this conversation by talking a bit about data quality and data governance and I guess what those things mean to you at Avo and how they have an impact on an organization.

SO (01:06):
Yeah, great question. This is such an open subject in my perspective. And the interesting part to me was my surprise that when you talk about data governance, it sort of depends on where people specialize in their maturity and data, whether they think about data quality and making sure data is consistent across sources and products and teams and decisions are being made at the same definitions of metrics or they start thinking about data security and personally identifiable information and such. But in our case, I think we are thinking about data governance from this perspective of making sure data is available to make decisions on it and that people are making consistent decisions based on consistent data.

SS (01:58):
Yeah, I think that makes sense. And I guess the worst thing to end up in is a situation where you have lots of different sources of truth and every team has implemented their own data and every team has a slightly different definition. And as a result, you end up spending your time in meetings debating whether you've got the right data or what the right number is rather than how you can improve the number or how you can act on it.

SO (02:26):
Exactly, yeah. Whether you're on the right track, just minor placement here. Yeah, definitely. And even a worst case scenario, you don't even realize that you don't have the right data and you make a decision to change your onboarding process or something based on the data that you have, and then all of a sudden you're onboarding conversion plummets and you're like, what the hell happened here? This seems like such a good decision. And then you look into the data for the second or the third time, and then you realize that this decision was actually made on incorrect data and then you are slowing your business down so much you have to take, it probably took maybe two, three weeks or even months depending on how quickly companies and people move to actually change the onboarding process according to this bad data. And then it takes maybe a few weeks to a month or two to realize that everything is broken now and that it's due to bad data, and then it might take a few weeks to a few months to actually revert that onboarding experience back to what it was prior to that decision making.

So the cost of this bad data can be immense. And what do we mean by bad data? I think that's also a thing that can stay abstract in people's minds, but for example, what we fixed at Avo is when people are going through their user journey in any application like Spotify or something, then what you can do is you can do all sorts of things in the app. And for example, if we take Spotify for example, you can add a song to a playlist or something, and you can do that from so many different locations in the app. You can do it by sliding the song to the right and then it automatically adds to a queue. Or you can sort of tap some three buttons somewhere and then you can choose whether you add it to the queue or to a specific playlist.

And then you can do it from the song full screen mode and from the home screen, there's so many different locations, and these locations in the app are maintained by so many different software teams at Spotify. So there are so many different developers that touch this experience and they end up creating playlist added song added to playlist song, added song add, add song, all of these different versions of the same user action. And then when a data person or a product manager are looking at the conversion from creating a playlist to adding five songs to a playlist to playing it through, then they don't have a full picture of the data because they don't know that they're looking for 50 different versions of this single user action when they're looking at their database.

SS (05:24):
They don't know if they need to add them all together or if they need to pick a subset of them or if they just need to pick one. And we had that exact same situation at Calendly actually on a project where they had 35 different events just for sharing a meeting invite. And so it ended up people were spending more time thinking, trying to figure out which data to use than they were actually using it and making decisions.

SO (05:47):
Exactly. And that's the best case scenario that you're describing, that you actually have to spend time on figuring out what data to use and then you end up using the data. That's the best case scenario when you have this sort of data. But the worst case scenario is what I just described, where you actually end up making a terrible business decision based on this data and then you have to revert it and just costs a business like months of work and business, even revenue.

SS (06:14):
And is that something you've come across? Have you got any examples of where you've seen that happen?

SO (06:19):
Yeah, it's actually one of the triggers for starting out. Maybe we can save it for later in the podcast, but yeah, I can tell a good story about that.

SS (06:27):
Okay, cool. And so I think it'd be good to talk about why, I guess one of the things I often see is people think that data governance and data quality are the responsibility of the data team.

SO (06:46):

SS (06:46):
Think one thing that's really important is that ultimately it's not the data team who are impacted when, or maybe they are, but it's not them who are impacted the most often when things go wrong. And so what are your thoughts on who should own data governance within an organization or data quality and how I guess you get stakeholder buy-in within an organization for improving the culture?

SO (07:08):
This is such a great question and I'm almost triggered when you say this. So for backstory, I had the opportunity to be the founding analyst of a mobile game called QuizUp, which had a hundred million users around the world and was backed by Sequoia and Tencent and world-class investors. And we were the fastest growing app in the app store at the time, later beat by Flappy Bird, just a tidbit for those who know what either QuizUp or Flappy Bird were. And we went through this interesting journey of data ownership and sort of data quality ownership. And I think it was on my first day, the CEO said to me, okay, Steph, so you're responsible for retention. And I'm like, okay, I don't, okay, but as a data person, I have very little ability to actually impact retention. I have ability to impact how we leverage the data and how we use the data, but at that point, I didn't even have ability, any ability to impact the quality of the data, let alone how we actually leverage that as a company in our product and marketing.

I don't have decision making power over that. I have input power over that. And I think this was one of my first learnings as a data scientist is how important it is to actually rally together stakeholders and bring ownership in people like product managers and specifically developers. And that was the journey that we went through at QuizUp and that I have since also helped a bunch of companies go through just like when I was consulting after QuizUp, and then obviously after we started Avo is sort of taking the data culture from a place where the CEO says something like, you have one job, we should be able to make decisions based on data. And I'm like, oh my god, but we can't and I don't have any control over it to just having gone through the process of making people excited about using data, that's the first step, and then making them excited about making sure that the data that they want to use is actually usable.

And so I think that's the way to do that. And how we did that at QuizUp was, and how I've consulted and helped a bunch of companies do that since, is as a data person, find the person in the product team, which is a product manager and ideally also a developer. Those are typically the people that I try to speak to who is excited about data, a little bit excited about it, and then try to convert people from seeing data as a chore that has to get done for the data team that has nothing to do with the end user to seeing it as actually one of the fundamental tools that they use to understand the impact of the stuff that they're building and the impact of the decisions that they make on the end user experience. And so we ended up at QuizUp building this, I guess, well, both internal solutions, technical solutions to ensure data quality and data management was there.

But also this toolkit, a meeting that we called purpose meeting, and it was a sit down with a data scientist, a product manager, a designer, developers from each platform that we're releasing from for iOS, Android, web, and we would talk about the upcoming release and we would define its goals, why are we working on this feature at all? What's the purpose of it for the end user? It's metrics. How will we measure the success of those metrics and then the actual data that we need to implement, what do those analytics events look like for us to be able to visualize this metric on a chart? And then we would design those together using, we have to define the goals with the product visionary that's the designer or the product manager. We have to define the metrics also with that team, and then we have to define the event structures based on the ability of the code base.

So you have to measure things how you can. And so that's why the conversation with all single platform developers are so important when you're designing data, you have to design your data from the perspective of how can I actually implement these analytics events and how can they be used to visualize something that I can make a decision based on? Then after we had started doing this kind of a sit down meeting with people before they started working on a feature, then eventually we got to a place where we had maybe in the beginning of the journey, we had maybe a single developer that was occasionally excited about asking data related questions, how are people using this feature? And things like that. So over 50% of the developers would come to us with this question before they even started working on the feature to make sure they would be able to measure the success of the things that they were working on. And so that they could go into Mixpanel, Amplitude directly after the release and be like, well, how did this go? The thing that I was building? And that was a really wonderful success moment from my perspective, and it took a while and it took some rallying, but it makes for such a, it's a way more empowered product team in general.

SS (13:02):
And often you start out in a place where because the data's a mess and because people are struggling to find what they need, there's almost like a fear of going to the data because if I go to the data, I'm going to spend two days trying to first of all fix the tracking or work out whether the existing tracking is working, how I'm going to use it, then building a rapport and then trying to convince everyone else that the report is accurate. And that whole process can be so kind of frustrating that people shy away from it. And I think

SO (13:34):
Yeah, mean almost no one does that unless their job is data because you don't have time for three days of digging into the data just to know how you can use the data in most cases.

SS (13:45):
Yeah, exactly. And so I really like that story because I think it shows that if you do get the fundamentals right, then actually everything else becomes much easier and people start to engage with it because the data, because they already trust the data, they know that everyone else can trust the data. The data's structured in a way that's accessible to them. It's consistent, it's consistently named, it's consistently structured. You've got a manageable number of events at the right level of granularity. You're not having to search through 50 different events that all correspond to the same thing. And I think it can feel like an inaccessible place. It can feel like a mountain to climb, but actually I'd be interested in your perspective, but my experience is it's taking steps towards that is actually can be quite manageable. And it's just about working out what the right steps are to start with.

SO (14:39):
Exactly. I couldn't agree more. And this is the recommendation that we give every single company or team that we talk to is do not try to boil the ocean and always start small with something manageable and use it as an internal case study. And I like to use this analogy that if you are vegan and you're trying to convert the world to being a vegan, you should not start by trying to convert the people that are like, I want 70% of my diet to be meat. Forget about anything else. That's just going to be such a difficult lift and it's going to be high effort, low impact. But there are loads of people that are like, ah, yeah, I can buy the vision, but I just can't imagine a world without cheese or something. And then you're like, ah, I can help you get there.

And then you have a group of people that are sold onto your vision and then they can help bring the next people. And that's how you sort of get this journey internally and how that analogy applies in this case. And how I encourage people to think about this is if you're a data person, you're the first data scientist or you're a data leader, try to find a product team or someone that owns a part of the product, ideally functional and find a data curious developer and a data curious, curious product manager to start doing this process with you for that part of the journey. Start really small. Do not try to start revamping your entire analytics tracking. Just start by measuring this feature successfully, the next release or something, and then build that into a case study for why you should do that in general with this team maybe. And then leverage that to sort of get more teams to want to do this with you as well, because you can show that they've become more impactful with less pain because this is painful in the beginning, but when you do it efficiently and you can, then it's not painful. It's fun.

SS (16:50):
Yeah, I agree. And I think this is probably slightly controversial, but I really don't like starting with North star metrics because I think they can be too abstract and too strategic. And sometimes what you need is a tactical win. I want to improve the conversion rate in my acquisition funnel, or I want to get more people using a particular feature. Start with those tactical wins that demonstrate the value of the data and start with help you ultimately prove that the model can work and probably actually have more impact in the very short term than starting at the very high level of the, what's the ultimate goal of the company or what's the ultimate goal of the product?

SO (17:35):
Yeah, that's a really great point. I do definitely to help people think about in those big vision things, but definitely start when you're starting on this sort of journey to better data. Then I agree that you shouldn't maybe try to fix that core metric that typically requires at least like five input metrics and then you need to revamp the entire tracking for your entire app to be able to measure that. So I think that's a really good point, and I think that sort of ties into the data governance journey that I typically see and recommend or just recommend people be aware of. And it sort of maps onto what we did at QuizUp, which is like we start off in the Wild West, every single team is doing whatever they want, just like you were describing at Calendly. And we've seen that with tons of our customers.

And so basically the team is shipping bad data fast, so they're not able to really use the data efficiently and then they realize that's a problem, all of those problems that we were describing in the beginning of this conversation. And then they go to a centralized governance model where there's a gatekeeper, there's a data engineer or data analyst or something that they realize that this is such a huge problem that they do not want anything released without it going through them. And that of course works not efficiently. You keep shipping really bad data, but you try to ship maybe less bad data and it's really inefficient, so you're shipping better data, but really slowly. And then from there, the teams realize that they want to go into shipping good data fast. They do not want product releases to be blocked by data design or data quality management.

And then at that point they start figuring out, okay, what are the tools that we have and can use to do that? And they start designing processes like tracking plans and some sort of quality review and automation on that and what can they do to do that? And that's where Avo is that ultimate self-serve data governance utopia. In my experience, that's how we built it because that's based on what we built at QuizUp, but we also support people from going through that Wild West from really bad data to good data to actually self-serve analytics governance. And I think that's the realistic expectation for how people should go through that journey.

SS (20:13):
Yeah, you're right. I think you sometimes have to centralize in the interim or at least central, maybe you don't centralize everything. The Calendly example was a good one because what they ended up deciding to do was centralize the core funnel events so that those things that really measure their growth loop or their acquisition funnel, but still allow individual teams to track other things in a less structured way. And I think they did that because it got them the value of having real trust in those core metrics without having to, I guess slow the teams down when they were tracking something that no one else cared about. But I think you're right that that's a middle ground and it doesn't get you all the way there in the short term, never get you all the way there, I guess.

SO (21:01):

SS (21:04):
Okay. And I think, I guess it would be maybe useful to talk a bit more about how Avo solves some of those problems. Almost a bit of me thinks of it a bit like I guess like GitHub or dbt, but for product analytics and tracking plans. But maybe you've probably got a more eloquent way of describing it.

SO (21:28):
Yeah, I love that. Thank you. I always love how people sort of frame Avo, love to hear how people frame Avo differently. Well, maybe it's relevant to start by summarizing a little bit where we came from. I mean, Avo fixes data quality for product analytics. We focus on that and what does that mean? Well, we help world leading product teams like Fender and Walt and Ikea,, but also startups and smaller companies, we help them make sure they don't fly blind with their data so that they can continue to build great user experiences. That's what it means in practice. And the reason why we focus on product data and even more generally on it's a problem that's applicable to any event-based data, but what's uniquely complex about product data is that it is ever evolving. The data structures are ever evolving because your product is ever evolving.

And this doesn't apply to all data structures. For example, if you take Stripe data or something like that and you download the schema structures of Stripe data, which is event-based data, then they're probably going to look the same or at least very similar to what they looked like three years ago. But that does not apply to your product data. Every time you release a new product update, which modern product teams are just continuously iterating on their product, there's no such thing as a ready-made product, then you have to update your data structures and the process of managing this is what gets people down. It slows people down or they ship bad data all the time and inconsistent across all of the products. And this is what we learned at QuizUp. So I've been now in data science for over a decade, which always seems weird to say, started in genetics and then got pinged by the startups to join them as the founding analyst. And I was like, definitely this PhD degree can wait and I'm going to build up a data team at a startup that sounds really fun. And I went to do that and got this opportunity to find this problem and solve it. I will fix this data quality for product analytics. What does that mean in practice? Will we help forward-thinking product teams like Fender walt, but also smaller startups to make sure they don't fly blind when they're trying to rely on their data to build great user experiences.

So that's what drives us. We want to help people build better products, ultimately more informed. But another thing that really drives me personally, and then I have to share that here, is to make people less miserable while doing it, especially data people, because that sort of brings me to the backstory. And we've touched on many of these points already. The journey in QuizUp, it was rewarding in the end, but it was a really painful experience to have to manage, try to figure out what the core of this problem is and how to solve it. And after QuizUp, a couple of us started another company and we had the exact same problem again, just within months, we were faced with a decision of whether we should actually spend time to build internal solutions for this. Again, that's a really ridiculous conversation to be having when you're a tiny team and you should be building your product and not the hammer that allows you to build the product.

And so we ended up talking to hundreds of data people, product people, and developers, and they were all equally miserable to us and to me, but 1% of them or something had converted to building something internally and they were still miserable about it because they had to maintain it and it wasn't perfect and all those things, but we realized that this is something that needs to be solved. And I was like, I had a personal vendetta to really make this better. And brings me back to that. You asked whether I had a story where we had that business impact, and yes, we did at that startup after QuizUp before Avo, we realized that we could simplify our onboarding journey because 98% of our users were using email to sign up and not the phone number, which it's a classic way you choose between those signup methods.

And we're like, okay, great. Let's simplify both the code and the product and the experience by just dropping the phone number option and allow people to run through it a little bit more quickly. We released it and our onboarding conversion fell by half from 80% to 40% or something like that. And we're like, what happened? And then we saw in the data afterwards that we had the signup completed analytics event and that had the property of authentication method and that had that value of email and phone number, and we had seen that 98% was email and 2% was phone number. When we took a second look at the data, it was actually around 48% email phone number, and then 51% was phone number without a B. So that's like a tiny detail that seems so insignificant, like a developer presses the B on the keyboard a little bit less mildly when they're implementing the analytics events for one of the major locations of the signup button.

And then you end up with this kind of a decision that plummets the onboarding conversion had a really terrible experience for an early customer of ours, and we had to take a few weeks to build that simplification process a few weeks to realize the mistake, and then a few weeks to revert back to the original option of having those two options. Very expensive in Startupland just a few weeks. But this process, entire process takes months for bigger organizations. And you can imagine how frequent it is. It's just everywhere and how expensive this is for everyone.

SS (28:15):
And the impact you can have in that time if you're focused on building the right things.

SO (28:19):
Exactly. So there's a huge sunk cost and just, yeah, so it's very painful. And this was that when I found that phone number, phone number, typo issue thing, I had to go home that day. I was like, oh my God, I'm not going to be able to communicate with people without shouting today. And soon after that, we started, actually, we had those conversations with those people all over the world that had either been miserable or built something internally, and then we ended up going into building Avo and now we're solving this for an incredible group of people. And we're just getting started. I like to show this logo slide, seeing the incredible companies that are using Avo today. And the thing is that we have everyone going into the digital journey. You won't be a viable company if you're not going into the digital journey. And so we probably have, we're entering at least early majority if we're not far into the early majority of people wanting to do great digital products and experiences, you won't be viable as a business if you don't. And then inevitably they'll have the data quality problems that we're talking about, and then inevitably every company in the world is going to be in our customer within a few years.

SS (29:41):
And how does Avo solve those problems and how does it differ from some of the other solutions that are out there?

SO (29:47):
Yeah, it's a great question. So we have three use cases, three core use cases. It's data observability for your analytics events for your event-based data, which is a super simple to install single player tool that you can use. We call it Inspector, and it both allows you to observe continuously the releases and whether they match your schemas, but it also allows you to basically build up an audit. So what the hell is wrong with your analytics today? And this is something that we just saw so many customers spend months on just collecting all of the analytics events data to be able to just know where to start and fixing it. So Avo helped you do that with Inspector, with the observability use case. And then the second one is sort of the management of releases. So that's like a collaboration platform that Avo solves better than anyone, the collaboration process between analysts or data scientists and developers and product managers, which all have to touch the process of designing data, designing analytics events, implementing them, reviewing the quality of them, and then shipping the product release to the analytics platform so that you can use it again to design your next release.

And then you go into designing data, designing metrics, designing analytics events, implementing them. So it's a continuous cycle that happens along with your product release. And I streamlines that process immensely. And for example, one of our customers said that they went from spending three to four days of every single sprint, two weeks sprint on that. That's 40% of their time to 30 minutes. And so that's like a huge time saver in addition to all of the downstream effects it has on the data quality of course, and time saved of the cost that we just talked about. And then the third use case is specifically catered to developers. And it's sort of something that we don't push onto our customers. It's something that if there's an appetite among developers to really streamline this process, we have code generation that generates type safe code based on your tracking plan design so that developers just out of the box implement analytics events correctly from the get-go, and they don't have to spend a lot of time chasing their tails after QA says that, or the data scientist finds a bug somewhere and they just implement them correctly. So it just takes so much shorter to implement your analytics events correctly across iOS, Android, apple, tv, all of these platforms that you might have for your products. And so saving a lot of time there. And then of course, downstream quality.

SS (32:35):
Yeah, you're right. And a lot of these problems become worse the more platforms you have. So if you're like a media product where you've got five different connected TV products plus iOS and Android apps plus a website, suddenly you've got immediately like six or seven versions of every event that you've got to manage. And if it's so easy for them to diverge, as you said in the dropping a B from the name on one or having different properties on one versus the other, and then your analysts are making errors without realizing they're making errors or your product managers can't actually get used to using the data because there's so many pitfalls and you almost need a degree in the data and all the issues before you exactly before can use it.

SO (33:23):
And how I like to think about it also is another different way that, excuse me, we think about this at Avo, we think about it in the way that it is way less expensive for you in the long run to have proactive data management prescriptive like this than to be reactive and have it, let me be even more specific. Another way that I love to think about this also is it is way less expensive in the long run to have analytics management and data management be a proactive team sport, cross-functional team sport than have it be a reactive task that, or project or major overhead that a data scientist or a data team has to do. It just takes so much longer, it's way less efficient, and you end up with less quality data because your data is so much better when it's designed as a team sport, when it's designed by bringing together the product manager's vision for how do you want to use it, the developer's vision for how can I implement it, how can I design optimal data structures?

And then the data scientists view on what is a good way to query the data, for example, or visualize it. And so that's why it's a cross-functional team sport, and it's way better when it's proactive than it's reactive.

SS (34:55):
Yeah, you're right. And I think that enables you to think about it from the perspective of the user and structure the data around what's the user doing and what do we care about? Rather than sometimes when a single team think about it, they think about it more like developers might think of it more as monitoring, and how do I make sure this thing's functioning correctly? Or a product manager might think about it in another way. And you're right, tying all those people together, bring all those people together and coming up with a common definition, I think increases the chances that you get it right and you're thinking about what do we actually care about as a business and what do our users care about?

SO (35:30):
Yeah, and I just have to say, obviously I'm coming at this from a very strong personal passion. This, I'm very passionate about data cultures and building data cultures well, but I'm also very passionate about, like I said, achieving this milestone with less measurability among data scientists. And so that's why we are able to solve this problem. We support both of those, but I'm always excited to hear from people who have solved it by building something internally or solving it at a different manner. So please reach out to me. I'm sure you find some way in the meeting notes or something, go to Avo app or Stefania at Avo app to tell me how you're solving this yourself, and I'd love to have that discussion.

SS (36:20):
Well, I think that's a great place to finish, but thank you. I think it's been a great conversation, really interesting to hear a bit more about what led you to start Avo and yeah. Well, let's stay in touch.

SO (36:33):
Thank you so much for having me, Stuart. It was a pleasure. Look forward to working with you more.

Ready to unlock new
growth opportunities?

We and selected third parties collect personal information. You can provide or deny-  your consent to the processing of your sensitive personal information at any time via the “Accept” and “Reject” buttons.