Skip to main content
HomePodcastsData Science

Making Better Decisions using Data & AI with Cassie Kozyrkov, Google's First Chief Decision Scientist

Richie speaks to Google's first Chief Decision Scientist and CEO of Data Scientific, Cassie Kozyrkov, covering decision science, data and AI.
Updated Sep 2023

Photo of Cassie Kozyrkov
Guest
Cassie Kozyrkov

Cassie Kozyrkov founded the field of Decision Intelligence at Google where, until recently, she served as Chief Decision Scientist, advising leadership on decision process, AI strategy, and building data-driven organizations. Upon leaving Google, Cassie started her own company of which she is the CEO, Data Scientific. In almost 10 years at the company, Cassie personally trained over 20,000 Googlers in data-driven decision-making and AI and has helped over 500 projects implement decision intelligence best practices. Cassie also previously served in Google's Office of the CTO as Chief Data Scientist, and the rest of her 20 years of experience was split between consulting, data science, lecturing, and academia.

Cassie is a top keynote speaker and a beloved personality in the data leadership community, followed by over half a million tech professionals. If you've ever went on a reading spree about AI, statistics, or decision-making, chances are you've encountered her writing, which has reached millions of readers.


Photo of Richie Cotton
Host
Richie Cotton

Richie helps individuals and organizations get better at using data and AI. He's been a data scientist since before it was called data science, and has written two books and created many DataCamp courses on the subject. He is a host of the DataFramed podcast, and runs DataCamp's webinar program.

Key Quotes

I just want the world to be so much better at decision making. We're going to embarrass ourselves in 500 years time. We're going to look at all the things we thought were okay, and we're going to be amazed that we ever allowed ourselves to do this as a species, like the way we look at bloodletting and leeches now. And I'm just so excited that we're on the precipice of change, accountability, and actually taking decision making as a skill.

When looking for data science work at a new company, find out who is going to give you enough of the business knowhow so that you are guaranteed to be solving the right problems. Because if someone just brings you in a room and says, ‘okay, data scientist, just do things’ there is such a great chance that you are going to run up against the type three error of statistics. For those who don't remember their errors: Type one is when you incorrectly reject the null hypothesis. Type two error is when you incorrectly fail to reject the null hypothesis, and type three error is when you correctly reject the wrong null hypothesis. In other words, using all the right mathematics to solve the wrong problem entirely. And if you don't have someone who actually understands the business, understands the strategy, helping you work on the right things rather than the wrong things, how are you actually going to be impactful? And the more mathematically sophisticated or highbrow your work is going to be, the more support and buy-in you need, the more scoping of those problems you’ll need from somebody in leadership. Because if you work on something super complicated in a company that is just not ready for that level of model and that level of contribution, it is likely to get looked at briefly and thrown away.

Key Takeaways

1

Beware of the Type Three Error: In decision-making, it's crucial to avoid the pitfall of correctly solving the wrong problem, emphasizing the need to align data science efforts with real-world business challenges.

2

Data scientists should not work in isolation. Collaborating with domain experts, decision-makers, and data engineers is crucial. This multidisciplinary approach ensures that the right problems are addressed, and solutions are effectively implemented.

3

Beware of Being the "Everything of Data"—While versatility is valuable, trying to handle every aspect of data—from collection to analysis—can be overwhelming and counterproductive. It's essential to understand one's role, collaborate with specialists in other areas, and ensure that the focus remains on impactful decision-making.

Links From The Show

Transcript

Richie Cotton: Hi Cassie, thank you for joining us on the show.

Cassie Kozyrkov: Hi, thank you for having me. Great to be here.

Richie Cotton: Brilliant. So, one thing I've heard you say before is that you don't think that data science lives up to its brand name. So can you tell me a little bit more about what you mean by that?

Cassie Kozyrkov: Yes. I think that when people think about data science, the two kinds of images that come to mind is one is the professor ish type or at the white board with, let's say chalk to make it extra old school and a bow tie and lab coat for no good reason, that kind of stereotype doing very careful mathematics is something so intelligent.

That's one kind of stereotype. The other kind of stereotype is the, hacker listens to the business problem for two seconds, grabs a laptop, and everything is solved. And those are so, so far from reality. And I think that a lot of people fall into data science for this sense that that glamour of doing complicated mathematics is what the job is about, or the glamour of solving enormous problems really quickly in a way that might have happened a few times back in the low hanging fruit days of data that attracts people to the field.

That then find themselves deeply disappointed and they're sitting around and asking, where has it been six months? And nevermind neural network, not even a single regression has happened here. And so the brand problem is partly this sort of disappointment in what the job actually turn... See more

s out to be and how it isn't well connected with what people think they should study.

To then get it and also partly just the realities of stakeholder management and what it means to solve useful problems.

Richie Cotton: So, I certainly agree with the idea that it would be very cool if you could just tap a bit on your keyboard and solve all the stuff, but it sounds like there are a lot of tasks that data scientists end up doing that are fairly low value. So can you give me an example of where you think data scientists are wasting their time?

Cassie Kozyrkov: Well, that's actually where I was going to go. So I love this collaboration. Thank you. there are broadly two kinds of time wasting. That happened. One has to do with the data scientist is not set up for success when they join the company and the other has to do with the data scientist is not trained to be useful to the company.

And these are different. So the one that I wanted to bring up is some advice that I have for folks just getting started in data science. Some questions that you should absolutely consider asking in your job interview. I wrote a blog post about this, so if you want to go find them there again and then forward them to whoever is suffering with you listeners, you can feel free to do that.

The title for that was pretty cheeky. It said, Data Science, the sexiest job of the 22nd century. So, that's... a callback to an article that was written where it's the sexiest job of the 21st century. And the idea is that maybe of the 22nd, when companies are actually ready to make use of you. So here's my advice.

I want you to ask any potential employer about the following things. First, whose job is it to make sure that you have data? Because it is very easy to take a gig where there is no actual data for you to be a data scientist with. And that is a great way to be ineffective because here you are expecting to flex your muscles on data that's just simply not there.

If it is your job to get the data, that might be okay, but watch out. First, are you trained and willing to deal with the nitty gritties, the real world nitty gritties of data collection? Because there's going to be a lot of stuff that does not feel glamorous to somebody who didn't expect to get their hands dirty with the real world.

things like, if you are doing survey questions, do you have pens and pencils for the people filling out the survey? Did you check that the battery was charged on the laptop? There are little things that feel a bit janitorial, for lack of a better word that folks who thought they were going to take all the highest mathematics classes to make themselves as marketable as possible, just feel like they might be above, and then you get pulled into this.

Now, I think that you learn a lot by doing tasks that you. Have the impression you're above when you meet the real world face to face, you learn a hang of a lot. So it might actually be a great experience. Now, that said, that's the nitty gritty of data collection. Data design, how do you design that collection?

If you have a social science background, you are probably best positioned for that data design piece. But even so, it's not something that we teach in school. And I would love for us to a little bit more into this topic later. You'll be flying blind. That's also very uncomfortable. Now, what if you are doing data collection at scale?

Maybe you're working with a tech company and the data collection is working with the engineers to start logging things that weren't logged before. Well, that data, those logs have to then end up in an analyzable form for you, the data scientist, and the piece in between is data engineering with a whole lot of IT problems sprinkled on the top, usually.

Are you qualified for that? Because that is a different job. Data engineering and data science are not the same job. And you might be saying, I want to be the everything of data. I'm going to step up. I want to do all of it. Well, that's kind of like, if you are hired ostensibly to translate between, say, Mandarin and Swahili, and that's your speciality, and then you're told instead you're going to translate between French and Japanese, okay, it might be fun for you to learn another two languages, but at the end, once you've mastered your French and Japanese, what happened to your Mandarin and Swahili?

So be careful trying to be the everything of data all on your own is a great way to set yourself into a risky position. So always ask who's responsible for data, what kind of data is already around, who checked it for quality? How has it been documented? Did it just come from a vendor? Was it collected by the company?

Are there adults around now? Speaking of adults who is responsible for making you successful as a data scientist, who is the decision maker who actually frames the problems that you work on, who provides the domain expertise, who gives you enough of the business know how so that you are guaranteed to be solving the right problems?

Because if someone just brings you in a room and says, Okay, data scientists just do things. There is such a great chance that you are going to run up against the type 3 error of statistics. And so for those who don't remember their errors, type 1 is when you incorrectly reject the null hypothesis, type 2 error is when you incorrectly fail to reject the null hypothesis, and type 3 error is when you correctly reject the wrong null hypothesis.

In other words, using all the right mathematics to solve the wrong problem entirely. And if you don't have someone who actually understands the business, understands the strategy, helping you work on the right things rather than the wrong things, how are you actually going to be impactful? And the more mathematically sophisticated highbrow your work is going to be, the more support and buy in you need in a company that's just not ready for that level of model and that level of contribution it is likely to get looked at briefly and thrown away. Also, who's on the hook for implementing what you make? Because the data science side is often not the hardcore engineering side.

And a great lesson of this goes all the way back to the Netflix prize. When, as this is a very famous machine learning prize, it was about improving the Netflix algorithm by at least 10 percent for recommendations. And so the winning team, they got the 10 percent improvement and Netflix said, Thank you very much.

There's absolutely no way in hell that we are putting that into production. So, you can make complicated, sophisticated data science work, and it might fall on infertile ground. It might just not go anywhere. If it's not put into production, it may as well not have happened, right? model falls in a forest and no one hears about it, then what's the point?

So, how you're going to work with stakeholders, who is responsible for making sure that you work on the right problems, is a big, big thing that you have to worry about. And if the answer there, again, is you, I are you a domain expert of this domain? Do you have good relationships already with all the people around?

If you're coming straight out of school, the answer to that is no, no, and no. So be very careful and do ask those questions. And then also tools. What are the tools that you're going to use and who is responsible for making sure that those tools work correctly? Do you want to be on the hook for figuring out which particular platform is going to be approved by your IT department?

What's going to create security issues? That is a long time that you will be spending in meetings and running around and not doing what you think of as data science. So you have to understand that the data science piece really relies on this enormous team and this enormous real world context of stuff.

And not every company has that, but a lot of companies just want to hire a data scientist because they expect that you got one, you got a unicorn that's magical and you just put them in the business and magic happens. And that is not the real world.

Richie Cotton: There's a lot to think about there. And I have to say, I love the idea of a type three error where you've solved the wrong problem. so you described quite a few ways in which a data science job can go wrong. I'm just wondering, is there like a utopian data science job? Like what does a good data science job look like to you?

Cassie Kozyrkov: Oh, First let's get ourselves past that stereotype. of the, quick fingers on the keyboard and everything is solved. In the early days of any discipline, you have these quick successes available. And that low hanging fruit scenario, it goes away. Data science has been around for a long, long time.

And there might be pockets, there might be pockets where you're like, I made a scatter plot, here is 10 million dollars. Doesn't work that way most of the time. And so,

broadly speaking, you've got two kinds of gig. You've got the gig on the team that's already very much a data science team that is solving complicated problems. Now, there's pros and cons here. The pros are that you get to work with your own kind. Chances are that the person fielding requests to this big team of data scientists is themselves a director of data science, something like that, a data science leader, speaks your language, speaks a stakeholder's language.

What comes to you is the right kind of work. The low hanging fruit has so already been squeezed out here that the chances are you are making improvements like at the 8th decimal place or something. You're not doing these wild, write home to your parents impactful kind of projects. You're making incremental improvements because here's a large team of people.

They've already done most of what there is to be done. And so here we've got a little plus epsilon that we're all doing. So pros and cons there. Alternatively, you could be the singleton dropped into some organization that may or may not know what a data scientist is. And this is where it's really dangerous not to have someone looking after your career who gets what your work is and how you're supposed to be doing it, what is your fault if it goes wrong, and what is a problem with the company, what should you be held accountable for, and what excellence looks like.

So, here there's a lot of opportunity to contribute in generalist kind of ways. But at the same time your contributions may be devalued or misunderstood, or you might be misdirected. I mean, the worst kind of thing is where you end up being used as a horn in a territorial dispute between leaders where you're just backing up decisions that they've already made and really adding no value except helping them prop up their claims and kind of bamboozle and bully everybody else with data.

So you really have to watch out for that. On the other hand, the pros are that. There may actually be opportunities for larger impact, because if no one's looked at any data at all, you might do some even basic analytics and discover, Aha! Here is a way that I might pose that we could create some changes in this org, on this team.

You could actually end up creating some things that you could feel very great about. The chances that what you need to be doing in a setting like that are really, really sophisticated. are lower, of course. So you want to do sophisticated work, that much larger team probably makes sense for you. Not all companies have that.

And then if you want to do some analytics for inspiration, get people excited about potential directions to explore. And not all low hanging fruit, but it's a little more wild west than being a singleton could be a good fit. But again, comes with that fundamental misunderstanding and potential lack of empathy from your coworkers.

Richie Cotton: So it sounds like really lot of it's about your appetite for risk. And do you want to have that chance of an amazing impact? But you're living in chaos or. Do you want to be part of a larger machine?

Cassie Kozyrkov: well, I'm not sure I would characterize that as being about risk. Exactly. The larger machine comes with risks too. It's the risk of standing out career wise. how will you, if you believe that you are brilliant, if you believe you're the best data scientist in the world, you have a very major risk of getting mothballed on that team.

Because projects that would allow you to shine just don't get to you.

Richie Cotton: Yes, certainly being able to work with your colleagues and not tell the greatest person in the world. That's general good career

Cassie Kozyrkov: Yeah, yeah,

Richie Cotton: I like that. Okay. So, I'd like to talk a little bit about how generative AI is changing the role of the data scientist. I guess it's been like a year since generative AI went mainstream.

So, to begin with, what do you think has been the biggest impact it's had so far?

Cassie Kozyrkov: Cynically, I could say it is a new kind of stack overflow with a bunch of pros and cons. So it, does help you write your code faster. If you learn how to test what you're doing with Faster, overall requires some debugging know how, because if you just take it, plug it in, and hope for the best, and never check it it's gonna slow you down in the long run.

But having some ability to get some help with code can be very useful. Where the actual interesting stuff is, if we, if we step away from our cynicism, is that the way to think about generative AI, the revolution, is that it is more of a design revolution than an AI revolution. So from the point of view of the large enterprises, the Googles, the open AIs of the world, the foundation models follow the same principles.

In terms of AI and software development, as the big foundation models of all the other kinds from the last 10 years, there's actually not that much difference from, say, Google's perspective. The difference is that before, it was highly embarrassing, from a user experience point of view, for the user to know how the experience was delivered.

Like, you shouldn't have to know whether there's JavaScript on the web page. It would be embarrassing if you as the user knew that, right? The web page just works. And similarly, you get your recommendations from Netflix. And you get your search results from Google, and at no point is anything yelling at you, we've got AI, we've got AI, it just works.

That's how it's supposed to be by that old user experience design philosophy. Now we've got this completely different design philosophy, which is very much AI for AI's sake. AI in your face as a raw material. So from the perspective of the Googles of the world, it's still one large monolithic system.

It is designed to be performant in certain ways that are set by the company. And then this is all given to users and users are told here, use this thing. This is AI. It's a raw material, build with it, whatever you want. And there is so much opportunity for trouble and also delight in something like that.

And what data scientists should be trained to do, particularly if they are of the statistical bent, Is testing things question of does this raw material work for our needs? And when we have built upon a raw material that comes from a completely opaque process, like you don't know what kind of data the foundation model was trained on.

You can't go debug that. You don't actually know what the objective functions were. So it's totally opaque. You're now going to take the raw material that comes out of that. You're going to use that for whatever your own needs are. If the task is unimportant, who cares, right? If I'm like, hey, chat GPT write me a poem about bananas in the style of John Milton, who cares, right?

I'm going to get something out of that. Maybe I like it. Maybe I don't. It's up to me. No statistics required, but what if I'm going to start basing some kind of solution that I'm selling? I'm a startup. I'm going to build on top of chat GPT or One of the other things and it has to work has to work for my clients.

It has to be useful and effective. How do I think about testing that? How do I think about building safety nets and testing those safety nets? What kind of data should I use for that testing process? How do I know if it works? This business of how do I know if it works is very much core skill set for data scientists.

And I think that we'll get past that very individual usage. You know those poems about bananas? And then we'll start wanting to actually build things that we can sell on top of these products. And then, oh my goodness, there will be so much fascination with does it work or not? And that's a great career opportunity for data scientists.

Richie Cotton: That does sound fantastic but it does seem like it's a very different skill set being able to test the results of some generative AI model compared to, I don't know, running a logistic regression or something. So,

Cassie Kozyrkov: Well, yeah. So, so the analogy I want to use for this is I have this and I should just send you the image so you can put it up, but I have an analogy that is my kitchen analogy for the AI space and before generative AI, it would just have four images, and it would have like a tomato and it would be ingredients and data, right?

Those are the analogies and your data is your ingredients. Then a microwave or an oven, and I'd have appliances and algorithms. There are analogies for one another, your paper recipe, the model and an actual recipe that chefs would want to cook from, and then at the end, your pizza, and that would be your dishes or your predictions.

And that's from the enterprise point of view. Now we've got this new layer, which is that you take that pizza. As a raw material, and that's your data product, so maybe you could do anything with that pizza. You could spray paint it gold and call it art, right? And then that's a very different quality. That pizza doesn't have to taste good to potentially be good art.

You could put on the wall. So that leap is the difference that instead of Just serving pizzas at scale, which is companies would have done. Now you are giving that not in a take it or leave it fashion, but build on top of this, do your own thing with it. And there's a whole can of worms about who takes responsibility for that.

What does it mean to do it safely? Where should the boundaries be? Where should companies prevent those who build on top of it from doing certain things? I think what's really key is that forget the data product, the gold painted pizza, the gold spray painted pizza. In a large enterprise setting, each piece of this process is somewhat discoverable.

you should know who your data vendors are. You should be able to go interrogate those data sets and see what's in there. Are those ingredients garbage? Any Michelin star chef would tell you that you can't make something great. If you're working with bad ingredients, the same thing holds for data.

So, maybe you made the data yourself and you know all about it. Maybe you bought it from a vendor. At least you can go interrogate, see what's there. And what about the appliances, the algorithms, the math that you're using? That's somewhere in your code base. You should hopefully kind of know what you did.

And to some extent, to the extent that you really understand these algorithms, you could see some mistakes coming. Like, let's say you do regression. Well, we know that regression doesn't handle outliers really well. So, what's going to happen when you get an outlier? We expect major failure. The recipes themselves and the models, you get those out explicitly.

You know what's coded in there. You could look at it. Now, it's going to be too complicated to read if you're solving these. at scale, big data, AI type problems, like you're not going to go read the output of the neural network. But theoretically, it's there, you could, and you at least know the scale, and you know the number of coefficients and whatever.

And you see the inputs and outputs directly. You lose that when you're just building on top of the output. Like, you don't know what kind of data went in, you just don't. And you don't know exactly... Which algorithms and you don't know what corners were cut. You don't even know what kind of ensemble it is.

How many models are in there? People are guessing how many models are behind chat GPT, right? There's a huge difference in terms of trust and testing and safety when you don't get that full view. And I'm not sure that people have really come face to face with how difficult that will generative AI space.

Richie Cotton: do you think we need more transparency about what goes into these large language models then?

Cassie Kozyrkov: that's an interestingly phrased question. I would say that there are many things that we would love to have. I would love to be able to ask anyone anything that's private to them. However, I would be able to, and there is absolutely no reason why an enterprise providing this service would give their proprietary stuff.

I mean, for you to truly interrogate the data, you would have to get the data set, but the data set is where the moat is. The data set is the expensive thing that the company would have put so much time, effort, money into creating. And you want to look at it. That means you want a copy of it. Hell no, you won't be given that.

And that makes perfect sense. I think,

Richie Cotton: that makes sense because I guess it's becoming easier to create these models like from a machine learning point of view. So actually a lot of the values in the

Cassie Kozyrkov: it is in the

Richie Cotton: the models that's an interesting thought. so, I'd like to talk a bit more about the impact on different data roles.

generative AI is going to make some data roles more important or less important over the next few years?

Cassie Kozyrkov: something that I have found. A little bit ironic, is that when I ask people to complete the following sentence, I'll ask you, Garbage in,

Richie Cotton: Garbage

Cassie Kozyrkov: garbage out. Right.

They all seem to agree. Garbage in, garbage out. It will be garbage. You base your data stuff on garbage data and you're going to get some garbage model out that's not going to have good impact, that's not going to work. So, we don't seem to have any disagreement about this fundamental statement. And then I do this slightly spicy thing after I give talks, I give a lot of conference talks people are interested in having me speak, I've got my link for that is MakeCassieTalk.

com, we have ways of making you talk and afterwards I go and I hang out with the audience. And I play this little devious game. So I go around the, the crowd and I ask them, Okay, what's your job role? And they're like, statistician, and this one's like leader, and UX design. And then I go around again and I say, Who in your organization is responsible for data quality?

Whose job is it to make sure that the data are well designed, intelligently collected, and curated, managed, and documented effectively? And most of the time, the answer will be the same job role that they have just told me they are. This is chaos. This is unacceptable. How, how do we permit this? We are building our whole foundation on data.

We all understand garbage in, garbage out. And yet nobody can tell me even the name of the job that ensures that what goes in is not garbage. Wow! I I have no word. How did we get so many years into this and we still have this problem? And maybe part of that is that we have not made this job sexy. It's an afterthought.

Everyone wants to do some fancy mathematics. Nobody wants to think about data collection. If we go back to the story of Fei Fei Li, who I'm so glad that she's now getting recognized for What she did, which was very brave as a PhD student in Caltech, working on AI, everyone told her, go do some fancy math.

She said, we need data. Let me go and get volunteers to label photographs and just say what's in the photograph. Is it rocket science to look at a photograph and be like, there's a cat, I'm going to write the label cat? No. so she got a lot of flack from her advisors. She kept expecting that she would have to go and work at, I think, her family laundromat, or not, sure I remember the details of the story, but because it didn't feel Caltech y enough to be doing that.

I think that's such a heroic act that she did. Without her, we wouldn't have ImageNet, we wouldn't have progress in vision AI, cause someone had to go and. Do the foundational bit. No one wants to do the foundational bit. Everyone expects somebody else to do it and then they get to do the fancy mathematics on the top of it.

This is going to be a huge problem. Now all this generative AI stuff, it is fed by one thing and one thing only. And that is data. And the better your data, the less fancy your math needs to be. Whose job is it? How do we make new, fresh, talented young people actually study all the things that you would need to know to get the data side, right?

I mean, that's a pinch of social science. Design of surveys and, human psychology, a pinch of data engineering, pinch of statistics. How much of it do you need? What kinds, what if it's stratified? How do you actually design the sources that you would have to go forward? The randomization schemes, all the real world, practical details, plenty on security.

If you're going to, then you, you're gonna know, need to know some things say in the medical field about hipaa more than you thought You. Wanted to know if all you were interested in is mathematics and a whole host of other stuff. So it's actually quite a complicated discipline in its own right.

But we insist on telling people that it is not valuable. It's not as good as data science. And we're building data science on a foundation of junk. And I had this conversation with a data science influencer. And I was asking this person, so how do we make this job? Sexy. How do we get young people to want to study this, to contribute to quality in, quality out, not garbage in, garbage out?

And what should we call it? And the off the cuff response there was, oh, isn't that just data janitor? And I thought, this is the whole problem. Beautifully encapsulated. Like, imagine you tell young people, come take a degree in being a data janitor. What are they going to do? Call mom and dad back like, Hi mom, I am studying to be a data janitor.

No, that's never going to work. So it's on all of us to get a little bit more humble and to say how desperately, desperately we need this. For all data science, and now even more desperately than ever before, generative AI. I mean, again, that's the moat, good quality data. Someone's got to get the good quality data.

Whose job is that? Who knows about it? How does it work?

Richie Cotton: Yeah, I can certainly see how saying data janitor is the sexiest job of the 21st century. That's a very tough pitch,

Cassie Kozyrkov: Yeah, very tough.

Richie Cotton: but it sounds like a lot of this is quite close to the idea of data stewardship, making sure that you've got high quality data like throughout the whole sort of data pipeline. Are there any things that you think data professionals need to know about stewardship of data?

Yeah.

Cassie Kozyrkov: Look, I've, I've kicked around a lot of different job, potential job titles for this. Data steward doesn't feel active enough as a position. It's being like the data's security guard. You're just there and you hope the data does its thing. And yeah, I'm not sure it carries the gravitas that you would want because that's where I want young people going.

If we're building on data, if we're building a data-driven data fueled world and society want that data to be good data, hero is what I would much prefer to data, data steward. But you know, that, that has its own problems. Often things I kick around is data designer, but this is so much more than design.

there are a lot of things you're gonna have to think through and design and deal with the consequences of. And it's not a soft discipline. It's got plenty of mathematics behind it, too. So, I don't know, I can't, I personally cannot think of a satisfyingly

hearty term for this that really captures the art piece and the science piece and just incredible importance of the role that is so neglected right now. And what I would say as a, as a piece of advice for people getting interested in it is that data documentation is so, so hard especially Because that's where the motive is, and because you don't want to share copies of the data, you need some way of allowing your potential client or potential user or data scientist using that data to interrogate what might be in there and what they want to pay money for it and so on.

And for that, you need some notion of transparency, which is hard to define and you need some notion off. Well documented information about it's provenance, it's schema, but also where the gotchas are. So how do you do all that without giving the actual data set? That's a really hard problem. And there have been nice attempts to make some headway Some from pair people in AI research at Google, some from Microsoft and others. But really what a lot of those papers are saying is, my goodness, we're stumped. There is not just one way to do data documentation. And when you get that situation, everyone's going to start doing it differently, there's no standards on it, and what chaos, what chaos will all get.

And so I think that a lot of people who just want to do some shiny mathematics on data that's arrived to them through everyone else's sweat and labor don't realize just how wobbly and subjective the whole procedure is to get it to them in the first place. And how much effort and attention we should be putting into assuring that

Richie Cotton: So yeah, I mean, I think on that note, maybe like a lot of the challenge with data documentation is it ends up being maybe the data engineers are the ones who know the exact details of how some data was cleaned, but it's actually the business people that understand it, and I don't know whether you have any thoughts on like how you make a system that is both suitable for data engineers and business people as well.

Like, how do you reconcile those two different points of view? Thank you.

Cassie Kozyrkov: so, interesting that you bring that up. So when, when I was part of the pair project on data, transparency, data documentation, one of the things that we realized early on is that to make some headway, it would be worth having workshops or trainings, at least in what to think about and what to contemplate documenting before getting started.

So that's available. We've published that. We're very proud of it. People can go check that out, but there are many, many more than two personas. So from a UX point of view, who are the personas is not just business people, the data engineers, there are a lot of different personas that you as the data documentation expert would need to consider.

And yes, it's hard just with two. I completely agree with you, Richie. It's worse when we're talking dozens of personas that you might want to think about, particularly when you think about data vendors. So a data vendor might be selling to all kinds of clients. Maybe it's a government client. Maybe it's a regulator.

Maybe it's. A machine learning team that wants to automate based on that data. Maybe we're just curating it and we don't care. We don't know what we want with it right now. We're just making this big old hoarder's storage locker full of stuff and good luck to whoever comes next. Different personas will have different needs.

There's a whole guessing game of who's going to use it how that you can't perfectly guess. And something else I want to say about data is that In the old days, when dinosaurs like myself were crawling the earth as statisticians, there was quite a premium on data storage, so you didn't have really big datasets.

Because you didn't have really big datasets, you would be very careful about what you collected, and you would design and curate Carefully and maybe then the mental model would be that this would be like a gallery exhibition or a museum intentionally selected shiny things that make sense. Each column is there for a good reason and then as data storage became cheaper, then you can imagine the free for all is more like a hoarders storage locker.

Nobody cares. It's just throw everything in there. And this is really great for analysts who are supposed to pull inspiration out of stuff. This gives them more sources of potential inspiration, so they don't want you keeping only these five columns and, toss everything else away because you can't perfectly curate, document, and design it.

They're like, yeah, just toss everything in there. We'll deal with it later. That's literally their careers. They have every incentive to want more sources of inspiration. That said, when we start talking about documentation, maybe the analogy there is the, the labels in a museum. If you want to chuck more things in without worrying about it and bring that storage cost way down, you're explicitly going to not document, right?

The more things you document. The fewer things you can keep because you need personnel hours to do the documentation. And so this business of like, just toss everything into that, that hoarder's attic and worry about it later does not lend itself well to the act of data documentation. So there is this push and pull.

There's that statistician, museum, curation, fully documented, perfect thing. And then there's the data lake, hoarder's attic chaos. Don't worry about it. Let them deal with it later. antithetical to that first line of thinking. I'm not even sure that organizations and leaders,

Place this thing head on and actually ask themselves, okay, what are we optimizing for? Can't really have both, not at any reasonable price point. So which one are we trying to do? And chances are that they've hired people who come from particular schools of thought, lines of reasoning, who are not questioning why they have the knee jerk reaction that they do.

So maybe you've got a bunch of stats people who are like, document the hell out of it. Maybe you got a bunch of Ls to like deal with it later. Even that is something that tends to go under the leader's nose and never be addressed head on. It's literally the leader's job to make those sorts of trade offs and judgment calls.

Richie Cotton: Yeah, I can see how it might be quite hard to get like a lot of sort of chief data officers attention to be like, Hey, we need to worry about documentation. Really, really important. But actually,

Cassie Kozyrkov: But is it? Because what are we going to use it

Richie Cotton: yeah,

Cassie Kozyrkov: Yeah, like it might not even be important, so someone's going to make that call.

Richie Cotton: yeah. I suppose related to this, it seems like that again, there are some design issues around, like, how do you make sure that this data is accessible and usable by the people who need it? Do you have any advice on, like, what data people need to know about design?

Cassie Kozyrkov: Oh, yes that is literally what those workshops are for. I would say go and take one of those workshops that the folks from people in AI research at Google have put out. That is... for creating empathy between the people creating, collecting, stewarding the data, the people analyzing the data, and all the other personas in the space, and how might you actually think about who needs which bit of it for what, and what would make sense to document under those settings.

So, that's actually a really rich topic that I think saying, here's the training, it exists, go get it. That might be the best way to tackle it.

Richie Cotton: Okay, fair enough. And just going back to generative AI, I mean, I suppose that's had a bit of a design revolution in that, hey, we've got this. Chat conversational interface now but I'm not quite sure that's the sort of the end point. Do you have any thoughts on where that's going and whether there'll be other ways of accessing generative

Cassie Kozyrkov: Well, sure. How sci fi should we be? I mean, let's go, let's go all the way sci fi. Let's go all the way sci fi. The idea of a landing page that is, there's some, whatever your favorite bank is, Bank of America's landing page, it is one. It's like that. It's designed for all users. You show up there and you get that pages experience.

With a little bit of customization based on cookies or something, right? That's, that's the internet as you know it today. Why would you have to have landing pages? If you could generate experiences on the fly, on the internet. And if you could have an experience generated on one side, representing the user's interests, and you could have an experience generated on the other side.

Representing a company's interests brokered together into a seamless experience that would completely do away with the concept of landing pages, which would do away with many of the internet concepts. So that would be pretty revolutionary. So you asked me for sci fi. You could have an on demand internet.

Yeah, you could theoretically have an on demand internet.

Richie Cotton: Okay, that's cool.

Cassie Kozyrkov: Yeah. And there

Richie Cotton: yeah, I do like this idea. The intent is just completely personalized for you. Um, okay,

Cassie Kozyrkov: would be so many issues with that because all the The problems with the social media chambers are barely a fingernail clipping on the problems you could have if the whole internet was a personal experience,

Richie Cotton: absolutely. Yeah, you never need to have another point of view. Every.

Cassie Kozyrkov: right?

Richie Cotton: I think I'd like to talk about the sort of the democratization of data science. So, it seems like generative AI, business intelligence platforms, better tooling in general, it's just taken away a lot of barriers to entry to data science.

So, how do you think the increased accessibility has changed the role?

Cassie Kozyrkov: So I wrote this article that is titled, Is Data Science a Bubble? And there I was asking myself what are the consequences to the data science career of a big influx into the field of data science. Role dilution and just that evolution of a profession that starts with very few hardcore people.

Becomes democratized, and there I introduced the the concepts of the and data scientist versus the or data scientists of the and data scientist would be somebody who is a statistician, and they're an analyst, and they're an AI engineer, and they're a machine learning specialist and and and somebody who thinks that data science is the everything of data and that you should be expected to be maximally hardcore on.

You should be paid a lot because you are this amazing unicorn who can do everything. And the and data scientist is incredibly offended by the or data scientist. Just somebody who dares to wear the exact same job title as them, that is a job title that's supposed to come with money, and is an or data scientist as in they just do one of these things.

Maybe not even that well. They touched a regression at some point. They made a scatterplot. Now they call themselves data scientists. You can imagine the horror from somebody's point of view. I spent 20 years becoming this good. How dare you? How actually dare you do this? okay, I both sides. It is horrible to have had the impression That in order to practice in the first place, you have to have a level of quality that's insane and you put yourself through that very difficult schooling and you show up and you're like, right, this is why I should be paid these dollars, which are rare and highly qualified.

And then you see other people jumping onto the scene to pretend to be you, to take your salary and dilute the market. How horrifying. I get that. On the other hand, the and data scientist is a threat to getting things done. Because there are just so few of them. Like, if everybody has to be that truly top hardcore level, nothing's going to get done.

And it makes much more business sense to specialize. And even if you are an and data scientist, you will not be doing all of them. Every day or every week or every month, even projects come in phases and during some phases, it's mostly statistics and during other phases, a whole lot of programming, so on.

And does that really all need to be housed in one person if it's not used all the time? Maybe it does make sense to specialize. If you have this expanding universe of data science professional roles, data science adjacent roles. Then maybe it's okay if people aren't at the everything of data. And maybe it's okay if they only do some part of it.

That means that you can have more people in it and you can get more done with data. I think that we are having that dilution. I think that there are a lot of people who have absolutely no shame who don't, wouldn't even qualify as or, just put data scientist somewhere on their resume and hope for the best.

And it does not carry that gravitas, that emblem of quality anymore. It just simply doesn't. And so when everybody is a data scientist and it doesn't mean anything as a role, the market's all over the place high variance in pay. And then you see a lot of people who want to have this, hardcore credentials tucking and rolling and trying to call themselves something else.

Welcome to decision intelligence folks, still hardcore around here. but. I think that fundamentally, if what data science is, is the discipline of making data useful, that is never going out of style. No matter what we call that job, the more data we have, the more data fueled technologies we have, the more we need people who know how to make data useful.

Do they need to know everything about it? I don't think so. At the end of the day do we need a little more honesty between hiring managers and folks about what the skills are? And that means that hiring managers might need to be more educated on, what a good candidate looks like and not just take it from the job title.

We absolutely need that too, but making data useful, never going out of style, job roles, job titles. There may come a point when people are too embarrassed to call themselves data scientists because of the way that. Economies go.

Richie Cotton: So maybe naming fashions change, but um, you think like, is the end game going to be lots more or data scientists? Do you think

Cassie Kozyrkov: I think so. Yeah, I think so. Because it just doesn't make actual economic sense to hire a person who is vastly overqualified, who's just not going to do all of the things that they're qualified to do on the job, right? Like you could hire somebody who is very, very good at Yeah. Being a brain surgeon and also a car mechanic and also a, fighter pilot.

But are you going to be doing all those things at once? It probably doesn't make sense.

It's a big waste. It's a big waste for you to have the qualifications and the thing that you, you're not doing at any given time.

Richie Cotton: absolutely. You don't want someone with many, many data skills and then being the person like you mentioned before, being the person that like checks that the laptop has enough power.

Cassie Kozyrkov: Right?

Richie Cotton: Okay. So, if the role is changing, are there any skills that you think are becoming more important? Like what should people be learning about?

Okay.

Cassie Kozyrkov: Well, the role is fragmenting. So the data science universe is expanding. And it means that it's now time to ask what flavor of data professional are you? So if you are an analyst, for example, then your excellence is speed. How quickly can you get inspiration out of the data into your own head and into your stakeholders heads without wasting their time, without bringing them red herrings.

So that is a game of speed. And when you are leaning into that, you will be leaning into how do I make myself faster at this business of extracting inspiration from data, every skill relevant to that you will want a little more of if anything, really advanced analysts are the ones who most need to know how the algorithms work much more so than statisticians who just need to know that it's the right thing.

Or machine learning folks who are just like, I'll get it off the shelf and run it and see if it works. The analyst has this massive multi dimensional space that they have to try to make sense of. So, for them, a p value is not a philosophical concept, as it is for a statistician or decision maker.

For them, a p value has, if they understand the mathematics behind the p value, then they understand how this big space is being projected down into a single number. And when they see the single number... And they see the formula, which they know, they can make some guesses about constraints on this big, confusing, hard to visualize thing that they were dealing with.

So knowing stuff like that is very useful to them, because they're trying to get all that inspiration into their own brains quickly. Domain knowledge. Leaning into domain knowledge for them also helps them be faster. Helps them not snooze past the gems and also avoid the red herrings. So, instead of leaning into which part of it you actually do, is where the, the core thing is.

Statisticians are helping decision makers, or maybe they're the decision maker themselves, they're helping themselves make one or a few big important decisions under uncertainty. There are a lot of skills you could lean into about that. And you will also notice that there are skills that seem data adjacent that are just so much less relevant.

So think about what you're actually contributing, not in a name of the discipline, but what do you actually do around here? Do you make automation work? That's a whole other thing. You might lean more into engineering type skills if you do that. And even in these three categories of AI, Stats, and Analytics, It fragments further this kind of approach on this kind of problem, this kind of setting, and certain skills will be more relevant than others again.

So maybe letting go of this desire to be the end data scientist and know everything, and just ask yourself, what am I actually doing around here? Or make that go faster. Never mind what it's called. How do I help myself do it better?

Richie Cotton: So we've got back to the idea of personalization again. So from the sci fi future internet where everything's personalized, actually your job's going to be personalized as well because the data space is just fragmented.

Cassie Kozyrkov: I, I, I think that this is nothing special. I think that you had doctors in the early centuries of humanity. Being the everything doctor and doing it badly on their own. And now you have hospitals full of all kinds of staff who are specialized in much more narrow parts of the process.

They know exactly what their job is, they know what it means to be good at their job, and they collaborate together to create outcomes more effectively. It's an expansion of the universe of healthcare. Well, similarly, we're having an expansion of the universe of data and data science. And so within that, you could now have interdisciplinary collaboration where you would have thought that it was all one discipline, but it isn't, it's people working on different aspects.

The soft skills that are backed up by technology and data are fundamentally very different already, even between a statistician and an analyst. And the more the universe expands, the more everybody has to specialize and become. a version of a version of a version within what used to be a bigger category.

Richie Cotton: That's really interesting. And so you mentioned like, domain knowledge and collaboration. Do you think the soft skills that are most important to changing as well?

Cassie Kozyrkov: I think again, the universe is expanding. And so there is a different set of soft skills. And okay, here's what I want to say about soft skills. I hate that we call them soft skills. What I much prefer to call them is that they are the skills that are hardest to automate. When technology becomes better, When the UX of these tools becomes better, so that you can program faster, for example, when there's less fiddling when you, don't have the stringless factors problem anymore.

Then you can spend your time on what matters, on why you're here fundamentally in the first place. And there's going to be some part that resists automation. A lot of what we were so excited about in data was just that we could, we could add two numbers together. How wonderful. We didn't worry about what it was going to be for.

We were just like, well, we can do this. And then, well, we can have a spreadsheet and, well, we can make a scatterplot from it. Not what it's for or why, just that we could do it. Well, now there's so much that we can do. The reasons that we're doing it all are hanging on, on the soft skills. What are we trying to achieve and why are we trying to achieve it?

And the better our tools get, the less we huff and puff and, do the equivalent of work with punch cards when you could work with a laptop. Is it less hardcore to work with a laptop than punch cards and vacuum tubes and build your own computer from scratch? I would say no. I'd say that the soft skills become revealed faster.

Because if you're spending all your time building vacuum tubes and playing with punch cards and all that you will look very busy achieving a relatively small amount. As the tools become easier. Now, you will have to be accountable for the why and the thought process, and you'll start to see very different thought processes, very human parts behind data science meant what statistics is, is effectively epistemology.

It is numbers based epistemology, it's philosophy. And it's decision making if you spend so much time fiddling with deriving a t test or whatever it is and then going to the back of the textbook and like looking things up in that table you might not even remember why you're doing what you're doing you just go through the pantomime the dance of well I guess someone told me to do a t test if I have this kind of data so I may as well do it Well, what if that happened, as it does in R, with, there's your t test function, go.

Now, all of a sudden, you're forced to do the bit that was actually important, which is like, why this? Why does it make sense? How do I think about this philosophically? That bit's not automatable. is that soft skills? I would say that those are skills that are hardest to automate.

Richie Cotton: Yeah, I can certainly see how understanding, like, what do you care about and why in communicating that to your colleagues or whoever else. Yeah, very difficult to automate and I think these skills are gonna be around for a long time.

Cassie Kozyrkov: Yeah, all these tools are just ways of executing the gaps between the soft skills, and that's, that's going to shrink. And most of it will look like soft I think that's brilliant. The less time we spend fiddling on things we don't need to fiddle. The better, because the fiddling bits, honestly, I have this distinction I like to make between thinking and thunking.

Thinking being those flashes of brilliance that the moments when you're actually thinking about what needs to be done, why it needs to be done, where you're on, where you're cognitively engaged, where you're, communicating, solving. And then thunking is you already know what to do, now do it.

The way in which we glorified the STEM disciplines. Has led us to wholeheartedly pretend that all this mathy stuff is filled to the brim with thinking. I... I would not say that I agree that it is inherently any more intellectually engaging than a whole lot of other disciplines. Like, yeah, mathematics is this way of being very careful, very precise, and saying relatively little in a very precise way, mastering symbolic manipulation, but it's not that much more glorious than following recipes in a kitchen.

They're similar, actually. So what I'm trying to say here is that those very technical pieces are filler between a lot of the thinking. Yeah, there's some thinking in there too, but a lot of it's not really. A lot of it is once you've learned how to do it, safe and simple minded. Yes, it's hard to learn how to do it in the first place.

Once you've learned how to do it, safe and simple minded. And then when you take that away, oh my goodness, now this is the deer in headlights moment when you have to think about the what and the why and the open ended chaos of what should we do next?

Richie Cotton: Absolutely. So yeah, I can see how once you've learned your Python RR, then, you can do a lot of thunking there. And it seems like what we're leading up to is maybe the most important bit is like how you go about making decisions. So, yeah, I think decision science is your, is really your forte.

So, can you just tell me like how decision sciences sort of related to data science and, where the differences lie.

Cassie Kozyrkov: Oh, okay. So, so this is a place where the branding is awful and the words are confusing. And so a little bit about the history. So Decision Sciences, not Decision Science Taylor, Decision Sciences, plural. used to refer to the collection of disciplines that had something to say about decision making. So you would think about your economics, psychology, neuroscience, your classic candidates.

And of course, you would also start to bring in the operations research, statistics, AI. And this is a bucket that was rendered deeply unsexy many decades ago. It's when I think about what brand colors might make sense there, it's like that grayish green of Excel, like no one's excited about it. And I think that that is because it was interrogated a bit before its time.

Now, decision intelligence is the discipline of turning information to better action in any setting at any scale. And how this is different from the decision sciences is not only is it a modern look, but it's also very impatient with the idea that you have to stop once there's data, right? That like, it's some soft thing, and you did your pros and cons list, and then, ah, now you have petabytes of data.

Yeah, well. Oh no, you, have to hand it off to the AI person. I know, you should be able to, if you were truly an expert in decision making, you should be able to turn information into better action, any scale, any setting. No stopping. Not with petabytes of data, not with... No formally written down data at all, but how do we think about making the most out of 10 lunch conversations to make a career decision?

You should be able to handle any of it. And so that's the kind of academic umbrella, a more modern one that pulls into it everything that every discipline would have to say about decision making with zero patience for.

throwing your hands up in there and running away screaming when we get to the data sciency side of things. So then in that setting, data science is part of decision intelligence. The decision sciences, the traditional ones, which were not highly quantitative, are also part of that umbrella. But all the bits in those disciplines that has very little to do with decision making is now, another major's required reading and just for interest reading in this one.

So decision intelligence is. The ruthless pursuit of better decision making, everything that you would need to know from every point of view, specifically about better decisions. If decision making is turning information into action, you better be really good at information. The data science piece is there.

You'd also have to contend with the judgment piece. Now, in psychology, they tend to call their decision making discipline judgment and decision making. The judgment piece is about selecting the way that you want to decide in the first place. There's not one right way to do that. And so, there are smarter ways to frame your decisions, there are dumber ways, but there's not one right way.

Makes things very exciting. it also is fun to realize, if you have a graduate degree in statistics, like I do, at some point, hopefully in your degree, you realize, my goodness, My entire degree, the judgment piece was done for me by my professors. I have never been taught to do this. The hypotheses were there and I had to use the methodology that I was taught to most effectively test them.

But the why was never there. In fact, you almost get the opposite habit. You are disincentivized from questioning whether the problem is worth working on. You're incentivized towards type 3 error, because it's some totally inane thing, some hypothesis about a bunch of rabbits that Sally has in a field.

And instead of saying, why the hell am I dealing with this, professor? You're just like, okay, Weibull distribution, blah, blah, blah. So that first judgment piece of how are we going to go about structuring our decision in the first place. So, so important also under that umbrella now, decision science today is confusing because it's I think that I probably didn't help very much because I was one of the big advocates in this space.

And I chose to go from chief data scientist title to chief decision scientist title. The reason I chose chief decision scientist and not something with decision intelligence is that we joked about it. I'd be like, I'm chief decision intelligence, whatnot. But, it's not pithy. So I was just like, all right, I'm going to flip data scientist to decision scientist to make a nod for the importance of decisions as the why that kicks off all the data stuff like information on its own.

If it falls in a forest, who the hell cares? It's it's through our decisions, through our actions that we affect the world around us. So the decisions are the important bit. I want to really point that out, even in the title. And then, More people got excited by that, and now there's decision scientists all around.

Now, I would say that if I had to set like a general direction for what a decision scientist would have to be for me to accept them as a decision scientist, is that they would have to be trained in at least some aspects of decision making, but that their role, the way that they would execute it, is not as decision makers, but as decision advisors who have the humility to if they are helping a decision maker figure out, say, between A and B to absolutely not care about A and B.

And this is a weird thing to say, but when I do decision science advising I have to really truly embody not caring about the options and caring only that the decision is made as well as possible. So it's not me who's making the decision. It's not me thinking like, I like A, so I'm going to nudge this decision maker into A, or the decision maker has already chosen A.

Let me help them feel better about the A that they've chosen. Instead, I'm going to make sure that this decision is made as well as possible. So I'm going to take my trained decision making brain and I'm going to let the decision maker borrow it for a bit through their eyes. They're the actual decision maker.

What's important to them, their priorities. I'm going to help them make sure they thought of everything, but in their own way. So somebody who does that professionally, I would say, great. I would love to call you a decision scientist. And somebody who studies that. As well. So that's how it all fits in.

Some people also love to say decision science is the qualitative piece and the data science quantitative piece within decision intelligence. This is where we realize that things need tighter branding. Otherwise, we will get confused. That really is in a nutshell, the genesis of the mess.

Richie Cotton: No, that's amazing. That involves so many different disciplines. You talked about the data science side, you talked about psychology, and then it's all like the understanding, like, how do you actually persuade people? Like, how do you teach people how to make decisions? There's a lot there. Are there any aspects of decision science that you think data scientists ought to know about?

Cassie Kozyrkov: so I think that everybody would be better served taking a decision intelligence approach. That is

having a hard boundary between data science and decision science, I think is not useful. So it would be better if we blurred a little bit there. It would be better if data scientists could take a more decision oriented approach. It would be better if decision scientists didn't throw their hands up the minute things got real with data.

That said, what data scientists should absolutely appreciate, is that they are a downstream worker of everybody else. And if you haven't trained yourself in the upstream pieces, you are a hazard, you're a wrecking ball in that space. So thinking that you've learned a bunch of mathematics and now you're ready to go and do the upstream stuff that you have a zero qualification for is a hazardous level of hubris.

So don't do that. Similarly, because you are downstream of a whole bunch of things, your effectiveness depends on the effectiveness of. The decision makers above you if you were working with a bad one that will have major implications on your career. So there's another reason why you might want to go and train yourself to actually be good in that space.

So that if there isn't already an adult in the room, maybe you could train yourself and learn to be the adult in the room. However. You hadn't studied that stuff either. You are not the adult in the room. Knowing some mathematics does not make you so. So, there's a lot to know in this area.

Richie Cotton: It seems like it's going back to the idea of collaboration. And you need to know what your position is relative to others. So just to wrap up is there anything you're particularly excited about in the world of data or AI at the moment?

Cassie Kozyrkov: What I'm excited about most of all is that there is an interest in accountability for decision makers, for leaders that wasn't there before. And what this means is that decision making necessarily has to become more interesting. So the way that everybody was passionately running around after. Data science for its own sake, data for its own sake.

No why in the mix, just let's do regressions. I don't, why, who cares? I feel like we're on the cusp of major change there. Where we might actually start to do it in the sensible way around. What's worth doing. Why do we want to do it? How do we want to structure it? And holding people accountable for wisdom in those things.

And that's going to be great for practicing data scientists, because the problems of dealing with loose canon decision makers who don't know what they want and don't know what they're asking for, and then that has terrible impacts on data science careers, that's going to be somewhat mitigated by more scrutiny and interest.

And for me, I just want the world to be so much better at decision making. We're going to embarrass ourselves in 500 years time. We're going to look at all the things we thought were okay. And we're going to be amazed that we ever allowed ourselves to do this as a species. Like the way we look at bloodletting and leeches now, goodness, why would people have thought that was okay?

Or witch hunts? I think we're going to look at ourselves the same way in 500 years time. And I'm just so excited. So we're on the precipice of change, accountability, and actually taking decision making as a skill.

Richie Cotton: That sounds fantastic. Having data science with a purpose rather than just just for the fun of it. And it can be fun, but yeah having a purpose sounds like a good idea. Thank you so much for your time, Cassie. It was really great having you on the show.

Cassie Kozyrkov: Thank you, Richie. It was really a delight to be here.

Topics
Related

You’re invited! Join us for Radar: AI Edition

Join us for two days of events sharing best practices from thought leaders in the AI space
DataCamp Team's photo

DataCamp Team

2 min

The Art of Prompt Engineering with Alex Banks, Founder and Educator, Sunday Signal

Alex and Adel cover Alex’s journey into AI and what led him to create Sunday Signal, the potential of AI, prompt engineering at its most basic level, chain of thought prompting, the future of LLMs and much more.
Adel Nehme's photo

Adel Nehme

44 min

The Future of Programming with Kyle Daigle, COO at GitHub

Adel and Kyle explore Kyle’s journey into development and AI, how he became the COO at GitHub, GitHub’s approach to AI, the impact of CoPilot on software development and much more.
Adel Nehme's photo

Adel Nehme

48 min

A Comprehensive Guide to Working with the Mistral Large Model

A detailed tutorial on the functionalities, comparisons, and practical applications of the Mistral Large Model.
Josep Ferrer's photo

Josep Ferrer

12 min

Serving an LLM Application as an API Endpoint using FastAPI in Python

Unlock the power of Large Language Models (LLMs) in your applications with our latest blog on "Serving LLM Application as an API Endpoint Using FastAPI in Python." LLMs like GPT, Claude, and LLaMA are revolutionizing chatbots, content creation, and many more use-cases. Discover how APIs act as crucial bridges, enabling seamless integration of sophisticated language understanding and generation features into your projects.
Moez Ali's photo

Moez Ali

How to Improve RAG Performance: 5 Key Techniques with Examples

Explore different approaches to enhance RAG systems: Chunking, Reranking, and Query Transformations.
Eugenia Anello's photo

Eugenia Anello

See MoreSee More