Are Spreadsheets Still Relevant For Data Analysis? with Jordan Goldmeier, Author of Data Smart

Jordan is an entrepreneur, a consultant, a best-selling author of three books on data, and a digital nomad. He started his career as a data scientist in the defense industry for Booz Allen Hamilton and The Perduco Group, before moving into consultancy with EY, and then teaching people how to use data at Excel TV, Wake Forest University, and now Anarchy Data. He also has a newsletter called The Money Making Machine, and he's on a mission to create 100 entrepreneurs.

Adel is a Data Science educator, speaker, and VP of Media at DataCamp. Adel has released various courses and live training on data analysis, machine learning, and data engineering. He is passionate about spreading data skills and data literacy throughout organizations and the intersection of technology and society. He has an MSc in Data Science and Business Analytics. In his free time, you can find him hanging out with his cat Louis.
Key Quotes
I think a successful data mindset is the ability to think critically about the products in data and just like the data space. Everyone wants things to be clean. Data plays on this idea of objectivity. these are just the facts. But there are no real facts anymore. There's what you present, even like a sensor that gets directly from a physical measurement still runs through people's idea of what it should say, of its precision. There's an interface between it. So we don't have true facts anymore. So that means that you need to be really skeptical about the things that are presented your way and the decisions you have to make
Excel does slot in very nicely within data science and that many data scientists are at their own expense not including it in one of the skills that they should offer. So for me, the way Excel like assuming that it does continuously evolve, but the way Excel really slots in as we talk about in this book is creating algorithms where the components are actually very visual and very seen and they have formulas associated with them. So you can really understand the logic and you can walk someone through it. If you were going to start a data science project, and you had to communicate it to leadership, you can come in with all this Python code, you can run a Jupyter notebook, and maybe that works for your next level manager because they were you five years ago, but for their manager, their manager's manager, they just want the facts.
Key Takeaways
Despite the availability of advanced tools, Excel remains a valuable asset for data scientists, especially for data cleaning, prototyping, and presenting data to non-technical stakeholders.
When presenting data science projects to executives, Excel’s general familiarity can bridge the gap between technical and non-technical audiences, facilitating better understanding and buy-in.
Modern Excel features such as dynamic arrays and lambda functions can significantly enhance your ability to handle complex data tasks and perform operations more efficiently.
Links From The Show
Transcript
Adel Nehme (00:40):
Jordan Goldmeier, great to have you on the show.
Jordan Goldmeier (00:43):
It's great to be here.
Adel Nehme (00:44):
Thank you so much for coming. So I'm going to start off by saying that in many ways your book Datamart broke a lot of my preconceptions that I had about Excel. There's a trope in the data space that data scientists will refuse to do anything in Excel no matter how ea... See more
Jordan Goldmeier (01:14):
Well, I think that the way Excel presents itself in data science is continually changing. So much of the critique that came really, if you think about the history of Excel first, it was like your choice was Excel or matlab. Your choice was Excel or S or Jump or something like that. So it was either you do a lot of work in another coding language or you try to make it work in Excel and it's just your computer's freezing and overheating catches on fire, but sometimes it works. So I think what happened was all these new open source technologies came about, so Python and R, and there were still in the executive level leadership, this idea that we always need Excel. So this developed a lot of hatred from data scientists who were forced to use this product who were like, this is not the right way to do this.
(02:03)And having lived on both sides of that, I think that Excel does slot in very nicely within data science and that many data scientists are at their own expense, not including it in one of the skills that they should offer. So for me, the way Excel, assuming that it does continuously evolve, but the way Excel really slots in as we talk about in this book is creating algorithms where the components are actually very visual and very seen and they have formulas associated with them. So you can really understand the logic and you can walk someone through it. And as well as if you were going to start a data science project and you had to communicate it to leadership, you can come in with all this Python code, you can run a Jupyter Notebook and maybe that works for your next level manager. They were you five years ago, but for their manager, their manager's manager, they just want the facts and they want something they can click on. That makes sense. And even if they are programmers, chances are I have my own company, I have staff, so I'm A CEO, and I tell them all the time, just the facts. I only want the facts. And so Excel gives you just the facts if you do it right.
Adel Nehme (03:13):
Okay, that's really great. And we're going to definitely uncover a lot of the use cases that could be relevant for data science and Excel, but I mentioned that trope of data scientists not wanting to use Excel at all, but there's also another kind of really popular meme in the kind of data space of literally Excel carrying around the world's GDP on its back, right? So Excel is widely used by a lot of folks. It has quite improved a lot as a tool maybe. Why do you think Excel still has a bad drop today despite all of these improvements? Expand on that notion a
Jordan Goldmeier (03:44):
Bit. Sure. So just, I'm going to throw the book up just so we have visual there.
Adel Nehme (03:50):
It's really great. It has a spreadsheet, vibe and everything. The sales.
Jordan Goldmeier (03:52):
Yeah, totally. Right. Yeah, I always wanted to do this for the cover of a book and then finally I was able to,
Adel Nehme (03:59):
Yeah, it's really great.
Jordan Goldmeier (04:01):
I like to compare the use of Excel to the TI I 83 calculator. I guess it's ti I 83 plus. Now I don't know what the kids are using. All right, maybe they're not even using that one anymore. Anyone during those years when I was a kid and I'd hang out with these tech nerdy people who were coding, they'd have the jobs I want, I'd go to synagogue, ask them a million questions, and I talked about the TI 83 plus and they went on this thing about how Casio makes the better calculator. That may be true. There may be products that do what Excel does better, but the way tech works in our world is it's not about what we idealize, it's about what people are using. So we can debate what the degrees to Microsoft made people use Excel or took them away from other people, but as you said, it does run the world's financial system for better or for worse.
(04:53)I can sit here and list all the things that it does well, you can list the things you don't like. It's not like you're wrong, but it doesn't change this fact. So I think if you are an analyst and you can send your hate mail to Jordan net, anarchy.com, there you as a data scientist in the career field, the data scientist sits above the analyst on the ladder. But the truth is, in the broad sense, you're analysts, okay, this is what you do. You use your brain to analyze this stuff. You need to know Excel. I mean, if a bank, if you work for a big bank, you may be right about all these products and this and that, but over the years they've been using Excel. And so even though I love VBA, the macro coding language, for those who don't know behind Excel, I wouldn't recommend people use it for most problems. But the entire banking system is run on it and there are cottage industries to approve that, but it wouldn't change the fact that a lot of things that are important are run on it and people don't want to change those things quickly.
Adel Nehme (05:52):
Okay. And you mentioned earlier in our discussion, you mentioned that there are quite a few use cases where Excel could be very relevant for data scientists. One is where you have to build systems or algorithms where it's very visual. One is when you have to show a prototype to leaders. Maybe walk us through these use cases in a bit more and walk us through how Excel could potentially work with other popular tools in data in the data space, such as rm Python as well.
Jordan Goldmeier (06:15):
For sure. So I think that it's not like that I'm an Excel maximalist on this. Certain things you have to do, even things that we do in the book, the book is more of a textbook to explain all these things, but I would not build, we built a spam detector using the bag of words in a naive base. So to classify, I wouldn't do that in Excel. I mean, it's not deployable in Excel, but if you wanted to explain to someone how it worked as a professor, as an analyst, as a manager, Excel is great. And if you didn't want to run the whole entire training set, but rather just needed to explain to someone using a small amount of data, Excel is great for that. Again, you can't put that in production, but it's good for that. The other thing Excel is great for is data cleaning.
(07:12)If you've used Deep Pryer and R or you've used Pandas and Python, it's very similar. But if you had to explain it to someone, again, what you can do in Power Query is actually much quicker than you can do with Deep Pryer. And I love Deep Pryer and I can crank it out, but again, it makes it a little bit easier to understand what's going on. I can, for my own work use R, but for my clients, I'm going to use Excel because I'm talking to people who are finance people. They're not coders, but they understand what Power Query does. And then I'm trying to think here, because it's not really about what is right now. Right now in the world of data science, all these other technologies are the better choice, but there will be a point in the future where you could plop something in Excel, send it off to be worked on by some algorithm in the cloud and get the results very quickly, right back. The same with Power BI as well. So again, it's good to know it now because our future, because the difference between these tools and the difference between these tools right now is very obvious, but I think that we're going to reach a point where they converge.
Adel Nehme (08:24):
It's interesting that you mentioned convergence because there's also another big elephant in the room here, which is generative ai, right? We've seen a lot of, even DataCamp has Data Lab which use the generative AI to fix your code, to generate code. We've also seen Power BI having copilot where you can generate charts, generate functions in d que. Do you see that for most analysis use cases, right? We're going to converge on a set of tools that have a GUI interface, for example, in the future that leverage generative AI in the backend to come up with insights, right? Is that kind of the future that you see as you mentioned convergence here?
Jordan Goldmeier (09:04):
Yes. I still think there's going to be coders. I think it's really going to get, I think it's going to get the people who are analysts who are doing analytics, let's say. So they're less code and they're more reporting for them. It's going to be a boon. And I think that it's going to fundamentally change the way we use Excel. And I have to say this with a sense of nostalgia because the way I used it was all about automation. And so that way is going away, and now it's going to be more like using Google. If I didn't know something, if I didn't know how to code something back in the day, I use Google. And people are like, how do you know this? Well, I learned how to Google now a next generation of people will learn how to copilot or learn how to use chat GPT, and this is going to help them. And I'm going to be left behind. I'm going to have to move a mentorship role. I'm not looking to master those skills, rather, I'd rather help people who are mastering them, get jobs, get the things that they want, and live the life that they want.
Adel Nehme (10:01):
Yeah. And how do you see, maybe you expand on use cases, generative AI and Excel. How do you see generative AI transforming the Excel experience?
Jordan Goldmeier (10:08):
It's like right now if you've got copilot, it's very limited, but you can just sort of see what they want. So if you've got copilot preview, you can use a Excel table, which for those who don't know, an Excel table is actually kind of similar to a data frame in Excel. So not just a range of data, but rather they call it in a control T table. I used to call it a capital T table. So in that you don't think in terms of cells, you just think row wise and the same with the way you dly or just add this to this column and then it runs it down. So the big thing I would think is that, or right now, if you want to add a new column, Hey, do this column that does this. It will work in copilot. Copilot can do Excel tables.
(10:50)If you say type in, can you highlight cell A seven, it will work. Can you generate insights? Okay, it will generate insights. Those insights are terrible. They don't do anything. They're just like, this value was noticeably greater than this value. It's like, no shit. I don't know. It's like this has the most percentage of this. It's not very good right now, but you can't imagine a future where you hit a button and on your computer and the mic turns on, you say, copilot, let's filter this, this, this, and this. That's someone's idea. Boom, it does it. And then someone says, Hey, pilot, save that and do this. So we're getting close to this idea that we tell the computer what to do and it just does it. Excel is going to be a great platform for this, especially for very basic stuff.
Adel Nehme (11:36):
Yeah, I can really imagine because the visual component here of working with data visually rather than through a coding interface, I think it's going to be extremely valuable for Excel. And you mentioned here voice for example, you press the button and mic opens, right? The interface will be very definitely interesting in the future. Now you mentioned Power Query as well and how it can be a much more simpler way of doing data analysis and data cleaning than DLI or Panda and certain use cases, right? Maybe walk us through Power Query more in depth and how does it differ from Excel's VBA or other scripting features within Excel as well?
Jordan Goldmeier (12:11):
For sure. So Power Query is a data cleaning tool. The best way to think about it is that I usually take people through this in my Power Query classes. It's like basically in 2010 or 2013, everyone was concerned with Big Data and Microsoft wanted a way to deal with what I call the power, or excuse me, what I call the business intelligence workflow. So if I asked you what business intelligence is, if I asked you what analytics is data science, or if someone asked me what it was, we'd all come up with different answers. And usually these answers are insight, blah blah, decision making, who cares? But really to me, it's like asking what is love? What are relationships? Meaning the answer might be 42, but most of us are mortals and don't know. So
(13:08)Basically the Power BI or power, excuse me, the BI workflow, not Power bi. The BI workflow works like this, okay, you get data, you transform that data, clean it up in some way, and then you present it so that really Microsoft came up with three technologies to do that Power query, power view and power pivot, or should really it's Power query, power pivot. Then Powerview in that order, do that. So Power Query lets you take data, just transform it, move everything around, and it keeps every step for you as you do it. So deep pli, you would write select all, and you would chain all your things and it keeps the steps. This was abstracted that and gave it to the everyday regular person, alright? And then big data happened. And so Power View and Power Pivot actually moved off to another product called Power bi. But this product of Power Query exists across multiple spaces in Excel and it lets you take in a data set from just about any source notwithstanding if you have to pay for it. But even then you may have to pay for a connector to that data source. It brings it in. And of course we know data never looks good when we bring it in. So you can just imagine a table with something the top and something on the side,
Adel Nehme (14:17):
The values and Exactly.
Jordan Goldmeier (14:19):
So what you want to do is, so Power Query lets us knock out all this stuff, take it all and turn it into a data frame. And from a data frame, we can actually analyze it, whether that's doing a report as a pivot table or we take it to another piece of software to actually do the analysis from there.
Adel Nehme (14:37):
So PowerPoint in a lot of ways is data transformation tool that lets you clean up your data. And maybe if you look at maybe Power bi DAX is very well known to the audience here, how would you compare it to something like Dax in terms of syntax complexity?
Jordan Goldmeier (14:53):
I think it's easier than Dax. I mean M, it has some type of learning curve to it. So the language behind it is called M stands for mashup. It does have a little bit of a learning curve, but in truth, you don't need to know any of it because you can just point and click and it'll write it for you. It's only when you have to get to more really harder things do you have to learn it. But I will say that I used Power Query for about three or four years before. Now I'm learning M, but before I thought I needed to learn M because it's not that hard to understand once you get the hang of Power Query, it's like this is the next step. And that was the last step. This is the next step. And that was the last step, which again, that's deep PLI in art.
Adel Nehme (15:33):
Yeah, a hundred percent. And then given here that we're talking about a lot about Excel use cases for data science, and you mentioned Power Query, which I think may be something the audience hasn't heard of before. What are other strong Excel features that people may not be aware of that you use to operationalize a lot of the use cases mentioned in the book?
Jordan Goldmeier (15:52):
Well, the big thing that this book does that the last book could not do and did not do was the use of dynamic arrays. So this is a major change to Excel, which allows us to reference cell ranges that are not just one cell but can automatically grow to other lengths. So that's one of the things. Now, I didn't really do a whole lot of Lambdas in the book. I just talk about Lambda, but you can think of a Lambda as like an L apply and R if or I know other languages have lambdas, but basically you can just create a function out of nothing very quickly and you can use Excel formulas to do that. And so this opens the door to things like Recursion. So if you have an algorithm that requires recursion, it's not just data science, computer science too, right? This is very easy for it to do.
(16:39)If you wanted to fake recursion before in Excel, you had to deal with circular references, which was just how many references do you go or how many times does it circle until you just give it a threshold to stop? So the dynamic array thing changes the game because also it was very hard and clunky to do it. And although I won't say it's not clunky as compared to writing it in code, it's actually like all the dynamic arrays where before you had to write all this and drag down and change that. The way I've set it up is you type in one formula, boom comes up with your Covance matrix, right? So I made it a lot easier. The other function is the let function. So with the let function, you can declare variables in your formulas. So that's a lot better too. The thing that has not improved is optimization.
(17:27)So it's good to optimize in Excel. You can show people how the optimization works, but if I had a choice, I would use the algorithms in another program. It is just very confusing. So when you run these optimizations, and again, is the linear optimizer better in Excel than the choices you get in packages you could get access to if you use other languages? I don't think so. I think it's better, but in terms of these other things that I've mentioned, just building out the infrastructure of your tables using dynamic arrays, it's so much easier than it used to be and intuitive.
Adel Nehme (18:06):
Yeah, that's really great. And then it kind of circles back to what we discussed when we set the stage and how Excel is relevant or not relevant for data science. Maybe to take that question into, take it a bit a notch further, how would you advise data scientists to start learning Excel and what use cases they should? Definitely, if I'm a data scientist working in a bank right now, when should be the next time I consider to use Excel?
Jordan Goldmeier (18:33):
Well, I mean one obvious use case for me is dashboards. So you could use shiny in our, is that still a thing? I guess I haven't code it lot.
Adel Nehme (18:41):
Yeah,
Jordan Goldmeier (18:42):
Python's got a similar thing. I can't think of the name. Dash, okay. Yeah,
(18:47)You could do that. That's one way to go. It turns it into a website. I think doing it in Excel is very useful. I think it's even quicker if you do it in Excel and it exposes the data a lot more easily. It's not like I have to create a table, it's just like I right click on a chart, what's the data? And then it takes me to the guts really quickly. So I think that it's great for dashboards and then if that's something you want to start with, I also think that you should start learning Lambdas because that's going to fulfill your desire for things. If you're more of the sort with a Lambda, like I said, you can run on every row, you can run a specific function, you can do running totals, you can do a lot that you could do in these other languages, and it's very flexible. And I think that that would satisfy many people that it is a data science tool. And then same thing with let, you should learn the let function as well as Excel tables. Excel tables will make your life a lot easier. You'll just see it. You can't really interact if you use R, use R Studio, I should say, with a data frame, the same way you can in Excel with an Excel table.
Adel Nehme (20:02):
Let's maybe pivot a bit in our, we were talking pivot here, no pun intended. Maybe let's pivot in our conversation and talk about that prototyping use case. Right? I love the prototyping use case because I think the visual interface of Excel shows the power to bridge the gap between a data professional and a non-data professional and an executive to illustrate the power of a use case. Because we always know that a common thing in data science that you may have the best algorithm in the world, if you're not able to convince an executive to get it operationalized, it's not going to drive any value. So I would love if you could comment on the role Excel plays in gaining executive buy-in on data science use cases, illustrating the concepts that have a lower barrier to entry for non-technical profiles and how it can be used as a tool to scale data literacy potentially within the organization.
Jordan Goldmeier (20:53):
Absolutely. And I just want to add that copilot is part of that data literacy idea that type it in, and then it even shows you what it's doing. And so in terms of Excel and using it with non-technical professionals or even technical professionals, obviously you can build something very quickly and you can show it to them. You can use your Excel as PowerPoint. You can show them exactly what you're doing and give them a very constrained way of moving dials and things like that. Now, a lot of people will still say, well, there are other ways to do that and there are better ways to do that. But what they don't realize is that most organizations, if there's an Excel test, if we can't do this in understanding it in Excel, we don't want it because everyone has Excel. Microsoft, I can call 'em up, give me Excel.
(21:43)Okay, here you go, right? Can I do that with we're going to run our own R server? Well, I'm sure maybe there are easier ways to deploy that as compared to when I used to code. But the implementation of these other technologies is a lot more complicated. And at the end of the day, your algorithm is great, but to deploy it, you need to be frictionless. And part of being frictionless is giving it to management in a way, getting them to understand it in a way that has no friction. This is what it is, this is how we can understand it. Does it pass the Excel test? So a lot of people don't understand it because tacit, they think they're going to argue. They think that they're going to come in and show how smart they are, and this is the best way to go. This is what to do. But it's like we have an infrastructure based off of Microsoft. This is a 30-year-old tool. It's the most popular tool in the world, the most used application in the world. You can't pass this test and we got to get something else. We don't want it.
Adel Nehme (22:38):
I couldn't agree more, especially when you think about a lot of the time when you mentioned passing the Excel test and being able to get executives to buy in on what the concepts are. Maybe can you give me some use cases of ways that you do that in the book, for example, of ways to illustrate a technical concept to an executive using Excel that you think works really well to illustrate this concept that we're discussing here?
Jordan Goldmeier (23:04):
I think that I should also mad or also add, excuse me, that this is more some of the content of becoming a data head, which is about more about how to communicate non-technical topics to technical people. So the way to fuse that in Excel is just to use things like charts and graphs and data visualizations that convey these concepts very quickly. I think though what's not being said is that to make someone make a decision, what they need is simplicity. They need to understand this concept very easily. They need transparency. Transparency. They need to know that if someone needed to take it apart, maybe it's not them. They can know what they're doing and they need just the facts. Here are just the facts. This is all the important, and that's it. I don't want to feel overwhelmed. So I think that Excel is a great and its limitations. It forces you to think of what is the most important stuff. And then there's just one last thing that's non Excel related that people need, which is you need to appeal to people's emotions. I mean, that's just a fact. You want, it's data scientists, we don't want to hear that. We love facts, but you need, if you want to make someone to make a decision, commercials don't give facts. They appeal to your remote round of fact.
Adel Nehme (24:14):
And maybe extending that even a bit more, because data storytelling is such an integral part of driving impact with data science. What are some of the best practices that you advise? You mentioned appealing to your emotions. How do you best approach that as a data scientist? What's your advice here?
Jordan Goldmeier (24:29):
So my advice is first of all, lay out all the facts that you want to say. Get it all out. This is how I write my books. I get it all out. When we turned in data head, it had 400 pages and then I chopped it down to 215. So my coworker, my co-author, Alex, that was the first time you'd really seen me do that. But that's how I operate. So get it all out and then start looking at what are the most important things, what can I combine? I mean, think of yourself as running principle components on it. What are the key components of all these facts that basically explain most of 'em? So that's what I would do to start out and then to construct a story. It's really not that hard. We overcomplicate this. So beginning, middle, and end, here's how we started, here's what we tried, here's the result. And if you want to get more advanced, you run it as up and down. So we tried this project, it didn't really work, but then we figured this out, middle point,
(25:29)And then we kept running this way until we ultimately realized the problem. Third act break. So what are we going to do now? Well now we're going to go, we realize this is our end goal and this is what's going to take us home. We run, you can plus negative and then end on a plus. Alright? Or you end on a negative. That's the essence of that. I haven't put that into a full course yet because I'm still ideas together, but plenty of people have written about it. There's a great book. Nancy Duarte I think wrote, I can't think of it right now. I don't want to Google.
Adel Nehme (26:03):
We'll definitely add it in the show notes. We can mention it. And you mentioned here your book data Becoming a Data Head. It's actually a great book. Highly recommend that everyone reads it. That book really touches upon the underlying concepts needed to think and develop a data mindset and think like a data professional. Maybe expand on what a successful data mindset looks like here, Jordan.
Jordan Goldmeier (26:28):
Well, I mean lots of people will probably disagree with this. I think a successful data mindset is the ability to think critically about the products in data and just the data space. Everyone wants things to be clean. Data plays on this idea of objectivity. Oh, these are just the facts, but there are no real facts anymore. There's what you present. Even a sensor that gets directly from a physical measurement still runs through people's idea of what it should say of its precision. There's an interface between it. So we don't have true facts anymore. So that means that you need to be really skeptical about the things that are presenting your way and the decisions you have to make. We take for granted, for instance, someone gives us a study using a TTAs, right? 95% is the usual confidence level and we take that for granted.
(27:23)So I mean one in 20, are you comfortable with one in 20 results being potentially random that it could look like it's correct, but it's not saying this in the most data science statistical way. But I think people, I'm saying, are you comfortable with that one in 20? We don't think about that. It's like, oh yeah, it passes. It's good. I mean, for me, it's just a matter of being skeptical. It's about understanding that you're not going to save the world with your data project and really realizing that even though we hold data scientists up here, we work in teams. So you need to be able to understand how your influence and what you want to try to get done spreads across that team. I know in the back of the book we talk about different scenarios of data, project failures as a result of lack of communication. And anyone who reads that is going to be like, oh my god, that's me too.
(28:15)And I just want to add one last thing. We got a recent review on the website or on Amazon I should say, that said, this book takes a really negative view. Nothing is good about ai, and they gave us one out of five stars. So if you want a book that praises AI and talks about how data's great, it's not that book. If you want a realistic assessment that's going to help you become a manager because you didn't make the wrong mistake because you put too much faith in your own abilities or the abilities of others. It's a great book for you.
Adel Nehme (28:41):
And maybe you walk us through what the book tries to say about AI that you think the community would disagree with here or some members of the community would disagree with.
Jordan Goldmeier (28:49):
Well, I mean, we wrote the book before generative AI became such a big thing. So there's some mentions of generative AI in the potential future, but clearly it's taking over our businesses in a way that this book does not describe. Although I wouldn't say that. I mean, we were never trying to be that. So honestly, I think by and large people will not disagree and they'll find humor in the irony of what data promises, what their life really is in working as a catalyst. But I do think people might disagree that we talk about neural networks as just being one giant formula, and this makes them not as special as we think. They were born out of time when we kept saying that brain was like a computer. And in some ways it is, but the way that a neural network works is a metaphor for how our brain works and maybe approximates the randomness of it or just the little that our neurons make, but it's not our brain.
(29:43)And a neural net will not become a brain yet. And I think this, if you were hired as an AI person, or if you felt your career was going away because you planted your flag on the internet of things, and now you're going to plant your flag on generative AI and you get to go to all these conferences and they talk about how great it is and you write a report to your CEO, but at the end of the day, they're just trying to keep you out of projects too optimistic about, but you're some emeritus status. You probably don't want to hear this,
Adel Nehme (30:14):
Not
Jordan Goldmeier (30:15):
Higher critique.
Adel Nehme (30:16):
I mean, definitely everyone do read that book. It's a great one. And you mentioned as well, projects failing because of lack of communication. I think this is such an important thing about developing a data mindset here is the ability to communicate with others by using a common data language. Let's say maybe walk us through these common failure patterns that you've seen and how best to avoid them.
Jordan Goldmeier (30:39):
One of my favorite patterns that we talk about in the book is the telephone game. So the idea is that an analyst looks at a bunch of survey results and they say, oh, 75% is what they say of people are said that they would rebuy our product. Okay, 75% of people said they would rebuy our product. So they're at a manager meeting and they say, well, early analysis shows we have a 75% group or 75% of survey response respondents want to buy our product again. So over time, this idea morphs into something that it's not. So the analyst, she goes back, I think it was a she in this case, in the book, she goes back and looks at the data and sees, oh, well only actually 8% of respondents actually filled out that information and they said that they would want to buy the product again.
(31:30)So that really doesn't look very good. But that little fact toy takes on a life of its own. So suddenly at a manager's meeting, she hears someone say, and guess what, we have a 75% rebuy rate based off of the analysis of our data scientist. And that's not what the data was. It was eight respondents said that they would think about or that they would buy the product again, but they haven't actually done it yet. So as an example of just how the ways these things don't communicate well, and then the other thing is we use terms average and we use terms average to mean a lot of different things. So if someone said, well, the usual way it happens is, are they talking about the mode? I mean, that feels usual to me. Are they talking about the average? Well, you can make a case for that.
(32:24)Are they talking about the median? Maybe less of a case, but what does it really mean? We talking about the upper quartile, we don't actually have a definition for those things. So when it comes to talking to data, we talk about in the book, say average or mean, that's okay if you say average or mean, the average refers to the arithmetic. Average words like typical, usual, sometimes often things have very wiggly meanings. And so we want to stay away from those words. And one last thing I want to point out is that people often misunderstand, even smart people will say something like, well, if the average person is this, then 50% of the population is below average. Think about that. Right? But that's not average. That's median.
Adel Nehme (33:07):
That's median, yeah.
Jordan Goldmeier (33:08):
And I've talked to people about this and someone said to me, well, in my school we learned, we call these the measurements of average. I don't know, but he said that mean mini and mode were all measures of average is what his instructor called it. Okay. I don't know. I mean, I've never heard of that, but let's say it's possible. Well, all the more reason that you should be specific in your words and create common language within your organization. I mean, we also see this in machine learning. Precision and recall have a bunch of different names, statistics, type one and type two error. You will never hear me say those words. Okay.
Adel Nehme (33:45):
Bias. Bias is used. So many different use cases.
Jordan Goldmeier (33:49):
Absolutely. I want to stick with the type one and type two real quick. I think this was named by a lazy protein. False positive, false negative. That's it. Those are the words I will use. I use type one or type two,
Adel Nehme (34:02):
Don't even. That's a bit jargony. And then maybe as we're discussing these pitfalls, Jordan, how do you grow that data mindset? What are ways that you can build it in yourself and your team and your wider organization? How do you build that?
Jordan Goldmeier (34:17):
Well, the official response that I will give you is that you work on, you read my book and you work on times. Maybe you have once a week you discuss a chart or you discuss something nobody's wrong. You just sort of try to tear it apart and think together as a team what are the critical things we're missing? And we'll become very clear what everyone's good at. Think some people, it's hard to get them to agree and they're just going to argue about everything, but maybe they're the person to have it there and you just don't take it personally because they're going to. But even if they do get hooked on things, which they often do, but I think that there are people who have led small teams who are able to bring this critical thinking mindset. That's the official answer. The less official answer is that I've read Carl Young. I don't think people change. We think that they do and that there is no true answer and that we have to accept people's personalities as they are. And it may just be a matter because the next generation works in data and they just may be better at it than we are.
Adel Nehme (35:20):
Let's just say we'll find out in the future. But I like the official answer and the unofficial answer. And then as we close out, Jordan, at what point do you reach a limit in working in Excel and need to jump to other tools?
Jordan Goldmeier (35:36):
The easy answer for that is you'll know it. You'll know when you see it and when you hit it. I can't say specifically, I think that for me it's just a matter of I sit down and I'm like, this is too hard. I'm going to move on to another language, and sometimes I'll move back to Excel. I think that when your data is at risk, when your work is at risk of crashing, hit a run button and it goes white. And maybe if you wait 10 minutes, it will likely go, but you don't want to risk that this is critical stuff, then it's time to move. But if you have to set your calculation mode to manual, I don't like doing that, turning it on and off. If it can't stay automatic in Excel, then it's too big for Excel. But a lot of this stuff just can be optimized. So if you take my online class and you can read in my books that we do things very optimized so that we're not, well, I mean I won't get into it too much, but for those who have a computer science background, there's big O notation sets of instructions and apply that in Excel. You can think about optimizing your work so that you're not doing a hundred v lookups, you're doing one.
Adel Nehme (36:43):
That's pretty great. And maybe as we close out, Jordan, what is one final advice that you would give to practitioners or leaders listening in on the podcast today, they can take away for it there next week of work?
Jordan Goldmeier (36:56):
I think the big takeaway for me is that these changes that we want, it may seem like, well, you install this software and your life gets better. You do this with your team and life gets better. But in truth, it's more like the way your team operates right now, whether it's data or anything else, it's like a ship. And you can move the rudder a little bit in a few years. It will have changed the direction dramatically. But you can't do that tomorrow unless you start tomorrow. So my view is if you found something useful here, pick one thing that you want to try to fix and make that the focus of your group or your team. And when you've netted that win, pick one more thing and just keep growing it. But don't try to do it all at once. It won't work.
Adel Nehme (37:48):
I couldn't agree more. Thank you so much, Jordan, for coming on the podcast.
Jordan Goldmeier (37:52):
For sure. And I'm just going to show this
Adel Nehme (37:54):
Just yeah, definitely do show the books again.
Jordan Goldmeier (37:57):
Yes. If you guys want to find these are on amazon.com is the best way to get them also in ebook format. And of course, if you want to find me and ask me questions, best way to get to me is LinkedIn.
Adel Nehme (38:09):
Everyone definitely get Jordan's books. They're really great books. Highly recommend them. Thank you all so much. Thank you Jordan, for coming on the podcast.
Jordan Goldmeier (38:18):
Absolutely. Thanks for having me.
blog
Excel vs Tableau: Choosing the Right Data Analysis and Visualization Tool

Laiba Siddiqui
11 min
podcast
Do Spreadsheets Need a Rethink? With Hjalmar Gislason, CEO of GRID
podcast
Spreadsheets in Data Science
tutorial
Using Python to Power Spreadsheets in Data Science
tutorial
Getting Started with Spreadsheets

Ryan Sheehy
5 min
code-along
Live Training: Analyzing a Marketing Funnel in Spreadsheets

Ariel Hendrick