Making AI Applications like Greased Lightning with William Falcon, CEO at Lightning AI

Richie and William explore the NY Al hub, the journey from Al idea to production, diverse perspectives in Al development, how Lightning Al simplifies Al workflows, the significance of open-source models, and much more.

May 19, 2025

Guest

William Falcon

Host

Richie Cotton

Key Quotes

For AI to really take off, the focus has to be on how to deliver a business outcome to people, not what tools to build, right? People don't necessarily care so much about the tools, they care about what it lets you do and what the outcomes are of using those tools, right? And this is coming from a tool builder!

I think that models should be open source. The thing that's interesting about the models is not the model itself, but what you do with the models, right? So you don't regulate. You don't regulate technology, you regulate applications of technology. That's what the government really should be focused on.

Key Takeaways

AI success hinges on delivering tangible business outcomes rather than just building tools; focus on how AI can solve domain-specific problems to create real value.

Software developers, even those without AI expertise, can leverage platforms like Lightning AI to build AI products, emphasizing the importance of software engineering skills in AI development.

Organizations should focus on improving data quality and context to enhance AI performance, as better data often leads to more significant improvements than tweaking models.

Links From The Show

Lightning AI

PyTorch Lightning

Course: Introduction to Deep Learning in PyTorch Course

Transcript

Richie Cotton

I want to pick up on something that you said at a meeting I attended. So you said that you think artificial general intelligence is going to come out of New York, not San Francisco. Can you talk me through why you think that?

William Falcon

I think that in New York you have people from different backgrounds that are coming at problems from a different perspective. You have lawyers, you have doctors, you have marketing people, etc.. And so I think that the that like a variety in or breadth in viewpoints is actually what I think will happen, will create the next generation of like foundation of companies.

William Falcon

Right. And probably to some example, to some extent like a Harvey or something like that is, I think a good example of that, where it's like the lawyers that are involved or doing something very specific. Not that there are lawyers in San Francisco, but I think in terms of scope, people tend to think more uniformly about what they built, and it tends to be kind of solving their own problems.

William Falcon

And most people are in tech. And so that's kind of the the issue that I see there.

Richie Cotton

Okay. So there's a difference between just thinking, about the technology side of things versus thinking about the domain side of things, and then how the technology can help the domain.

William Falcon

Yeah. Because I mean, for AI to really take off, th... See more

e focus has to be on how to deliver a business outcome to people, not what tools to build. Right. Because people don't necessarily care so much about the tools. They care about what it lets you do and what what what the outcomes are of using those tools. Right. And this is coming from a tool builder.

William Falcon

Right. But like we focus a lot on how can we enable you to build like a products.

Richie Cotton

Okay. That's interesting. Do you want to expand on that? So how do you get closer to solving domain problems with lightning?

William Falcon

Yeah. I mean, you know, there's, there's like a the journey that people go through from idea to, like a real product and production and is there's, there's like a, a path. Right. And it always follows something like the ML lifecycle. Right. You have like, I'm sorry, you like label data. You collect. Well, you first collect, you label, you prepare this a bunch of things you do there, then you might do some sort of like, you know, distributed, workflows where you might do embeddings and then at some point you end up with like a clean data set, I guess.

William Falcon

And then now you need to decide, am I going to fine tune a pre-trained model? Am I going to like what approach am I going to take there? And sometimes you might not you might just start with an API and then use rag anyway. So there's like so many decisions along along the way. And then ultimately I think where most people land is at a posse deployment, like something where they're like, wow, look at this cool thing that works for like seven people.

William Falcon

But the real answer is, well, how do you get something that's actually live in production and like, really reliable and so on? And so, you know, the kind of approach that we have is really more like we have a AI hub, which is like the end of the journey. So like, you don't have to go through that if you just want something that like, is an agent or an expert, or if you want like a fully, fully built rack system, or you want like an actual vertical and application, then, yeah, that's funny because like, if you think about what it actually takes to build.

William Falcon

So like let's say something like a Harvey. Right. Like that's an end application, right? So now I hope you can find those kind of things. They're like very specialized to domain. You just hit them, hit one, click deploy and then you got that. But to build something like that you need a workflow usually, which is kind of where most ML companies are sitting around.

William Falcon

They're like, well, we'll do it with ROC. We'll do with fine tuning. There's like 20 million different approaches on how to do it. So those workflows are things you can find and they help as well. So they kind of give you that final step. And workflows themselves are made up of atomic particles that are models vector DB etc., which you can also find as atomic as atomic things on the on the hub.

William Falcon

And then if you don't like what's on there, then we have a whole platform of tools that you can use to build those things ultimately, which is studios. Right? So you want to train a model, go for it. You want to embed some data, go for it, whatever, whatever it is. There's like a suite of tools. So so that's kind of the approach we take and we hope that we get, our goal is to really solve that last mile problem for people.

William Falcon

And I think that's where a lot of the value lies.

Richie Cotton

Wonderful. So basically to saying that a lot of the hard it's just getting these things into production and getting that end user application. Is that the biggest challenge you think is with development?

William Falcon

Well, it's hard once you already decided what the problem you wanted to solve was, you took the creativity to decide how to apply and you enterprise. Then you hire the people who could like, do the work. Then those people decide what tools to build, what techniques to use. And yes, if you had all that, then the hard thing was getting into production.

William Falcon

But like there was so much hard stuff before that.

Richie Cotton

I want to pick up on something you said, so you done well. There are lots of different types of tooling. So you're saying you've got the end user applications, but the developers, there's different layers of things. So you got workflow tools, you got tools for rags. You've got tools of fine tuning. There's so many different tools. So I try and solve the space.

Richie Cotton

I can't keep up. Can you give me an overview of what all the different sort of tasks and tools are, and what does the tech stack look like?

William Falcon

I want to start with the outcome. So the solution. So for example, if you go to a hub you're going to find something called the stock research expert right or sorry stock research agent. And this agent literally you can give it a ticker and it'll go do research on that stock and give it a report. That's the kind of thing you're building enough and like a investment back.

William Falcon

Right? Okay. So that's the that's the ultimate application. That in itself is like a whole company potentially. Right. Where like it's like all I do is make that thing really, really good. How did you get to that? Well, you sub one level down. You take a workflow like a fine tuning and deploying workflow that takes some model, fine tuned it on that data and then deploys it and then teaches it how to use tools.

William Falcon

Right. So that second level is, something that you can build on the platform with the tools you have. So we have a way for you to like create, for example, a job that does fine tuning and then a way for you to create a different thing, which is like a deployment that's like a Docker base image, you know, these are two like individual things usually where you need two different products to do it.

William Falcon

So with us, it's really more like a single platform that will let you do both at once. Right. With the same workflow is very, very straightforward. And so so that gives you the workflows. Right. And then and then if you wanted to build each particular piece of that for the fine tuning, you actually might need something like data generation.

William Falcon

And so on. And so ultimately for when you're building those particular, tools, sorry. I mean, it's really confusing because we're talking about so many layers here. Right? When you're building those, I guess you can call them, I don't know, atomic pieces of us AI system. You're going to need a set of tools. So it all starts with the studio.

William Falcon

Right in the studio is like a cloud development environment. So think about it like you have your VS code on your laptop. It's up, but it's not on your laptop. It's on the cloud. And you've got GPUs there. And whatever you could do in your laptop, you can do that. And so if you say, look, I'm fine tuning a model, you're going to need some GPUs, okay?

William Falcon

So get GPUs. It takes a few seconds to get it. And now you can clone your GitHub repo. What whatever you going to do start developing on a testing at iterating swap GPUs. So if you want to test a different type of GPU or not, and it's going to persist. So it's not going to lose your progress, right.

William Falcon

So you have to install anything again or whatever. And then you can do a lot of the iteration, the development side of that. Right. And then you say, okay, I'm ready to train the model. And then from there you like package it up and you submit it as a job, and then it goes and trains. And then you can say, okay, now I need to deploy a model.

William Falcon

So you start a separate studio now and in there you set up your model checkpoint, you set up your server, you debugger, you like run the server live. You can hit it because it's on the, on the web. Right. So it's got ports. You can kind of really load test server, do all that stuff, and then you ultimately package that up into a Docker container and then you ship it and it gets deployed serverless as well.

William Falcon

Right. So really, no matter what the workflow is, that kind of studio is kind of like the the base foundation for it usually.

Richie Cotton

Okay. So it's interesting that there are really discrete tasks within this AI application development. Space or within agent development. So maybe fine tuning the task, maybe deployment is a task and you can deal with them one at a time. So it sounds like with lighting Studio, you try and have the entire workflow in a single tool, which is quite different to a lot of other companies where they're focused on one segment of development.

William Falcon

So what's your motivation for that approach? Well, that was the number one thing we had to do from day one. So I was when I was working, when I started doing I like more professionally. Right. So I started as an undergrad like mess around in 2015, then research. And then when I did my PhD, I was I admit, I was a Facebook guy at the time, and you can already tell there were a lot of tools that were coming out.

William Falcon

There's like a lot of different things. You had two different types of GPUs. You had different like experiment management stuff, things that we take for granted today. But like it was like a lot of these different things. And so when, when I built PyTorch lightning, the first thing I focused on was, how do you get all those tools to work together nicely in the training framework?

William Falcon

Right. So it's just like a very, it's one level below kind of where the platform sits. So the training framework and from that day that was like 2019, you know, I was like and I had come from like web development, all this stuff. And I knew, you know, you you have an explosion of libraries. And I was like, I think one day there's just going to be a million tools that are happening and they're going to come, come at us, you know, fast.

William Falcon

And I want to build a platform that basically is kind of tool agnostic and like it can pivot like that and very quickly and, and the value to a customer is like when a new tool comes out that we curate it, put it on there for you and show you how to use it so you don't have to figure this out.

William Falcon

And like we'll also tell you if it's not a good tool, right. So like to your point, how do you keep up? I don't think you do. I think we keep up and we tell you what's interesting. Right. And so that was really the genesis of the idea. And then it just took about five years to figure out how to do that.

William Falcon

Well, because it's a very hard problem, of course.

Richie Cotton

Okay. So not having to worry about keeping up with all the latest tools and having to use them in different manners, that does seem like an incredibly powerful thing. So I guess the advantage of lightning here is it just giving you some sort of a consistent interface to one of a tool you want in order to develop your AI applications?

William Falcon

Yeah, I mean, look, it's a natural progression in things, right? Like your your phone here, this thing was a replacement for you carrying a calculator on a bag and and like a separate device for notetaking and like a flashlight and, and camera and all these other things. And so it became a platform. Now that has all those things in it.

William Falcon

Right? Something like a slack. Like a slack was a replacement for a special tool that you could send images on. And a special tool for texting, a special tool for sending videos and this and that. So I think all roads lead to an end to end platform for pretty much any type of new technology. And for AI, it's going to happen.

William Falcon

Like you're not going to you're not going to have specialized tools for training or serving this. There's just point solutions. Those are Band-Aids today for the real products. Right. And so that's kind of the approach we took. And so to us all those things are plug ins ultimately. Right. Like I don't know that they're just built into the thing okay.

Richie Cotton

So it's probably starting fights with, a fair few of the startup companies saying, all right, your company's just a plug in. But yeah, this is, an interesting approach. All right. So, we talked a bit about the tooling side of things, and I'd like to know, who's going to work on this. So tell me what sort of talent you need in order to build, these applications, this sort of general role of AI engineer?

Richie Cotton

But can one person do everything or do you need different people with different skills.

William Falcon

Or is the right tools you can. Right. So I think I think like, you know, it's a good analogy. I like people like, you know, car mechanics maybe, or like construction. Right. Like, you know, could you work on a car today? Could you, could you like, fix stuff on your, in your house like you're not a specialized. Maybe you are.

William Falcon

I don't know, but I'm not like a specialized mechanic or or construction builder. But like, I guarantee I could go to Home Depot and find the tools that would actually help me do something that I couldn't have done 50 years ago without being very, very specialized cutting wood or something. Right. And so, so I think that the right tools help you level up, the people that you work with.

William Falcon

And so our goal is to make it accessible for anyone in the world to build AI products ultimately. And I think, right now, a huge part of our users are actually developers who know nothing about AI, and they're coming on here and they're like, great. I don't really care about AI, like I care about building a product.

William Falcon

And I don't care how that gets done. But they're familiar with things like Kubernetes or whatever else. And so they can deploy takes on the platform. They can grab a Docker container or they can put it serverless. And then when they train a model, you know, it's kind of a no code experience. So maybe that's how we do it.

William Falcon

It's like for us, the no code stuff is like, if you're not an expert, then go for it. If you're an expert, we give you the full code experience and you should be able to tweak your models and do whatever you want, right? But if I think if we're successful, anyone in the world, no matter what who you are, should be able to build AI in whatever shape that means to you.

William Falcon

So, like I to my grandparents means like, I think that I can chat with that should be able to do that. Like, what does that look like to you? Right. And, AI to a PhD means a model with 24 layers. And here in this particular way, and as a transformer with this type of attention or whatever, then do it as well, right?

Richie Cotton

Okay. So you need different skills for different types of AI. You also mentioned that a lot of these people, who are developing AI things come from a software development background problem from a data or machine learning background. So can you talk me through one of the most important skills you need in order to work with AI to develop things?

William Falcon

You know, I, I came from the the PhD background in AI, right. And like the the researchers I think are there. They love the research. They love to get caught up in like the type of model and the way that you fine tune and this and that. And like, you know, to build their products really makes marginal difference most of the time.

William Falcon

Like those kind of decisions. Right. They're actually being made for you already in the open source. Like we've all done it. Like we the models were there. They already said with the standards, you just have to run them. So I think software developers in general are best equipped to do this because they're they're treating this stuff as a tool to achieve an actual business goal.

William Falcon

And so they actually will not get caught up in like all these details. And and they actually have pretty, pretty solid like software engineering techniques, like how to, how to, you know, CI, CD workflows, etc., etc. and that's actually what you mostly need. So I, I don't know, I think you mostly just need to be a fairly reasonable software engineer who's not afraid of it, who's who's willing to like, bring something and see if it fails and not like be okay with having something that's working and you don't know why it's doing what it's doing.

William Falcon

And that's okay. Like, you know, I don't think anyone here drives a car with knowing every detail about everything that happens within that car, or like learning physics to drive a car. You don't do that today, right? So there's no reason to do that to to like, hold yourself to that level. So I think it's open mindedness and being okay with not knowing everything probably is the, the core stuff.

Richie Cotton

Okay. It's definitely an important skill to be okay with not knowing everything because, I mean, at this point, it's just impossible to, keep up with everything there is. And I think for the different audience, a lot of them come from a data home machine learning background. Are there any particular software engineering things that you think are good?

Richie Cotton

You know, if you want to work with AI.

William Falcon

We use notebooks for scratch work things you're going to throw away. Right. But when you when you're finished doing the work that you're doing, put it into Python scripts like actual Python scripts with like real entry points and this and that, so that you can put it into production, meaning you can train it, you can deploy it, etc. we provide a lot of tools to kind of bridge that gap.

William Falcon

So like if you, for example, wanted to deploy a model and that model sitting in a notebook, you know, like, you know, I, I think it comes from it's because in legacy data science, you could train an excuse boost model on like a CPU notebook and like three lines of code and it's fine. But in AI you need GPUs, you need distributed processing, you need all the stuff that, like a notebook, cannot do because it's literally a single Python process.

William Falcon

It cannot go and do multiprocessors multi machine things because it's like impossible. Right. And so and so that's why we have to break out of notebook saying if you're doing AI, if you do data science, stick to notebooks. It's great. But so if you want to grab that model then we have something called let's search for example where you can grab that model checkpoint, drop it in into this.

William Falcon

It's kind of like PyTorch lightning, but for serving like a harness, I guess. And, and then you have kind of a few lines of code that you need to do. And then the I'm a is kind of handled for you automatically behind the scenes. Right. Yeah. It doesn't have to be a PyTorch model. It can be an Xbox model.

William Falcon

It can be whatever. But there are these types of tools that are helping bridge people out of these things. So I would say just use the right tools for the right job. If you're experimenting and if you're doing analysis, stay in a notebook because you need the interactive ness of it, the ability to visualize things 100%. But if you're going to train or deploy things for real, then you need to learn Python without a notebook.

Richie Cotton

Okay, that's interesting and probably fighting toe to a lot of data scientist. So yeah, I think definitely some problems notebooks. But does that mean there's a gap in the market for a notebook that does scale to distributed computing and working with GPUs? Or is it just that notebooks are fundamentally unsolvable and everyone should switch to writing scripts?

William Falcon

No, I think people have tried. There have been plenty of products that have tried to do that. And I'm not saying never. Maybe there will be something, but, I think people can, I don't know, they should really just switch to to writing the scripts. It's not even that the shirt apart. It's like the out of order of a notebook.

William Falcon

Like, how do you how can you guarantee that this code was executed in exactly the same order every time? You can't because people notebooks so do like so on. So for cell to cell three and then cell seven and that path as well led you to the model. And that's not reproducible right.

Richie Cotton

So yeah. This sounds like it's a fundamental problem with notebook. So if users are constrained to, executing things in order, I can see I was going to have cause problems in production.

William Falcon

Don't get me wrong, I used notebooks, but I use them for the right thing, which is where order doesn't matter. And it's exploratory work. But what? I'm going to do serious work. I'm going to use Python scripts. Right.

Richie Cotton

Okay. All right. That seems like a good thing to know about. So PyTorch is very much the dominant player in terms of frameworks for doing machine learning research. So why did you feel the need to create lightning? So what was missing from the base PyTorch?

William Falcon

I equate PyTorch to like JavaScript. So would you build a website using JavaScript or would you use something like react?

Richie Cotton

Right. Okay. So yeah, it's going to take you a lot longer to do raw JavaScript everywhere.

William Falcon

Exactly. And like you're going to reinvent the wheel every time. So you really need to reinvent state management for every single website you're going to build. It's a cool experiment. If you've never done it. It's the same thing. PyTorch. Like, do you really need to reinvent the way to save checkpoints, the way to do distributive training, the way to plug in to FTP, the way to do multi-GPU training, the way to, you know, log things, manage experiments, you know, collaborate.

William Falcon

Like why would you do that every time? It doesn't make sense. Right. So yeah, I mean, that's really why ultimately because if you're doing anything serious, you're going to need a framework, right? It's only the purists for like, oh, let me do this. I'm like, yeah, maybe like cool story. You wrote a JavaScript website, but like, you know, Facebook is on, right?

William Falcon

And JavaScript is running react, right? Google isn't written JavaScript to train an angular of the best people in the world aren't using JavaScript directly. There's a reason for it right now.

Richie Cotton

Okay, cool. So, it seems very sensible having a framework just to speed things up. So are there any particular tasks you think lightning makes easier compared to lower level PyTorch?

William Falcon

Yeah. I mean generally like just, the ability to experiment. Right. But I would say the thing that it shines out the best is distributed training specifically. So with lightning, you can write your code in a way where, it can run on a CPU, or you can run on one GPU or four GPUs or a thousand GPUs without changing anything about the code.

William Falcon

And it'll just work out of the box. Right. And you can do it even on other chips like, trainium chips or TPUs or whatever. Right. And, and I think that's really what's going to give you the power to kind of go between laptop, cloud, whatever you want to do as well. And not to mention all the, all the like, best practices that have been embedded in there.

William Falcon

Right. So so I'll give you context. So lightning is hot. Probably close to a thousand contributors at this point. We did the math recently at something like 400,000 developer hours. So if you could, you know, and that's going to get you stability, correctness, like, you know, that the code is not going to have an issue when you run.

William Falcon

You didn't swap the backward with the way when you released, the gradients or you didn't do the test, before the backward or whatever, all that stuff is sorted out for you already and like, it's very, very robust. So unless you can tell me that you're going to invest 400,000 hours into your own code base to make sure you have no bugs, then it seems a bit silly to not leverage the things that people have built.

William Falcon

Right? So so those are interesting. Stats are, but there are so many small optimizations there as well, like the ability to go, not even small, I guess. To me that they're small, not, but like a position a bit for bit two bit the ability to to just like test one batch of data and see what happens, try to overfit one batch.

William Falcon

All these like all these things you learn during your PhD that like are embedded in there. And they were met in there by me, by people I've met, by people in the PyTorch team, by people at Facebook. I buy by, like a lot of amazing researchers all over the world who put the best tricks of the trade into this thing so you don't have to learn what they are anymore.

William Falcon

Right?

Richie Cotton

That seems very useful. So, just building on, I guess, the shoulders of giants, is a big point of open source. So can you tell me too, what the relationship is between the PyTorch lightning developers and the vanilla, PyTorch developers? I think it's mostly people matter, right? Torch.

William Falcon

Yeah. I mean, it's it's over the years, it's varied. Right? So PyTorch is is, something that came from meta. Right. So it's not something that we developed and and, but you know, when, when PyTorch was coming out, there were few of us who were like very early. So my CTO, Luca was actually much earlier than I was working on PyTorch, like very, very early.

William Falcon

He was actually on the PyTorch paper itself. Right. When it was published, and the Autograd paper before which led to PyTorch as well. If you guys ever read this book, Deep Learning with PyTorch. He's one of the authors on there as well. And then another one of the authors is, Thomas Vernon. So Thomas works with us as well.

William Falcon

Thomas wrote a torch script. Right. And so, over the years, we've had people who come from, like we had, for a while, we had a handful of people from the PyTorch core team as well here working on, on optimizing things. And, and, you know, there's kind of revolving doors, to some extent. Right. So some of us go there or some of us, they come back and forth and so on, and and now lectures organizations have, lightning teams internally.

William Falcon

So like meta has lighting team internally who works on their PyTorch lightning like equivalent internally because at some point they were using PyTorch lightning for like all of their production stuff. And so all these things have derivatives like Nvidia has a lighting team like there. They're these companies where they'll have something internally, and we collaborate with all of them today as well.

William Falcon

So it is a big effort. And, you know, we support PyTorch beyond just lighting. So we serve on the PyTorch foundation today. And then, you know, Luca, who I was mentioning, our CTO, he's, he's the chair of the PyTorch Foundation this year as well. So we try to save other than the governance of how we shape the product, and the project as well.

William Falcon

And it's not just a us, it's a handful of companies. We're doing that today. But, you know, I believe PyTorch is a community project, as was PyTorch lightning today. And, and we try to do that for, for everyone as well. Yeah.

Richie Cotton

Okay. So, that seems pretty fantastic that both these projects got good collaboration going, and you're working closely with people who are putting on vanilla PyTorch now. Both. PyTorch and PyTorch lightning, open source projects. And it seems like the large language models, whether or not they should be open source, is quite a divisive issue. So you got the meta llama models that are kind of semi open source.

Richie Cotton

They got open weights and a lot of the competitors, closed source. So we're yet to see a completely open model. Do you have a position on how open these large language models should be?

William Falcon

Yes. Actually, before I answer that, I, I forgot to mention one thing. This this, month, we, we it was an interesting thing that we surpassed. So most people are familiar with Keras and, on the TensorFlow side. Right. So you've got TensorFlow and PyTorch are kind of the same level. You've got Keras and Lightning just a little bit above that.

William Falcon

And they kind of automate a lot of stuff. Although you know, I think lightning is kind of build more for research on like high performance stuff. But Keras is one of the most popular projects in the world. Right. This was this was the first months where lightning was used more than Keras in the world, had more downloads.

William Falcon

Right. And so that's been really interesting from our perspective because, I think what it just, you know, it's hard to tell what the relationship is, but like PyTorch is growing, but it's a growing because, you know, we make it easier to use as well. And then because they're growing, it makes us easier to use. Like it's like a two way relationship as well.

William Falcon

So it's just been really cool to see like that that pair grow. So so that was just an interesting tidbit that, that we saw this month. But in terms of open models, I think in the in the asymptote, it all becomes open, right? Like in the, in the limit, it's it's all going to be open source, the end of the day.

William Falcon

And whether we do it or China does it, it's someone will do it. Right. And the best models already, I think are the open source models and like deep sig prove that. And you're going to see a few more models come out. I'm surprised you know, we haven't. This is, March 2018, but like, we haven't heard from meta yet, and I'm sure very soon you'll hear something about it.

William Falcon

I have no insider knowledge. I'm just guessing. Right. And, and I, I'm sure that model will change things as well for for the industry, but, But no, like, I, I think that model should be open source, like the, the, the thing that's interesting about the models is not the model itself, but what you do with the models, right?

William Falcon

So you don't regulate you don't regulate technology, you regulate applications of technology. Right. That's that's kind of where, where the government really should be focused on, what's a good example of that? Like, you know, regulate cars, I guess, or maybe weapons is a good one, like you wouldn't regulate like the the ability to build a weapon or use a weapon.

William Falcon

You would regulate when and how you can use it and who can use it, for example. Right. And same thing for a car like you want to regulate the building of a car, you would just say, hey, you can't use the car for certain things that are like illegal, like running into stuff or whatever. Right. And so and so it's it's really, you know, time has already said that that's kind of how you manage technology.

William Falcon

Ultimately, it just feels scary. But I don't actually think it's a I don't know, it's not scary.

Richie Cotton

Okay. Yeah. So in this case you would want to regulate what people use the AI for, not necessarily putting regulations on what one contains or the capabilities of the model. Is that about right?

William Falcon

Yeah. Like make it illegal to to generate deepfake images, for example. Right. Make it illegal to, to use AI for whatever, the like, I don't know, for trading smuggling like those kind of things. Right. But dog make it the actual use of AI itself illegal, right.

Richie Cotton

Okay. And I guess the other part of it is that at the moment, a lot of the open models, the datasets aren't publicly available. So it's not completely possible to reproduce these things. Do you think we'll see more openness in that area in the future?

William Falcon

Yeah. Good question, I hope so. Maybe someone will, but it'll be like an activist type with us that I, I don't think that they can open the, the data because it's definitely not obtained legally probably. Or like the data that they use is not no one wants to get sued. Right. So that's that's ultimately what it is.

William Falcon

So no company will tell you what it is because it'll open it up to lawsuits. And but yeah, if you have the data, I think that would be a lot better to really understand what went into these things. But I don't think it's feasible. Like, no, no one's incentivized to do it. Yeah, I, I don't think it's feasible.

Richie Cotton

Yeah. I don't think it's, necessarily feasible, at least in the short term. So we got some economic and legal challenges. Having true openness.

William Falcon

That comes with the question is like, why do we have to be so specific about this, like true openness, like, you know, this isn't like, I don't know, like, like a moral stance, like, every, like a purist stance on open source, right? Like everything must be open. Like, look, the model weights are open, the architecture is open there.

William Falcon

Sometimes I give you the way that it was trained, but they don't tell you, what, what data use potentially. But they tell you high level like what is the we use bucks or whatever. Yeah, I don't know. I, you know, that's pretty good. I have to say.

Richie Cotton

All right. So what we have, we get the benefits for researchers being able to share ideas, but not necessarily the commercial competition. Next, everyone can completely reproduce what you've done.

William Falcon

You know, it's funny because for the longest time, there was this debate of, like you, even before I came out and like you were in data science, right? And people are like, what's the value is the model, the value like people remember, I think data science back in like 2020 where like, why are all these deep learning people open sourcing their models like they spent $10 million, you know, open sourcing this model.

William Falcon

Wait. And this and that? Well, it's funny because now people are when they're sending it's not the model that matters. It's the data. And we've always been saying that forever. The modes is the data. Ultimately for an enterprise, it's not the model, it's the fact that you have unique, perfect data that no one else has in your industry.

William Falcon

The way you train your employees, your customer activity, like all that stuff is the moat.

Richie Cotton

Okay, so in this case, the data is more important than the actual model. In this case. So as in most cases you can tweak a model. You're going to get slightly better performance but get better data. That's going to give you dramatically improved performance for your predictions or for whatever the AI is generating.

William Falcon

Yeah, I think the most compelling thing for people working on Alonso's them changing models five three and then you know, llama too and nothing really changes. And then they go change their data or the prompts and they're like, wow, this is way better. Like, yes, the data is the main thing that matters. Interest on context is king is what they say, right?

Richie Cotton

Okay. So does that mean organizations need to be investing in their data capability before they start worrying too much about AI performance?

William Falcon

Yeah, I think that you put you you grab something like a lightning, you put a V1 of a product out there very quickly in a few days. Right. Deploy something, no code, whatever. It's out there, you get it working in your in your production system and you start measuring the success of that. And you now treat it like an AB test going forward.

William Falcon

Right. Okay. What if I change my data if I fine tune it this way? This way? What's the effect on my customers? And you just keep iterating on that over and over?

Richie Cotton

Okay, so this seems incredibly important for a lot of companies to deal with. So, you mentioned before about Chinese AI and because at the start of the year we had Deep Sik released the AR1 model and caused a lot of controversy. So, it seemed to be very cheap and very powerful. And it's quite a wide range of views about how this is going to impact the AI ecosystem.

Richie Cotton

So we've had a month or two for this news to settle. So what's your takeaway like having had time to digest it.

William Falcon

So I think it's a it's a great thing actually in my opinion because you know, this is goes back to the question of open source. A lot of these foundation model companies want you to think that it is really hard to pre-train a model, that it's millions of dollars and it's too expensive. It's not. I have people on the platform doing this for thousands of dollars, literally.

William Falcon

Sometimes people fine tuned for $0.12 because it's that quick. Right? But why do they do that? Because the harder they make you feel like it is, the more that you're going to pay them to just use their their black box, not an open source thing. Right. And so I think it's great that our research lab from China put this out and was like, hey, this could be done cheaper.

William Falcon

Now, is that a real claim or not? I don't know, they may have used a thousand GPUs behind the scenes, who knows? But from a researcher perspective, I can tell you that the techniques that they use are real. And they do like they can give you the improvements in performance. And I think, probably worst of all is those techniques have been around for years.

William Falcon

There's no new techniques right there. They've been around for a while, and they just put them together in like a in a good way that, kind of yielded their results. So I've always said that, you know, my prediction last year for 20 for, for the new year was a 1 billion parameter model will do just as well as a 70 billion parameter model.

William Falcon

Right. And and you're seeing like a lot of that stuff start to happen now into smaller models, but like and then ultimately what I was kind of saying with that as well as saying like, smarter science will beat engineering, right. And I Facebook, I like a lot of what we did, and I spent most of my PhD on was smarter science on how to train a model like foundation, foundational methods for training a model.

William Falcon

So like, what do I mean by that? A model can take a year to train if you are, if you're calculating like, an MSC loss, for example, what you want to do. Right. But like, let's just make that. Then you can move into like a cross-entropy loss. And the computational time of that might be slower. Right.

William Falcon

And then there's all these techniques like negative sampling right. And C loss for example, where it like takes positive pairs and negative pairs and their way does that is it computes like a, like an n by n multiply and like it gets you like all these scores. And then you sum them up and you normalize and stuff. And that multiplication can be pretty hard.

William Falcon

And so what people started doing with distributed training to do that where like the negative samples go on one GPU, pass it on the other so that computation gets faster. But making that thing better, like the quality of the model better, came down to what math function are you using? What is the loss like? How did you structure the loss?

William Falcon

How do you regular regularize or how do you sample your data? That's the science element, the engineering of making it better. It's just like, how can you scale out to more GPUs in a smarter way, or this or that? That's like being done about it right out there. Like anyone can do that, but how can you come up with better science to do that?

William Falcon

That's, I think, where we're going to see the biggest leaps. And so, so to me it's a it's a good step because it's exactly what should happen. And someone else will do it. Someone else will come up with better science. They're going to pre-train a model on like seven days or something crazy. And it'll be like, wow, okay, that's crazy.

William Falcon

Whatever. We could do that. It's like, yes, because you had the wrong loss function or you didn't regularize or there was something you could have done, right. It is not the first time this has happened. I mean, remember, AlexNet kicked off deep learning and it was literally a single page. These students who just happened to grab this thing, put on a GPU and like, blew away 80 years of computer vision research.

William Falcon

Very, very simple. It was done once already. I think the second time was OpenAI itself. Like they took something that already existed Transformers. They didn't invent Transformers that was already there. They took that, and then they just scaled it out to more GPUs. And guess what does new you get? You get this crazy model that works really well.

William Falcon

That's GPT two. It happened twice. Like, is it going to happen again? Yeah, 100%. Could it be one single person? Yeah. It's usually been one single person.

Richie Cotton

That that's very opportunistic and I like it. It's possible to get, a major scientific breakthrough and it improves the state of what everyone's doing. So as long as everyone finds out about it and the research, it's clear what they did and the research is open and shit. So yeah, it becomes available for everyone.

William Falcon

Yeah. That's why the importance of open sources is big because it's I, I personally care less about the literal model. I care more about what did you learn, what what worked for you, what didn't work. So we can try that out. Right.

Richie Cotton

Okay. I guess this brings us to another topical point is that it's a lot of money being thrown into AI infrastructure. So, data centers and things like that. So we've seen the first big IPO from one of the recent AI infrastructure companies with call wave going public. So do you think is a bubble in AI infrastructure?

William Falcon

100%, no doubt. Yeah. I mean, look, GPU prices drop all the time. So so on lightning, you're able to get, GPUs from different cloud providers. Right. So we kind of give you the lowest across a bunch of, and you see this trend like it's 100 is dropping consistently. So prices will continue to go down.

William Falcon

I think that there is a delay effect from Covid. So Covid made GPU production a little bit bottlenecked. And then I came on top of that during like the year after. And it just compounded the whole thing. And so you had this massive demand. Then Nvidia produced so many GPUs, probably overproduced GPUs. I would I would bet that's my like unpopular opinion.

William Falcon

And then you got all these people with all these GPUs all over and like most of them, were sitting there idle like they most people can't use those GPUs today. Right. And so, so a lot of what we do is we partner with those cloud providers to give access to, you know, our whole community of users and everyone else so that they have, you know, someone who actually can use their GPUs.

William Falcon

And that also benefits our users because users get the cheapest GPU prices at the same time. Right. So there's this massive, imbalance in the market that, that we're seeing as a platform that, like, kind of serves both sides. And, and, you know, I think blackouts can exacerbate that more because not people are able to use the sparkle chips.

William Falcon

So what's gonna happen to these older chips? So I do think there's a there's a massive bubble happening there. And I think the, the financial sort of like a core web are, are interesting. I, I'm not going to judge it. I maybe it works out. Maybe it doesn't. Right. But like taking a lot of that to buy things and then you know intro high interest rates I get it compounds and becomes really hard to to deal with as well.

William Falcon

You know, we'll see. But there can't be a thousand cloud providers. So let's just be clear about that. And there's like a new one popping out every every three weeks. Right.

Richie Cotton

So what happens then if GPUs get really cheap, what happens to everyone else then.

William Falcon

Well, I mean, I think it allows people to actually do their, their AI stuff easier, right? Because a lot of it for many years, you could even get the GPUs for people. Right. And so they couldn't even try, try to do the workflows. So actually I think it'll it'll accelerate AI quite a bit. In my opinion, if GPUs are cheaper, I think that like the stock market will have a massive problem.

William Falcon

Right? I think that Nvidia and all these other companies will have issues. I'm very I saw so many GPUs like it probably will not affect them that much. But I don't know. I like to when prices of things drop, it like hurts some group of people and and the benefits a lot of other people. I think it's a net plus benefit for everyone in the market.

William Falcon

And and that'll just accelerate the ability and, who can build AI? It'll democratize. So, so I think it's overall a great thing for people. Yeah.

Richie Cotton

Okay. So maybe infrastructure providers, not so happy. But for everyone else though, GPU prices going be good because you can do more AI with them.

William Falcon

Exactly. And you know, from our perspective, like we're kind of sitting here giving customers access to the. So like my goal is to give people the cheapest GPUs possible, right. And give them the, the infrastructure and the tooling that they can use to build stuff. And so, you know, I guess cheaper prices are great for us.

Richie Cotton

I so I guess related to this, I think the two big trends recently have been about reasoning models and AI agents. And both of these, they consume a lot of tokens. And so they get quite expensive. And sometimes in unpredictable ways. So do you have any advice for organizations who want to make use of these technologies?

William Falcon

Yeah, stop paying for tokens. It's not a good business model for you. Right. So you can exchange tokens for compute ultimately. Right. You can you can either pay by the token and not run the compute yourself. But then you're limited by the throttling, arbitrary throttling that they have and the pricing that they set up. Or you can go to like, you know, like I'll say like lightning AI, for example, you can grab R1, deploy yourself in about five minutes.

William Falcon

You'll have your own private dedicated endpoint, and you're not paying anyone by the token. So you're paying for the compute only, right? And we do a lot of work to make sure they fit in cheaper machines to run hundreds, etc.. But yeah. And then when you're not using it, they go to sleep and there's no costs for the infrastructure.

William Falcon

Right. So I think if you're paying this token tax, like it's kind of self-inflicted at this point.

Richie Cotton

Okay. So your recommendation is to build things yourself. It hosted in a way you control and then you control costs a bit more. Is that the idea?

William Falcon

Yeah. Why wouldn't you like why would you want to pay by the token to a black box. So you can't control you can control their uptime. You're setting your data to someone else. Like on something like lightning. You can get your own private, dedicated endpoint on your own private cloud with all fully secure. And you're not paying by the tokens, you're just paying by the by the compute that it's being used under the hood.

William Falcon

And it's been optimized like to be pretty good. You know, the other thing is like that model won't change if you if you're hitting an API, if they might change the model under the hood and then your prompt won't work anymore. Right. And so you had built this whole system on top of this, relying, like a deterministic output that you just won't get anymore here.

William Falcon

You have guaranteed that like your model is not going to change.

Richie Cotton

Okay. Yes. Certainly if your model changes and then you're getting a completely different output for your use case, then that's going to be a very difficult maintenance problem. Okay. So just to give me some more motivation, can you talk about some of the cooler things that people have built on top of your platform?

William Falcon

What have people built? So we have a ton of people developing drugs on the platform. So a lot of science companies, people sequencing DNA, people, trying to we have, a lot of people like testing games, for example. So using reinforcement learning to, to test, you know, what can happen in a game, for example.

William Falcon

What else do we have? I mean, there's like a I'm finding the most interesting. There's a ton of like, alarms. People are like fine tuning training, deploying alarms are wants, like hosting your own models, that kind of stuff. But I personally think the most exciting stuff is like things in the sciences. Right. There's recommendation engines as well, but, you know, a lot of big companies use our products today.

William Falcon

So, like there was, kind of the last thing that, we did was we worked with LinkedIn, so they pre-trained 100 billion parameter alarm. And, and so we helped them do a bit of that as well. And that was all done using PyTorch lightning as well. And, yeah, I don't know. I mean, for me, the sciences are the most, the most interesting stuff.

William Falcon

We worked a lot of foundation model companies as well. Nowadays, robotics is a big thing that we work on. And so they're like simulating all these things or training the models. I mean, it's it's pretty wild. I think the future is going to be crazy in the next few years. I don't think we're crazy yet. I think it's like just the beginning.

Richie Cotton

Like, when you said that people are developing new drugs on the platform, is this some kind of sketchy dark web thing or. Well, I'm presuming these pharmaceuticals.

William Falcon

Yeah, yeah. Pharma. Like, they're like submitting, you know, they're trying to find proteins to synthesize them into clinical trials, that kind of thing. Right.

Richie Cotton

Okay. All right. So legit stuff. Now, since you said that you scaled to, using large language models for, LinkedIn's AI applications. So, I presume that the scaling is pretty much taken care of.

William Falcon

Yeah. And we routinely deploy 600 billion parameter models, like you can find full hour one right now on the platform without quantization, 670 billion parameters. You can deploy like that. It'll do it. Multi-node inference. It's scalable. It works like, you know, people use those for distributed training ultimately. And like I think where our team has so much experience distributing models across the world.

William Falcon

So it is it is our bread and butter I would say today.

Richie Cotton

Okay. Nice. Just to wrap up, what are you most excited about in the world of AI?

William Falcon

Yeah. What is going to have I think, I don't know, robotics, I guess. I think that's really interesting. I'm not, I'm not. I don't think it's going to, like, replace workers. I think it's really more around, automating a lot of this factory stuff that people are doing. I'm actually, I mean, like, I I'll say it again, but I think the, the ability to fine, better drugs that, like, can actually cure things like cancer and so on very quickly because you have to you have to think about how the how this works.

William Falcon

Like if you want a cancer vaccine, right. I'm not a I'm going to butcher this probably. So sorry if you guys are like I one of these companies, but if you want to have a cancer vaccine, for example, like there's a very specific protein sequence that you need to generate, like there is a combination of letters, right? That that like will get you this particular protein.

William Falcon

And you know, back to like biology 1 to 1. There's like for me no assets I guess. Right. And you mix them up, you have how you connect them, etc.. So it's for it's like what like 100 potential spots on that, on that drug. And there's 100 to the four because there's four options. Right. Of these things and probably even longer.

William Falcon

How many possibilities is that like trillions. Right. And so and so it gets pretty pretty wild. So if you didn't have AI, you have to kind of like loop through all of them and be like, okay, sequence one, sequence to sequence three. And testing a sequence can take years because it's got to go through clinical trials. You have to synthesize the drug.

William Falcon

So imagine if you could reduce the the time that it takes and find high probability sequences that have like a high probability of being an actual drug. Well, you can actually solve most diseases because those sequences do exist. We just need to find what the what they are. Right. So I has a massive, ability to speed that up the drug discovery.

Richie Cotton

Absolutely. I'm certainly working on these big healthcare problems and these big social problems. It's just one of the great reasons for having AI. So we got real problems that are currently very difficult to solve right now. And hopefully in the future, we'll be able to solve them with some technology assistance. So yeah, it's a bright future. Now, final question.

Richie Cotton

I always need follow recommendations. Whose work are you most excited about? Who should I follow?

William Falcon

I mean, you got to follow my advisors. So Killian Cho, he, he was one of the original authors on this paper called Seek to Sequence attention. So it was, if you guys know. Well, attention, I guess I think it was created in that paper or like another paper by Ilya. I think they were both kind of working on this thing at the same time.

William Falcon

So they have very similar papers actually, at least Otsuka, who, who was the, chief scientist at OpenAI and, and then that led to many years later and he was a Yoshua Bengio lab when who was working on this that eventually led to, you know, attention, Transformers, attention is all you need, etc.. So they're doing super interesting work.

William Falcon

And they also happen to be working on, on, a lot of this cancer stuff at, oppression Genentech, for example. Right. So they're interesting. And then, young kun, you know, like, he's been at the, at the head of a lot of this. But like Yann, it's funny because he's been doing the same thing. So, like, even the things that I was in a PhD was, I'm like, I saw a paper recently where I was like, kind of that works.

William Falcon

So I had I continue my book. That would have been roughly what we were doing, but like it was learning from videos. Right. And so I think there's I would say a lot of people have meta like Facebook, and I have a lot of really good ideas, and they probably have the tooling nowadays to start executing them. And they're probably one of the last few labs that are like actually focus on advancing science and AI research.

William Falcon

I can't speak too much for DeepMind or Google, but, Facebook has really still pushed a lot of this. Like, you know, Lamar is a good example of something that they push to, to like, really open source. So, I don't know, I think they're so pioneering a lot. And, yeah, it's been, it's been good.

William Falcon

Who else? Kevin Murphy at Google, he's great. Loved his books. He does a lot of really good work. And, I mean, I, I follow more researchers. I guess I don't really follow like, the influencer types, so I can't speak for them.

Richie Cotton

No, no, research is definitely good to follow. So it sounds like I've got some reading to do. All right. Wonderful. Thank you for your time Will.

William Falcon

Yeah. Thank you for having me. It was fun.

Topics

Artificial Intelligence

AI Agents

podcast

Getting Generative AI Into Production with Lin Qiao, CEO and Co-Founder of Fireworks AI

Richie and Lin explore gen-AI use cases, getting AI into products, foundational models, trade-offs between models sizes, use cases for smaller models, cost-effective AI deployment, excitement for the future of AI development and much more.

podcast

End to End AI Application Development with Maxime Labonne, Head of Post-training at Liquid AI & Paul-Emil Iusztin, Founder at Decoding ML

Richie, Maxime, and Paul explore misconceptions in Al application development, fine-tuning versus few-shot prompting, the roles of Al engineers, planning and evaluation, challenges in deployment, the future of Al integration, and much more.

podcast

A Framework for GenAI App and Agent Development with Jerry Liu, CEO at LlamaIndex

Richie and Jerry explore the readiness of AI agents for enterprise use, the challenges developers face building agents, document processing and data structuring, the evolving landscape of AI agent frameworks like LlamaIndex, and much more.

podcast

The Data to AI Journey with Gerrit Kazmaier, VP & GM of Data Analytics at Google Cloud

Richie and Gerrit explore AI in data tools, the evolution of dashboards, the integration of AI with existing workflows, the challenges and opportunities in SQL code generation, the importance of a unified data platform, and much more.

podcast

The Challenges of Enterprise Agentic AI with Manasi Vartak, Chief AI Architect at Cloudera

Richie and Manasi explore Al's role in financial services, the challenges of Al adoption in enterprises, the importance of data governance, the evolving skills needed for Al development, the future of Al agents, and much more.

podcast

The 2nd Wave of Generative AI with Sailesh Ramakrishnan & Madhu Iyer, Managing Partners at Rocketship.vc

Richie, Madhu and Sailesh explore the generative AI revolution, the impact of genAI across industries, investment philosophy and data-driven decision-making, the challenges and opportunities when investing in AI, future trends and predictions, and much more.

See More See More