Skip to main content
HomePodcastsArtificial Intelligence (AI)

What to Expect from AI in 2024 with Craig S. Smith, Host of the Eye on A.I Podcast

Richie and Craig explore the 2023 advancements in generative AI, the promising future of world models and AI agents, the transformative potential of AI in various sectors and much more.
Updated Dec 2023

Photo of Craig S. Smith
Guest
Craig S. Smith

Craig S. Smith is an American journalist, former executive of The New York Times, and host of the podcast Eye on AI. Until January 2000, he wrote for The Wall Street Journal, most notably covering the rise of the religious movement Falun Gong in China. He has reported for the Times from more than 40 countries and has covered several conflicts, including the 2001 invasion of Afghanistan, the 2003 war in Iraq, and the 2006 Israeli-Lebanese war. He retired from the Times in 2018 and now writes about artificial intelligence for the Times and other publications. He was a special Government employee for the National Security Commission on Artificial Intelligence until the commission's end in October 2021.


Photo of Richie Cotton
Host
Richie Cotton

Richie helps individuals and organizations get better at using data and AI. He's been a data scientist since before it was called data science, and has written two books and created many DataCamp courses on the subject. He is a host of the DataFramed podcast, and runs DataCamp's webinar program.

Key Quotes

When you think about it think about what a large language model is doing, it's got a string of text and it's predicting the next bit in that string. And that's it's amazing. But it doesn't understand the underlying reality that text describes. And a world model will or does.

There's been a lot of talk about intelligence. And then there's been a lot of talk about super intelligence. I mean, there was Sam Altman's made some comments to the FT about planning for GPT-5, this is before his ouster, OpenAI, and that generated a lot of talk about superintelligence is coming next year or something. You know, this advancement towards superintelligence and AGI. I think is going to happen, but I don't think we're on the right track to get there yet. And I think that it's far distant and people shouldn't be expecting it or worrying about it right now. The threats right now are things like agents that can spread disinformation cheaply. But super intelligence and AGI, it's to me, it's still decades away.

Key Takeaways

1

Understand the capabilities and risks of AI agents, especially in their ability to automate workflows and potentially spread disinformation, to leverage their benefits while mitigating ethical concerns.

2

Cultivate AI literacy and skills, not necessarily in coding or algorithm development, but in effectively using AI tools and understanding their implications in your field, to stay competitive in the rapidly evolving AI landscape.

3

Get to know world models, an emerging AI technology that learns directly from real-world inputs, offering a more accurate representation of reality, potentially revolutionizing fields like autonomous driving and robotics.

Links From The Show

Transcript

Richie Cotton: Welcome to DataFramed. This is Richie. As we come to the end of 2023, the one thing everyone can agree on is that it's been an exciting time for AI. Today we're going to spend a little time reflecting on what's just happened, then think about what's coming next. AI itself is changing rapidly, of course, but beyond that, AI is changing how we interact with other software and with the world.

It's having an effect on careers and the economy and society in general. It feels important to understand how AI is going to have an effect on 2024. Our guest is Craig Smith, the host of the popular Eye on AI podcast. He previously spent more than three decades as a journalist and manager at the Wall Street Journal, the New York Times, and other publications, reporting from more than 40 countries.

Since he's seen a few news cycles, I'm keen to hear his take on where he thinks this current AI cycle is heading. Hi, Craig. Really great to have you on the show.

I think to begin with I want to talk about what's happened this last year. And I think really 2023 has just been a crazy breakout year for generative AI. So, what do you feel has been the biggest impact so far?

Craig Smith: it hitting the public consciousness, I assume you and, and I've been an AI, involved in AI, tracking AI for a number of years. And I. Been talking to my family and my kids and telling them, pay attention to this, pay attention to this. And everyone kind of rolls their eyes but finally it hit.

I had a conv... See more

ersation with Aiden Gomez a while back. He's a co founder of Cohare, one of the big LLM companies. And he was on the team that wrote the transformer algorithm. At Google that is at the core of all of these generative AI models. And he said that, yeah, they did this, they came up with this algorithm, and it did these amazing things, and then nothing happened, and they couldn't believe it.

They were You know, when are people going to realize what we've done? And it took Open AI and Microsoft, frankly, to invest the money in compute and scale for it to catch the public's imagination. So, yeah, I think just, it hitting the public's imagination was the biggest thing. And certainly the GPT-4.

The scale of g PT four was a big step forward. And then all of the other large language model companies piling in anthropic, and Coha and Facebook or Meta with Lama with the open source model and everything that's happened, me it. the public awareness was the biggest thing.

Richie Cotton: Absolutely. I've definitely noticed a change from like a few years ago. You say, Oh yeah, I'm doing some work with AI people, blank stares, but now it's like, Oh, that's a thing.

Craig Smith: Yeah. Everybody, yeah. Everybody is paying attention.

Richie Cotton: Absolutely. And do you have any personal favorite use cases of the technology?

Craig Smith: Well, personally because I write a lot I use it for research, I use binging, frankly, or the browser version of, chat, GBT, because it just speeds up your research, you know, you can ask a question, natural language. It'll come back with a footnoted answer, and then you can use those answers.

just saves a lot of time and research. And then I also use it for summarization, because then I can block and copy different articles and ask for, detailed bullets. It just speeds all of that up. I've had a few instances early on where I trusted it too much. I wrote a piece for Forbes once.

This is, last. February or something and, and realized after I'd actually posted it on Forbes that it had hallucinated. And, so I had to, fortunately it was only up for a couple of minutes, but, but yeah, it's, for me, it's, it's really the time saving aspect. I'm not using it to write or do more creative things, although when you send me some questions, you, we, you asked about image generation.

I don't use image generation much, but when I do write for sites that ask me to provide imagery I use mid journey primarily to generate images. and that's a huge time saver. And frankly, if I were a graphic artist, that'd scare the crap out of me. I'm sure it's impacting their business.

So,

Richie Cotton: Absolutely. Yeah, a lot of impacts and I'd love to get into defects on the economy and jobs and things later on. But yeah, I do like that just even simple tasks like being able to summarize documents are just tremendously time saving and important.

Craig Smith: and for that, I don't know how much the listeners have played around with these models. But Two, of the ones that I've played with or used is the largest context window. You can, or the, you can upload the largest block of text. So I use that a lot. I mean, chat, GBT is still con, still constrained on the amount of.

text you can upload for it to summarize. But, all of those issues will will be solved in time.

Richie Cotton: Yeah. And so, those sort of larger context windows are pretty useful. Actually, maybe we can get into this in a bit more detail. So I think perhaps these large language models have been the biggest story of generative AI so far in 2023. Do you think that's going to continue to be the case or, and how do you think they're going to change in 2024?

Craig Smith: I've been paying a lot of attention to world models, which are a different kind of model that don't depend on language for learning. And I think world models are going to be huge new direction in AI and generative AI. I mean, they're still generative. And I also think that agents, AI agents which people have been playing around with since GPT 3 first appeared.

There was something called AutoGPT that appeared on GitHub, where someone had built a wrapper. Around GPT that then would take actions. But now open AI has come out with agents. And I think that's really going to be the, the next direction with large language models. So you can tell a large language model or plug a large language model into an AI agent that then will take action.

And, you eventually will be able to automate entire. Workflows end to end right now you, you ask a large language model a question, you get an answer, and then you have to figure out what to do with that answer. But the agents, I think, are, are really going to change the economy globally, you know.

Richie Cotton: absolutely. I mean, so two important technologies there. Maybe we can start with world models. So it's actually something I'm not very familiar with. So can you talk through, like, what are these used for?

Craig Smith: Yeah, so, Yann LeCun has been focused on world models for a while. And now I'm not a practitioner, so I'm not going to pretend to understand everything about them. But basically, large language model predicts the next token in a series. And that token is either a word or part of a word. And it's drawing on.

Patterns that it recognizes in the language that it's been trained on, but all of its knowledge, and this goes to this debate about intelligence is contained in human language in text which is amazing as that is contains a lot of false information, a lot of old information. And so when the model is going to predict the next token or the next word, it's doing it based on the probability of what that word will be.

And it doesn't. credible job. I mean, very comes up with very sophisticated intelligent, sounding responses. It understands satire and poetry and and to a certain extent, computer code. But it's based on probabilities. It's and then and I think that in the public imagination, that's Confused a lot of people because it looks like intelligence.

anD I know that, that there are people a lot smarter than me that argue it is in some form of intelligence. But but the problem is whatever the, highest probability is, is what it's going to come up with, which is why It will hallucinate if there isn't an obvious next word, it'll come up with something and that may not be factually accurate in the real world because it doesn't have a grounding in, in reality, it's grounded in, the real Text that has been trained on, which is, one step removed from reality.

World models train directly on input from the real world without language. And the most immediate way to do that is through video. And, Yan LeCun has made tremendous advances in this. By the model look at a scene and it, takes rather than predicting at the pixel level what's going to happen next in that scene or understanding at the pixel level, level that scene as it's, progressing in a video, it encodes a representation of that scene in a, in a higher dimensional space and then it, It operates in that high dimension.

So if you're looking at a, at a scene of a roadway, this is being used by a company called Wave ai, WAY ai who's got their own world model. It's done amazing things. It'll look at a street scene and rather than when it, when it encodes that into a higher representation all of the detail.

Gets lost. It doesn't matter what color. Well, in some cases, it might matter what color, but it's all the fine grained detail at the pixel level is gone. It's just a representation that allows the model done to make. Predictions about what will happen or understand, for example, the laws of physics whereas in a large language model, it's understanding the laws of physics come from descriptions of the laws of physics in text.

In a world model, it learns the laws of physics by witnessing how the physical world operates, and that's a very Powerful and a much more fundamental representation of the world than you get through language, a much more grounded representation. Anyone who's interested in this should first of all pay attention to Yann LeCun, who's developed architectures, for this and look at what Wave AI has done.

we've all seen generative. Videos from large models that are essentially language models. They're predicting the next pixel as opposed to the next token. But take a look at what Wave AI has done in generating street scenes in a LLM or, one of these large visual models, It's like an acid trip, everything is blurring in and out and Wave AI with its world model, its model is called Gaia One, it.

Generates video that is very stable, very realistic. It's really remarkable. So Wave AI is using this as the primary model in self driving car system, but there all kinds of applications. So I think that's That's really promising. I think that there will be models that are much more grounded in physical reality.

and I think that in the long run, then you're going to blend world models and language models because it's a lot like how Humans learn, a child learns through observing the physical world and then once it understands the physical world, it starts learning higher level concepts through language.

So I think world models are a really exciting area of research.

Richie Cotton: That's absolutely fascinating. And I can certainly see how anytime you're doing anything spatial, you're going to want to have some kind of underlying physical model going on there. So yeah, it could be huge for any kind of manufacturing or engineering or. Even like, I guess, predicting things that are happening in sports games.

So that's world models. Definitely something to look out for. The other thing you mentioned before was AI agents. So yeah, talk me through how these are going to have an impact.

Craig Smith: Yeah, and this, when, when GPT 3, first came out there was a lot of anxiety, there still is a lot of anxiety about what these models are capable of and the bad things that they could be used to do, and one of the concerns was that you could take an, a large language model.

Chat. GBT had a plugin to Zapier, or Zapier had a plugin and integration with chat, GBTI can't remember which direction it went, but but Zapier is an agent it can execute actions. So if you want it to send an email. It can send the email if you want it to, post something on social media.

It can post something on social media chat. GPT alone, or one of the, big LLMs can't do that. they don't act.

People started engineering these agent wrappers in effect, like auto GPT, that you would. Query or, prompt the LLM and then the agent would take the output from the LLM and, do whatever you were asking it to do. And there was a lot of concern that and there still is a lot of concern that agents will enable bad actors to do terrible things.

I mean, you had a. call recently with a company called NewsGuard that tracks misinformation, disinformation on the Internet, and you no longer with these agents will need a person to prompt the LLM come up with some text and then Block and copy and paste it into a website. These agents can do all of that.

You can set up in there, you can set up thousands of agents and they could be, generating disinformation and posting it on social media all autonomously. But there are other amazing things that agents can do. Good things. And, your thermostat is an agent.

you set the thermostat, and when the temperature drops below a certain level, the thermostat turns the heat on. That's agency, and with the power of a large language model You will have agents that will be able to handle entire workflows processing, when I was at the New York Times, they had this crazy system for expenses, and it was such a drag.

I would wait. So, six months, and then I would spend literally a week, typing in these little boxes and, the thing was glitchy and you would lose, you know, agents will be able to do all of that with just simple set of instructions. So I think these agents. Tied to LLMs, will have an, a tremendous impact on the economy.

and that's starting to happen.

Richie Cotton: Lots to cover there. And I have to say, I feel your pain about expenses. One of the great things about AI is that, oh yeah, it can automatically pull out numbers from a photo or a receipt or something. That's, that's brilliant. But I loved what you said about a heating system, just your standard house thermostat being AI, really.

And yeah, it's a good point that in the broadest sense, you don't need a computer to do AI. It's just got to be something that takes data, makes decisions. So one thing you, you also mentioned was misinformation there. I suppose this is probably something very dear to your heart as a journalist.

Can you maybe talk me through like, what sort of misinformation you think is going to happen how it's going to play out in the next year?

Craig Smith: Well, as we all know, it's, it's been in the public space for a long time and it's created a lot of, trouble, certainly in the American political system. And we saw disinformation really having an impact during the last presidential election cycle. We're coming into a new one. And it's just going to be that much more powerful.

I mean, right now it's, having an impact in the. Public conversation, around the Israeli Hamas conflict and, not so much right now in the United States, but NewsGuard was showing me an example of an article that was written by a large language model about, Bibi Netanyahu's psychiatrist committing suicide.

And this psychiatrist doesn't exist. But the article was extremely convincing. And they created a website like Global News Outlook or some generic name like that. And posted the article on that website with a lot of other stuff. the article claimed that uh, Netanyahu's psychiatrist was so, despondent over what they called Netanyahu's, waterfall of lies that he committed suicide.

I think the average educated person would recognize that the site is not reliable and the, the story is far fetched. But that story was picked up by Iranian state TV and run through several news cycles. and, that kind of thing is dangerous. And when you have agents operating the Volume of that kind of disinformation is just going to explode.

And the question is whether there are systems in place that can chase all this or educate the public on how to recognize reliable information. So I, I think disinformation is, it's something that. That certainly has had an impact on public discourse in the United States and elsewhere in the world.

But I think what we've seen is child's play compared to what we're going to face in the next 10 years.

Richie Cotton: Absolutely. And that speaks to an interesting point that, okay, you've got a dodgy website putting out misinformation, but then once it gets picked up by a more legitimate site like National News Station, then there is some legitimacy given to

Craig Smith: Yeah, that's right. That's right. Yeah.

Richie Cotton: And so I suppose we've talked about the idea that hallucinations and These LLMs making things up is a problem, but it sounds like some people actually want to be able to make up things.

So I don't know. Do you have a solution? Do you think the problems with hallucinations are going to get better or worse in the next year?

Craig Smith: I'm skeptical that large language models are, that you're going to be able to tame them. The current strategy is through something called reinforcement learning with human feedback. And now there's a lot of research into reinforcement learning with AI feedback. So you don't need humans in the loop to train these models.

a large language model is like, Me. I'm an old guy. I've read a tremendous amount of stuff. anD if you ask me what happened some birthday party for my mother 30 years ago. You know, I'll kind of remember, or maybe I won't, but I'll, I'll tell you something. And a lot of it won't be accurate.

If you were able to roll back the tape 30 years and, and look at a video of that birthday party, I would have a lot of the details wrong. I may have, who was there wrong. I may have was served wrong or all of those details. Or I may even have the whole event wrong, the date wrong.

Large language models are like that. The information is in there, but it's. Only giving a response based on probabilities, and if it doesn't have a good solution according to the probabilities, it just makes something up, and, and, and as we all know, it, does so, with a very authoritative voice.

So, the way that all of these large language model companies have been trying to correct this is they have literally armies of people. At various pay scales, depending on, on how sophisticated responses that they're working on, I've seen these platforms, some of them are more detailed than others, some of them are really just a thumbs up and thumbs down, and we, you see these where a lot of users of LLMs are participating, you'll see is this a good answer or a bad answer And, these systems the language model, there's, there's an interface that it gets a reward if it's a good answer, it gets a punishment, I mean, not literally, but in computer code, if it's a bad answer, and through people tapping on these good or bad or choosing the best answer among several.

The hope is that the model will learn to behave better, to give more accurate answers. But if you talk to the researchers behind this, they don't know that that's what's going to happen. I mean, these models are so complex you can't trace the neural patterns in the network. you just hope, and then they say that, yes, they are getting better with this kind of training.

To me, that just seems, it just seems incredibly inefficient and unreasonable to expect that you'll eventually get this model to understand that it's not supposed to make things up. the better way to stop hallucinations that's being used by virtually every enterprise system that requires precision is to build vector database that you load with Ground truth or trusted information that's vectorized that turns every sentence into a string of numbers.

That's what a vector is. and then the model queries, the vector database for the answer. It may supplement it and it certainly then acts as the language Generator, but it's information is coming from this trusted source within the vector database.

And that's how it's it's being addressed today.

Richie Cotton: That's an interesting idea in order to stop hallucinations, you do need this second technology, these vector databases, and I fully agree with this by the way, I'm wondering, at the moment, having to learn an extra technology, it seems like, an impediment to adoption by the, the wider public, do you think that's going to happen?

change in the next year? Do you think it's going to become easier to make use of having factual information in documents and helping?

Craig Smith: yeah, I think, yeah, certainly not everybody can build a vector database. I think for the public, people are just going to have to learn I mean, certainly Bing chat, for example that's tied to search that gives footnotes. public is going to have to learn that it's fine to ask, ChatGBT to write a limerick or a poem to your girlfriend or something.

But if you're dealing in factual information, you need to use one of the systems that's tied to search that's going to give you, footnoted sources. And then you need to go and check the sources. You can't trust that the LLM is actually giving you accurate information even though it cites the sources.

And I've come across that many times where I think, Bing Chat has given me the answer, I see the sources, but then there's something in the answer that doesn't look right, and I'll look it up, and it's, it's not. not accurate. So the public just has to understand that these are tools.

They're not brains that you can trust. It's, it's, as I said, it's like talking to an elderly person. You're, they have a lot of knowledge, but, they confused a lot and forgotten a lot. You just can't rely on that.

Richie Cotton: Absolutely. I'm starting to think your analogy of as you get older, you become more and more like one of these large language models. It's pretty accurate. I guess the other thing is that your context window tends to shorten a bit as well. You start forgetting things. Do you want to tell me a bit more about how these contact windows, like the amount of memory that a large language model has, actually how is that going to change?

How is that going to affect things?

Craig Smith: Yeah, so this is interesting, too. the context, when, I mean, certainly they can make it larger, and they will, and they can deepen the, the memory that these models have of the conversation that has gone on during a session. The issue really and this is why as much talk as there is about enterprise adoption, and as many applications that are, are hitting your inbox there's a constraint, in how much throughput you can have in one of these applications because, and it goes all the way back to silicon starts, in the foundry, so you, you bake a big piece of silicon and then you slice it into these thin slices, those are called starts, and then you etch them with circuitry, and depending on how much money you've spent on your lithography machine and, we'll determine how finely the circuits are etched, how much you can squeeze onto, a chip and then the, the, the starts are, cut into chips and then the chips are, packaged but there just aren't enough starts out there.

There aren't enough, foundries certainly doing etching at the, at the fine grain that's required for for these advanced chips, which means that Nvidia, who's the primary provider right now, or any other of these new chip makers. The chip makers are not actually making the chips, they're designing the chips, and then they send them to a foundry like Taiwan Semiconductor Manufacturing Corporation that cuts the silicon and etches it they're there just aren't enough foundries to handle all of the orders from people like NVIDIA.

So, NVIDIA has a limited supply of chips, and everybody wants them. So, they have to parcel them out, and, and according to the, their relationship with various, buyers. And that For somebody like OpenAI, who's operating GPT 4 they're constrained by that, by how many GPUs they have. So, when you get a query, when you type in a prompt, there are a certain number of tokens is what they're called, but it's basically, you can think of it as how many words, that get sent to GPD four.

Well, GPD four only has. I mean, it certainly has a lot of GPUs, but it doesn't have enough to handle all the queries. And so it limits. That's called a rate limit. It limits the number of tokens that you can process uh, per minute, and it limits the number of queries you can hit it with in a minute.

And so these big enterprise applications, a lot of companies are testing these in, small pilot systems. But when they try and scale them to enterprise it gets too slow. There is this bottleneck that you can't get through. And, that also limits the context window that is available to public users.

Because that context window, again, those are, tokens that are having to go through this pipe to hit the large language model. So that, I mean, that'll get solved. And there are a lot of things being done, a lot of research, not all of that. Every word you type needs to be sent to the LLM, or there are different ways of speeding up.

inference on the LLM side, so there's a lot of work being done to try and ease that bottleneck. But ultimately, it's going to require more silicon starts and more foundries. And it takes Years to build a foundry. I mean, there are new foundries being built today, but it'll take 5, 10 years before there are foundries producing enough chips to ease this bottleneck.

So that that is a constraint that's going to be there for a while.

Richie Cotton: That's interesting that you think computational cost is really a bottleneck at the moment, and that comes from the sort of, the, the bottleneck in, like, the lack of even just GPU chips.

Craig Smith: That's right. And electricity and the power required. That's very expensive. Yeah,

Richie Cotton: okay, so, if That means there's a big economic incentive to have less computationally intensive models. And would that, would that also result in a drop in price? Are we going to see a sort of a race to the bottom in terms of pricing of these large language models?

Craig Smith: I don't think so because of these constraints, and that's a, a very good point. The because the alternative to when I'm talking about this pipe, I'm talking about the application what is API application

Richie Cotton: Programming interface. Yeah.

Craig Smith: interface? I was going to say protocol, but yeah. You can build your own.

LLM. And then you're only constrained by the number of GPUs you can buy, but that's very costly. So I think certainly in time there will be cost competition that will make this cheaper. I don't think that's going to happen very, quickly within the next year or two. I think these constraints are going to be there for a while.

Richie Cotton: Related to these, this idea of like the changing economics of generative AI, it's also going to affect a lot of jobs, a lot of industries. Do you have a sense of which jobs are going to be most affected over the next year or two?

Craig Smith: First of all, there have been these crazy projections that half of all white collar jobs are going to go away or, you don't need CEOs, that can be done by any time there's a new technology, there are all these crazy projections that turn out to be wildly wrong. I do think it'll affect employment.

and this is, An old saw that people have been using for all year now. It's, your job isn't going to be replaced by AI. It'll be replaced by somebody that knows how to use AI. So I think it's really important. And this is something I feel strongly about in in early education and, high school and tertiary education that people worry, oh, students aren't gonna do homework.

They'll just let the chap GPT do it. I mean, there will be strategies for recognizing. I think most teachers can already recognize a generated, essay, but People need to understand how to use this stuff through, just as my kids, even today, I'm, I consider myself pretty tech savvy, but, they'll see me trying to do something and they'll just be frustrated and go in and bing, bing, bing, bing, bing, and it's done.

the generations coming are going to have to, and they will Become comfortable and fluent in using all of these A. I. Tools that are coming on stream and the people that don't do that or the companies that don't do that are going to fall behind and eventually by the wayside very much as happened with the Internet.

Early adopters, people that understood it people that saw opportunities on how to use it, have, have advanced ahead of other people.

And it's not something that I have spent a lot of time, researching, but, on the image generation, certainly there are a lot of artists, graphic artists who are using image generation.

If you go on mid journey, most of what you see, streaming by and anyone who's interested should try mid journey. It's fascinating because you you can see what people are generating. Most of that is being done by professionals who are refining images for their work. But But as I said there are various platforms that I write for where I'm responsible for providing an image, which is always a channel, a challenge with articles about AI.

And one of my pet peeves is that AI and robotics are two different fields. They are combined in some forms, some places, but a robot is not. An A. I. And every article I see, it's got a robotic finger or a robot doing something. So when I'm coming up with images for articles rather than going into a database, getting an image of a robot or also a matrix like screen of digitized

Code or something. I generate an image and you can generate amazing images on your own. And, and if I were a graphic artist, I haven't spoken to anybody about how it's impacting their business, but I would guess that that's having a huge impact because suddenly you don't need an artist to generate that image for you.

Richie Cotton: I have many questions for you around AI literacy, but because you mentioned that you don't like these using robots as being a representation of AI in images, do you have an alternative? Like, how would you represent an AI in an image?

Craig Smith: Well, that's why I like the generative, because they don't represent the AI. They come up with some some surrealist imagery or Dadaist imagery that, is evocative but it I mean, how do you represent AI? you can't take a picture of a large language model.

You can take a picture of the logo. You can take a picture of the web interface. So, yeah. But, robotics is also an interesting field related to a I because these particularly a I agents and world models are now being built into robotic, a I brains, the controllers of robots.

And one of the things that has Limited the application of robots is their inability to deal with unstructured environments. So most robots, it's an arm fixed to a platform and it's doing things. And Most of those robots you see in a, welding robots in a, automobile factory, they're, controlled by, with very precise mathematical models, so they, if the, weld joint is positioned correctly, the robot is going to weld exactly that point, but if the weld joint is, position incorrectly, the robot will weld in the wrong place.

So, AI now, with computer vision, it can recognize where the joint is and, take over control from the mathematical model to, place the weld where it's supposed to be. And that That kind of application is now being extended into all kinds of robotic applications. And, Eventually will lead to robots that can operate in the wild, so to speak which we haven't seen yet.

And then there, the other side of robotics is, is that it's a hardware problem. We've all seen, I mean, certainly Atlas at Boston Dynamics that does flips and stuff, there's a guy with a joystick off camera that you don't see, he's controlling that, and you also don't see how many times Atlas falls new models World models, AI agents are going to solve that problem.

But it's the the actuators and the joints and all of that fine grain movement sensors that have yet to catch up. So robotics is really a, hardware engineering problem.

Richie Cotton: Absolutely. That's a fascinating use case there with the idea that just these robot arms can be taken over with computer vision, improve the quality of the manufacturing. I'd like to go back to what we were talking about previously on the idea of AI literacy and How there are going to be some impacts on your job, but if you have some sort of AI skills, it's going to help you out.

What sort of AI skills do you think you need in order to, remain competitive in this sort of job environment?

Craig Smith: Yeah. And, and this is, you know, I have kids. Who are early in the job market. You don't need to write code. You don't need to understand algorithms or the finer points of training models, you need to understand how to use tools. And the tools right now we're in the midst of this kind of combinatorial explosion of tools.

So it's hard to know which ones you should learn how to use. but I think it's important that people in the same way that, websites in the early days had all sorts of different formats, but they, coalesced around common, drop down menus or everyone, you can look at a new web interface and, you may not understand what it does, but you understand how the buttons work or how generally how, the flow of a website works.

And I think these models already becoming standardized, even not models, I mean, tools, even though they do different things, but the interfaces are becoming familiar to people that are using them. And, and so, the, the, the image generation models are, are tools that that people should understand how to use.

There are all kinds of agents that are, are coming on stream that that people will. That will be productized and people will be able to to learn how to use them. And I think that's important. I can't give you a list of tools, but anyone who's interested in any particular field, should look at what's coming out.

what people are talking about and, try them.

Richie Cotton: Yeah, I think at this point it's just impossible to keep up with all the tools and it's just thousands from this point, but I like the general idea. Yeah, maybe play around with some of these image generation models, play around with some agents and see where you get to. Beyond the tools themselves, are there any things you?

You think are important to have skills in like, do you need to understand anything around like privacy or use cases or things like that?

Craig Smith: thE ethics and the, and the private, those, to me, those are really issues, that have to be dealt with at a corporate level. I Certainly think people using AI tools should generally be aware of how their data is being used, and how. It may be misused in the same way that it took a while for people, on the Internet to understand that they're giving their data to systems that are not necessarily protecting the privacy of their data.

So I think generally people need to be aware of that. I think On the ethics side, people need to be aware, but again, this is more at a corporate level of, how. These systems can be abused or how they can be biased and unfair. tHe obvious, example are, systems that, banks and mortgage companies use to decide who gets a loan and those systems have been notoriously biased in ways that are not obvious.

But that's more of a corporate policy, a company that's deploying these models needs, really to understand the biases and, be wary of, small companies that have cobbled together, a tool that is maybe cheaper than a company that's larger and has the resources to address those things.

aNd then if you're using, generative AI text language models, just understanding, the hallucination problem and being really careful. And you really have to be careful. I'm, I've been caught several times And fortunately, nothing has gone out into the public sphere with my name on it that's but I've, I realize that, yeah, I, I, I really have to be careful.

Richie Cotton: Absolutely I do think these ideas of like, understand bias, understand hallucinations, make sure you have a healthy cynicism of just because, the AI said it doesn't mean it's true. These are very important. Okay, so, we've talked a lot about, like, the predictions that you think are going to happen.

Have you heard any predictions where you think, no, this is rubbish, I don't think this is going to happen at all in the next year?

Craig Smith: as I said with large language models, there's been a lot of talk about intelligence, and then there's been a lot of talk about superintelligence. I mean, there was, Sam Altman's made some comments to the FT about, planning for GBT 5. This is before his ouster OpenAI, and and that generated a lot of talk about superintelligence is coming next year or something. this advancement towards superintelligence and, AGI, I think is going to happen, but I don't think we're on the right track to get there yet. And I think that it's Far distant, and people shouldn't be expecting it or worrying about it right now.

The, the threats right now are things like agents that can spread disinformation cheaply. But superintelligence and AGI, to me, it's, it's still decades away.

Richie Cotton: Okay so, That's interesting. AGI is not something we need to worry about in the near future, but maybe just to finish up, what are you most excited about right now? What's the, what's your number one excitement?

Craig Smith: World models, that really fascinates me, because and I think we're going to see a lot more coming out of that research over the next year or two. Because again, when you think about it, think about what a large language model is doing. It's got a string of text. And it's predicting the next bit in that string.

anD it's amazing, it's amazing it can do it. But it doesn't understand the underlying reality that that text describes. And a world model will. Or does and there's a lot of work that has to be done on them and a lot of money that has to be invested in scaling them. But to me that, that is a direction that personally I just think is much more fundamental and, and, and exciting in building true intelligence.

And then, putting a language model on top of it.

Richie Cotton: Okay. Voice models, definitely something to look out for then. Thank you so much for your time, Greg. It was great having you on the show.

Topics
Related

You’re invited! Join us for Radar: AI Edition

Join us for two days of events sharing best practices from thought leaders in the AI space
DataCamp Team's photo

DataCamp Team

2 min

The Art of Prompt Engineering with Alex Banks, Founder and Educator, Sunday Signal

Alex and Adel cover Alex’s journey into AI and what led him to create Sunday Signal, the potential of AI, prompt engineering at its most basic level, chain of thought prompting, the future of LLMs and much more.
Adel Nehme's photo

Adel Nehme

44 min

The Future of Programming with Kyle Daigle, COO at GitHub

Adel and Kyle explore Kyle’s journey into development and AI, how he became the COO at GitHub, GitHub’s approach to AI, the impact of CoPilot on software development and much more.
Adel Nehme's photo

Adel Nehme

48 min

ML Workflow Orchestration With Prefect

Learn everything about a powerful and open-source workflow orchestration tool. Build, deploy, and execute your first machine learning workflow on your local machine and the cloud with this simple guide.
Abid Ali Awan's photo

Abid Ali Awan

Serving an LLM Application as an API Endpoint using FastAPI in Python

Unlock the power of Large Language Models (LLMs) in your applications with our latest blog on "Serving LLM Application as an API Endpoint Using FastAPI in Python." LLMs like GPT, Claude, and LLaMA are revolutionizing chatbots, content creation, and many more use-cases. Discover how APIs act as crucial bridges, enabling seamless integration of sophisticated language understanding and generation features into your projects.
Moez Ali's photo

Moez Ali

How to Improve RAG Performance: 5 Key Techniques with Examples

Explore different approaches to enhance RAG systems: Chunking, Reranking, and Query Transformations.
Eugenia Anello's photo

Eugenia Anello

See MoreSee More