Skip to main content
HomePodcastsArtificial Intelligence (AI)

The Power of Vector Databases and Semantic Search with Elan Dekel, VP of Product at Pinecone

RIchie and Elan explore LLMs, vector databases and the best use-cases for them, semantic search, the tech stack for AI applications, emerging roles within the AI space, the future of vector databases and AI, and much more.  
Updated Mar 2024

Photo of Elan Dekel
Guest
Elan Dekel

Elan Dekel is the VP of Product at Pinecone, where he oversees the development of the Pinecone vector database. He was previously Product Lead for Core Data Serving at Google, where he led teams working on the indexing systems to serve data for Google search, YouTube search, and Google Maps. Before that, he was Founder and CEO of Medico, which was acquired by Everyday Health.


Photo of Richie Cotton
Host
Richie Cotton

Richie helps individuals and organizations get better at using data and AI. He's been a data scientist since before it was called data science, and has written two books and created many DataCamp courses on the subject. He is a host of the DataFramed podcast, and runs DataCamp's webinar program.

Key Quotes

What companies like Pinecone and the foundation model companies like OpenAI are doing are the big building blocks that enable innovation and they enable using machine learning in the real world and in conjunction with huge data sets of existing knowledge or new information, new knowledge, such as applying it to video streams or audio. So what we're doing is putting in place the infrastructure. And putting in place the infrastructure is hard, requires specialized skill sets, but taking these building blocks and building very powerful, and very useful applications for end users is a very different thing. There's huge opportunity for that.

As a company, you need to have a long-term vision for what you're trying to achieve with AI and keep focused on that, rather than focussing on the latest model to be released. A lot of new things pop up. Many of them are interesting. Many of them are just not that relevant. And the most important thing is that you focus on what you're building, do a really good job at that.

In many cases, you can swap out the model for the latest, greatest thing at a later stage. So in many cases, you're not building your core product around the specifics of one model, you can keep updating them as they get better and better.

Key Takeaways

1

Vector databases are crucial for handling complex AI tasks, such as semantic search and image recognition, by efficiently working with vector embeddings to understand and process the semantic meaning of data.

2

Semantic search enabled by vector databases represents a significant advancement over traditional text-based searches, allowing for more nuanced understanding and retrieval of information based on the meaning behind queries.

3

Businesses can leverage vector databases and LLMs for a wide range of applications, from enhancing customer support with chatbots to improving internal knowledge access and decision-making processes.

Links From The Show

Transcript

Richie Cotton: Welcome to DataFramed. This is Richie. One of the big problems with generative AI tools is that they sometimes hallucinate. That is, they make things up. Vector databases like Pinecone, along with a technique called Retrieval Augmented Generation, are being promoted as the solution to this. Today we're going to look into how vector databases can be used in software like chatbots and enterprise search, along with the latest developments in the space.

Our guest is Elan Dekel. He's the VP of product at Pinecone. Since he's literally in charge of building the vector database, I'm excited that we get to go straight to the source to find out what's new. Elan also has a long history in this area. He was previously a product lead at Google, where he led teams working on the indexing systems to serve data for Google search, YouTube search, and Google Maps.

Let's hear what Elan has to say.

Hi, Elan, great to have you on the show.

Elan Dekel: nice to meet you and thanks for having me.

Richie Cotton: I'd like to start with a problem, actually. So one of the big criticisms of large language models is that they sometimes make things up or hallucinate. So when might you need to worry about this problem?

Elan Dekel: one way to think about it is that a large language model is a combination of a reasoning engine, like a brain and a bunch of data, a bunch of knowledge Transcribed that it was trained on and is sort of stored you know, in the model. And you run into ... See more

problems when you try and ask it a question that was not in the data that it was trained on.

And in that case, it's not going to tell you hey, I don't know the answer. Because the model was trained, the weights are set, it's perfectly happy to answer the question and it'll perfectly confidently generate an answer. and that's, that's what we call hallucinations.

Because it's these answers are generally totally made up. they sound very uh, realistic and it says the answer very confidently. So, it looks true. But in many cases, it's it's totally made up.

Richie Cotton: Okay. And vector databases like Pinecone have been touted as one of the solutions to this problem of hallucinations. So can you just tell me what a vector database is and how it's different from a standard SQL database?

Elan Dekel: So, a vector database is a database that's designed to work with vector embeddings. Vector embeddings are basically think of it as a, a long list of numbers. That's a vector, and these numbers are generated typically by machine learning models. And these numbers represent the semantic meaning of whatever that object is that you fed into the model.

So it could be a bit of text and the model will convert that to a vector. A different model might take an image or a video frame and do the same thing. so basically you end up with a whole bunch of these vectors. And you need a way to search over them. And a typical database, regular SQL database, is not designed to do that.

So that's where vector databases come in.

Richie Cotton: Okay, so once you've converted all the text or whatever into numbers, then it's just it's something that machine learning can deal with, or it's just math to work with it, rather than having to deal with text, I suppose. Okay, so, can you talk me through what the most common use cases of vector databases are?

Elan Dekel: Yeah, there's, there's many vector databases are you can think of it as know, in this new machine learning era essentially all information at some point is going to be represented as these vectors, because that's how you apply machine learning to data. if in the past you had databases with lots of tables with text and things like that in them, in the future databases will all have vectors.

Not necessarily replacing the regular SQL databases, but in many cases working alongside it. But for many use cases that existing databases simply aren't designed for vector databases will be used by themselves.

Richie Cotton: Okay and are there particular business applications that are particularly popular at the moment?

Elan Dekel: So, yeah, again, there's many. So, obviously, in the last year since ChatGPT was launched, many people, as you, as you mentioned, are using it for what's called retrieval augmented generation, which is where you augment the knowledge in the large language model. with knowledge that's retrieved from the vector database.

So that's a very popular one. People are using vector databases to create things like chatbots that know how to answer questions based on a proprietary data sets, say a company's support information or internal knowledge that only exists within a company, for example, that ChatGPT or other models didn't have access to when they were being trained.

Image recognition is another one image search recommendation systems tons and tons of things. So there's you can think of vector databases as essentially a very it's going to be a low level primitive that essentially every application in this sort of machine learning age will, will be using in the near future.

Richie Cotton: These all seem like incredibly important use cases, so, things like chatbots is something more or less every company needs. I'd like to go into a bit more detail on some of these use cases, so maybe we'll start with the simplest thing that you mentioned, which is search. Now, we've had search capabilities for documents and websites and the whole internet for decades.

So, how is semantic search different?

Elan Dekel: So, Search engines started by essentially searching over text strings. So, you type in, say, you know, a few characters, like, Peter, and then it'll search through all the text in its database to find that exact sequence of strings. That was sort of the initial type of search.

Obviously, it's problematic. Say you've misspelled the name or say you're talking about a concept. Say you're talking about a dog and the article that you're searching over refers to canine or something like that. It won't find any of that. So, that's where we come into something called semantic search, which is where the search engine knows the meaning of what you're searching for, and it knows how to do things like handle misspellings and things like that.

So those are much more sophisticated search engines and companies like Google spent thousands and thousands of programmer years essentially making their search engine. Be aware of the semantic information in the document to make a high quality search. So, the great thing about using vector embeddings and using large language models for search is that you essentially can get to the quality, almost to the quality of Google search just by using a model and feeding the text through it and using vector embeddings and doing vector search.

You don't need you know, thousands and thousands of programmer years of work to build your own search engine.

Richie Cotton: Yeah, it certainly seems incredibly time consuming trying to work out which words are synonymous or nearly synonymous with other words. And to have that done for you automatically has got to be a huge productivity boost.

Elan Dekel: Exactly. And it's very, very complex. And again, like companies like Google and Microsoft, thousands of engineers, like, making that work really well. And in a sense, you can think of these large language models as sort of, you know, in one fell swoop, allowing people to essentially reproduce that level of quality, if not better, in many cases.

Richie Cotton: That's pretty fantastic. you also mentioned image search before, so that seems like a very cool idea, but I'm not quite sure when you would use it. So, what are the sort of business applications of image search?

Elan Dekel: So there's different types of use cases. there's two types. Image search is uh, we call it a classification problem. So we have uh, classification, then we have extreme classification. So, uh, classification would be like hey, this thing here is a hamburger. this is a picture of whatever.

Say you're building a. a website where you can search over food or things like that, and people are uploading pictures of their dinner. So you can sort of figure out what that dish is. Extreme classification, on the other hand, is where you're actually trying to find an individual.

So for example, if it's people, you can do like face recognition. So if you have like millions of bases in your database and you know, you're building a. say, a system to verify your identity or some sort of security system, and you take a picture of yourself that it can actually find an exact match of that face.

Classification would say, hey, this is a face, or this is a person. Extreme classification would say, oh, this is Peter, this is Elan, this is John. So both of those can be supported by by these techniques and by vector databases and again many, many different use cases.

Richie Cotton: Yeah, I can certainly see how, there's different levels of difficulty there. So just saying is this a person in the video is much easier than is this Elan in the video.

Elan Dekel: Exactly. So for example, like, People have used Pinecone for security systems for like, if you think of a company that has security cameras and they have thousands of cameras with feeds of video and you want to find like situations that are unexpected, like, Ooh, a person is now going through that doorway in my data center, which people don't usually go through.

Like I'd like to get an alert. So that's the kind of thing that you can build using a vector database.

Richie Cotton: So we talked about text, we talked about images, other, other different types of data that you can also include in your vector database.

Elan Dekel: Yeah, totally. I mean, again, anything that you can represent as a, vector embedding is fair game. Just one example there's systems like at YouTube and other companies that deal with copyrighted content. So where you upload a video, say somebody's taking a video that they shot at home.

These systems know how to detect. The background music and detect whether it's matches a piece of copyrighted music from a library of copyrighted songs. And then it can decide what to do with it. So that's an example of, you know, representing the music as a vector and then matching it with the database of vectors representing.

different types of music.

Richie Cotton: Okay. So, it, it seems like more or less any data type you can have there. So, audio seems to be an important thing as well. Now, you mentioned a technique called Retrieval Augmented Generation before. Can you just tell me a little bit more about what that involves?

Elan Dekel: When you, when you query uh, or, or when you work with a large language model to build something like a chatbot you typically generate your query by providing it with some context about your question. So you can say you know, what's an example, like, how do I, how do I use Pinecone?

How do I upstart data into Pinecone? One way of doing it is to just go to ChatGPT and, entering that as a query. And ChatGPT may or may not give you the right answer. It may provide a, a hallucinated answer that it just made up. So what you do is to get rid of this hallucination, you use something called Retrieval Augmented Generation, which means You use the generative capabilities of the model, but you augment it with information that you get via retrieval.

So retrieval is the process of querying some kind of information retrieval system. In this case, it would be a vector database. So you would query a database. Say we had a database full of pinecones support information. We'd say, how do I upsert data into Pinecone? We would create Pinecone's vector database, and we'd find the appropriate support documents.

We'd retrieve the text from them and then we would generate context for the query. So you have that same query that you're asking Chad TPT, how do I upload data into Pinecone? But then you add to it please use the information that follows. then you paste in the text that we've retrieved via querying the vector database.

So you basically have your query, and then you have the context which follows it. And this whole package, you upload into ChatGPT, or your large language model of choice. And it will know how to understand the content and the context and the query and provide a really well written response that is now accurate because it's based on the knowledge that you've just fed it.

So this whole process is called retrieval augmented generation.

Richie Cotton: Okay, so just to make sure I've understood this, it sounds like, the case you were talking about before with search is just going to directly return entries that are in your vector database. And in this case with Retrieval Augmented Generation, it's then combining those results with some kind of prompts to a large language model to have a more natural language interface to to get nicer results.

Elan Dekel: So, so the, the model will be able to understand the text that you've given it. It understands your question and it finds the knowledge from the context you provided and then it constructs a reason, well reasoned response. So in many cases, the response you give it could be like two or three pages of, text, which may include random gibberish.

So it actually goes through and finds the precise piece of information and then reformats it, reformulates it into a response and then provides the response. It's quite remarkable that it works, but it works very, very well.

Richie Cotton: It's such an incredible technique, and it sounds like this is leading towards being able to build better chatbots. So can you maybe talk me through how Retrieval Augmented Generation works with chatbots?

Elan Dekel: So many chatbots today, like I think some of the first use cases have been customer support. So it's helpful because instead of waiting for a support person you can get an instant answer from the chatbot. And in many cases, companies are also hesitant to let customers talk directly to a person because it's expensive, increases their support costs, et cetera, et cetera.

With a high quality retrieval augmented generation based chatbot, you can actually provide instant and very high quality responses for many cases. So many companies are finding that it works quite well and, and customers are quite happy and through reducing their costs and things like that.

There's many other use cases for this, by the way, like, LLMs are uh, amazing at reviewing legal documents and things like that. So there's many. There's many use cases that companies are using this for.

Richie Cotton: I'd love to hear more about those. In particular, have you heard any success stories from companies using this technique?

Elan Dekel: Yeah there's many. I can't talk about all of them. I don't have approvals to mention every to mention many of our customers. But for example, Notion is is a customer that uses Pinecone for allowing people to ask questions about all the material that they've into your Notion workspace.

So, for example, here at Pinecone, we use Notion for all our internal documentation, including our benefits information and. healthcare and as well as just general internal corporate documentation. So you can go there and you can say, Hey when are the corporate holidays for Pinecone? And it'll find the appropriate page, pull out the information and respond.

You know, like ChatGPT does. So it's, it's super, super useful.

Richie Cotton: That's incredibly important. I'm always amazed at how hard it is to find information about what's going on in your own company sometimes. So, Solving that would be an amazing win for a lot of places. Are there any other particular use cases? You mentioned the legal case. I'd love to hear a bit more about that.

What happens there?

Elan Dekel: So actually legal and sort of business use cases in general, people are finding that using vector databases for the semantic search and then using. The reasoning strengths of a large language model are, you know, very, very powerful combination. So we're seeing some of our biggest customers are from the legal tech space.

for example when you have a case, you have thousands of pages of documentation that you have to wade through and find the, you know, the exact write paragraph that's relevant to your case. So, they're using it for things like that. So you can essentially ask questions about all the documentation that's been provided.

And it's, I'm not a lawyer, but apparently it's it works quite well for this use case.

Richie Cotton: Okay. So we've established that for a lot of these generative AI use cases, you need a large language model and a vector database. Is there anything else in the tech stack?

Elan Dekel: So clearly there's many pieces, but an important piece is getting this information from the raw data, which could be a bunch of PDF files, or it could be webpages, or, you know, could even be an existing database. Getting that data into the vector database to be used in this way can be a bit of work.

So, different solutions have sprung up. Frameworks of different kinds are available. Open source and and non open source. But the problem is basically the same. You have a bunch of documentation you need to extract the text from it. You then have to go through a process called chunking.

The chunking is basically you know, have a whole bunch of text, document might be a hundred pages long. And typically one vector can represent, again, depending on the model that you use, can represent a certain amount of text in a way that you can retrieve from it well. So, for example For this legal documentation, you would typically want to chunk it up in terms of, you know, a number of paragraphs, you know, maybe a thousand words, plus or minus, something like that, and then create a vector that represents that amount of text.

So a large document could have, several tens or hundreds of vectors that represent parts of that document. So anyway, so there's a process of Extracting all that extract, chunking it up, creating the vector embeddings, creating the embeddings. By the way, we should discuss for a second you have to use something called an embedding model.

So it's similar to a large language model, but a bit different because it's trained to output the embedding, which is again, the stream of numbers instead of responding in natural language. So it takes this input chunk of text, outputs the vector, and then you store those vectors in the database.

And once you've done all of this, then you have your index and you're, and you're ready to go. The next step above that is to build an application that knows how to connect the vector database with the large language model. So it's essentially it does some orchestration where it gets the query from the user.

It analyzes the query and potentially even uses a large language model to break up the query and say you've asked a relatively complex question, it might decide to subdivide that question into several sub questions. And then for each sub question, it'll then query the vector database, get the responses, and then construct this context, which it then feeds into the model for the, for the response.

So, at a very, very high level. That's That's how this Retrieval Augmented Generation framework looks like.

Richie Cotton: Okay. So, it sounds like it can get pretty complicated. quite sophisticated and maybe a little bit fiddly in places. And certainly that's my experience of sort of playing around trying to use these technologies is that once you start trying to split your documents up into chunks, you're not quite sure how big the chunks should be.

And then I don't want to say tedious, but it can be quite difficult to get the right answer. I'm just wondering are there any ways of, or are there any technologies to help make it easier to build chatbots using these technologies?

Elan Dekel: Totally. And in fact, when we first tried to do it internally, we found that it was quite challenging to get a high quality answer. So it turns out that you need to figure out exactly the right size of chunks. You have to understand exactly how to generate the query and the prompts. And yeah, there's a bunch of fiddly aspects, as you said.

So what we did here is we also built an eval system, which allows us to evaluate the quality of the results and try it out with different types of data and that would allow us to tune these different parameters. So that's what we built, and we actually released. package that does this.

It's called canopy. It's available on github and that provides some tooling which makes all of this a lot simpler. And of course, there's many companies that have built solutions for this, some paid solutions and other open source frameworks. like Llama Index and Lankchain that are, again open source and, and quite powerful.

So there's a lot of excitement, a lot of innovation happening, happening in this space right now.

Richie Cotton: It does seem like, that's going to make it a lot easier to build these things quickly and get much more reliable systems. this idea of tuning parameters, it reminds me of the concept from machine learning called hyperparameter tuning, where you don't have to worry about setting some of the, some of the, well, the hyperparameters, you'd still Get it to run automatically, try a few different values, and it's automatically going to pick the best ones.

So, that's excellent. Okay, so the other tricky thing when you're building with this is cost. So, already I've been speaking to a few Chief Data Officers, and they're like, It's come from like last year where everyone's excited to like build prototypes and now they're like, well, actually, you know, once you put things in production, it's just eye wateringly expensive.

So, I guess at what point should organizations start worrying about the cost of generative AI?

Elan Dekel: There's probably different aspects to cost here. One is based on simply the size of the dataset you're working with. Like, do you have a, really huge dataset with hundreds of millions or billions of documents that then get converted into hundreds of billions or billions or tens of billions of vectors.

So that can get expensive to generate the vectors and then post them in a vector database query them. So that, that's one element. The other element is and by the way, generating the embeddings can also be expensive. Because you're paying OpenAI or, some other company typically to generate the embeddings for you.

So, so again, depending on the size of your data set, there's one set of costs. And then on the query side if you have a use case, that's a high QPS basically, you know, you have millions of people that are querying your system constantly. So then you have to stand up a system that can support high QPS and.

And you're also paying for the inference side using ChatGPT to generate the responses and paying an embedding model to generate the query and all of that. So, so there's again, costs are either based on performance, your performance needs, number of queries per second and on the size of your data set.

So there's two, two aspects at a high level.

Richie Cotton: Okay. So it sounds like there's a lot of different areas where you're going to be paying for things in this case. So, do you have any advice for organizations who want to reduce the cost of working with gen AI?

Elan Dekel: So there's different techniques here. So, on the large amount of data side Pinecone has made we've launched a new system called Pinecone Serverless, which is designed specifically for companies that have very, very large data sets. And we can store it store the vectors using object storage, using like blob storage, like Amazon S3, which make it very, very cheap store the data set.

And if, like most organizations, their workload pattern is such that they have a large data set, but relatively small compute requirements, like relatively low QPS. so essentially what we've done with Pinecone Serverless is separated storage from compute. You can essentially buy the amount of storage you need and we're providing very, very low cost storage.

And then you get the amount of compute that you need specifically for your use case. So we found that most organizations that have that you know, workload pattern of large data, small QPS will find huge, you know, up to 50x reduction in cost compared to existing vector databases.

Thanks. Some customers have a very high performance use case, like you're building a recommendation system. You may have a very, very small data set, hundreds of thousands or maybe small number of millions of items, of, of vectors, but they have a very, very high TPS. So that, requires a different solution, which we have in our, existing pod based platform.

The other way you can reduce costs is by looking at which model you're using, both the embedding and both for the large language model to answer, essentially answer your your questions. So, we found that there are open source models that are very high performance.

In terms of quality and can be much cheaper to run than calling open AI, for example. And in fact, we found that the combination of a slightly lower quality model using retrieval augmented generation can provide a very high quality result. So that's, that's another interesting way that people can can reduce costs if needed.

Richie Cotton: Okay, so your first point about some customers having very low compute requirements compared to their document size, I can certainly see how something like trying to search a corporate internet might be like that. So it's got lots and lots of documents, but, you know, people aren't. Trying to search for all the time.

Whereas the recommendation engine where everyone's trying to like find out which product should I buy that's gonna have a much higher commute requirements. Okay. So, the other point you made was about using open source models. I've had a look on like the Hugging face platform and they've just got hundreds and hundreds of different models.

So are there any open source models that you think are particularly popular with your customers?

Elan Dekel: I think the Lama models are people people use the Falcon models. they actually publish a there's a table where you can see kind of like a leaderboard of the newest models and their quality rankings. So honestly, that's what I would, that's what I would look at

Richie Cotton: Okay.

Elan Dekel: of just, yeah, instead of just naming some off the top of my head.

Richie Cotton: All right. I'm, I'm sure, yeah, every quarter there's going to be a new leader and uh,

Elan Dekel: Totally. This changes all the time and there's so much innovation in this space. an interesting company called Voyager that's creating vertical specific models, like a model embedding model just for, you know, legal or finance, things like that, which super interesting as well.

Richie Cotton: On that note you have any advice on how to keep up because there are new models come out all the time. There are new techniques coming out all frameworks. It seems like everything changes every few months. So. How do you deal with that level of constant change?

Elan Dekel: Oh gosh. It's a good question. I think that's, the challenge for everybody working in this space. I think first of all, a, as a company, you need to have a long term vision for what you're trying to achieve and keep focused on that. A lot of new things pop up. Many of them are, you know, some are interesting.

Many of them are just not that relevant. And the most important thing is that you can, you focus on what you're building, do a really good job with that. And in many cases, you can swap out the model, you know, for the latest, greatest thing at a later stage. So in many, you know, many cases, you're not building your core product around the specifics of one model.

So, you know, you can keep updating them as as they get better and better.

Richie Cotton: Okay. So maybe you don't need to jump on every new technology that appears and start swapping things out every few weeks, but just, you know, make sure that you're building stuff where you can swap out the model or whatever other component later. All right. So, I think this leads nicely onto talking about skills, so, I think a lot of these AI applications are kind of odd because they require both software skills and AI skills.

So, are you seeing any new roles appear for the creation of these AI applications?

Elan Dekel: it's hard to tell. So right now, so we're a database company. we build infrastructure, so we hire, engineers that are familiar with, very complex distributed systems, know how to design for high performance and things like that. We're not, our customers are the ones building the chatbots and things like that on top of us.

We build frameworks in between us and that level. So as far as I can tell now, it's a combination of, just deep software engineering and understanding of machine learning. I understand that, you know, there's new roles like prompt engineers and stuff like that that are popping up, but we operate at a lower level than all of that.

Richie Cotton: Okay, so it seems like there might be a sort of boom in these lower interest, infrastructure level jobs, and then also maybe at the machine learning level as well. And so for everyone else who isn't a developer, they're not into infrastructure are there any other opportunities in this area that you're seeing?

Elan Dekel: A hundred percent. I think What companies like Pinecone and the foundation model companies like OpenAI are doing are the sort of big building blocks that enable innovation and enable using machine learning in the real world and in conjunction with huge data sets of existing knowledge or, you know, new information, new knowledge, like, applying it to, video streams or audio or whatever. So, what we're doing is putting in place the infrastructure. And putting in place the infrastructure is hard, requires, specialized skill sets. But taking these building blocks and building very powerful and very useful applications for end users is a very different thing.

and I think there's huge opportunity for that. Essentially every vertical out there. When I say vertical, I mean things like, the automotive industry, the accounting industry, the legal industry, finance pharmaceuticals, et cetera, et cetera. All of those are going to have thousands and thousands of use cases that are all built using the same infrastructure.

So if I were somebody looking to get into the machine learning field. Or at least using machine learning to innovate in you know, in traditional industries, I would learn how to use, on the one hand these core building blocks that are built by companies like, again, like Pinecone and OpenAI and Cohere and others but then also deeply understand a specific use case or specific industry.

And once you understand that industry, you, be able to figure out areas where you can innovate and optimize some process or allow people to do things that they simply couldn't do before because of the scale of the data or the complexity of the task or, or things like that.

So there's I would say almost an List of opportunities out there. So again, it's learning how these technologies work. And it's a lot easier to use Pinecone and to use OpenAI than it is to actually build those things. But once you have these building blocks, It's fairly manageable and straightforward, even for small teams of, you know, one or two people to build very, very powerful solutions that actually can make a substantial impact on a company or on an industry.

Richie Cotton: I do love that this has become much more accessible. Like you mentioned AI has just got a lot easier to use in the last couple of years, and I think it's continuing that way. So. Because of this, it feels like there are a lot of people without a technical background who suddenly become interested in AI.

So what do you think are the most important things that everybody needs to know about AI and about vector databases? I

Elan Dekel: I think understanding at a conceptual level, what they do, how they work, how they interact with data, how you prepare data. to be used in the models, how to wire them together. And again, wiring them together is, you know, much easier than it was in the past. There's many solutions and products that help you do that.

Again, I think it's really down to understanding the problem that you're trying to solve. You know, and we see these, every day, like, somebody's trying to help an insurance company take images of accidents and figure out you know, the damage that was caused.

Like what were the specific types of damage and how much does it cost to fix it? that's just one example. Or you're an oil company that is drilling and as you're drilling, you're uncovering all kinds of seashells. So you're building a database of like. literally millions, hundreds of millions of these samples of things that you found in the seabed.

And you want to be able to, search over them to gain insights about what you're drilling in. Again, these are use cases I never would have imagined, but there's, as soon as you start digging in to any industry, you'll find a huge number of these.

And these are, again, huge data sets, So it's definitely a big data problem and they need machine learning and they need techniques like vector embeddings to solve.

Richie Cotton: have to say, I've also never thought about the use case of having a giant database of seashells. But yeah, you're right that once you start thinking about all the different industry use cases, there's so many options.

Elan Dekel: And the thing is, as like, uh, if you're an entrepreneur or just a data scientist or a software engineer, you'll never uncover these just by, reading or, browsing the internet or whatever. You have to actually get out there and interact with these businesses. and talk to them and say, Hey, what are, what are the types of things that you're trying to achieve?

Like, what are the, if you had this magic machine learning, you know, thing, what would you like to do with it? And they'll probably come up with ideas and you can help them achieve that. Because I think, they're struggling from also lack of machine learning expertise.

Again, this stuff is all new. And it's also new to them.

Richie Cotton: It is a tricky thing where there are so many possible things that you could do with this. It's sometimes hard to know where to begin. I was wondering whether you've seen any common patterns across your customers or just from talking to people. Like, what's a good sort of a simple first project?

Elan Dekel: The text based ragged use cases, I think are the simplest and most common. And example projects could be what I mentioned, like, applying that to your internal knowledge in your company to allow people to essentially chat with their knowledge base and ask questions like, Hey, you know, How does health insurance work in in my company or things like that.

So that's relatively, conceptually straightforward. It's not entirely trivial to put in place, but conceptually straightforward and there's patterns and frameworks. But if you think about it, literally every company in the world will implement this in the next 10 years. Like literally every single one.

and probably 1 percent of them have done that now. So it's a huge, huge opportunity. And similarly for things like tech support, as I mentioned any company that has customers would love to reduce their tech support costs and allow their customers to get a high quality answer immediately instead of having to wait, you know, over the weekend.

those are just two examples that you want to get started in the space, easy to start experimenting with, and I'm sure you'll find customers who need it immediately.

Richie Cotton: Yeah, certainly support chatbot seems to be a huge thing. I can certainly see how having a bot where, It's programmed to be friendly all the time. This is going to be a good thing. Okay. So, you have any other advice on how to get started?

Elan Dekel: Start building, start hacking on stuff. Download some of these open source frameworks. Use, we have a free tier. Just start experimenting. We have a a Coursera course with Andrew Ang that you can, can learn about databases with and then start learning about an, you know, an industry, find some interesting use case and to start building.

And you'll probably surprise yourself by how quickly you can get something usable, something useful.

Richie Cotton: Let's go. I love the idea that you should just start building. I should, I would be remiss not to mention that you can also learn about Pinecone on DataGamp as well from one of Pinecone's own developer advocates, James Briggs. So yeah, also possibility. No, sorry.

Elan Dekel: beautiful. Yes. Love James. Love James. And I'm so pleased that he's that he's worked with you.

Richie Cotton: Yes, excellent. All right. So, just to wrap up what are you most excited about in the world of vector databases and AI?

Elan Dekel: think so the way, the way we look at it is if you look at all the data in the world today, only about 10 percent of that data, I'm talking about like digital data is currently stored in a database. And like 90 percent of the data out there. All the unstructured data, all the videos and you know, even emails and things like that.

Those are all currently not in the database and are exceedingly hard to, to search over and use. So this is where vector databases come in. The reason they're not in a database is because You can't really query over images or emails in a useful way. Just, you know, in a SQL database.

That's why you need search engines. The search engine today aren't great at these things, so that's why databases are sort of another leap forward. So we're excited about the fact that there's a huge opportunity here to use machine learning with huge amounts of data that are currently not, essentially not accessible, not usable.

so we think there's going to be tons of interesting things to build and to innovate with.

Richie Cotton: It's fantastic stuff. I love the idea that absolutely everything is data now and you can actually, yeah, work with it, search it and calculate on it. Excellent. All right. Thank you so much for your time, Elan.

Elan Dekel: Thank you, Richie. It was great talking to you.

Topics
Related

You’re invited! Join us for Radar: AI Edition

Join us for two days of events sharing best practices from thought leaders in the AI space
DataCamp Team's photo

DataCamp Team

2 min

The Art of Prompt Engineering with Alex Banks, Founder and Educator, Sunday Signal

Alex and Adel cover Alex’s journey into AI and what led him to create Sunday Signal, the potential of AI, prompt engineering at its most basic level, chain of thought prompting, the future of LLMs and much more.
Adel Nehme's photo

Adel Nehme

44 min

The Future of Programming with Kyle Daigle, COO at GitHub

Adel and Kyle explore Kyle’s journey into development and AI, how he became the COO at GitHub, GitHub’s approach to AI, the impact of CoPilot on software development and much more.
Adel Nehme's photo

Adel Nehme

48 min

ML Workflow Orchestration With Prefect

Learn everything about a powerful and open-source workflow orchestration tool. Build, deploy, and execute your first machine learning workflow on your local machine and the cloud with this simple guide.
Abid Ali Awan's photo

Abid Ali Awan

Serving an LLM Application as an API Endpoint using FastAPI in Python

Unlock the power of Large Language Models (LLMs) in your applications with our latest blog on "Serving LLM Application as an API Endpoint Using FastAPI in Python." LLMs like GPT, Claude, and LLaMA are revolutionizing chatbots, content creation, and many more use-cases. Discover how APIs act as crucial bridges, enabling seamless integration of sophisticated language understanding and generation features into your projects.
Moez Ali's photo

Moez Ali

How to Improve RAG Performance: 5 Key Techniques with Examples

Explore different approaches to enhance RAG systems: Chunking, Reranking, and Query Transformations.
Eugenia Anello's photo

Eugenia Anello

See MoreSee More