Chatbots, Conversational Software & Data Science
Hugo is a data scientist, educator, writer and podcaster at DataCamp. His main interests are promoting data & AI literacy, helping to spread data skills through organizations and society and doing amateur stand up comedy in NYC.
Transcript
Hugo: Hi there Alan and welcome to DataFramed.
Alan: Hey, it's great to be here.
Hugo: It's great to have you on the show. I want to jump in and discuss your OG Media post from April 2016, which really got the ball rolling for everything you're working in now. You opened this post with the following statement, "We don't know how to build conversational software yet." I feel like you meant "we" as a society and community of tool builders. Conversational software includes ChatBots. I'm wondering now, two odd years later, is it still the case that we don't know how to build these types of things yet?
Alan: I would say probably yes, I still agree with that statement, but we definitely made some progress. It's still very much early days for building great natural language interfaces for computers.
Progress of Conversational Software
Hugo: Tell me a bit about the progress you've seen in the past two years.
Alan: Maybe a bit of context as well where that blog post came from. We built a few SlackBots because Slack was one of the first platforms that opened for this. Alex, my co-founder and I had ... Actually we had a couple of chatbots that companies were paying for. They were both around actually making data science more accessible, turning natural language queries into SQL and then running those queries on a database. Anyway, we had some experience building these things and looked around and thought, "Wow. There really aren't any great developer tools for how to build convers... See more
Hugo: What you mean by Facebook Messenger platform opening up is opening up in order to have conversational software in it?
Alan: Exactly. Letting developers build chatbots and that anyone with a Facebook account could then chat to. We thought, "Well, we have now quite a bit of experience. We know it’s really hard to do this well and these are not going to be great experiences for people." I think that is actually pretty much how it panned out. There's a lot of hard work involved in getting things working well even in a narrow domain. Doing something with just kind of open ended conversation is definitely nowhere in sight.
Alan: I think we've made some good progress in terms of libraries and tools that people can use that make it easier to build something that does work, even though we've definitely not cracked this problem by any means.
What are the pressing things we haven't figured out yet?
Hugo: What are the most pressing things that we haven't cracked with respect to building conversational software?
Alan: There are a number of things. The two things you need for building conversational software ... The first is understanding messages that people are sending to you. That's called NLU, or natural language understanding. What that usually means is classifying a short piece of text as belonging to one of N intents. Those are class labels for things like a greeting or saying good bye or asking for some specific things that your chatbot can do.
Hugo: Like looking for a hotel or something like that?
Alan: Exactly. Like, "I'm looking for a hotel." Then, the other part is pulling out some structured data. That's usually called entities. I'm looking for a hotel in San Francisco and then knowing that San Francisco is what you want to use in your query. That's NLU. I think it's fine to call it that as so long as you remember it's absolutely not true. The computer doesn't understand anything. It's just able to classify text into one of these buckets and then pull out some entities.
Hugo: Yeah. This is a large concern in general with terms like artificial intelligence, that kind of anthropomorphism involved in nomenclature and the way we name stuff. It's dangerous, especially when it permeates common language.
Alan: Definitely. Definitely. I think there are lots of complications and limitations to that, even that simple model of the intention of what a short message means, being a universal single label. This always means this. You can only represent its meaning by a single one hot vector. That's obviously a very limited model of understanding what people can express. It's a good starting point and lets you build on top of that, but of course, it's not ... When we think about the future, I think that's not how we're going to represent the meaning of short messages.
Hugo: We have NLU as being a huge challenge. What was the other one you were going to speak to?
Alan: Yeah, the other one then is dialogue management. If somebody says "yes" in the middle of the conversation, the way you respond, of course, depends on the context. It depends on what happened before. It's not enough to just map each of these intents or each of the outputs of your NLU system to the same action. You need to always build up some kind of statefulness.
Alan: Then, the question is how do you handle that in a way that doesn't break and is actually maintainable? That's a big part of what we've been working on for the last couple of years. That's, actually, I think the biggest part that was really missing back in 2016 was a reasonable way of dealing with that complexity because the way that people were doing it, basically manually writing a lot of rules, just doesn't work and it doesn't scale. It causes a lot of headaches.
Hugo: Right. I like this idea of scaling. Because as you say, writing a bunch of rules is, I suppose, the most bare-bones naïve way to think about writing conversational software. I imagine you can have a set of nested "if-elses" to try and deal with everything, or a subset of everything. Then, as soon as a new use case comes up, you then need to nest them even further. This is something which definitively does not scale.
Alan: Right. Exactly. If you nest them deep enough, then it counts as deep learning.
Hugo: That's hilarious.
Alan: In general, you have ... I've tried this plenty of times. Even building a relative simple bot that it was a banking chatbot that we just built for fun to see “How badly does this work if you really just do it with rules?”You could do a couple of things like check your balance and transfer some cash to people and everything. I think it came out at over 500 rules just for doing this simple stuff. Of course, then when you want to update something or something goes wrong when you add a new rule, it clashes with the old ones. Then, you go, "Oh, no." Then, you have to go and try to reason about all these rules and figure out why it is that something broke or that something clashed.
Alan: There's an asymmetry there, which is interesting, because in the middle of conversation ... The kind of cliché example by now is what do we want? chatbots. When do we want them? Sorry, I didn't understand your request.
About Alan and Rasa
Hugo: Okay. That's provided a great teaser into a lot of the through lines that we're going to talk about with respect to NLU, scaling, a lot of different use cases of these types of chatbots and conversational software. Before all that, I'd like to find out a bit about you and Rasa. Maybe you can tell me a bit about yourself, what you do, and what Rasa does and what Rasa is.
Alan: Yeah, Rasa is two different things. Rasa is a company. It's a start-up. It's also a pair of open source libraries. There's Rasa NLU, which does language understanding, so parsing short messages. Then, there's Rasa Core, which does dialogue management. I'm a maintainer of both of those libraries. The aim of them is really to expand chatbots and conversational software beyond the answering simple questions, FAQ style, one input, one output kind of ... Turning it into a real conversation and building that in a way that scales.
Alan: Where I see us as a company and also where we come from was around 2016, when we said, "Okay, nobody knows how to do this." Actually there's a lot of research on how to use machine learning to overcome some of these problems. There's a lot of great papers written on it, but there weren't any libraries that developers could use to actually implement those ideas. Where we see our role is really in that big gap between arXiv and github, let's say. Something that's actually well maintained, has lots of tests, has people responding to issues, has support, gets updated regularly, and ... We do a lot of applied research in this field and we publish papers. We work together with universities, but it's always very strictly applied. Then, the primary output is always to put out some new code that people can do that does something better that they couldn't do before.
Alan: Where I see us as a company and also where we come from was around 2016, when we said, "Okay, nobody knows how to do this." Actually there's a lot of research on how to use machine learning to overcome some of these problems. There's a lot of great papers written on it, but there weren't any libraries that developers could use to actually implement those ideas. Where we see our role is really in that big gap between arXiv and github, let's say. Something that's actually well maintained, has lots of tests, has people responding to issues, has support, gets updated regularly, and ... We do a lot of applied research in this field and we publish papers. We work together with universities, but it's always very strictly applied. Then, the primary output is always to put out some new code that people can do that does something better that they couldn't do before.
Alan: One recent example was we completely changed how we do intent classification. We shipped a new model, which threw all the old assumptions out the window and just said, "Okay, now we're going to learn word embeddings in a supervised way for this task." That lets us do things like building up hierarchical representations of meaning, understanding that a message can contain multiple meanings, because sometimes people just say multiple things. We do a lot of that. Then, the primary output is a piece of code people can use. Then, if we write up a paper, that's a nice component.
Hugo: This speaks to the open source side of Rasa. You said that Rasa is two things?
Alan: Yeah, so the other side is a company. The first year that we operated, we basically did a bunch of consulting work on top of the open source. That was really great, because building stuff with it yourself, it keeps you really honest about its limitations, helps you understand your customers. Then, after we did that for a year, that was obviously very nice and we bootstrapped the company, then we said, "Actually, we think this is scalable product we can build here for an enterprise version." We talked to a lot of these big companies that we've been consulting for. There was a clear need for an enterprise package for more features and a different product. We thought, "Okay, that's something we want to take a bet on." For the last six months now, we've stopped all the consulting work. We raised some venture capital and really just went full on building out the enterprise version.
Alan: Then, still all the machine learning stuff goes into the open source. A big part of what we believe in is you can't build up a competitive advantage by having secret implementations of algorithms lurking around. The machine learning stuff needs to be open and needs to be tweakable. People need to be able to play with it. There's just so much nonsense around in the AI space that it's better to just say, "No fairy dust. There's no magic. It's just stats. Go look at the code. All the machine learning's open source. You can see exactly what it does. You can tweak it for your own purposes."
Hugo: Yeah. I'm sure all the different interplays between your company and your open source development are really exciting. For example, correct me if I'm wrong, but you've recently hired two machine learning researchers.
Alan: Yeah. We're only 10 full-time people, but two of those are full-time on ML research. The measure that we really care about is how quickly can we take a new idea, like this little trick actually works, and then put it in production. The great thing about having the open source community is that whenever we have a new idea or something that kind of works, there are just thousands of people who are just ready to check out the master branch and see what it does on their data sets. That feedback loop is really awesome. You can't do that if you build things that are closed source product. You don't get that kind of insight.
Hugo: That's really awesome. I suppose we should say also that it's a Python library, right?
Alan: It is. It's written in Python, but also we obviously work a lot with large companies. Our main enterprise customers are Fortune 500. They're, let's say, not mostly on Python. They mostly write in .NET or Java or C#. Then, there's a large chatbot developer community that uses JavaScript. What we did was we made sure that the libraries, even though they're written in Python, you can use them without using any Python. You can consume everything over http API. That means that you don't have to be running a Python stack yourself to use these libraries. You can spin them up in docker containers and use them and deploy them to production without having to actually write in Python yourself.
Hugo: That's really cool. I may be putting the cart before the horse, or the chatbot before the company to extend the analogy into absurdity, but there is a fantastic DataCamp course that you've created and I facilitated last year on building conversational software with Python and using Rasa. There's also a lovely interplay, for those people who know a bit about the Python data science landscape, a lovely interplay between Rasa, scikit-learn, and SpaCy, for example.
Alan: Exactly. There's no point in reinventing the wheel. A lot of really great libraries out there for doing different things ... For example, in Rasa Core, which is the dialogue manager, you can plug in different back ends to implement your model in and actually do the machine learning part. One way you can think about Rasa Core is it does all the hard work to get that conversation into the kind of XY pair format that you think about when you think about machine learning. Then, you can plug in whatever classifier you like. Of course, we have some good ones implemented already. You can implement your stuff in TensorFlow or Keras. Actually, I think the Keras API was a big inspiration for that. Just saying, "Okay, can we just abstract over all the things that aren't important for understanding the problem and really just present the API that makes sense to you?" Then, we don't need to build our own autodiff library, because why should we? You can use TensorFlow for that. That also runs on GPU, et cetera.
Alan: Similarly for doing things like part of speech tagging. There's a lot of things you can do with SpaCy and it makes more sense to build on top of libraries like that. You can choose different back ends you want to use and implement some lower level functionality for both Rasa NLU and Rasa Core.
Define Chatbot
Hugo: That's fantastic. I want to in a second to think about what type of use cases, which verticals and industries, you see most interested in conversational software. Before that, I just want to step back a bit. What's a good working definition of a chatbot or conversational software? Are they the same thing? That's a relatively ill-formed question, but maybe you can speak to that.
Alan: Yeah. I'd say that the things that we're interested in are not just strictly chatbots. Some people would call some of these things virtual assistants. I think that voice is extremely important. Anything where you interact with the computer through natural language is something that we're interested in. I would call all of that conversational software.
Hugo: That's different to just a chatbot, right?
Alan: I think chatbots mean different things. In some groups of people, a chatbot strictly means actually something that does chit chat, can't actually do anything for you, it's just there having an interesting conversation. That's actually less what we're interested in. Small talk is not something that we really focus on. I'm more interested in purposeful conversations that actually do something. Then, there's another kind of definition that says a chatbot is on these mobile messaging apps, right? Facebook Messenger or Slack or WhatsApp or Telegram or any kind of application that lives inside of one of those apps, even whether it's chat or button-based or it shows you a little web view. All of those are also chatbots. It's ill-defined as most terms are.
Hugo: This may be a relatively naïve question, but I think naïve questions can give us a lot of insight. Why should we care about chatbots or conversational software?
Alan: I think it's a great question. I think one of the main reasons I'm really excited about it is that it makes computers usable by people who aren't experts. That can be on a very banal level or it can be on a more sophisticated level. For example, my grandfather never used a computer in his life. He knew what the word "internet" was, but he never knew what it meant and he never had an experience of it. Then, I think about my parents, who are to some degree computer-literate, but also certainly not the same as my generation. Then, I think of how quickly technology is progressing.
Alan: If we all believe that tech goes exponentially fast, then if my grandfather was behind, imagine how far behind we're going to be when we're old. I think a big step that we need to take is how do we, rather than have the need for computer literacy, how does the computer adapt to deal with how we already think about things? It's sometimes for really stuff. Just asking a simple question and getting an answer, rather than having to open up a document and find where it is or something. Then, sometimes it's for more sophisticated things. I always like to think how many Google searches do you think there are every year for, "How do I do X in software Y?" Then, imagine that you didn't have to Google that. You just had to say to the software, "How do I do X?"
Alan: Photoshop is something I know is very powerful. I have no idea how to use it, but I could express some of the simple things I'd like to get done. "Add a blue border around this image" or something. Why can't I just engage with computers by saying what I want and just having it happen? I think it's powerful in a lot of different ways, especially for making tech more inclusive. It's really important.
Hugo: No, that's great. Because that also speaks to a very personal, individual power. Not the type of power when you're working with corporations or businesses achieving business goals, but really on the ground empowering.
Alan: Yeah. Yeah, absolutely.
What industries are interested in conversational software?
Hugo: Which verticals and industries are currently most interested in using conversation software?
Alan: My sample is definitely biased, because I work on Rasa. The companies that we know about and approach us are the ones that find us important enough that they want to invest their own engineers into building this out. They realize it's strategically important. They want to build up these capabilities. I'm sure there are other industries where it's really relevant, but the considerations are different. They don't necessarily want to run the tech or own the tech in-house. Where we see the strongest pull from industries in financial services, so insurance and banking, and another one, which is automotive. I think that makes a lot of sense. I remember when I got the Amazon Echo. I had it for about six months. I thought it was cool. It was fun, but wasn't a game-changer.
Alan: Then, I went to a conference in the south of Spain. I rented a car and I had to drive three hours from the airport. I remember just driving on the motorway in a foreign country with my phone clipped to the dashboard. I was tapping on it change the map and to change the song on Spotify. I thought, "This is so dangerous. Why can't I just talk to the car and say, 'Play this song' the same was I was used to doing with the Echo at home?" I think automotive is a really valid use case, especially for voice interactions.
Hugo: For sure. That's very clear where the value can be delivered. The example you just gave was a paradigm, I think. How about in insurance and banking? What kind of value can conversational software deliver there?
Alan: I think it's on different levels. If you think about what is your main interface to your bank, to the data that you have with the bank, and how do you engaged with it? Some banks have really nice mobile apps and you can just do everything there. One other interface that a lot of people fall back to is just going to the branch and talking to a human. Why do you do that? Because you have questions, right? You can't ask a question of an app generally. You can do the things that were suggested to you that the developers decided to implement, but you can't say the things that you don't know or ask for the things that haven't been implemented. Asking for advice, understanding things a bit more in depth. Those are all things where there's a lot of value to be added, especially around insurance where the actual product that you buy, the policy, can be very complicated and hard to understand. Talking to a human is also tedious. You have to take time out of your day to engage with them.
Alan: If you have a 24/7 agent that you can chat with, they can answer most of your questions, it's actually really pretty powerful.
Hugo: Having some sort of intelligent conversational software for banking in that sense, for questions and FAQs and that type of stuff, would stop me from becoming infuriated when I call the bank. You end up in one of those graphs that directs you downwards and you just keep shouting representative like 15 times. Press 0 for whatever, because it makes no sense. I also have the added challenge, maybe you can speak to this, that I'm an Australian living in the United States most of the year round. I have to put on a fake American accent to be understood by the automated telephone system of Bank of America.
Alan: I can definitely understand that. I've seen a lot of that also. My girlfriend's from Scotland. Her accent can be misunderstood both by humans and by software. Also, non-native English speakers understanding language. Understanding speech, there's been some great successes, but it's certainly not a solved problem or commodity by any means.
Alan: It's not something where we try and compete. You've got to focus on something and it's not something that we do. It's a certainly really interesting problem.
SMS Chatbots
Hugo: Hey, I remember you telling me some time ago about an insurance chatbot, but it wasn't voice. It was an SMS chatbot. Maybe you could tell us a bit about these types of use cases.
Alan: Yeah. That, actually, was a really interesting one. It's a large insurance company in Switzerland]. A really big company. It's 160 years old. They wanted to do something to engage people whose house insurance policies were about to expire. These are five year policies and they run out. They have a large customer base. Because employing people in Switzerland is very expensive, it actually wasn't literally worth their time to have an agent call up all these people and ask if they want to renew their policy. What they did was they built a chatbot, actually with Rasa, and it went out over SMS and engaged with these people. It said, "Hey, your policy is due to expire. Is your living situation still the same?" It would ask them questions. Has anything changed? Maybe you moved. Maybe you got a dog. Maybe you got a more expensive car. Or whatever it is. If the person wanted to renew and they collected all the data they needed, they'd say, "Okay, here's your quote. Is that cool?" If they agreed to it, it would just actually finalize the policy and the policy would be in the post with them within a couple of days.
Alan: It's really end-to-end automated. That's also what I mean by conversations that really do something. It's nice to answer questions, but it's a lot more powerful ... You can say, "Okay, it's now done." It's on the way. I think one of the interesting things is it challenged a lot of assumptions around what chatbots are and what they're useful for. People think about it firstly as a customer service thing and saving costs, which is certainly relevant. If you can actually increase revenue, that's really compelling. This was actively reaching out to these people over SMS. It was a 30% conversion rate, which is really astonishing in terms of getting people to buy a new five-year policy.
Alan: The other thing is that people think about chatbots and they think about generation Z and messaging apps, when actually the first person to buy an insurance policy over this chatbot was a 55 year old Swiss lady. That's all really interesting. I think it also speaks to the fact that it does make tech more accessible for a larger group of people. You can just speak to these systems. You let your customers speak to you how they think about the problem, rather than forcing them into your paradigm, which is this software that you've built for them to with your company.
What industry isn't using conversational software to its potential?
Hugo: Exactly. You've spoken to some really interesting use cases in insurance, banking, the automotive industry. Are there any other industries that aren't as interested in conversational software as you think they should be?
Alan: That's a really good question. I am not sure I have a good answer to it. I have personal frustrations with things like telecom companies getting your internet set up in your flat and all that kind of stuff, where it could certainly be a lot more useful if you could have a 24/7 automated support that you chat to and get things done. I'm not sure if I can think of off-the-shelf of examples of industries that are really neglecting this. We see, actually, a really big rise of companies reaching out telling us that they're using Rasa and sometimes asking about the enterprise or sometimes just have some questions. It's actually much more diverse than the use cases I listed. Definitely the dominant ones are things like financial services and automotive.
Loops
Hugo: Okay. Great. As you mentioned, expressed your frustrations with calling up telcos for example, I actually remembered government, I think is probably a great example.
Alan: Oh, yeah.
Hugo: The IRS and the tax system. I actually ended up ... I mentioned my frustration with those telephone trees that you get sent down, like, "Press 1 for whatever". I ended up, somehow, when I called the IRS several years ago, on a loop. I thought maybe you wouldn't be allowed loops in these graphs, but I ended up on a ... I was about to swear. I ended up on a loop in one of these conversations. I ended up hanging up. There's definitely room for a lot more aspects of conversational software in these types of places.
Alan: Yeah. I think when the experience of someone who's in a phone tree is, "How do I most quickly get to speak to a human?" It's not, "How do I most quickly get the thing done?" Just because we have such low expectations of these things actually ... Because they're mostly just pre-qualifying what you're problem is to send you to the correct person. Or that when the person talks to you, they already know half the information that they need and they don't have to collect that. It's really optimization on their part. It's really not optimization on the point of you as a customer getting your problem solved quickly.
How to get started in conversational software.
Hugo: For sure. I think this has provided a lot of insight for our listeners into conversational software, where it's being used, the ins and outs. I'm sure a lot of them are eager to hear about where they can get started. I think both technical and non-technical listeners alike would be interested in hearing about how they can get started with conversational software. Maybe you can give a few pointers?
Alan: Yeah, definitely. Also partly because there was this big boom when Facebook Messenger opened up, there are actually some really nice online tools that you can just point and click and build a little prototype of a chatbot. Dialogue Flow is one example. It's owned by Google. Chat Fuel is another. You don't have to really write any code. You can design a prototype of your chatbot. You don't have to set up a server or anything. You can very quickly get something you can try out and just get a feel for how something like that would work.
Alan: Then, if you want to go beyond that, there's lots of good resources on understanding the tech behind it. Building something you can really maintain and scale. There's, of course, the DataCamp course that we created that really covers the fundamentals. What are you really doing? What are you really going on? Demystifying this concept. Which if you've never worked in NLP or anything, it seems really bizarre and almost impossible. How do you build computer programs and understand language? It seems so inconceivably hard. We go into a lot of fundamentals there.
Alan: Then, if you want to build more advanced things, you can either then build everything from the ground up, which is always an interesting, especially for a learning experience, an interesting approach. Or you can use something like Rasa, an open source library, where a lot of the heavy-lifting is done for you. You can then get started, build something out, then really tweak parameters to get better performance on your data set. Give it to real people and iterate. I'd tell you the best way, if you want to tinker, build a chatbot first. If you want to actually solve a problem, and you actually understand what people do, then actually the best way is not to have a chatbot, but just pretend to be a chatbot yourself. Just say that it's a bot and ask people to talk to it. Then, answer them yourself.
Alan: If people don't like the experience with your brain behind it, then they're definitely not going to like the experience with an artificial machine learning system behind it. It's a great way to validate if what you're doing actually makes sense and really get some inspiration for all the things you didn't think about.
Alan: One mantra that we have, and it's really one of the principles that we use to build our products, is that real conversations are more important than hypothetical ones. I'm less interested in giving people tools to design hypothetical situations and think about all possible conversations that people can have. It's more important to look at real conversations that people do have and learning from them. That's what Rasa Core is all about is learning to have conversations from real data.
Hugo: Absolutely. There's so much in there. I want to recap a couple of the takeaways. The first takeaway for me, of course, is take this DataCamp course. I do think it provides a wonderful introduction. In the first chapter, you get to build a chatbot, which is based around one of the early chatbots, the ELIZA chatbot, which is incredible. You do a bunch of natural language understanding. Then, you get to build a virtual assistant. It's a personal assistant that helps you plan trips, which is really cool. I think you've got a lot of insight into this course into everything we've been talking about.
Hugo: The second takeaway, I think, is you need to think about what the purpose of your chatbot is. For example, a lot of people, I think have a misconception that chatbots need to sound like humans, for example. The question is if you've got a virtual assistant who's going to help you build a trip, do you care whether it sounds like a human at all? Or do you want it to do the job you want it to do, right?
Alan: Yeah, I do have to agree with you. I do think there's a big, important topic as well around the design and writing good copy and being empathetic and doing things like active listening. That's a whole orthogonal set of problems kind of to what we're really tackling. It is important. Of course, yeah, the question is actually do you solve the problem? Do you actually do something for people?
Alan: I end up using, at least a couple of times a month, some product that I just completely hate, but I have to use it, because it's the only thing that gets the job done that I need. I think also in start-ups, if you think about product market fit, you almost want to put buyers in people's way. You almost want to have your first version of your product be really crappy. Maybe even intentionally so, just to see people persevere, because it really solves a problem for them.
Misconceptions of Conversational Software
Hugo: What are the most common misconceptions surrounding conversational software that you want to correct?
Alan: I think the first one is really artificial intelligence, maybe even artificial general intelligence. That you need to wait for DeepMind or OpenAI to solve these grand problems. Then, you'll be able to download a brain from the internet. That's going to magically solve all your problems. You can build stuff now with the techniques that exist. You've just got to put in a bit of work. You don't need a degree in statistics or computer science or anything like that to build these things. You can definitely self-teach. You can build something that's really useful and adds a lot of value with what's out there now. You shouldn't even try, I think, to build a do everything kind of system.
Alan: I think another one is that only Facebook, Google, Amazon, Apple have the tech to build conversational AI. It's just not true. We see that a lot. There's also some academic papers published, where they compare Rasa to some of these closed-source tools from Google and Microsoft. It actually does very well, because there's no fairy dust. I don't want to pitch Rasa too much, but the way Microsoft and IBM think about this problem is we have magical algorithms in the cloud. Upload your data here and we will turn it into gold. I think that's just nonsense. You can build a lot of great things, even better things, with opensource tools because you have full control. You can tweak things. You can customize things for your use case. It's definitely not something that's only in the domain of the big tech companies.
Alan: Then, another one that we already spoke to a little bit is that this generational bias that it's only something that's fun and light and for Millennials. It's actually something for the Boomers and the older generations as well. It's a really important piece of tech now.
Rasa
Hugo: One thing you mentioned was this idea of product market fit. I want to zoom in on where Rasa Core came from and how it emerged. I want to approach it from, I suppose, the idea that we discussed at the start of this conversation about needing to have conversational software that scales. Maybe you could tell us about the landscape before Rasa Core and then how Rasa Core emerged in relation to the idea of scaling this software.
Alan: Yeah, definitely. Before Rasa Core came out, there was some nice cloud APIs for doing the NLU part. You send it a sentence and it sends you back its interpretation of what that means, what's the intent, what entities are in there. Then, the question is what do you do in response to that? You have things like yes and no. Of course, the response of that depends on what happened before. What do you do? You write you some rules. You write out a rule for, "Okay, if the last thing we asked was this and the person says yes, then proceed down this direction. If the last thing the person said was no, then go down this path." That's great. Then, you say, "Okay, we now need to maintain that state over multiple turns in a conversation." You don't just build point A to point B. You build a state machine.
Alan: You say, "Okay, this person is currently in this state." Now, they've said this. I've moved into this state, where they're now ready to make a purchase or they're asking about this topic or whatever. That, of course, works to an extent. Then, you deploy this and you give it to people. You ask them a yes or no question. One example is, "Should we just send that to your home address?" You've maybe built a branch for yes and you've built a branch for no. Then, the person responds, "Oh, what home address do you have on record for me?" There's always an edge case that you haven't thought of. Which is fine. Of course, all software has edge cases. Then the question is how do you deal with that? If that's an important thing that lots of people say, you need to answer that to them. You can either then add another nested if statement in your logic, or you can add another state to your state machine, which then is explaining some deeper information. Then, probably you can get into that state from multiple different other states. You have one use states, but then you have order N squared ways of getting in and out, because you have all the other states.
Alan: That very, very quickly becomes unmanageable. You have this bag of rules about how people navigate this state space. Then, whenever you want to change something, it clashes with old rules or breaks something else. The thing is that mid-conversation, it's absolutely trivial to know if a chat box said the wrong thing. You can give that to a four year old and say, "No, that's nonsense. That doesn't make sense." It's really hard to then figure out from that big state machine why did that happen? Why did that go wrong? Reversing the logic.
Alan: We said, "Okay, why don't we do it completely differently?" We don't build out that state machine at all and we just say, "Okay, here's a conversation that went wrong." Say what should have happened instead. Add that to your training data. Then, train the machine learning model that learns how to have these conversations. Rather than having a fixed, solid representation of this state machine, you learn a continuous fuzzy representation. A continuous vector that represents that state. You learn that such that you can do these conversations that you've had, you've seen. Then, you can measure how well it can generalize some of these patterns.
Alan: There's a recent paper that we just finished. I'll make public very soon. Where we actually study how well you can take these general patterns, like answering a clarification question like, "Oh, what address do you have on file for me?" Then, reusing that in different context and even different domains. That's what we've always wanted to do with Rasa Core is say, "Throw away your state machine. Don't try and anticipate and write rules for every possible conversation." Because it's basically impossible to build a flow chart and reason about every conversation that everybody could have, because it's just a combinatorially big space. Don't do that. Just have real conversations and learn from that. That's where Rasa Core came from.
Alan: There was lots of research on doing machine learning-based dialogue management. We certainly didn't embed them. We had to do things a little bit differently from how the academic world was doing it. They were doing a lot of work on reinforcement learning. There were some technical reasons why that didn't make sense for people getting started with Rasa Core. The short version is basically you have to implement a reward function. That's not a trivial thing to do. We said, "Okay, we don't go for reinforcement learning. We do everything supervised learning." We let people build up these conversations interactively by talking to the system. That's what we've been doing with Rasa Core is taking some of the ideas from the literature. The ones that we think are most applicable.
Alan: Then, straying from the literature, where we think it makes sense to. Then, building a library that lets people who don't have a PhD in statistical dialogue systems actually build machine learning-based dialogue and not have to build these unmaintainable state machines.
What is the Future of Conversational Software?
Hugo: That's great. This movement from state machines to the importance of real conversations and machine learning. Implementations in machine learning algorithms, which learn for more and more conversations and more training data, real data is incredible. So what happens next? What is the future of conversational software look like, thinking about it in these terms.
Alan: That's really interesting. I think for developers, it's a really cool time to be in this. We engaged a lot with our community. That's probably my favorite thing about open source is we have literally thousands and thousands of developers who use our software. We have over 100 contributors to the code base, who are just people using it for their own purposes and contribute back. This is a real sense of excitement. People are building and inventing new things. I kind of think about it like being a web developer in the 90s. If you build websites now or you have friends in development, it's well understood what you have to do. None of that's really been invented yet for conversational AI.
Alan: I mean, some of the kind of early versions of it are there. As a developer, this is a green field. I get to invent a lot of stuff. That's really fun. There's still lots of challenges. It's not, by any means, a solved problem. That's why we invest so heavily in research and breaking through the limitations of what we see currently in our own libraries and then what people do in academia. Pushing beyond that and shipping it into the libraries.
Alan: On the consumer side, the future is lots of little magical moments, where something just works. I had a great one literally yesterday. I was on Google Analytics. It's not something I look at very often. Maybe once every few months, because I was working on our documentation. I wanted to know how many people view our documentation on a mobile device. I'm not an expert in Google Analytics, so I would have to go and build some filter. They have this cool feature where you can just ask a question. I literally just typed into the box, "How many of our users at nlu.rasa.com, our documentation, are a mobile client?" It just answered, "6%." I thought, "Amazing. I didn't have to do anything and it worked the first time.
Alan: Those magical moments are really cool, especially fun because that was one of the first use cases we worked on in early 2016 that we actually had some paying customers for. Then, to see that deployed at scale in the wild was really exciting. I think a lot of really magical moments that we'll all experience over the following years, but we won't have anytime soon is the do everything magical assistant that replaces a human butler or something.
What are other challenges facing conversational software?
Hugo: Yeah. It's great to hear your experience as a user as well, as someone who develops a lot of these things also. You spoke to the future challenge, or current or future challenge, of scaling to multiple use cases. Are there any other big challenges facing conversational software development?
Alan: Yes, I do think the biggest one is, "Okay, you have something which works in a narrow domain. How do you extend it to more domains? We recently saw this demo from Google Duplex. Primarily very, very impressive, text-to-speech, the speech synthesis. Also a very nice functioning dial-up system that can handle quite a bit of complexity. They're very open in their post about the software. It only works because it's very limited. It works on restaurants and hairdresser appointments. Then, the question is, "Okay. How do you build the next 100 use cases?" If you look at Amazon, I think there's 6,000 engineers working on Alexa. That's a big effort.
Alan: Then, the other part is really not just around the technical challenge of building it, but it's also the conceptual challenge for programmers to build software, where the core logic is learn from data. That's very different from calling an image-recognition API. You send it a picture and it tells you that there's an apple in the picture and then you have statements like, "If apple say apple" or something. The way the conversations go is learned from real conversations that people have had and that you've checked and annotated and fixed and learned from.
Alan: Managing that training data become a way of programming. A lot of product management needs to be rented there in terms of how do you actually do that? That's really interesting. That's obviously something that we spent a lot of time thinking about. I think we have some cool things in the pipeline as well. I'll have some great things to show there in the future.
Call to Action
Hugo: Great. I look forward to it, hearing about those future developments. Alan, my last question is there is do you have a final call to action for all our listeners out there?
Alan: Yeah. All of Rasa is open source so go check it out. Rasa.com is the website. If you search "Rasa Core" or "Rasa NLU", Google should show you that the documentation, the GitHub repos, and build something. Try it out. Let us know how it goes.
Alan: I'm really curious. People always come up with infinitely creative use cases. Yeah, if you have any problems, let us know.
Hugo: Fantastic. Alan, it's been an absolute pleasure having you on the show.
blog
The Latest On OpenAI, Google AI, and What it Means For Data Science
blog
Will AI Replace Programming?
blog
ChatGPT vs Google Bard: A Comparative Guide to AI Chatbots
blog
Bard vs ChatGPT for Data Science
podcast
Automated Machine Learning
code-along