The Unreasonable Effectiveness of AI in Software Development with Eran Yahav, CTO of Tabnine

Richie and Eran explore AI's role in software development, the balance of AI assistance and manual coding, the impact of genAI on code review and documentation,the future of AI-driven workflows, and much more.

Jan 6, 2025

Guest

Eran Yahav

Eran Yahav is an associate professor at the Computer Science Department at the Technion – Israel Institute of Technology and co-founder and CTO of Tabnine (formerly Codota). Prior to that, he was a research staff member at the IBM T.J. Watson Research Center in New York (2004-2010). He received his Ph.D. from Tel Aviv University (2005) and his B.Sc. from the Technion in 1996. His research interests include program analysis, program synthesis, and program verification. Eran is a recipient of the prestigious Alon Fellowship for Outstanding Young Researchers, the Andre Deloro Career Advancement Chair in Engineering, the 2020 Robin Milner Young Researcher Award (POPL talk here), the ERC Consolidator Grant as well as multiple best paper awards at various conferences.

Host

Richie Cotton

Key Quotes

The world is split into two camps, one that thinks that software development as a whole is going to be in natural language and, programing languages are a thing of the past, right? Why would you need to speak this foreign language of the machine? It seems like a waste of time. The machine should speak my language and it will be done with it. So that's one camp. There's another camp that says, these languages exist for a reason. And the reason is we do some security and be precise on what is the expected behavior without running the program.

AI can help you accelerate all activities across the software development lifecycle. Code generation, documentation generation, test generation, code review or deployment to production. On all stages of the SDLC, you already see assistance providing significant value and significant acceleration.

Key Takeaways

Use AI for coding tasks where it’s easier to explain and verify results than to do the task manually. Focus on well-defined problems where you can quickly assess the quality of the output.

AI speeds up coding but can slow down reviews. Set clear standards and robust processes for both AI and human reviewers to ensure quality and alignment with company guidelines

Start using AI in low-risk, high-impact tasks like new projects, test generation, or documentation. This builds trust and showcases value before scaling to critical production tasks.

Links From The Show

Tabnine

Course: Working with the OpenAI API

Transcript

Richie Cotton: Hi, Eran. Welcome to the show.

Eran Yahav: Hey, thank you for having me.

Richie Cotton: Brilliant. We've talked about software development and using AI for quite a lot on the show, so I think a lot of our listeners are aware that AI works really well for developing software, writing code a lot of the time, but sometimes it doesn't. So, a big question to begin with is, like, when can you use AI systems and when do you want to handwrite code?

Eran Yahav: Yeah, I guess it's really like, I'd like to talk about it in terms of the fundamental theorem of GEN AI, which is not that fundamental, but I'll lay it out anyway, which says that GEN AI makes sense when the cost of specification Plus the cost of consuming the result is much, much smaller than doing the work manually.

, so it really makes sense to use any of this if the cost of explaining what you want and the cost of consuming back the result is much smaller than doing it yourself. So this is maybe too abstract, but that's the nature of fundamental theorems. Let me make it slightly more concrete here. It makes sense to use it.

If you can make a really quick, snap judgment that what you got is what you wanted, Then this, in certain domains, this is easier than in others. Let me give you a concrete example. Like let's say I'm writing my React application, then I can. See with my own eyes whether what I asked for is the UI I wanted,

so I can make the snap judgment of like, I got what I wanted, ... See more

So it makes sense to use that if I'm writing very complicated. Let's say, distributed systems synchronization protocol, then it's going to be pretty hard for me to determine that what I got is really what I wanted. And therefore, I'm maybe better off not delegating this task to the AI assistant or the AI agent.

Richie Cotton: Yeah, I like that. So, if it takes you a really long time to write your prompt, then that's like a big specification cost. If the output is hard to interpret or hard to work out whether it's correct or not, then that's a big sort of consumption cost and you want those to be as small as possible, otherwise it's not worth it.

Eran Yahav: Yeah. And it's not real about gen AI in a sense. It's about delegation, right? It's also like, I'm trying to delegate the task or to offshore a task to another entity, whether a human or silicon, so I'm kind of like offshoring to silicon and it makes sense to offshore to silicon when the boundaries of how do I define what I want and how do I make sure that what I got.

is sensible,

Richie Cotton: Absolutely. So, maybe it's worth talking a bit more like finding some context on, I mean, generative AI is just one tool among many different tools to help software developers write code. can you just talk about, like, what are the other, or what is this competition? what are the other tools that people are using in order to Automatically write code.

Eran Yahav: oh, there is a bunch, you know, of , AI based solution for generating code. I think it really depends on the segment. Like, are you an enterprise user who is trying to write code in an environment in which things are highly specific and incremental? In which case you will use something like tab nine, which is more aware of , the environment you work in, or maybe you're writing like smaller application, maybe you're not even an enterprise setting, maybe you're a hobbyist or like a more independent student, in which case you maybe use something that is more, let's say, prompt based or more natural language based and, you data lower end.

Of the complexity of the environment in which you're operating, but I think at this point there was like an assistant for every layer in space, right? and probably more to come because to be honest the promise of JDI is huge, but one of the areas in which this promise is already realized .

is software development, right? This is almost, one of the most prominent areas in which you see the value of Gen AI every day.

Richie Cotton: Okay, it's nice that there is a tool for like every different group of users depending on what they used an individual or an enterprise what industry you're in. let's talk about what you can actually do with this stuff. So is it just about having AI write code for you or are there other things you can do with generative AI here?

Eran Yahav: AI can help you accelerate all activities across the software development lifecycle. So it is about code generation, documentation generation, test generation, code review, and deployment to production. On all stages of the SDLC, you already see AI systems providing significant value. and significant acceleration, just on no documentation generation, for example, something that developers really don't like to do, You can see a huge uplift there and also uplifting, the actual documentation being generated or sorry, the documentation that is present in your project at the end, because previously people just not write it. But now that AI gets like from zero to one, People are refining it. So that's one of the effects that you're seeing is that maybe AI did not get it all the way through, but once the developer has a draft, they're going in refining that.

So you get like high quality documentation by the collaboration of the AI writing the draft and the human refining that. And I think that's true across the entire S-D-L-C-C-O-C-C. Test generation, maybe the AI did not actually hit the full subtlety of the test that you wanted, but it provided enough scaffolding and enough of the, you know, marking, enough of the stuff around there for you to come in and do like the draft smile.

and getting the full value out of the test.

Richie Cotton: This is interesting because I think a lot of people, if you become a software developer, you enjoy writing code, and so having AI write your code, it was like, well, I was enjoying doing that, but some of these other things you mentioned, so code review, documentation, testing, these are things where, like, no one, got a job to really want to, write tests and write documentation so, having AI work on those things, that seems like good idea.

Eran Yahav: But I think also, like, in terms of writing of code, I think there's a lot of code that people don't like to There's a lot of boilerplate, there's a lot of repetitive code, there's a lot of convoluted code, complicated APIs that people generally do not enjoy. So having AI, automate the mundane stuff and the boring things, that's huge.

kind of the developer focus, software engineers will focus on the creative aspects of the work is really what GNI is making possible.

Richie Cotton: Yeah, I suppose, , if you just like writing a, registration page, like login code for a web developer, it's like you've done this a thousand times, doing it a thousand and one times, it's not going to be like that much more interesting. So, yeah, maybe outsource that repetitive code stuff. but I would like to talk about some of these, , non , code generation aspects of it.

So, talk me through code review. what's the kind of current workflow and how does it change once you've got Germs of AI involved?

Eran Yahav: First of all, it's important to understand that, the bottleneck right now for really getting the full benefit of GNI is code review. Really what we are seeing right now is that generation has already accelerated using Gen AI, so everybody's generating more and more code, but review has not accelerated to the same speed yet,

so what you're effectively getting is like a distributed denial of service attack on your reviewers because all the developers are using AI to generate data. Say just for exaggeration, 10 X more code, but the reviewer is not reviewing code at 10 X speed, So if you want to leverage or get the full potential of GenEI across the SDLC, we really need to solve the problem of accelerating the review itself or making sure that the code we generate is of high quality to begin with, otherwise it is really a loss on the reviewer, which is also your senior developer,

effectively hammering your senior resources to review AI generated code, so that's one of the things that we are focusing on in TAP9 is really solving this problem of the review. And the workflow is interesting because question of is good quality code is not a platonic question, right?

It does not live in some academic sphere of just clean code book or something like that. It varies between organizations, it varies even between teams, because part of it is very, very specific to how we do things around here. Part of it is using the existing Say microservices, the existing code base, existing knowhow.

And so standards, best practices. So really the reason that experts are doing the reviews right now is because they have all this tribal knowledge in their heads, typically right? And are reflecting that as new code comes into to be reviewed. And what we're doing in top nine is kind of replicating that specificity of the review by learning organization specific rules and also allowing the , human experts to enter kind of organization specific rules and have the AI, have top nine review the code according to these specific rules.

specific standards, so you get the AI code review to be specific to the org, aware of the existing code base, putting in comments, but also suggesting fixes automatically, right? So you get the full spectrum of the human expert reviewer. what you want, the experience that you want to replicate is the expert.

Code reviewer that has been a developer in the organization for a decade, that's the level of expertise that you want to reflect in the AI code review. Knows all the code in the org, knows the best practices, knows why we stopped doing certain things, why we do it the new way, and knows how to guide you to do things the right way, not just complaining all the time without remediation.

Richie Cotton: That seems smart that it has contextual knowledge about all the code that's been written before. But at the same time, that also terrifies me because it's like, the terrible code you wrote like two years ago, somehow the AI is learning from that. And yeah, it's going to review things based on that.

okay so, the idea is that you've got AI that sort of knows everything about, the projects you're working on, gives you good feedback. , I'm wondering whether it changes the workflow now, because sense is that code reviews normally happen towards the end of the cycle, so a software developer builds something, then hands it over to someone else and says, take a look at this.

If I'm writing a document I can get, , AI feedback on what I'm writing as I go. Things like Grammarly will give me feedback real time. Is there a chance we can move towards real time code review as you're writing stuff?

Eran Yahav: absolutely. That's part of what we're doing is shifting that code review aspect left into the IDE. The nature of the problems are not always the ones that you'd like to flag as you type the code, because it's much more contextual than, say, Grammarly. Some of it, you'll be complaining all the time about code that has not been written yet, and that's quite annoying.

tried that. It's a bit too Intrusive, but you do want some checkpoints in which you say, okay, I've seen enough. Let me tell you that you're wrong, right? you definitely want to do that in the IDE and not in the pull request that you can because also the IDE is my private space and the pull request is public space, So I'd rather get your comments privately, right? Especially if I get this in progress, I would go, Hey, tab 9, if you knew this all along, why didn't you tell me earlier, why'd you wait until I pushed this, like, tell me earlier. So definitely working on shifting that left into the IDE.

Absolutely.

Richie Cotton: That is cool that it comes sooner, but it's also interesting that you can't have that sort of instant feedback sometimes because some of the code you write relies on stuff you haven't written yet.

Eran Yahav: Okay, so here's your experience question, really. I think some people would like it like more aggressive. Some people would like it just to wait a bit and see like the full thing before it complains. But yeah.

Richie Cotton: And so related to this is the idea of documentation, having documentation being written for you. Now, there have been some sort of semi automated documentation tools for a long time, but I'm hoping that Gerontive AI gives us better tools for this. Talk me through how does it work and how's it going to change people's approach to documenting code?

Eran Yahav: First off, it's really easy, right? You can generate docs and then click a button and you get docs which are quite descriptive. Personally, I'm in the camp that says that documentation should say why it is something and not what the code does. I mean, the code says what the code does. I don't mean. More of that, but as I said, I think the workflow, the real workflow is that you get like version one draft from the AI and then once it exists, people say, okay, let me add that small note here actually explains that the important bit of why I did it this way and otherwise they would just not add that bit.

So I think that changes by forcing the existence of documentation and then humans are leaning in to make that documentation more meaningful. for the future, Ron, who will come in a year and will not know , why he did something that way. So I think that's kind of the default that is happening in reality.

Interestingly, by the way, we're seeing a lot of success with test generation, which is another aspect that people generally don't like to do. And again, when you have a click of a button that gives you a test plan that you can then. Implement with another click and you can revise the plan and you're like manipulate what's going to be generated so you control the process.

It's not like I'll generate whatever tests come to mind right there. You control the day I in what is the test plan that you'd like to implement. I think that is also quite successful and another flow that is interesting. is generating test cases as part of the pull request. So you push code, but in the background, top line says hey, you know, what you push did not have sufficient test coverage.

How about I give you like 10 additional tests? You don't have to worry about them. They all pass and they increase your code coverage from 40 percent to 80 percent code coverage. Take this additional PR if you'd like, and let's move on. That's another aspect.

Richie Cotton: I like that it can create a sort of test plan and then code the test to sort of fill things in. Are we just talking boilerplate tests here though, or can it do more sophisticated things as well?

Eran Yahav: It's doing pretty sophisticated things. Again, it's not a replacement for the human writing, the super sophisticated test that will cover the extra corner case. But I've seen tests that covered cases that I missed, so it's smarter than me, maybe for some cases. So I think that was good enough for me.

Others may be smarter. Yeah.

Richie Cotton: , I guess, as a human, it's very easy to get bored writing tests and then your test coverage is low just because you, , couldn't be bothered to write the rest of the tests and the AI doesn't have that problem.

Eran Yahav: Yeah, no, first of all, I'm not a huge fan of measuring code coverage. I'm not sure exactly, you know, there are many academic words on , whether that's meaningful or not, , but definitely getting some tests and getting, , reasonable test coverage is a desirable property and definitely better than having zero tests.

Right. So. That's

Richie Cotton: because of the sort of probabilistic nature of large language models, you do have some randomness in the tests that are created. Do you have to start writing tests to test your test, like some sort of meta test situation?

Eran Yahav: pretty interesting idea. I have not done that or seen that, but it's a pretty cool idea.

Richie Cotton: Okay maybe like, no, no one cares about testing that much, they just want to maybe just manually check their own tests. Alright are there any other I don't want to say boring, but are there sort of less fun bits of software development that AI can help out with?

Eran Yahav: Again, I think what we're seeing is any task that you can think of has assistance from AI and soon is going to have like an agentic workflow that does the majority of the work for you. The bottleneck for all this is, A, what is the quality of, say, code that is being generated or refactored? It doesn't matter, manufactured, right?

It's not just about generating code from scratch. It's also about making changes to your code base, right? Refactoring, modernizing migrating between languages, all those tasks. Our tasks in which we're seeing increasing value, increasing productivity uh, from AI, but really the bottleneck is what is the quality of the things that are getting generated?

If I'm generating 10x more code and I do not provide any quality guarantees on what I'm generating, the fundamental question is, am I generating an asset or am I generating technical debt? . And this is a very specific question as well, or specific question, because it could be very nice code, but that just doesn't use the microservice that you should be using inside the org, And so like telephonically, if you look at the code, it's great, but it's just not doing the right thing. And so the specificity of what does it mean to have good quality code for my setting is really important here. and for me to trust the AI, the AI has to understand my organization very deeply. And I also need some way to control it.

Because what you said is really important as well. Like, let's say I have this legacy code. And the AI in Delta 9 understands that the way we do things around here is legacy code, because that's what it has seen. But I really wanted to stop creating new legacy, right?? I wanted to somehow say, from this point on, this is not the way in which we connect to Kafka, right?

We do it like a different way, different microservice. So now I, the human, the expert, the architect, need a way to tell the AI in Delta 9, starting from November 1st. First, we're doing it this way, ? You need the phase shift. So it means, A, the AI to know everything about your org. So it has some implicit understanding of what is good, but also B, a way for you, the architect, the expert to say explicitly what is good, and force that moving forward.

Otherwise, you will never be able to escape the Jupiter level gravity of legacy, Trapped in legacy forever. Take care. Take care. Thank you.

Richie Cotton: I think transforming your code, like, moving to newer languages, I know there's a lot of Big enterprises where they saw code running on like COBOL from like 1970 or something. And so yeah that transformation seems like a great idea. And I love the idea of sort of getting rid of legacy code.

So, talk me through, how does it work? I mean, I presume Generative AI is good for translation between human language, right? Does it also work with computer languages?

Eran Yahav: Yeah, translation between languages is really an easy use case. Even translating from COBOL to Java or something like that is doable. think what people don't realize is that porting an application from COBOL to Java is not about translating procedure by procedure. It is about re architecting the application, really.

because it's just the concept between a Java application and a COBOL application. Are different, and maybe if you're like doing this shift, you want to maybe transition to fun, the monolith to microservices, maybe there is a lot of stuff that you like to do as part of this migration and not just translating, you know, from running on co runtime to running on JVM, that is not what you're trying to accomplish, right?

You're trying to get a modern application that isn't maintainable moving forward. And that's much more than just translating procedure to procedure. I think that's the misconception that people have. So AI can help you re architect as well, but I think we still need humans to kind of think what is the desired kind of end result of migrating to global application.

And AI can help you. Once you decide on this kind of, the overall structure, AI can help you do , the micro steps of migrating procedure by procedure and stuff like that. , that works wonderfully well with Havana and the other systems.

Richie Cotton: Okay, so that's interesting that you've got high level architectural decisions, and then you've got sort of lower level, well, this is like what the code's doing on an individual. And actually, this leads nicely to, like, one of the, the sort of the most hyped ideas of the last couple of months which is change of thought reasoning.

So the idea is you've got AI or generative AI that can complete multiple steps at once. Can you talk me through how is chain of thought gonna change all this AI developer tooling?

Eran Yahav: it's not about chain of thought, specifically, or tree of thought, or graph of thought. That doesn't matter, like, the general structure is all about , the agentic workflows. , as you said, completing multiple steps, potentially having backtracking, potentially, Now invoking tools and both to read and write information like sensing and transforming or manipulating.

We're already seeing a lot of success in that. So we can have workflows that go from a Jira ticket to a full implementation, right? All the steps. So I have. I've done this cute demo in top nine, which I have like start from zero ticket. Doesn't matter, implement the back end of the application, implement the front end of the application, connect them, implement the database, and all that is happening as a single agentic flow.

You don't have to worry about anything, right? So the AI does everything for you end to end. And there are other demos on the internet of similar things. So whether it's like Baby HCI or OpenHANDS or whatever there are a bunch of those. And these agentic flows provided with the right tools and the right abstractions, Can really go a long way and take you end to end, either implement a Jira ticket or a debugging application or translating application.

All those things are totally doable with the caveat that I started with, which is how do you guarantee the quality? of what is it that you're getting. And maybe that looks back to your initial question of , when does it make sense to use JLA and when it doesn't. So one aspect is really, you know, is to specify, is to consume the result.

But maybe another aspect is like, is this like the maintainable code that I'd like to live with for next? Five to 10 years? Or is this just know, like a side project, A POC or something like that? In which case, I don't mind doing the end-to-end flow without worrying about, you know, how any of that is really implemented.

So again, in the offshoring analogy, and I offshore this entire application because it's A POC and I just wanna see that I got it right and then maybe I either. manually refine the components that were created, or I rewrite it, or I do it with some coaching rules like Top9 has, which force some standard and some quality on whatever is being generated.

Richie Cotton: That's interesting that we're going from like chain of thought to agents and actually it's like two sides of the same idea. It's just being able to complete multi step tasks without having to have much human

Eran Yahav: Yeah, that generally is like one, let's call it planning strategy an agent, so it's a special case. Yeah.

Richie Cotton: we've been talking for 25 minutes and we still haven't mentioned data once, which is giving me withdrawal symptoms. So, I'm curious. , we talked about the software development workflow where, , code is basically everything. I think for a lot of, like, data use cases, you've got much simpler, shorter data projects.

I'm just wondering for data analysts, data scientists ways doing these short projects. How are these sort of trends in generative AI changing things?

Eran Yahav: I think data science is the perfect application for Gen AI, because what I care about, the code is really not the artifact that I'm after, right? The code is a means to an end , to get the analysis that I want, to get visualization or data, whatever the projection that I want. And I couldn't care less about the code as an artifact for the most part, right?

And that's fine. doing some framework, but if I'm analyzing something, the code is going to mostly be discarded at the end of that small project, as you call it, so I think that's. It's , perfect match. Also it's a perfect match for all sorts of visualization projects, because I, again, I can make a snap judgment of like, Oh yeah, this is the histogram that I wanted, or this is the curve that I wanted, or this is the analysis that I wanted.

I can look at it and say. Yes, this is great, or I can very easily manipulate it with a prompt with natural language because, again, specifications are short. Results are easy to understand. Code is mostly short lived. Perfect match on all three, on all three, all three accounts. And also know a lot of Python, which is a language that most of the models are really strong at.

because models are a bit of a researcher, so they try things in Python as well, so there's some, I think, very strong guarantees on Python being a very first class citizen of the models in terms of ability to generate and accuracy, etc. I do like this sort of framework you have of , deciding whether or not generative AI is going to be useful, just like, can you make that slap judgment or it just seemed like, Yeah visualization works really well here because you can look at a picture and say, well, does this make sense or not?

Richie Cotton: Pretty quickly.

Eran Yahav: works really well also in complicated use cases that you've done before, because, you know, like, I wrote this code somewhere, sometime, or something similar, and I look at the code engineer and say, yeah, that's exactly what I want to move on. So, it's actually interesting that for experts, it may be easier to counterintuitively, somewhat counterintuitively for experts.

Gen EI may be even more of a, I'm a force multiplier in the sense that you like, okay, I can make the snap judgment on a lot of things much faster than a junior would be able to, ,

Richie Cotton: Ah, so this is very fascinating because I think one of the sort of perennial questions over the last couple of years has been like, , do I need to learn to code if AI can write my code for me? And you're saying that actually, if you've got those skills already to be able to make that snap judgment about whether or not the code works well, that means AI is actually even more beneficial for you.

Eran Yahav: yeah, I think that's true from what, we're seeing. And think again, in the foreseeable future. Again, not for all software, but some software is still going to be the combination of humans and AI. And the ability to read code really well is very important. Like, read code well and understand what's going on quickly.

Because a lot of it's going to be generated by assistants or agents. And you, the human, will have to step in. So it's like the combination of. Self driving cars and human driven cars, you need to know how to operate in the mixed, environment.

Richie Cotton: Absolutely. So, talk me through, like, suppose you're an early stage either developer or data scientist or like anyone who has to write a bit of code in your job. it worth pushing hard becoming that expert in code? Or do you want to work alongside the AI tools? what would your career strategy be?

Eran Yahav: Yeah, I know career strategies is a big ask, but, again, I think it's, it's inevitable, right? It's just like, IDE is a tool of the trade, ? So some people swear by Emacs or by Vim, but they still have some extensions that help them, No code completion or do some things that are taking them to additional productivity and AI systems are like inevitable part of the stack.

So if you haven't done it still, think you should. Whether you should learn programming so I can be an expert in the nuances of Python or not. I don't know. I think a lot of the data science, at least, will increasingly move to a higher level of abstraction, I think. Again, I don't know how long will it take, and definitely not for all cases, but I've seen models do or systems do pretty amazing things in data science from natural language alone.

And I think this will just increase because it's a sweet spot for this technology. You can make the snap judgment. It's easy to specify and the code is generally short lived. So this is like really the perfect.

Richie Cotton: You mentioned moving to higher levels of abstraction. Can you just talk me through what that might mean?

Eran Yahav: Yeah, there are. I think generally the world is split into two camps. One that think that software development as a whole is going to be in natural language. And programming languages are, think of the past, Because why do you need to speak this foreign language of the machine? It seems like a waste of time.

The machine should speak my language and we'll be done with it. So that's one camp. There's another cap that says these languages exist for a reason, and the reason is reduce ambiguity and be precise on what is the expected behavior without running the program, so again, if it's a visualization or UI, you can run it, pretty much see what happens.

You don't need to be super accurate or specific about, you the code itself, you. can predict what's going to happen by running it. But let's say it's something more complicated, how you, the human, are going to predict how this thing is going to behave. You need some language that describes it in a non ambiguous way.

And if it's not going to be English, because English is ambiguous, you need something more structured. So there have been like past parents had some Formal English, structured English, but in the end they all deteriorate into some form of, let's call it Python, Ruby, whatever, something that does have structure that you can pretty accurately reason about what's going to happen when you run it.

Alright, so camp one says English is the way to go. Don't worry about anything else. Camp 2 says I need some symbolic representation that is non ambiguous and accurate. And interestingly, both camps are right. Because programming is not single thing, right? You don't program your React application like you program your nuclear reactor, ?

So the level of rigor required in these two ends of the spectrum It's quite different, and I suspect that, therefore, the languages that will be used to develop these will be different as well. So, yeah, I think both camps are right. Where it lands is probably the usual hype curve, so we're kind of like, yeah, everything is done in English.

Oh, actually, we don't know how to debug this, or maintain it, or control it, or every time I change one word in my application, I get a completely different result. Because, you know, it's not like a continuous, incremental translation if you change the English and regenerate the app. Something else may happen, ?

So that the space of mappings from the natural language and application is not like nicely continuous one, when you change one word and you get to like tiny change, maybe it might be a radical change from a tiny change in the stack. And so we're going to probably start with the everything is English and oh, actually we're wrong.

We need something more structured and the truth is somewhere in the middle and also depending the domain, as I said, like it's like a UI app or like the distributed database, Yeah, certainly. I mean, you gave the example of a nuclear power plant. If you've got like, , nuclear power plant critical control shutdown code, you don't want that to be kind of, well, it's about right. , I think, I think the AI generated code was okay. That needs to be absolutely spot on.

I think many years ago, I had a conversation with someone who was working on verifying Airbus. And he said, listen, there is a very fundamental, simple question, how much are you willing to pay for a line of code? If you're willing to pay 30 euros, then you get Airbus level code. If you're willing to pay three cents per line of code, then do whatever needs to be done.

It's an economical question at the end, really, because that the level of. Trust or the level of the quality that you build in the system is at the end the question of cost as well.

Richie Cotton: Absolutely. Yeah, I guess the consequences of things going wrong for a lot of code is less than like a plane crashing or nuclear power plant exploding. So, yeah, you can afford to do it a little bit cheaper. on that subject, like to talk a bit about how you go about adopting all these AI tools in the enterprise.

So, suppose your head of department, whether you're like head of data or head of software engineering says, okay, we're going all in on AI. Okay,

Eran Yahav: think , what I would recommend and what we're seeing is different. We're seeing people being hesitant because of the quality aspects. So I think we are kind of like, In code or in software engineering, we're already past the dog playing the piano. So we're past, magic of, oh, it can write code.

We're now kind of into the question of, , does it generate quality code that I can trust? And so people are kind of hesitant to say, take the entire department, give them tab nine and let's roll, Because they don't know what will the implications be. And as I said, the details on the reviewer is definitely one of the concerns,

like this is going to just increase the pressure on our code reviewers because they have to review the AI generated code all the time, and with more, being even more careful because they know the AI can generate things that are maybe not exactly what we want, ? And so what we're seeing is people are taking some part of the department, letting them try it, and then we see very quickly some extension once they prove the value, right?

So it's like very quickly, because this thing is magic, right? Once you start using it, you say, Oh, wait, I can do so much more and, , it's like a no brainer. And I think as a department head, if I look at this, I say, absolutely bring it to me today. But tell me how the code review is going to work as well.

How am I not going to overwork my reviewers tell me, can you be explicit about , what are the standards that top nine maintain? Can you show me the list of rules that are being enforced? So I can say, Oh yeah, these are the rules. So I can say, wait, there are two rules missing. Or if you say, Oh, these rules, I don't care about it all.

. And which all of the above. can happen. , but I think really Gen AI for software development is such a no brainer that the return is immediate. And I think at this point, I'm guessing most of our listeners are already using Gen AI. I don't think they need me to tell them to use it, right?

it's been a while now everybody recognizes the value. I think people also recognize the potential risks and why you need code review. I think the value is, inevitable, ? It's just coming.

Richie Cotton: I like the idea that yeah, a lot of people know what they're doing. It's just gonna happen. So, in terms of areas to focus on though like, do you start with cogeneration? Do you start with, , let's do documentation and get the hang of this. Do you do like low risk projects first or high value projects first?

What's the order of getting it implemented?

Eran Yahav: I would definitely start, you know, with, depending on what is the project, et cetera, definitely start with new projects because the amount of code to be generated is kind of the promise, right? I can move faster. So I would definitely start with that. Also, legacy things tend to be more subtle and more incremental changes, So we need to read much more code than you need to write. And again, it's just harder be really successful there. Possible, but harder. Test generation is definitely one place to start because test code does not run in production, ? So you're kind of like, maybe I'm taking the risk that the test will not cover exactly what I wanted, or the tests will run longer because I have more tests.

And now maybe not as helpful as I wanted, but the risk is zero, right? I'm not losing anything, really. Not jeopardizing my production system by my junior developers accepting code from AI without understanding the implications, So test generation is definitely a place to start. Yeah, I think, code review is another place that you can get a lot of value because it reviews both code written by AI and code written by humans, right, holds them to the same standards.

Which I think is also very important.

Richie Cotton: All right. So, working with new projects where you're starting fresh and then also , the place where things aren't going in production directly. So, yeah, testing and code review. I like it.

Eran Yahav: I would, I would go directly to the production. trust this well enough. I think I'm just trying to cater for people who are trying to do some gradual adoption and are risk averse,

Richie Cotton: a big scope, are like, gung ho, I'll use all the latest tech for everything, and people are a bit more risk averse. I guess we're going back to the nuclear power plant example. All right, super. Okay, so just to wrap up, do you have any final advice for people wanting to make use of these AI software tools?

Eran Yahav: Resistance is futile. like, it's like, it's coming, it's inevitable. Start using if you're not already using it.

Richie Cotton: All right. Yeah, I love the Borg response there. Very nice. Okay, thank you for your time, Aaron.

Topics

Artificial Intelligence

AI for Business

Cloud

podcast

The Data to AI Journey with Gerrit Kazmaier, VP & GM of Data Analytics at Google Cloud

Richie and Gerrit explore AI in data tools, the evolution of dashboards, the integration of AI with existing workflows, the challenges and opportunities in SQL code generation, the importance of a unified data platform, and much more.

podcast

No-Code LLMs In Practice with Birago Jones & Karthik Dinakar, CEO & CTO at Pienso

Richie, Birago and Karthik explore why no-code AI apps are becoming more prominent, uses-cases of no-code AI apps, the benefits of small tailored models, how no-code can impact workflows, AI interfaces and the rise of the chat interface, and much more.

podcast

Designing AI Applications with Robb Wilson, Co-Founder & CEO at Onereach.ai

Richie and Robb explore chat interfaces in software, the advantages of chat interfaces, geospatial vs language memory, personality in chatbots, handling hallucinations and bad responses, agents vs chatbots, ethical considerations for AI and much more.

podcast

[AI and the Modern Data Stack] How Databricks is Transforming Data Warehousing and AI with Ari Kaplan, Head Evangelist & Robin Sutara, Field CTO at Databricks

Richie, Ari, and Robin explore Databricks, the application of generative AI in improving services operations and providing data insights, data intelligence and lakehouse technology, how AI tools are changing data democratization, the challenges of data governance and management and how Databricks can help, the changing jobs in data and AI, and much more.

podcast

[AI and the Modern Data Stack] Accelerating AI Workflows with Nuri Cankaya, VP of AI Marketing & La Tiffaney Santucci, AI Marketing Director at Intel

Richie, Nuri, and La Tiffaney explore AI’s impact on marketing analytics, how AI is being integrated into existing products, the workflow for implementing AI into business processes and the challenges that come with it, the democratization of AI, what the state of AGI might look like in the near future, and much more.

podcast

From Gen AI to Gen BI with Omri Kohl, CEO and Co-Founder of Pyramid Analytics

Richie and Omri explore the evolution of BI with AI, the importance of data-driven culture, the role of generative BI in democratizing insights, the balance between intuition and data, and much more.

See More See More