[Radar Recap] The Future of Programming: Accelerating Coding Workflows with LLMs
Jordan is co-founder and chief duck-herder at MotherDuck, a startup building a serverless analytics platform based on DuckDB. He spent a decade working on Google BigQuery, as a founding engineer, book author, engineering leader, and product leader. More recently, as SingleStore’s Chief Product Officer, Jordan helped them build a cloud-native SaaS business. Jordan has also worked at Microsoft Research, the Windows Kernel team, and at a handful of star-crossed startups. His biggest claim to fame is predicting world cup matches using machine learning with a better record than Paul the Octopus.
Michele is one of the foremost experts on developing large language models to write code. He has been working with transformer models since 2018, and his team has developed Replit ModelFarm for creating generative AI applications, and the Replit Code Repair 7B LLM for fixing broken code. Michele also works as an AI advisor at Coatue. He was previously an AI advisor to Bessemer Venture Partners, the Head of Applied Research at Google, the Head of Applied Research at X, and a research scientist in AI at Stanford University.
Filip is the passionate developer behind several of DataCamp's most popular Python, SQL, and R courses. Currently, Filip leads the development of DataLab, DataCamp's AI enabled data notebook. Filip holds degrees in Electrical Engineering and Artificial Intelligence.
Key Quotes
Even when we accomplish the mythical AGI, we talk about it at least in world definitions as when AI is going to be comparable in terms of intelligence with a human. No human out there can be trusted to put their hands in a code base and make big changes, right? We always want them to go through a code review process. So no matter how long it’s going to take us to accomplish AGI, we will still need someone in the loop to make sure that what the AGI creates is actually correct and code reviewed.
We're seeing what we call a rapid a new class of developers, people that have a need to create software, but they have not been trained formally to be software developers. In the last year we’ve been seeing a lot more people attempting to create software. Before, these people were not able to build even a simple web application. Now, the same people are building startups that can go very far just by stitching together outputs asking an LLM to debug and fix their code. That's probably what I'm most excited about.
Key Takeaways
AI tools are not only making seasoned developers more productive but are also enabling non-traditional developers to write code. Leverage these tools to broaden your team’s skill set and include members who can contribute to software development without formal training.
To maintain code quality, ensure that human oversight remains integral in your workflow. AI can generate and suggest code, but rigorous human review is essential to catch potential errors, security issues, and to ensure the code aligns with your organization's standards.
Be aware that AI-generated code can introduce security vulnerabilities. Develop protocols for reviewing AI-suggested code to identify and mitigate these risks, and consider deploying models on-premises to maintain data privacy and control.
Transcript
Filip Schouwenaars
Hello. Hello.
00:00:02
Filip Schouwenaars
welcome all
00:00:04
Filip Schouwenaars
this is Phillip, uh from data Camp calling in and um, I'm very excited to be your host for the session about the price AI hence the name radar AI.
00:00:15
Filip Schouwenaars
Um, and it's it's obviously a super interesting topic, right?
00:00:19
Filip Schouwenaars
Large language models are changing the nature of programming. Uh, AI can now write explain fix code and it's getting better at it every week.
00:00:28
Filip Schouwenaars
That means that for anyone who's writing code as part of their job whether they're software developers or data scientists their role will inevitably change.
00:00:37
Filip Schouwenaars
And today we're going to touch on the current state of AI assisted coding how to roll it out in the Enterprise as well as explore what new workflows and developments are inside.
00:00:47
Filip Schouwenaars
I'm excited to do this with 2 experts that are hauling from the world's most influential and Innovative tech companies.
00:00:53
Filip Schouwenaars
Um, our first uh guest here is Jordan tigani CEO and co-founder at the analytics platfo... See more
00:01:22
Filip Schouwenaars
Um welcome Jordan I have to say modder deck is a great name and also a pretty amazing platform. Uh, we actually build a modern like integration into Data lab the data camps data notebook that uh, I'm building or helping build here at data camp.
00:01:36
Jordan Tigani
Alright. Thank you.
00:01:36
Filip Schouwenaars
Welcome.
00:01:38
Filip Schouwenaars
Then um joining Jordan is mik mik kasta VP of AI at replet. It's an AI powered software development and deployment platform with millions of users. He has been doing research on Transformer models since 2018. So well before it was super hip and he worked in teams that created palm and palm 2 while at Google at replet his team recently launched code repair. It's 7 billion parameter, llm for fixing broken code.
00:02:05
Filip Schouwenaars
Mikaella also works as an AI advisor at goq in the past the health positions as head of Applied research at Google and research scientists in AI at Stanford University. Hey, Michaela, thanks so much for joining us.
00:02:18
Filip Schouwenaars
All right, great. Um, we only have 40 minutes and we also want to leave some questions for Q&A at the end. So I suggest we get to it and hear what you have to say about some of the topics. We'll be touching on today.
00:02:30
Filip Schouwenaars
I want to start by setting some background context on llm powered workflows and maybe a quick 1 to start uh for you Jordan.
00:02:38
Filip Schouwenaars
Which which teams are roles. Have you seen make use of AI assisted coding and how have you seen them use it?
00:02:45
Jordan Tigani
so I think uh
00:02:48
Jordan Tigani
I was really surprised actually how quickly people started picking up LMS when um, like Chad GPT gbt for came out. Um, you know, people it already started changing how analysts are are doing their jobs, uh, you know, we we were talking to people and they said, you know, they're spending a lot of their time like cutting pasting between like query.
00:03:09
Jordan Tigani
UI and uh and like chat GT and um, you know, I think it really has started to kind of impact impact workloads, um, obviously developers, you know all over the place are using um, you know using using this for uh, uh,
00:03:28
Jordan Tigani
Uh for for helping them write code better, uh data scientists, um data scientists is as well. So I think um, we're I think I think the other thing is that we we there's, you know, a lot of excitement a lot of people starting starting to use some of these some of these Technologies. I think people then pretty quickly run into um, you know, uh roadblocks in terms of like how how good things actually are it's like it's really good to get like a demo and to like to uh,
00:03:55
Filip Schouwenaars
Yeah, yeah. Yeah.
00:03:56
Jordan Tigani
Uh to see like hey, you know like, um, this is sort of exciting but then um in terms of like actually sort of end to end, um being able to replace like how you do. Um, how how you write your queries? I think that's that's um, that's still that's still a ways off.
00:04:14
Filip Schouwenaars
Yeah, that's something we'll also touch on uh, by the way, yeah.
00:04:16
Jordan Tigani
but in terms of like being able
00:04:19
Jordan Tigani
yeah, yeah being able in terms of being able to stay in the flow. I think it's really helpful so far of like, you know, you can sort of write what you think it might be and then the AI can sort of like fix it fix it for you which which certainly helps spend spend time going back and forth to the dock.
00:04:33
Filip Schouwenaars
Yeah, yeah. Yeah. Yeah, it's pretty transformational. Um, you already touched upon it you have analysts, uh using AI assisted coding, uh workflows in like their IDs of choices just have like traditional programmers like building software programs and Miguel like in your experience. How would you say that these llm powered coding? Um, this coding experience is different in a way for traditional software programming versus more like, uh, data focused or analytical work.
00:05:03
Michele Catasta
I would say what's different about them at this point is not necessarily like exclusively and making developers more productive is also how much we're expanding the developer base.
00:05:15
Michele Catasta
So apart from every single profile the journal mentioned. Those are the classic people that were used to, you know, having influence on software development. We're seeing what we call a rapid a new class of cities and developers people that have a need to create software, but they have not been trained formally to be software developers and especially in the last 1 year plus since CH gbt and integration of large language models in ideas such as Rapid or seeing a lot more people attempting to create software.
00:05:47
Michele Catasta
And I the way in which I measure models improving is not much at this point, you know, how often people accept what they complete or what they generate is rather their success rate to create what they in mind. So let's teleport ourselves to 18 months ago a charge GB came out.
00:06:09
Michele Catasta
People try to copy base code in in red and often times. They were not able to like build even a simple like say web application, uh, as of today that is the case. So i i people are building startups that have never been software developers and they can go very far just by stitching together outputs over the lens and asking to debug asking to basically fix their code. That's probably what I'm most excited about. You know, the the good people are even more are even better today because a lot of Lance but the people who never wrote software now, they're becoming software developers and asking me that that's what gets
00:06:29
Filip Schouwenaars
Yeah, yeah.
00:06:41
Filip Schouwenaars
It's pretty cool. Like it's how AI is like kind of taking the scare off of code where people were before like hey, this code thing is not for me. And then all of a sudden AI is like actually with this tool next to me. It's it's it's becoming something achievable for for for people that are not necessarily formally trained in um in AI for sure.
00:06:58
Michele Catasta
Exactly, right? Yeah.
00:07:01
Filip Schouwenaars
cool, um
00:07:02
Filip Schouwenaars
Can you somehow quantify like the productivity boosted ones get 1 gets from AI assistance. Is that something that that you guys that your companies have tried to quantify have tried to measure is that something we should do in the first place?
00:07:16
Filip Schouwenaars
a Jordan Maybe
00:07:17
Michele Catasta
you
00:07:19
Jordan Tigani
Sure, um, we haven't tried to quantify it and and and uh, I think there are some people that that really like to use to use to use Ai and some people that don't we do have we sort of released this sort of we call it fix it in uh in our UI where you you can write, you know, write the SQL as you think it's going to be and then it will you know, if it's wrong it'll fix it and I think 1 of the really cool things about that is it does change how you um, how you approach coding and how you approach because usually it's like, oh, I don't remember the the syntax for like for differencing 2 time stamps and you know, so you like you, you know, you go to the you go out to the to the documentation is like, okay and it's you know, you use interval and intervals and quotes and all this stuff. But but with with AI you can actually just sort of write what you think it's going to be and then and then it'll and then if it's wrong it'll fix it for you and so it just gives you sort of a lot more confidence and and I think you know, we have like, you know, we do track sort of the success rate of like
00:08:19
Jordan Tigani
Like how good we are at at you know, fixing up what people are saying and I think you know, we can I think we're sort of sort of an 80% 80% right 80% rate on like on actually fixing um, fixing your code. Um,
00:08:30
Filip Schouwenaars
Yeah.
00:08:38
Jordan Tigani
But uh, you know, I think in terms of like how much more productive does that make an engineer or an analyst? I think that's a lot harder harder to measure especially if you look at sort of like
00:08:44
Filip Schouwenaars
Of course.
00:08:47
Jordan Tigani
Uh real engineering tasks not just sort of like, you know, sort of lead code style like, you know solving solving.
00:08:53
Filip Schouwenaars
Yeah, yeah. Yeah. Yeah.
00:08:55
Jordan Tigani
small small problems
00:08:56
Filip Schouwenaars
Yeah.
00:09:13
Michele Catasta
They do 9 of them and conversely I think it's important to mention many absolutely don't see any advantage or they're even try to stay away as much as possible from Ai and I respect that. I think those of us who have been trained to write code, you know, many many years ago. We all have our own idiosyncrasies and we like to develop software in our own way. I think what is more important to focus on is the new generation people are learning now and they're like AI first, you know, first class citizens, they they were with other side. They're going to be writing software in a completely different way and I I see them at hackathons. I see them how quickly they can build stuff in 24 hours and to me that's that's mindblowing. So I wouldn't be able to assign a multiplier to that but it's definitely something that in 5 years ago. I would have not even imagine that being said we also have some companies that are using rabbits and they are measuring very carefully, you know, their productivity improvements and we
00:09:57
Filip Schouwenaars
Sure.
00:10:11
Michele Catasta
In the software development life cycles. There are a lot of different steps. The the step will actually ride the so the code, uh, there's things speed up. So we're 20% easily. Um, but I would say
00:10:22
Filip Schouwenaars
Oh.
00:10:26
Michele Catasta
Now that the eye is going to become to be imbued across every single product feature. It was going to be very hard to measure exactly how much time is saving. You know, it's it's something new to search for code is helping you to do Q&A or a good day. Is it something you need to debug your builds? Um, like it's super basic. There are certain part I think are going to start thinking about how much time it saves us and we're going to consider table Stakes. That's I think in the future we're going to
00:10:47
Filip Schouwenaars
Yeah, yeah. Yeah.
00:10:49
Filip Schouwenaars
Yeah, yeah, very interesting. I think this is a great segue into the the next section of the stock if you will which is more about like the the quality and and potentially also privacy concerns around deploying llm models for for uh for basically like elevating your coding workflows Jordan you already touched upon on the beginning like sometimes like it's not working out as expected that are not for all tasks already performing. Well you already mentioned also like the fix it feature in in the modern Qi is is is right like roughly 80% of the cases give or take. Um,
00:11:21
Filip Schouwenaars
In terms of coding assistance is is is there specific use cases where you can say like actually here AI is extremely strong but actually in these fields it's actually less. Uh, like it's it's still like a challenge to have it perform. Well, Mikey.
00:11:42
Michele Catasta
I I think especially depending on the programming language you're targeting, uh, like from personal experience, you know, we see exceptional performance with relatively basic Python and JavaScript code, which do represent the majority of the open-source code that you can find on GitHub. So it's it's not a surprise that these models are so powerful, uh, now conversely when it comes to say, you know, uh, low-level system development that that's where the answer probably don't shine as much as to do on, you know web applications. Uh, that being said, I think ultimately what determines the the quality of the AI plus human in the loop is how lazy the human becomes because of what AI generates so if if you become complicit because you just take for granted that the output of 1 LM is going to give you exactly what you want. Uh, eventually you're going to become a software developer, but the the combination of a human who's doing basically code reviews and like AI generating the bulk of the code, I think,
00:11:60
Filip Schouwenaars
Yeah.
00:12:42
Michele Catasta
We're going to see the quality improving regardless could think about it before you had before you needed 2 humans 1 to write code and the other 1 to review it at the very least. And now you have the the coupling of an AI plus a human being as code writer and code reviewers. Um, and I think you know, if if there is a will to have high quality code, there is always going to be high quality code.
00:13:02
Filip Schouwenaars
Yeah, yeah. Yeah, I see I see and if you want to add to that Jordan.
00:13:08
Jordan Tigani
Uh, I think on the on the data side, um, you know, the uh, the stakes are very high in um in uh, you know using using llms and and uh, you know, allowing basically kind of a machine to sort of to uh, to to generate your you know, the thing you're using to make business decisions. Um, you know, I think it's like hey, you know, which 1 of my sales reps are doing well and you know, which ones which ones are going to have to worry about their jobs. Like, uh, you want to make sure that you're actually measuring that correctly or how's my Revenue doing should I you know,
00:13:41
Jordan Tigani
Should I uh, you know open up a new plant or store and you know, the Southeast region and uh, if the llm doesn't exactly know what that is then um or or get something wrong then you know, you can make really spectacularly bad decisions in subtle in subtle ways and in data a lot of the times the things that you're dealing with are very hard to Define like even like what's my headcount like that's a hard like it's it's a surprisingly hard question to ask like do you count interns like okay if you're counting like how many people we need in the office? Uh how many seats we need in the office or how many machines then absolutely count interns. Um, but but if you're counting, you know from a budgetary perspective like that's going to be a very different, um, a very different thing and you know, Revenue similarly is like, you know, we have all sorts of, you know, discounts and and ramps and all sorts of like
00:14:37
Filip Schouwenaars
Sure.
00:14:38
Jordan Tigani
Like all sorts of things things that make those kinds of things hard and um and like and there's a lot of just sort of institutional knowledge about these kinds of these kinds of things and I think um, so I think where it can work well is like if you have a relatively constrained environment and I think you know prelims like thoughtspot was a company that basically was able to sort of enable you to sort of question and answer your data, but I think the way that they were able to do that is they had you had to very carefully manicure and curate your data set and your metadata.
00:14:46
Filip Schouwenaars
Of course.
00:15:10
Jordan Tigani
The llms are giving you the ability to sort of.
00:15:13
Jordan Tigani
be a little bit more Loosey Goosey with some of that stuff, but I think there's just so much business like business logic and knowledge that uh is in people's heads that um, uh, uh,
00:15:28
Jordan Tigani
Is you know an llm is really never going to be able to to to do those things. So I think you have to be pretty careful about sort of what you turn loose on an llm. Um, but I think you know as as mentioned like the uh, you know code reviewer type type things where you basically have a human and a and an AI kind of operating together to make each other better. I think there's, you know, amazing potentials, uh in that
00:15:51
Filip Schouwenaars
Yeah, yeah, definitely like oh I look at it and and get from what you're saying is like as as you go more high level as you're like open-ended business question. There's like a ton of assumptions in there. There's a ton of hidden context there. And if you're not being very specific to the LM what you'll have to give that context and I'm guessing that I'm actually believing that in the next couple of years. There will be more advanced techniques for like internalizing all of that.
00:16:14
Filip Schouwenaars
Organizational context and all of like the semantics like what is this Colony? What is this stable mean and like what's our definition of Revenue Etc. But right now it's it's it's it's a neat like still a bit scary to to just ask the question like hey, where should we open up an office? Because then a lot of the services are more confusing in the system for sure, um continuing on the quality question actually earlier earlier this year. There was a white paper that got released and it said that the code Long Beach cavity. So basically like how long a line of code survives in the code base before it's Rewritten.
00:16:45
Filip Schouwenaars
That that goes down. So actually that quality of codes in code bases is decreasing since like llms were Unleashed on the world. Um, what are your thoughts on this? Mikay? You mentioned before like the AI is writing. I'm reviewing are we getting lazy complacent just committing without without thinking
00:17:05
Michele Catasta
I I believe that I think you know, everyone is for a for a good reason writing writing software writing most software is not a pleasant Endeavor. I think not not most of the software you write is not intellectually stimulating. So when I helps you to write most the bowler plate and the glue code, of course you start to become complacent and that's why you know, we were talking about code reviewing Jordan and I because it's it's so important to make sure that the um is actually in the loop and not lazy copy pasting the code that being said, you know, I read that paper when it came out 1 of the confounding factor that we need to keep in mind is the fact that
00:17:43
Michele Catasta
we are capable of writing much more code than before. So the the time it takes to replace a line of code today, of course is much shorter than it was in the past.
00:17:52
Filip Schouwenaars
Of course. Yes.
00:18:12
Filip Schouwenaars
I see.
00:18:16
Filip Schouwenaars
Yeah, I didn't think about it that way. It's it's true any any like tips or tricks that you can share with us to like to not become complacent to not become like yeah. Look this looks good is is there is there a way to to get you? Like I I have to say like I'm constantly using as assisted coding and it's easy to like, uh get carried away.
00:18:37
Michele Catasta
I I think you know as simple as making sure that could could the user actually happening and they're not sloppy. Uh, so at that point you have to imagine that there is an AI working in tandem with a human and then there is another person in the team who's actually reviewing the code and especially for teams that are very much uh, heavily relying on AI, uh, what they say is because we know that 80% of that PR came from an nlm. So at this point most of the burden is on the reviewer rather than the person who pushed the pr and as long as the entire team gets into the mindset and we give even more importance to reviews, uh that then I think we're good. Of course. This is a mindset shift that is not going to happen overnight.
00:19:17
Filip Schouwenaars
Yeah.
00:19:21
Michele Catasta
I think could have been around for at least a couple of decades officially across large companies. Uh, we're not going to change the models up around your people in 18 months, but I I think we're getting there especially thanks to these studies that are coming out in the past few months.
00:19:26
Filip Schouwenaars
Of course.
00:19:36
Filip Schouwenaars
Great. Um.
00:19:38
Jordan Tigani
If I can just add add 1 Thing on that like I think sometimes when you uh, you know, you're making a code change and uh, you know, you're adding you're adding a feature and there's a function that you know that exists in the codebase that that is almost right but not quite not quite what you want. Uh, I think an llm is, you know can basically copy that function with the changes that you need and like it's like, oh it's very easy to say. Yes. I'm just going to like take this this thing with a bunch of like cotton cotton paste and with slightly change code, you know, the proper way to do it is actually just to sort of like change that original function so that it can do the thing that you actually want it to do and that way in the future when there's bug fixes Etc. You just have to change it in 1 Place rather than like having the sort of like slightly modified modified version, uh all over the codebase and I think at the at the limit, you know, you get lots of sort of like, you know, ghost copies of the same the same thing, uh, and you know, you can be sort of less careful and I think as Mah mentioned like, uh, uh, you know having a huge
00:20:60
Filip Schouwenaars
Yeah, yeah sure sure, um from your experience at replet deploying AI from your experience at uh motherduck, um making AI available what have you seen the reaction of like larger organizations that are leveraging your tools B2, or have you seen their their reaction? Uh,
00:21:17
Filip Schouwenaars
To to like adopting these tools are they completely fine with it? So like hey, uh open up the switches like enable it and let's roll what are certain considerations that you're seeing these Enterprises take what are considerations that other Enterprises should take before they can like like um in a compliance way in a safe way adopt, uh, adopt AI assisted coding workflows. How do you roll it out? Basically what you need to put in place who gets access first, like what?
00:21:45
Filip Schouwenaars
that look like
00:21:47
Filip Schouwenaars
Miguel Maybe
00:21:48
Michele Catasta
Should I start? Yes, so 1 of the main concerns, I believe for Enterprise is uh licensing and copyright. So as we know LMS tend to memorize code. Uh, it's it's very important that they are not trained on any code that is not permissively licensed or that is the case at least, you know, you need a system in place that tells you if a certain excerpt of code that the model is generating is copied verbatim from say a GPL repository or anything for which you know, you need to show the attribution of the code. So it's probably 1 of the yeah top of Mind concerns that enterprises have the the other ones are I would say more similar to what you know, we have been discussing in the past 15 minutes as in what happens so our developers what happens to the quality of the code of the generator?
00:22:38
Michele Catasta
It's already going to become compression. Uh, shall we disable AI when we're interviewing them, etc, etc. So and last but not least and honestly don't have a
00:22:50
Michele Catasta
silver bullet for this problem. We got questions about security as in mlms can generate code that might introduce security bugs. How do we make sure how do we avoid that? And as of today there is not a human, you know, there is not a strong solution to that apart from code reviews that once again the the need to play a role in the in the picture.
00:23:11
Filip Schouwenaars
Yeah, yeah, and do you see also a concern in the direction of um, I'm sharing potentially on my entire code base with an llm. Um, and I'm concerned about like it's spitting out spitting out like my code to someone else using this llm and
00:23:27
Michele Catasta
There is likely that it's probably the only 1 for which I think the industry has a has a good enough solution at this point. So there are companies that allow you to deploy models on Prem or also. The open source world is flourishing, especially in the last year. So if you are willing and courageous enough to find 1 of the open source models and deploy it yourself and build, you know, the the application integration to make that happen, uh, if by all means you're free to do so and that means that none of your data is sent by the wire to any other party. Um, so like compared to the security issue that I was covering before for which we need a proper technical solution that I don't think is within grasp in the short term, uh, making sure
00:24:11
Michele Catasta
Whether your data doesn't leak is likely feasible.
00:24:11
Filip Schouwenaars
Yeah, yeah. Yeah.
00:24:12
Michele Catasta
It comes with shortcomings, of course as in you're not going to be able to use Frontier models from say open AI or on topic, but at least you you can decide to willingly do so.
00:24:18
Filip Schouwenaars
Yeah, yeah. Yeah.
00:24:22
Filip Schouwenaars
Is is that something that you did Jordan at at modern deck like deploy llms in on Prem or like specific for Enterprises?
00:24:29
Jordan Tigani
um, we're not deploying anything on Prem or our Specific Enterprise, but there's there's an interesting thing about
00:24:35
Jordan Tigani
data as opposed to I think or you know, I you know SQL as opposed to sort of to more generic code that uh,
00:24:44
Jordan Tigani
Actually, um organizations specific, uh, Solutions can be better than sort of broad broad Solutions. Whereas like in you know in if you're building an llm, you kind of want it to look as much code as you want and and then it kind of understands how code works and and it can be really good at like coming up with you know, hey, I want to have a function that does X Y and Z or connects these 2 systems. Um, the more the broader it can look the the the better but actually a lot of times in an organization people write the same SQL over and over again and that includes the business logic that includes knowing code which table you need to do X Y and Z and so actually by just actually narrowing the search to a particular organization's queries that they've run before you can do a better job than you can then then you could do actually if you were looking across across companies and I think this is where I think we can have an advantage um in uh, you know by building, you know, SQL SQL
00:25:47
Jordan Tigani
Specific uh SQL specific Solutions and rolling those out and then you get better privacy and and people can be much more comfortable with how um how their data is is being used and how their their queries are being used.
00:25:56
Filip Schouwenaars
And you're doing that at another know building a SQL specific. Llm.
00:26:01
Jordan Tigani
Um, we are actually we have you know, we uh, we worked with a company called number station to uh to to build a SQL specific L inductee specific. Llm. Uh, we are not currently doing organiza anything organizations specific stuff, but we are we're starting to to work in that direction, you know, basically using, you know, using Rag and using kind of some um, uh,
00:26:23
Filip Schouwenaars
It's like what? What does it matter? What does it matter that what the definition of Revenue is on the open source, like out there in the open like we have our definition of Revenue and that's the 1 you should follow like it's kind of kind of that in short.
00:26:34
Jordan Tigani
Yeah, exactly.
00:26:36
Filip Schouwenaars
All right, cool. All right, so that was like the part on quality and and and also like rolling it out. I actually haven't thought about like the the security aspect of like what AI is like AI can introduce in terms of security bugs and how to blame at some point if uh, if things go wrong, um, I was hoping that in the last couple of minutes we could um, uh, basically like, uh, put up our crystal ball and and make some predictions for the future and see what like the future has in store for us. Um, of course impossible to say, but I'd love to get your thoughts from your um, Premier perspectives.
00:27:44
Michele Catasta
I'm going to go with a contraer and take I don't think we should be blindly trusting Anyone who puts their hands in a good face and does I mean that's how they rush Company is managed to scale. Very complicated architectures and made them reliable. You know, I mean Jordan Jordan I said good experience in bigquery. I didn't think that this was software happened because they were just a bunch of amazing software developers. It's the process that brings you all the way there to make it reliable is what makes the difference so long story short. Um,
00:28:19
Michele Catasta
First of all, even when we accomplish the you know mythical AGI if you think about it, we talk kogi at least in World definitions when AI is going to be comparable in terms of uh intelligence with a human no human out there can be trusted, uh to you know, put their hands in a code base and make big changes, right? We always want them to go through through a code review process. So even no matter how long is going to take us to you know, accomplish a GI is recognition. We will still need someone in the loop to make sure that you know, the what the AGI creates is actually what correct and code reviewed. Uh, so we're probably talking about an even longer time line when we're going to be accomplishing. I think what people call ASI so like a super intelligence and when there is the case probably there's going to be the least of our concerns. I think Society will be so different that
00:29:11
Filip Schouwenaars
Yeah, yeah. Yeah. Yeah. Yeah.
00:29:12
Michele Catasta
yeah.
00:29:14
Filip Schouwenaars
I see I see.
00:29:14
Michele Catasta
So yeah, that's my that's my contrary and take I I don't I I think it's interesting to build on agents and autopilots like systems that can help you to go end to end, uh, but ask them to make sure that they align with what the human wants. Uh, we still need, you know, the double check coming from
00:29:34
Filip Schouwenaars
Ya ya, ya good to double check come from another AI at some point.
00:29:42
Filip Schouwenaars
All right.
00:29:43
Filip Schouwenaars
Um, so suppose that um the Jordan at that mother that you continue developing like, um SQL specific LMS. Um,
00:29:53
Filip Schouwenaars
Um adding more AI capabilities into how like people write code, uh, write SQL queries how they answer questions.
00:29:60
Filip Schouwenaars
how would you say that the life of like the a typical data member data team member will change um, if they like go all in on this AI system world like
00:30:10
Filip Schouwenaars
Does their does their daily life look different like Miguel already mentioned probably is going to be more reviewing and and and less coding. Are are there other like different like aspects that we will be marketing to you?
00:30:11
Jordan Tigani
I think first of all.
00:30:23
Jordan Tigani
Well, I think 1 thing that um, you know, there's a long lines of Michaela was talking about before about being able to bring you know, more people to coding. I think we can bring more people to to data and being able to sort of make, you know, Soul solve problems with data. Um, I think for a long time, you know, there's sort of been this sort of Holy Grail of uh, you know, self-service analytics. Uh, so business user, you know, uh, can you know, somebody who's comfortable in Excel, you know can can basically say hey I want to I want to ask some questions in my data or understand data without having to um, you know, having to know SQL having to um, deeply know how the data model works and how everything works. Um, I think we will start to see the abilities of more people to uh to to be able to to get things done. But I also think that in order to do that we're going to need essentially another level of of abstraction which is you know, we have the physical
00:31:23
Jordan Tigani
basically sort of the physical data that's in our databases, um and our data and our data models and our data schemas and I think we're going to have to sort of have really good definitions of those and have you know, a higher level data modeling language, uh, that that I think our data people are going to Define that's going to include the business logic of what is revenue mean, you know, so everybody can use the same you know this
00:31:47
Jordan Tigani
Be talking about the same things and then that the LMS will be able to sort of understand that and that's also going to be able to allow us to to to start to make real business decisions based on um, you know, based on that. So I do think that kind of the the the systems are going to look have to look different because the systems are going to sort of know how to talk to this the state of model and then we're going to have to you know, make sure that we kind of curate curate that model. Um, so I do think that those things are are going to be different on the other hand. I just I do think that like, you know, uh,
00:32:17
Filip Schouwenaars
Yeah.
00:32:21
Jordan Tigani
Giving data, you know giving data people superpowers and making them be able to sort of write a lot more, you know and better. Um, you know, better better SQL better analytics, uh is is also going to be um,
00:32:28
Filip Schouwenaars
Yeah.
00:32:37
Jordan Tigani
transformational how they how they do their jobs
00:32:39
Filip Schouwenaars
The ability of of AI making things more accessible also as Michael already mentioned in the beginning that that's super exciting what that does bring is the the need for a larger population of folks to be data literate. So to say to be able to like
00:32:54
Filip Schouwenaars
Discern a trend from a plots are to be able to say like, ah, this is a correlation. That's actually a correlation. I'm not making this up like, um data cap is Great Courses on that obviously, but I do think that that's becoming more of a uh, like a general ability that people need to like a general skill that people will need to master. Um, maybe 1 to
00:33:12
Filip Schouwenaars
to uh for Mikey as as you're building basically a replet an ID used by millions of folks The Ides that we know today like the very common 1 like replet, uh, like Visual Studio code. They already existed before like this entire AI High blew up. Um, right now I'm seeing these AI functionalities being extensions into these IDs, but it's not that the core ID experience is Extreme like super different from what it was before as AI becomes more advanced. Are you expecting that like the the core user experience and how we interact with programs how we interact with files how we write programs that that also going to
00:33:49
Filip Schouwenaars
completely change because of AI
00:33:53
Michele Catasta
I do think so I am when we make choices about what to surface in our product or even when we want to decide among 2 or 3 different ways of doing 1 thing which 1 should we surface which 1 should we highlight at? This point is usually always an AI driven feature. So even if this change doesn't happen overnight, I I think that there is zulia Paradigm Shift happening. Um, there are some companies that are taking a more radical approach, uh, because they're starting from the ground up so maybe not
00:34:23
Michele Catasta
The first 1 it comes to my mind is corser where they're really trying to build. You know, AI first ID as you mentioned like a rapid has been around for many years people are used to a certain like familiar feeling with the idea. So we can't really, you know, disrupt it from the ground up overnight. That being said we are building as we speak features where you enter the the product directly with AI so, you know, you start to chat you ask you has the part basically what you want to do and then regard you through the process and more and more that's going to turn into an agentic workflow where we try to build the application for you. And then only the last mile is taking care basically by the woman where they can do the last touch-ups. So regardless, I think if you are starting from scratch or if you are a product that already exists, uh, I see them all progressing and becoming more or AI Centric. Um,
00:34:26
Filip Schouwenaars
Yeah.
00:35:16
Michele Catasta
That being said I I have to bring in my my my hot take as always. Um.
00:35:17
Filip Schouwenaars
Yeah, super exciting.
00:35:21
Michele Catasta
I I don't want a lot of the goodies of ideas that we built in the last decades to to disappear immediately. Like there is a lot to earn from uh, static analysis and all the different amazing tool chain that we've been doing in the past few decades. So I I hope that there's always going to be access to those very low level tools that sound the developers use and there is a reason why we put so much effort to build those, you know, and that technology
00:35:45
Filip Schouwenaars
Of course, you shouldn't like we don't have to throw the kids with the bath water is a Dutch same not sure if it's in English 1. Uh, all right, cool. Um always have almost have to get ready for Q&A. But I I do want to um understand from you you already mentioned it like we'll have to learn to review, uh, the code more uh carefully, we'll have to be able to vet what the AI is doing.
00:35:47
Michele Catasta
So I don't want to disappear.
00:35:48
Michele Catasta
Yeah.
00:35:50
Michele Catasta
Exactly exactly.
00:36:07
Filip Schouwenaars
But suppose I'm new to data. Let's make it specific about uh, like Landing a career in data and starting to um, basically work in data teams, or at least trying to get insights from data.
00:36:18
Filip Schouwenaars
I want to um get into that field.
00:36:20
Filip Schouwenaars
I know that AI is there. How should I get started? Like, how can I build skill?
00:36:25
Filip Schouwenaars
That is relevant. How can I like should I still learn the basics of python or is that something I can just always leave over to AI should I immediately go for more advanced like high level topics? Should I no longer care about learning to uh, read Python and SQL like what suggestions do you have for people wanting to break into the field in a world where AI is is is becoming um table Stakes Jordan.
00:36:48
Jordan Tigani
I would just say find a problem that you're interested in solving and uh and you know dig into it and use the tools that are available to help, uh, you know, either ai ai related tools or python or SQL and it's just it's always helpful to like you start with the Curiosity about something and then um,
00:37:09
Jordan Tigani
Uh and and take a problem-solving approach versus like, you know watching a bunch of YouTube videos or kind of reading reading, you know books and blogs. Um, so, you know, is there something that
00:37:22
Jordan Tigani
Is going to help you in your day-to-day job or like you've always wanted to understand, you know.
00:37:27
Jordan Tigani
football scores and like and you start poking at it and then you'll you know and use that as sort of a um a framework so I realized that sort of not an AI specific answer but I think that's generally the best way of of learning of learning things and
00:37:41
Filip Schouwenaars
So and it's still valid regardless of like AI I see miguela. You have thoughts on this.
00:37:51
Michele Catasta
I I think it's important to learn some of the skills from the ground up. So I I totally agree with Jordan. Um
00:37:56
Michele Catasta
There is a you need to learn how to feel the data know. It's it's a very weird expression that I heard from many data scientists. If you just blindly trust, uh, you know, the llm creating pattern and pandas code for you and you don't you don't know how to vet the results, uh, that that code generates you're not becoming like a data person. So there is a thing as certain level of training that you want to do. I regardless of the AI in order to become proficient and to train that muscle of how to criticize the results that are being generated that I think is is enough to get rid of that without with or without AI
00:38:32
Filip Schouwenaars
Yeah, yeah. Yeah. Yeah. Yeah. I I sometimes try to bring up the analogies like just because we have calculators doesn't mean like we're no longer teaching people to do like 7 time 8 in primary school, right like getting a feel for like the large like how large a number is like once you know it of course like use the calculator but just like that that intuition around numbers and and and Mathematics. That's something we we definitely still need to train. All right, great, uh super insightful session. Uh, I learned a ton of stuff. Um, it's very exciting to see how how rapid is adopting Ai and like being at the frontier of
00:38:43
Michele Catasta
exactly
00:38:56
Michele Catasta
100%
00:39:06
Filip Schouwenaars
Like a new like a new kind of user experience powered by Ai and also in the SQL side. I'm super excited to see what the SQL specific llms and also like infusing context of the of the organization is going to bring uh to to the data professionals. Um, I was hoping in the last 5 minutes we could still do a quick bit of Q&A and I actually found a question here. I'm going to show it on stage. Um, because it's something that we actually talked about before starting the session just among the 3 of us, um have code a have ai coding assistance changed hiring practices. And if so how our technical interviews still prevalent in the era of AI coding assistance. What do you want to see from candidates these days?
00:39:42
Filip Schouwenaars
um Jordan
00:39:44
Jordan Tigani
Yeah, if I could jump in and just say that we used to use replet for uh for our coding interviews. And uh we had and I was saying before we had we had to stop using that because the the AI assistant got so good that it would just sort of autocomplete, uh, the questions that we that we answered and then I guess there's just sort of the broader question is like, okay. Well if you can autocomplete like something like are you really testing skills that are important anymore are we like for you use the calculator example, are we just testing that someone knows how to multiply or someone actually knows how to
00:40:42
Filip Schouwenaars
Yeah, yeah. Yeah, I see Miguel you want to shine in your light on that?
00:40:48
Michele Catasta
Yeah, so for solar Jordan, we're gonna allow you to disable AI for interviews that feature is coming. So glad to hear that. We're using rapid for interviews. Uh, that being said we I given that I try to hire people who are passionate about building these systems. I I can't really ask them never to use AI during the interview process. So my rasic has been the following I try to have a certain amount to interviews that are old school. So like whiteboard or pen and paper style where they can use any level of assistance and then maybe 1 or 2 of them allow you to use handy tool with the following cards. Um, I asked to screen share so that you know, the interviewer can see actually how they use any any AI tooling out there. So I want to see how productive they are with is their mindset how capable they are in using those tools and at the same time also the bar raises substantially because if you have a powerful LM in the loop what I expect it to be in 45 minutes,
00:41:47
Michele Catasta
Is much more than what you would be in isolation without you know, any assistance. Um, so it's probably like a double-edged sword but I 100% will be Jordan that we should more and more focusing on interviewing for Real World skills. And if actively this is what developers are becoming it is are the intersection of their own knowledge and AI this is what we should learn how to test. I I don't think any of us as the perfect solution, but that's why I we are daring to have this time.
00:42:18
Filip Schouwenaars
Yeah, yeah. Yeah, I see. Thanks. Um.
00:42:19
Michele Catasta
hybrid interview process right now
00:42:23
Filip Schouwenaars
Top 3 of AI coding assistance that you currently see performing. Is there currently the golden standard? I'm guessing you have a bit of a subjective take on this but is there like a the golden standard of like an AI assisted coding assist like AI coding assistant there was lots of assistant, but is there like 1 single best out there that people should definitely check out according to you.
00:42:51
Filip Schouwenaars
Jordan
00:42:54
Jordan Tigani
I mean for coding, obviously I think GitHub co-pilot is the 1 everybody everybody knows it's the OG. Um, but I I on the data side, I would say that um that hex does an amazing job of incorporating AI into into everything and not just sort of writing SQL but also building, you know, uh, building visualizations, uh, and um, and they've done an awesome job.
00:43:17
Filip Schouwenaars
Yeah Yeah Yeahs Michaela
00:43:22
Michele Catasta
Yeah, I will go with GitHub for sure. They've been pioneering the field so they definitely deserve a spot in the in the top 3, um, I mentioned corser before they're taking a very radical approach and they innovate a lot so definitely worth to keep an eye on that and testing it and I I can't I can't live without it out of the top 3, of course given it's my it's my daily daily job and passion and we are a different take on that. I think what's what's cool about what we're building is
00:43:43
Filip Schouwenaars
Of course, of course.
00:43:51
Michele Catasta
helping your across the entire, you know end to end from idea to deploying software in production. And that's that's why I found it very compelling. You know, it's AI not only to create code but across the whole software development life cycle.
00:43:56
Filip Schouwenaars
Yeah.
00:44:04
Filip Schouwenaars
Yeah, and I guess it's also not like only who has the best model or something. I think it's also the person who like knows how to most smartly integrated into their product make it a very seamless workflow. So it's not only like, um, lots of details are like mere wrappers around opening I but actually like a lot of magic is happening in the wrapping the wrap like building that wrapper around opening eyes as well. So, um definitely super exciting field to uh to continue checking out. All right. Um, let's wrap up here. Thanks so much for the session. I learned a ton. Um, I hope the audience enjoyed it as well. Thanks again for for joining us.
00:44:38
Filip Schouwenaars
And um, let's all continue. Um looking at this field of AI assisted coding and I'm very excited for what the future has in store.
00:44:46
Filip Schouwenaars
Thanks so much. See you.
00:44:49
Filip Schouwenaars
Bye.
podcast
No-Code LLMs In Practice with Birago Jones & Karthik Dinakar, CEO & CTO at Pienso
podcast
The Future of Programming with Kyle Daigle, COO at GitHub
podcast
[Radar Recap] Building Tomorrow's Workforce, Today: Scaling Internal AI Academies
podcast
[AI and the Modern Data Stack] Accelerating AI Workflows with Nuri Cankaya, VP of AI Marketing & La Tiffaney Santucci, AI Marketing Director at Intel
code-along
Only Code If You Want To: Data Science with DataLab (Part 2)
Joe Franklin
code-along