Unlocking the Power of Data Science in the Cloud
Solongo is the former Global Head of Customer Success at Exasol. Solongo is skilled in strategy, business development, program management, leadership, strategic partnerships, and management.
John Knieriemen is the Regional Business Lead for North America at Exasol, the market-leading high-performance analytics database. Prior to joining Exasol, he served as Vice President and General Manager at Teradata during an 11-year tenure with the company. John is responsible for strategically scaling Exasol’s North America business presence across industries and expanding the organization’s partner network.
Richie helps individuals and organizations get better at using data and AI. He's been a data scientist since before it was called data science, and has written two books and created many DataCamp courses on the subject. He is a host of the DataFramed podcast, and runs DataCamp's webinar program.
Key Quotes
Open up your thinking in your mind, right? Because now that you're in this new world, the barriers, in my experience, become lower, right? So, so if you, if you have a barrier, you've got a roadblock. If, if you're not meeting the price that you need, the performance that you need, don't stop with it's okay. Good enough is good enough. I'd say, open up your mind and say, there's, there's a better way.
When thinking about migrating to the cloud, start with a POC and really look at what is the highest pain point, what's the most critical and start small and take a graduated approach and start with just the storage layer and play around with the microservices and see where it goes. And it's also important to focus on the quick wins while building up that long-term strategy.
Key Takeaways
Embrace flexibility—Moving analytics to the cloud offers flexibility and cost savings, allowing organizations to respond effectively to workload fluctuations and reduce the need for over-provisioning.
Identify your triggers to move to the cloud by recognizing them, potential triggers could be the inability to meet business needs, cost considerations, or executive mandates. These triggers can guide the decision-making process and ensure alignment with your organizational goals.
Measure the success of your migration through the initial objectives of the project, such as cost savings, scalability, or data availability, ensuring alignment with the organization's strategic priorities.
Transcript
Richie Cotton: Welcome to DataFramed. This is Richie. If you're a data scientist, the infrastructure for your analyses may not be something you regularly pay attention to. But where your company keeps your data, and your analysis tools, is going to have an impact on your ability to work efficiently. So, it's worth having some awareness of best practices.
In particular, one of the biggest productivity boosts you can get is from moving analytics from local machines and on premise servers to cloud platforms. The catch is that migrating your data infrastructure is a big project that involves a lot of people and moving parts. Since so much of your business will be impacted, it's something you need to plan carefully to get it right.
Today, I have two experts from high performance cloud database company Exasalt. To help you think about making this transition, John Knierieman is the General Manager for North America and Solongo Guzman is the Global Head of Customer Success. Both of them have extensive experience in helping organizations make the transition to doing analytics in the cloud.
I'm looking forward to hearing their opinions on what to do and what not to do while running a cloud transition project. Let's dive in and hear their advice. Hi there, John and Salamgo. Great to have you on the show.
John Knieriemen: Hi, Richie. Thanks
Solongo Guzman: Thanks for having us.
Richie Cotton: So I'd like to kick off with a little bit of motivation. So can you just tell me ... See more
John Knieriemen: Yeah. So again, thank you for having us, Richard. This is a really exciting discussion for me and really looking forward to our time together here. You move your hands and legs to the cloud primarily for the flexibility the cloud allows. And in the prior model, in the on prem model a lot of times you don't have the flexibility you need from a data analyst perspective from data science perspective.
you're subject to creating an on prem infrastructure that requires regular updates internally. There's process, there's you don't get the flexibility that you need. And sometimes the cost is not what you need as well. Some organizations have decided that that they want to get out of the business of managing a platform.
They want to be more strategic in the way they interact with technology and choose technology. So at a high level, that's really high level pitch for for moving analysts cloud.
Richie Cotton: And since there's like a lot of bits of analytics stacked, like there's a lot of things you might want to move to the cloud. So can you just talk us through about like, what are all the different parts that you might want to move to the cloud?
Solongo Guzman: Absolutely. So in terms of layers, you can think about your analytics layer as the storage layer starting out. And then there's the integration ingestion layer. There's the processing layer and your analysis layer. So the easiest way to approach this is starting with your storage layer. So cold object store is relatively cheap in the cloud.
So that can be your storage layer. Amazon S3 that can be your Azure Blob or Google Cloud Storage. So that would be a great place to start. And then you connect that with your enterprise data analytics warehouse or database. And then you have the composable services that you can start playing around with.
Richie Cotton: Okay, so you're suggesting you just start like, just, you got a load of files lying about, shift them to the cloud first, and then worry about the analytics later. That's interesting. Okay so, I guess, before we decide that moving everything to the cloud is a brilliant idea, what are the alternatives to this?
John Knieriemen: Yeah, so there's basically a an alternative to have either a full on prem infrastructure, a hybrid infrastructure, part on prem, part in the cloud and then all in cloud. So those are really the three different scenarios. benefit of all in cloud as you look at what I said earlier around the flexibility and the ability to design your architecture in the way you want to design it.
It's really optimal. I sat down with chief data officer a few months ago, and she basically told me, she said, I want my architecture to be like a Lego set. I said, What does that mean? She said, I basically wanted to have those nice holes and fittings that fit together nicely. Then I can design it the way I want to design it.
So that's really what a lot of organizations are looking for. So that that cloud really enables quite a bit of that. Some organizations, if you go to the other side, have a need to be fully on whether it's regulatory concerns around security or having the ability to control their whole destiny end to end.
Some organizations are doing that today. And then I actually see a growing number of organizations in the middle of a hype, right? Some some part on prem, part in the cloud, and they're making decisions for those environments deployments based on what they're trying to achieve. So at
Richie Cotton: And so, If you are going to push things into the cloud it seems like maybe what the, the reason I'd like to keep things on premises is going to be worries about privacy and security and things like that. So how do you address those if you're going to move to the cloud?
John Knieriemen: a high level, when you move into particularly a public cloud provider their whole business is centered around security, particularly the three major cloud providers. If they don't have tight security for both public and private sector, they really won't have a business. So so those those organizations have really excruciatingly tight security that they follow.
And then the all of the technologies that fit within that environment are naturally as secure. from a security perspective, when you're all in cloud, doing the analysis on the public cloud provider side. And then fully on prem, you have to do all that yourself, right? You do all that vetting yourself and work with either a software provider and or your security team within your organization to figure that out.
Richie Cotton: Okay, so it sounds like a lot of this is about how much do you want to do yourself versus how much do you want to outsource to other platforms and tools. Okay. So are there any economic factors that drive this sort of migration of analytics to the cloud?
Solongo Guzman: Cost savings. I think it boils down to how much upfront expenditures. You're willing to kind of forego the process of provisioning a physical infrastructure where we've seen. It's it can be a very long process in a large enterprise. It starts with infrastructure planning, architecture design that can take anywhere between, 3 to 4 months.
There's an internal certification process. Typically, we see this in financial services, so there's the infra design pattern documentation and that can take anywhere between five to six months. And then there's the actual physical hardware delivery and the installation of the different environments, dev, test, prod, and the data centers.
So by the time the business has access to the environment, it can take as long as 12 months. So with the cloud, you're foregoing that. And with the pay as you go model, you pay for only the resources you consume. So that eliminates the need for you to provision the resources up front. And it's also increasing scalability and agility.
So what that means is you're able to respond effectively. To fluctuations of workload. So you're able to reduce the need for overprovisioning and you're able to scale much faster. There's increased operational efficiency. So there's overall reduced maintenance and downtime. The economics are incredibly attractive or going to the cloud
Richie Cotton: think saving money and doing things faster, that's a pretty strong argument for making the move. So what's the initial trigger in general? So what's the point where companies realize, okay, we need to start doing this migration?
John Knieriemen: Yeah, I would say there's several triggers. First and foremost, it's are you able to meet your business needs and objectives in the way that you've designed what you've designed? Right? So you think about the data scientists and the data analysts. Are they able to be the most efficient in their in their role to be able to succeed fast, fail fast, be able to get the performance they need to be able to access the data they need across the organization?
If that is a concern, that would be a trigger, right? To look at What type of a is a deployment model issue? Is it a process issue? Right? It's not always where the data sits or how you put it. Sometimes just a matter of process and how the organization is structured by who has access to data and how they access.
I would also say that to Salonga's point costs could be a driver, right? When you look at the cost and look at Sometimes on prem, you have to design for worst case scenario. So you're designing this, unilateral architecture that is worst case scenario versus being able to, to her point, be able to use the consumption based model just exactly what you need.
And the third one is probably the most obvious. Sometimes there's a mandate from a C level executive that says we shall go to the cloud that. gets everybody rallied that direction.
Richie Cotton: I certainly think those economic drivers are pretty relevant at the moment. There's a lot of companies trying to cut some costs, but on that last point I have a feeling a lot of our listeners might be the people who are like having to implement this sort of stuff. Suppose your CTO comes to you and says, okay, we need to move all our analytics to the cloud.
Where do you start?
Solongo Guzman: where we see this is Is really use case oriented. So, the market has changed quite a bit. Generally, we saw a decade ago the CTO CIO driven top down decision making. But now you see this mass consumerization of data and analytics. So what that means is the business and the users are really creating these use cases.
It's really use case oriented. So if they have a need to go to the cloud, they need to do it quickly. They want to modernize their solution as opposed to doing a more lift and shift. Because if you have silos, if you have issues on premise, why would you want to migrate that onto the cloud? So you want to take a much more holistic approach.
And, starting out with maybe the highest pain point. and say, what is the use case that needs to go to the cloud and start with that? And then maybe adapting a, a proof of concept and starting there because the cultural aspect might be one of the biggest barriers of this migration.
Richie Cotton: That's really interesting. The idea that, I guess it's true in every company where they're like, Oh, I hate the structure of our data. We've got data silos everywhere, and it's probably like one magic company where they do have that problem. But the idea of like just changing your infrastructure gives you the opportunity to just rebuild things from scratch and maybe do things better.
All right. So you mentioned starting with high impact projects. So I'm Wondering what the trade off is here between like starting with high impact projects versus starting with low risk Projects like what's the sort of thinking behind how you decide what to move first?
John Knieriemen: Yeah, I would say organization is different when you look at what I mentioned earlier around the projects that are creating the most pain. I think those are probably the most obvious, right? So the situations where you're physically hindered in your role as an analyst or a data scientist, those would be the potential ones, to Salongo's point, that you may move like right now because there's not a lot to lose in that.
The risk is very low on that one. There's probably only upside to that situation, there's maybe other ones that if you can use the analogy, it isn't broken, don't fix it. Maybe those you would, you would look at and put those further on the roadmap. So, so being able to look at the ones that are creating the most pain, the ones that are running okay, and then systematically migrating gets the ones that are running.
Okay, you might even make the decision to do those as well because there's an economic. advantage to having everything in a, in cohesive environment.
Richie Cotton: I think hopefully people are sold on the idea that this is a good idea to move things Um who needs to be involved in this process?
Solongo Guzman: It's everybody. especially you know, now when you Gardner has done phenomenal job with their and the frameworks
So the simple answer is the business and tech teams both need to be involved as well as the executive decisioning and a cloud migration is a strategic decision. So you need to have obviously the executive buy in, but in order to move the organization up the maturity model, you need the entire organization on board.
That means the business teams, the tech teams and the leadership.
Richie Cotton: Okay. So that's a lot of people. So by the time you got like, the users and anyone who's involved in data and anyone who's involved in IT and leadership, how do you manage like all these different teams working together? Because I imagine they're going to have different viewpoints on what to do.
John Knieriemen: way you manage it is really bringing the teams closer together. So, in the traditional model there is, an I. T. Perspective on the types of technologies that would solve a business challenge. There was already that should have been that natural relationship with the business and I.
T. But at the end of the day, the I. T. Group would normally say, I can or can't do this depending on the technology in the stack that that I'm aware of. In this other model it actually with with more flexibility comes more responsibility for the data analysts and data scientists, because now they're in a position where the operational side is mostly covered, right?
Because you're in an environment that, as we talked about earlier, is secure and works well together and is a marketplace of tools and solutions. So at that point, the data analysts and the data scientists become more powerful to be able to make Those types of decisions. but the I. T. Group still being informed is extremely key, right?
So any organization where there's decisions happening in silos, and I've seen both where I. T. Is designing and building. They will come and I've seen the business side making you a lot of decisions. business side is making a lot of decisions, the risk is that There's a lack of governance and potentially a high cost, right, because they're deploying at will and siphoning their credit card and things can get out of control in all different, all different areas.
So you still, there's still a requirement to have, to some of those point, people very tightly engaged across the organization. And then from an executive perspective that they're aware of the opportunities and the risks and they have full support for the organization as well.
Richie Cotton: And do you have any sort of tips or it seems like, communication sort of the big key here. So do you have any tips on like how to like communicate across teams or how to manage these, these big processes?
Solongo Guzman: So Center of Excellence have worked really well, especially in big, complex organizations, bringing a committee together and having various members from the stakeholder group work well. But also were seeing it happen at a grassroots level. For example, at a big customer, we have advanced analytical capability now in the cloud, and now the users have never had this kind of access before.
So now the question has turned into who has this data and when can I use it? And by default that process has allowed closer collaboration of the different teams.
Richie Cotton: That sounds pretty cool. And so maybe I'd like to hear a bit more about some of these success stories than you've had with your customers. So can you tell me like an example of when the customer has been through this migration process and then they've had like a big win from it?
John Knieriemen: We had a large healthcare company that we work with, a US based healthcare company, who made the decision to move to the cloud. decision was based on the fact that they wanted more flexibility, they wanted more agility, and they had a theory cost savings. So they ended up taking their current on prem data, moving it into ObjectStore.
In the cloud provider that they they were using and that created a layer by which they can then make decisions to Lego analogy how they plug the Legos in the engines in to optimize. That stack. And it's I'd say it's still a journey. What I applaud them for is they've been flexible along the journey.
And that's one thing I'll stress is that the journey, the first step of the journey critical to get the foundation built. But then you need to be flexible as you. move through the journey in the cloud allows that flexibility. So they've taken a constant look at what engines best for what job, what types of A.
I. M. L. Tools are best the teams. And it's been a very collaborative group, and we've been honored to be able to work with them, too, around some things around performance and some of the things that we provide. But it's just been a very for to applaud them again. They've been very flexible along the way and along the path.
And back to your question about organizational, they've created a structure by which they have a center of excellence. They have regular meetings, interactions across the business and the I. T. groups, but then also executives, and they were in very cohesive unit.
Richie Cotton: so it sounds like the big win for them has just been increased access to data and maybe are there any productivity benefits that go along with that?
John Knieriemen: Yeah, so when they started this journey the data analysts and data scientists were complaining because sometimes they could only run three or four questions of their data in a day, right? So one to two hour queries, things are just getting out of control. They go grab a cup of coffee, and by the time they came back, the query was maybe still running.
they were ineffective inefficient because they couldn't succeed or fail fast enough to be productive within the organization. So when they moved into this cloud environment, they were able to select the right tools to be able to optimize. And now the data analysts and data scientists are far more efficient and being able to ask for more questions of data for a lower cost to be able to leverage those those tools in the consumption based approach.
Richie Cotton: Oh, so you mentioned this was like a healthcare provider, and I have to say like a data camp, because we're a tech startup, like we're, of course, like you do everything in the cloud, but I guess it sounds like a lot of The sort of benefit from this migration, it's it's gonna be all the companies and perhaps larger organizations.
And would you say that's an accurate description of your customer base? Like who cares about this the most?
John Knieriemen: No, I would say I would say it's actually particularly interesting for for mid tier companies because mid tier companies don't necessarily have the massive amount of operational resources to support in the way that they need to support this particular health care company was a large one that decided we don't want to be in the business of doing this, even though we could.
We don't want to be in the business of managing an operational platform. But for mid tier organizations, we met with one last week where we had one guy who was basically running the whole shop, and he was a solo guy running the whole shop. And so in that situation, Cloud makes a ton of sense because you're able to leverage all of the operational support and the whole marketplace of tools and solutions you get when you can't necessarily afford to hire a whole team.
Solongo Guzman: A lot of Fortune 100 companies and the power going back to the business users. So instead of relying on it, we have teams, and two of the large telcos, they run their own infrastructure. So they own it, they run it and a lot of the operational overhead is outsourced so they can be freed up to do more strategic.
Richie Cotton: With a large cost project like this migration to cloud, you're gonna want to see some kind of returns and you want to know how to measure those. So can you talk about like, what's the measurement for success of this?
Solongo Guzman: Absolutely. I think the first and foremost is why did you set out to do the project from the get go? Was it cost savings? Was it scalability? Was it access for compute resources? So I think that's the initial you 80 if you will. But the ultimate success is where we're seeing with our customers is data availability and accessibility.
So you have a single source of truth, right? And, improved governance, improved security, there is sort of a data flywheel happening where more users having access to the right data that they need at the time that they need it. Now, the business is able to effectively harness more value out of their data.
So it's going back to climbing up the maturity model. And using data as a strategic driver for the business.
Richie Cotton: Okay, so, you really have to define some goals then up front before you actually embark on this if you want to be able to accurately measure how well you're doing at the end of it. Makes a lot of sense. on the flip side, are there any things that can go wrong? Like, what do you see are the most common problems?
John Knieriemen: I'd say that what can go wrong is maybe the first, what we said could go right. It's, it's, it's when you don't go in eyes wide open when you haven't done your due diligence to map out what you believe the business opportunities are, where the gaps are how you can optimize for how you can optimize for performance.
And some people go in with blinders on to think that there's a utopia and they come to find out that the lack of planning means they're. Placing when they started, so I would, I would stress that doing proper planning, making sure that you understand what the gaps are, what the trade offs are and make the right decision for your business.
And every business will be different. Some will be full on cloud, some will be on full on prem, some will be a hybrid depending on your model and the market you're in and all that. So I would say that planning is extremely important and being realistic about it. When you're gonna have those quick wins and where you're gonna build the foundation is also important.
Richie Cotton: It always disappoints me when, like, planning is a thing that you really need to be important. They want to dive right in. I've got to be patient. So, yeah all right. So, planning seems to be a very important thing. Having some goals seems to be an important thing. Are there any other tips you have for making sure that a migration project is a success?
Solongo Guzman: Yeah, I think it's arming yourself with as much knowledge. I think now that the responsibility shifts back to the users you really have to know a lot about your cloud platform of choice. You got to know how the tools and the interoperability works. Because going back to the risks, you can never eliminate the human error.
Cloud might be one of the safest places because of the robustness in security features, the data encryption at rest in transit. But, there are incidences of not provisioning things properly. So, just knowing as much of this incredibly powerful capability will make sure that it'll be the best way to mitigate any risks.
John Knieriemen: Yeah, I would also add that it's important to design that center of excellence we talked about earlier that that is educational center of excellence. And where I see some organizations struggling as well as is spitting a square peg into a round hole from the tool perspective. Because some tools are designed for performance, some are designed for ultra flexibility, some are designed for just low cost, right?
So, if you use the tools in the way that they're not designed you could escalate your costs and then hurt your ability to get the performance and the data that you need, and then being able to design that data in a foundational way to be able to access the data you need across the organization, not putting silos, not locking the data down for users is important in creating that center of excellence and having kind of a shared network where people can share, Hey, I use this tool for this and that you've worked well and be able to share among the company and organization is best, best practice.
Richie Cotton: So you mentioned this term center of excellence a few times, so, I'm curious as to what this involves. Is this like a dedicated team who just dedicated this one project or what's the deal here?
Solongo Guzman: it's all of it. So it's a team, it's a process. It's a mechanism in place where it allows you to make this incredibly otherwise complex project happen. So we're seeing cloud center of excellences. We have our, own database center of excellence in certain customer sites where it allows people.
From other parts of the business and department to come together and share knowledge on an ongoing basis. So there's transparency. when there's transparency, there's better control at
Richie Cotton: Okay, that sounds like a useful thing, just having everything together in one place just so you can make sure that I presume it makes communication easier if everyone's not spread out around the globe and in different teams with different different needs. So we talked a lot about the sort of the general migration process.
Maybe let's focus on databases since, of course, XSOL, you're primarily a database company. I think that's a fair characterization. So, I guess that the big names in cloud databases it's like things like Big Query and Redshift and Snowflake. So how does XSOL compare to those platforms?
John Knieriemen: Yes, I'll start with the business challenge again, we discussed earlier. So there's three different balancing acts, if you can think of it that way, is the ability to have choice within the database you select, the ability to optimize for cost and the ability to optimize for price.
performance. So those are the three balancing metrics you see with with databases. call ourselves a no compromise solution. So we're able to properly balance that. We've got a very strong foundation around an MPP based architecture that can scale concurrent users and scale data seamlessly.
But then we have the memory technology that helps it operate very efficiently and very, very rapid, particularly at the the B. I. Layer right when you need to have, second subsecond type response times within queries. And then we have the some some things that virtualized. So we have in our database ability to reach into different data sources natively structured and unstructured to be able to pull that data in so that the data analysts and data scientists don't have to worry about different systems and bringing systems together and moving data around.
You can you can natively pull that in. And then we have things that are just Native within our solution that make it easy to operationally manage it automatically creates the primary index and those sorts of things. So, we're excited to be able to offer the market. It's a no compromise solution.
You look at those balancing measures
Richie Cotton: So it sounded like high performance was a big part of your pitch there. You mentioned like the sub second query times and things. Is it just about chasing like faster query execution or are there other parts to improving the performance of people working with data?
Solongo Guzman: the mechanical level. It's two things for G. It's faster ingestion and faster query times. But going back to what we said, we're really focused on use cases and outcomes. So faster queries and faster ingestion, a lot of things for our customers. So typically when we're working with large enterprises, they're coming to us with specific.
Purpose and specific outcome. They want to achieve. So they want to increase revenue through strategic pricing and customer acquisition, or they want to reduce costs through reducing operational expenses or reducing risk and meeting compliance. So we are really focused on delivering, you know what that use case requires.
So it's much, much more encompassed and wrapped around faster ingestion and faster free times.
Richie Cotton: So, I quite like this idea of, like, focusing on the use case and just getting beyond like, just can you crunch SQL really quickly? Can you maybe give an example of one of these use cases where you've had to tailor what the database is doing to meet that need?
Solongo Guzman: Yeah, absolutely. We have a telco customer that has a really cool use case. They're doing network optimization. So what they do is they have different data analysts and modeling teams. They generate models that make recommendations. To improve five G configuration plans and designs, and that also includes infrastructure acquisitions, which can be really expensive for telco.
So they're able to predict using models. You know what type of investments would yield the highest ROI. And what type of specific configurations, cell tower configurations, and they're able to track the progress of that using third party external crowdsource data. So it's a full circle of not just providing the recommendations, but also seeing the progress and tracking the progress so they can improve their models and continue being that vital part of business decision making.
Richie Cotton: that's actually a really interesting use case, because it sounds like, if they're using external data sources, you've got like a data engineering component for like bringing everything into the database, and then it sounds like there's some kind of optimization or maybe machine learning on the other end of it as well, like some fairly sophisticated analytics.
So how does the database fit into that kind of broader technology pipeline then?
John Knieriemen: and they look at the pipeline, right? So you have the ingesting of data. The database is really the horsepower of the processing part of it, So whether you're running an AI model, whether you're running a report B I tool our solution. The horsepower usually comes from the database itself, right?
So the database really offers it to make a break in a lot of cases to be able to offer the performance that you need across the infrastructure. Now, I won't minimize other things like being able to properly prepare your data, and sometimes the ingest process is broken. So it's not just the database is the is the magic silver bullet.
It never is, right? It's the end to end process. But the database is sometimes a key bottleneck for organizations if it's not. If it's not utilized correctly and you're using utilize the right technology again for the right job.
Solongo Guzman: That's a really good answer, John. So we typically see three deployment patterns. So there are three main use cases that data consumer can get to the database. So you have the traditional data engineers and DBAs using sequel and accessing. The database like access all or you have the data analyst going through a B.
I. Reporting tool like tableau or power B. I. And then you have the data scientists going through, a data robot or Sage maker to access the database. But nonetheless, It is the compute engine. It's the heart of the operation. So when you have VI users complain how slow the dashboard is, it's actually the underlying database.
So a lot of the credit goes to the front end, but what's actually happening is what's underneath the hood.
Richie Cotton: Okay. Yeah, certainly. I've seen that where you've got a really clunky dashboard and you're like I'm not sure what's going on there. And maybe it's like, it's the processing upstream from the dashboard that's actually the problem. So, oh, John, you mentioned bottlenecks. And I'm curious to hear about, are there any more examples of like analytic problems where it really benefits from that more powerful database engine in the middle?
John Knieriemen: Yeah, so so pretty much any industry that needs to make decisions in enough in a quick enough period of time to actually affect make or save money are critical. And I'll use a couple of examples. One's from from retail, right? So you have a call center. You got people that are taking calls and a lot of call centers are selling centers in some cases, right?
Where they're able to not just support the customer, but also to to sell or upsell the customer. So in those types of situations, we work with a retailer around being able to do that in near real time so that as the person is sitting there with the individual on the other side of the line, they're running not just analytics, they're running predictive analytics.
so to Seloma's point, they've gone up the maturity curve where they're saying, this person's calling, this is what we, based on all of the people and crunching in real time and the market and the day and whether it's a holiday or not, they're able to make a predictive theory about what you should be discussing and presenting to that customer, right?
So, so that's ultra critical. You've got an engine, a database that's able to do that. Another example is we're working with a high tech company around sustainability, So basically for all of their physical locations, they're attaching exosol to be able to monitor energy consumption and usage in near real time to be able to make those adjustments.
And again, it's not just descriptive is analytics to be able to predict based on these types energy consumption behaviors. We're gonna go and adjust and optimize for sustainability.
Richie Cotton: That's actually pretty cool because I guess when I think of databases, I'm thinking of like just doing really a simple business analysis, crunching numbers, and once you start getting to like doing predictions, a bit of machine learning, then that's where you break out Python or R. But actually.
The database part of that is from what you're saying, it sounds like this is still really important because you've got to get the numbers processed in the first place before you can start trying to do any of that. Okay, cool. I guess for people getting started with this, there's going to be a lot of organizations who are just trying to figure out cloud migration now.
And so you mentioned having greater access to data was, was one of the big benefits, I think, for your health care example in general how do you think about data governance more generally at Exasol? So, uh,
Solongo Guzman: I think by process of migrating your analytics to the cloud, you have better metadata management, better cataloging, you're improving governance. We have seen that with a large customer where it's been traditionally on premise and they had siloed data sets in their own business units.
So there was a dispute on who had the source of truth. So by effectively having they have a data lake now on XSOL in AWS. So it's a single source of truth. The governance has just improved. And the business has access to already cleansed wrangled data that the data scientists and modelers just can run with it.
So they've seen improved governance. They've seen improved security by having a central repository
Richie Cotton: I'd like to talk a little bit about skills and does moving your analytics stack Like if you put it in the cloud, does that change the skills that your data analysts and your data scientists need?
John Knieriemen: I'd say absolutely. The point made earlier with much flexibility comes responsibility, particularly for the analyst, data scientist, the business user, as we call them to be able to be more in tune with Not the depths of a technology, but, but how a technology functions and how it should be used because essentially now that it's on their shoulders in a lot of cases to make that selection to pick the right tool for the right job to not fit that square peg into that round hole.
So, so they need to be more in tune with the types of technologies again, not the full architecture understanding, but enough to know when to deploy, which again, Lego on top of that Lego set for the particular thing that they're trying to achieve. There
Richie Cotton: of like that. It's like, I don't, don't need to understand the architecture. I just want to play with Lego all day.
John Knieriemen: you go. I have three young boys at home, so it resonates with me. I step on them on a daily basis, unfortunately, but it resonates with
Richie Cotton: Nice. Okay. And so if you're working on problems where you do need high performance, so you mentioned some examples of like working in retail, you need to get the answer in all this real time. So, if you do have that sort of need for like really fast analytics what sort of skills you need to get there?
Solongo Guzman: again. It boils down to the database. Because, the front end tool is only as powerful as the underlying database. So we see very common in the market, a Tableau is, for example, a popular dashboarding analytics BI tool, and you're able to do only subsets and extracts versus with Exasalt, you're running analytics.
On a live data set with, through a live connection. So that really empowers the data scientists and analysts to do more analytics, richer analytics on bigger data sets and fresher data sets. So, really, when you're playing around with powerful engine, powerful tools, especially in the cloud, when you have plethora of microservices, You have to know how these tools work together and really understand the power of this because the flip side of going with the case you go model is if you don't know how to properly use it.
The cost overrun can really be a surprise and you don't want to get that surprise bill at the end of the month. So you have to know how to provision them. You have to know how the tools work together.
Richie Cotton: I was talking to a data engineer recently and he was He was just like, Oh, no, I left this database running for 10 days. I didn't really remember it. And he said, Yeah, just got the bill for that. So certainly remember to turn off services you aren't using. Very good cloud advice. so. I think to summarize your point that you were basically saying that like, if you want high performance stuff, you just need to buy the right tools first that are going to deliver that performance and then learn how to use those tools properly.
And that seems to be the order rather than think about skills first and then figure it out. tools after you've got those skills. Does that make sense? Is that what you have?
John Knieriemen: It does. And I would also add that having the ability to work with IT is important as well. So having that sort of people interactions, being able to work well together with the team is important because It's again, not the tool doesn't solve everything. It's the whole process. And for those data scientists, data analysts that have regularly scheduled interactions with with I.
T. And view them as a business partner. Those are the ones that are most successful.
Richie Cotton: And I think one thing that's sort of come a few times here is that you need to know a little bit about like what the database is doing, what all the other tools are doing. Even as a data analyst or data scientist, like we're actually using these things. So are there any particular things that you think people need to know about like the underlying tools and infrastructure?
John Knieriemen: Yeah, I would say that, some technologies as a data analyst, data scientists, you can hit select star and it's not going to crash the database. Others It will basically totally crashed and you'll be at a bad state. And so the understanding enough about the types of ways that you would interact with the tool.
The things that are are okay, the things that are not okay, understanding those boundaries. Some tools have some databases have good workload management capabilities to be able to have one data analyst and a prioritization because they're running something critical for the CFO and the others are going to take in lower priority that some tools have that some don't.
So being able to understand that well enough to be able to go back to the I. T. Group again and say, Hey, I really need priority this week because I'm running this particular thing. That's really important.
Richie Cotton: And just to wrap up do either of you have any final advice for organizations wanting to migrate their analytics stack to the cloud?
Solongo Guzman: I'd say start with a POC and really look at what is highest pain point. What's the most critical and start small and take a graduated approach and start with just the storage layer. And Play around with the microservices and, see where it goes. And it's also important to focus on the quick wins while building out that long term strategy.
Richie Cotton: Oh yeah, proof of concept and getting a quick win seems very useful, especially if you've got impatient managers wanting to see results. And John, how about yourself? Any last advice?
John Knieriemen: I agree. I think so long was that I would also say open up your thinking in your mind, right? Because now that you're in this new world, the barriers in my experience become lower, So if you if you have a barrier, you've got a roadblock. If you're not me, the price that you need, the performance that you need.
Don't stop with. It's okay. Good enough is good enough. I'd say open up your mind and say, There's there's a better way. There's a lot of tools out there. There's a lot of things we can we can do, and it's really exciting when organizations open up their thinking and say, can achieve what we need to achieve.
Richie Cotton: Fantastic. And with that I think thank you so long ago. Thank you, John. It's been great having you on the show.
Solongo Guzman: Thank you, Richie. Thanks for having us.
John Knieriemen: Great. Thanks so much.
blog
Google Cloud for Data Scientists: Harnessing Cloud Resources for Data Analysis
blog
Cloud Computing and Architecture for Data Scientists
blog
How to Overcome Challenges When Scaling Data Science Projects
John Marquez
12 min
blog
How Data Discovery Tools Enable Data Democratization
Kevin Babitz
7 min
blog
Data Science in Education: Transforming the Future of Teaching and Learning
Shona Afonso
12 min
podcast