The Role of Mature Data and AI in Accurate Generative AI

Author: Beverly Wright

In this podcast episode, Dr. Beverly Wright, Vice President of Data Science & AI at Wavicle Data Solutions, and Aparajit Agarwal, Enterprise Data & AI Architect at Kao Corporation, explore why mature data and AI ecosystems are critical for success in generative AI initiatives. They discuss how gen AI’s potential can only be fully realized when built upon a mature ecosystem capable of handling complex data to deliver consistent business results. Whether you’re in the AI space or simply a tech enthusiast, tune in to learn how to fully leverage the full potential of gen AI.

Speaker details:

Dr. Beverly Wright, Vice President – Data Science & AI at Wavicle Data Solutions

Aparajit Agarwal, Enterprise Data & AI Architect at Kao Corporation

Watch the full podcast here or keep scrolling to read a transcript of the discussion between Beverly and Aparajit:

Beverly: Hello, I’m Dr. Beverly Wright, and welcome to TAG Data Talk. With us today, we have Aparajit Agarwal, and he’s talking about the criticality of mature AI for accurate generative AI. Very interesting title. Welcome to TAG Data Talk.

Aparajit: Thank you for having me.

Beverly: Let’s start off with a little background. Tell us a little about yourself. Why are you so cool?

Aparajit: Because I refuse to be flustered by that question. So, I have a pretty long history, I’ve been doing this for about three and a half decades. I started out as a structural engineer and moved into IT. I’ve done a lot of other things — programing, sales, business process reengineering, created a music album, and ran a restaurant. And I’ve run my own company in the analytics space creating software for doing statistical analysis, predictive analytics, and building models for the energy commodity industry. I recently designed a multi-tenanted SaaS software platform for global vendor billing for a company that then got acquired by BlackLine.

Beverly: Wow, you’re a real Renaissance man, getting into arts and technical and everything in between. We’re talking about how important it is to have mature AI before you can just jump right into generative AI. What does that mean when we talk about mature AI, like analytics maturity as it relates to generative AI?

Aparajit: One of the confusions is that AI is also analytics and insights for many companies and then AI is also artificial intelligence. Mature AI relates to data maturity and a good data management program. Ultimately, it’s all about the business, your core product, and if you’re making money or improving your position in the market by using gen AI or AI of any sort.

The widespread issue we’re seeing in the industry is that people want to do gen AI, and they want it to be customer-facing, and that’s where the danger comes in. If you really want to get the most bang for your buck, there is AI that has been around for much longer than gen AI that a lot of people don’t traditionally think about, and it’s machine learning. The ability to do predictive analytics, supplement that with gen AI, and use it internally, not externally, is where the money is at.

Beverly: You talked about using AI for predictive analytics to boost your work and how businesses at the core should focus on solving problems to add value. That’s going to shock a lot of our listeners because they think that data and technology are at the core. But you’re saying business is what’s really at the core providing that value. Could you tell us about predictive analytics and how to infuse those with AI to get that value for the business?

Aparajit: Think about gen AI as being a layer on top of everything else. You have your data, standard dashboarding, analytics, and other things. What you want to add to that is machine learning and predictive analytics and using AI to do that. A lot of people have groups of resources that are doing predictive analytics for tasks like forecasting. But applying data science to that piece results in a lot of returns.

For example, let’s take a company that’s manufacturing something. Are you leveraging time series data and macroeconomic indicators, such as unemployment rates and crude oil prices, to monitor market trends? Are you correlating this data with your sales and cost of goods over the same period, applying data science techniques to identify opportunities for cost reduction? Additionally, are you incorporating market sentiment analysis to forecast raw material needs? For instance, by predicting the demand generated by an upcoming marketing campaign, you could strategically purchase raw materials at a lower price today, knowing their future cost. With the right inventory management in place, this approach can optimize your procurement strategy and maximize profitability. When you do that and supplement it with gen AI, then the return on the dollar is significant.

Beverly: In this scenario, you’re talking about data maturity being kind of the crawl. And then the walking being like how to get more AI into your predictive analytics to solve problems. And then running is sort of the gen AI cutting edge. But that is not what people want to do.

Aparajit: No. That is not what people want to do.

Beverly: They just want to do some generative AI, and you’re saying no.

Aparajit: Call it the frequent flyer syndrome or the FOMO syndrome. Everybody is worried that somebody else is going to get ahead and eat their lunch. So, you want to do it. Everybody in the industry is facing a lot of top-down pressure from executive leadership, as well as the business, to do something with AI.

Beverly: Even the boards, right? They just want to do something with AI, and they don’t even care what it is. It’s going to increase our price. So, you’re saying that’s cool, but no.

Aparajit: Gartner predicts that 30 percent of all POCs and pilots in the gen AI space will get shut down in 2025, and that’s a significant number. It’s going to come from not being prepared with data, poor data quality, and poor data management practices. If your data engineering is not absolutely spot on, it’s the old garbage-in and garbage-out adage.

Beverly: Isn’t that funny that we’re talking about that right now?

Aparajit: Yeah, I know. We’re still talking about it 30 years later, and it’s the same thing. The other thing is having the appropriate risk controls. A lot of people don’t realize that gen AI carries some significant risks. You can have prompt engineering, prompt injections, IP laws, how gen AI models were trained, what toxic information might be stored in them that could come out at the wrong time, if it’s hallucinating, and if you know how to stop that.

Beverly: And are you feeding things that are out there for other people to access?

Aparajit: Exactly. I mean, very recently, there was an issue where Canada Air got sued because the gen AI-based chatbot that they released told one of their customers that it was okay to go ahead and do what they needed to do, and they could get a refund later. When the customer reached out to Canada Air, they couldn’t get it. So, they sued, and the court held them liable.

Beverly: Because their chatbot told them that.

Aparajit: Exactly, their chatbot told them that because it was hallucinating. There were no measures to say that you will only stick with this information. So, there’s supplemental information that can be fed via prompt, and then there’s the information on which the model was trained. If you don’t know how all of that was done, and if you’ve not done a significant amount of testing, and you put something like that in the marketplace, you risk loss of trust and reputation.

Beverly: What if that happens at large? And you have to offer refunds for all because of the chatbot, and now you’re in big trouble? If people start learning how to ask questions or say things in a particular way to influence the way the chatbot responds. If you have a programming mindset, you’ll know how to make this work.

Aparajit: Exactly. Then it also requires continuous monitoring. Plus, the prompts that are being input into your gen AI are another whole set of data. I mean, we talk about big data and you’re generating all this data. Those prompts are data that you then need to analyze to see if there’s holes being poked.

Beverly: The whole idea of prompts is brand new data. We started with numbers and moved to images and sound and sort of implying sentiment but not quite mindreading yet. Now we’re using prompts, which is new data.

Aparajit: Exactly. That’s the whole “risk” part of it. The other part of it is costs. It may seem like a small cost, but things multiply. You have hundreds of users asking 1,000 questions, 1,000 prompts, over 10 different use cases. Now suddenly you have got 1,000,000. You might think that your token cost is cheap, but now you’ve got 1,000,000 tokens. Again, not being able to drive actual business value and measure it. For example, are you able to say, “because I did this, I was able to add so much to my profit or reduce costs.”

Beverly: Or you’re just trying to get technical for the sake of it. So how is this happening? What is the driving force that is making us forget that we need to have good data, engineering, and all these things and then come to the table and start dipping our toe in there. Why are we getting so much pressure? And what do we do about it?

Aparajit: So, your question is not related to data and more about human behavior. It’s like my kids go to school, and somebody has got Nike shoes on, and they want Nike shoes. But the driving force is different, for my child it might be to look cool. But the driving force for a CIO or CEO or CDO might be that I don’t want somebody telling me that somebody came along, and they did this, and look at what they’re doing, and they’re taking away market share.

Beverly: But what you’re saying is not just about market share and revenue. You’re saying there’s big potential risk, like lawsuits, losing your company, bad reputation, and loss of faith in humanity. If it were a healthcare company, people might lose faith. So, you’re saying it’s not just about looking cool, making money, being the hero of the company, or using gen AI. But you’re saying there are reasons to be careful.

Aparajit: The guardrails are critical if you do anything new. There’s a criticality involved in what kind of measures you put in place to curtail these bad situations. With gen AI, I’m not saying gen AI is worse or better than any other technology that’s come along in the past, it’s just that it’s new and scary. A lot of people are doing a lot of wonderful things with gen AI, and there’s a lot of buzz in the marketplace. In a survey comparing last year to this year, companies were asked how many were exploring gen AI, running active pilots, or had it in production. Last year, 70 percent were exploring, 15 percent were piloting, and 4 percent were in production.

Beverly: Where is the source?

Aparajit: The source is Gartner. And then the remaining did not even get involved. This year, the exploring dropped from 70 to 42.

Beverly: Wow, that’s huge.

Aparajit: But guess why. The piloting went from 15 to 45. So literally there’s been a 36 percent jump year over year in people getting heavily involved.

Beverly: What’s the third tier?

Aparajit: The third tier is in production. Four percent last year versus 10 percent this year.

Beverly: So a six percent increase?

Aparajit: It’s more than double.

Beverly: If you had to give one final piece of advice, would you ask people to wait or focus your attention on the right places? Because everybody’s trying to come up with use cases for gen AI without thinking about the data.

Aparajit: It’s a question of how deep your pockets are because there are certain critical things that you cannot overlook. But that doesn’t mean that you just focus over there and leave gen AI completely alone, because gen AI does have its place. And where it finds its place in today’s world is internal productivity.

I’ve prepared a few things over here that I can talk about. Text generation, knowledge search, translations, and speech to text. If you have a multinational company, and you’re creating marketing materials, you can translate from one language to another with the click of a button. Some of these processes are coming down in terms of the intensity of the work involved, generating decent results, and increasing productivity. So, you do want to explore that.

Beverly: So, you’re saying to wait for the external-facing gen AI and focus attention on data and traditional data science?

Aparajit: Traditional data science in the space of machine learning and predictive analytics to give you actual hard bottom-dollar results should be the top priority of any company.

Beverly: If you are going to delve into it beyond POC, make it internal. Make it about internally improving the way you operate.

Aparajit: Exactly. There is a list of best practices talking about the best thing that you can do, which is the standard matrix. Low, medium, and high in terms of business value and low, medium, high in terms of feasibility. You put whatever you want to do in that matrix and see where it lands. If it’s low business value and low feasibility, don’t do it.

Beverly: What final piece of advice would you give to somebody who’s really trying to understand this whole thing? That even in the age of AI, you must work on data and modeling before jumping into generative AI?

Aparajit: The funny thing is that gen AI would supplement your machine learning and predictive analytics in the way of not having data. I’m flipping the conversation here a little bit. Generating synthetic data using gen AI can supplement predictive analytics if you’re using it internally, know what you’re doing, get the right people watching it, and make sure it’s generating what you want.

There’s the whole circle, and I’m going to use a very mundane example, let’s just take a report. First, you don’t want 3,000 reports out there, don’t want to spend the money on building it, and don’t want to use up the resources to do it. If you have a solid analytics governance program and you have a report, it’s crucial to ask: “What am I using this report for? What business decision did I make using this report, and did it generate revenue or lead to improvement?” If the impact can’t be measured, it’s time to modify the report or analytics to enable better decision-making next time. This continuous loop of improvement should also be applied in the gen AI space.

Beverly: If I understand you correctly, you’re suggesting that getting involved in the big picture and aware of industry trends can prevent your company from falling into the 30 percent that shut down projects before they even begin. By engaging with the community, learning from others’ experiences, and understanding the challenges they face, you can avoid being overwhelmed by internal pressures and ensure you’re better prepared for success.

Aparajit: In a new and fast-moving space, it’s natural to worry about whether you’re making the right decisions. Finding the right resources is key, and communities like this one are invaluable. Gartner, Google, the internet, and ChatGPT are also valuable.

Beverly: Yeah, there you go. Love it. Well, thank you again, Aparajit, for talking to us about the criticality of mature AI for accurate gen AI.

Aparajit: Thank you.

The potential of gen AI is immense, but its success is deeply rooted in the maturity of the underlying data systems. Ensuring that foundational AI technologies are well-developed and reliable is essential for achieving accurate and effective gen AI outcomes. By prioritizing the refinement of core AI systems, businesses can harness the full power of gen AI, driving innovation while maintaining trust, precision, and alignment with their strategic objectives.

Explore the full catalog of TAG Data Talk conversations here: TAG Data Talk with Dr. Beverly Wright – TAG Online.