Crafting a Data Strategy to Support AI in Healthcare

Author: Sue Pittacora

As organizations dive deep into utilizing the capabilities of data, AI, and advanced analytics, developing a robust data strategy is crucial.


In a recent CDO magazine interview, Sue Pittacora, Chief Strategy Officer at Wavicle Data Solutions, and Himanshu Arora, Chief Data and Analytics Officer at Blue Cross Blue Shield of Massachusetts, sat down for an in-depth discussion into the world of data strategy and advanced analytics, with a focus on AI applications in the healthcare industry.  


In continuation from the first part of the interview, this conversation with Sue and Himanshu dives into how to craft a data strategy aligned with business goals, navigate cultural nuances, and implement best practices in advanced analytics and AI.  


Watch the full interview here or scroll down for a detailed transcript of Himanshu’s insights.



Sue: Hello and welcome to the CDO Magazine interview series. I am Sue Pittacora with Wavicle Data Solutions, and I’m delighted to be joined here today by Himanshu Arora, Chief Data and Analytics Officer of Blue Cross Blue Shield of Massachusetts. Welcome, Himanshu.   


Himanshu: Thank you, Sue. It’s good to be here and good to see you. 


Sue: Thank you. So we all know we need clean data, and we need a strategy for how we’re going to use data, and AI, and advanced analytics. 


How are you thinking about your data strategy in relation to AI and advanced analytics use cases? 

Himanshu: It’s the difference between the approach for data strategy for, let’s say, AI category of solutions, then the traditional even advanced analytics data science solutions. I think there’s sort of a step change. In terms of data science advanced analytics, which is primarily machine learning driven, the focus has been on getting to clean data, getting to data which is cohesive and can be collated together, and doing, for example, large observation versus small observation, synthetic data, data enrichment, internal and external sources. That has been a lot of focus.  


And rightly so, because in the models, let’s say in the algorithms that are built out today, there is the opportunity to then take those outputs and apply a different lens, let’s say a lens of bias, the need for which is becoming increasingly pronounced.  So, you have a step opportunity there. 


With AI and particularly with general distributions, one of the things that we are looking to, we’re needing to do differently, is instead of looking to de-bias the model outputs, it is about debiasing the data inputs to the model, knowing, that at some stage, these models are going to be running more independently. It’s not imminent, but it is coming down the line. We all understand that. These models are going to be running more independently than the traditional advanced analytics solutions of today do.  


They’re also in line, but there are multiple steps, interventions built in there in the automation that is coming our way on top of generative AI, for example. That aspect becomes even more critical.   


So then the question becomes, “how do you do that? How do you look to de-bias your data?”   


That’s tough. And that’s where, frankly, there is a lot of manual work involved. We are looking for more and more automated solutions to do that and it’s becoming a bit more of a challenge. In order for an AI-driven system to really be from, let’s say, an external customer member-facing perspective, it has to encapsulate not just the context in which the question is being asked for, but in the context of the individual who’s asking the question or seeking something. This translates forward to having a much fuller understanding of an individual, which then translates forward to having much more data about an individual than only their health care data.  


So, now you’re looking to de-bias not just the inherent observations or data that comes through, for example, a claim or an enrollment, but other data that members are providing to us, they expect us to know about them. For example, preferences. Preferences in terms of how members wish to engage with us. Are they okay with us calling them? Do they prefer to engage primarily digitally? What times of the day? And again, that has a direct correlation to what a member’s socioeconomic status might be, or at least an imputed correlation to whether a member is available at certain points in time. When it comes to care and care management, what sites of care, in-person or virtual, should a member have the option to go to? And that starts to speak to benefit design.  


The complexity isn’t new in the traditional data that we have. It is now adding all aspects of data around a member and de-biasing the input into the model itself. If we don’t do that, then de-biasing the output would take away from what we are trying to do with these AI solutions in the first place. So you must do it on the way in. That’s the biggest difference about the challenge that we’re solving for now.  


Sue: That’s intriguing. This idea of de-biasing the data as it comes in is an intriguing concept. I think it has a lot more complexity than a traditional data strategy, as you just explained. There’s a lot of work ahead of us.


How do you ensure your data strategy for advanced analytics meets the diverse needs of your organization? How does culture play a role? 

Himanshu: That’s a very good question. I’m going to start with the generative components of AI.  Just given the mass adoption – if you remember a year ago, ChatGPT got out of the gate, getting to, what, 100 billion users faster than any other technology had at the time.   


It meant that more of our employees, customers, and members have had some form of exposure to this capability than previous capabilities. Even web, even mobile, even web 2. 0.  There is more general awareness that these things exist. 


Now, in one way that actually makes the internal cultural adoption of this easier. Quite frankly, what I’ve experienced, and this has been very positive, is much more of a thirst and demand from our internal teams to say, “when can we start playing with what we hear people talk about in social media, in people’s lives, what we experienced through other channels, what people experience in other industries, even?” It’s been great from that perspective.  


I think there’s also, from a cultural perspective, the sense that this is the first time we are  at the forefront of now encapsulating knowledge work into some level of automation.  What does that mean from an individual’s knowledge path perspective?  Do I have to learn differently? Do I therefore have to do my tasks differently? What are the things we are aiming for over time? The takeaway is to take much more of the repeat manual tasks of today, which consume a lot of time and energy on part of our frontline associates.  


The benefit inquiry is a good example of this.  It still gives the member the information that they’re looking for, complete, in a context, so that there’s no compromise there, but then for our associates to be able to spend quality time really handholding a member who’s in a challenging or a different difficult circumstance all the way, soup to nuts.  


So, it’s activating a different part of our learning. It needs to activate a different part of how we learn the new skill sets in our jobs and roles. I think that’s another part of the culture impact, which is still early, but I expect or anticipate that will play out over time.  


The last piece of this is when we take a step back and say, “okay, how does this work with our mission as an entity and as a company?”  


I think our mission is to show up for everyone as if they are the only one. So, that aspect of personalization is quite exciting and enticing both internally and externally. It does help propel the culture in that direction naturally.   


There’s a lot of natural tailwinds behind this.  What we need to figure out relatively quickly is enabling the “how” so that our internal teams can get to it, and we can start to feel more confident in exposing these capabilities externally. We will also have the proof points to be able to say things are meaningfully better, not just where we applied the AI, but where we do not want to apply the AI for it to be much more of a human experience.  


Sue: So, Himanshu, let’s go back to something you touched on earlier. Let’s revisit this whole idea of data security and privatization. That’s on the minds of every organization, but of course, even more so for a healthcare organization. 


How to consider data security and privacy, particularly within healthcare organizations? 

Himanshu: Data security –  I’m using that term more liberally than just the traditional definition of data security – is an increasingly big concern because there is an increasing reliance on data flowing through different entities and systems and flowing through pipes, which can then make it susceptible to breach, to attacks, to different bad actors with bad intentions. I think that’s the piece where we are figuring out the balance between how much data versus what value we can get out of that data. 


I think that’s another space that is ripe for us to apply AI capabilities to. There are already good applications in terms of threat detection and applying capabilities that way.  But how do you apply those capabilities to either be free, so preemptively keeping data from leaving the organization when you identify concerns around it, and in the recovery? So oftentimes, and this has been true for us as well, we’ve sent data to an entity that has ended up in a breach of some sort. So how can we use AI to accelerate recovery from it? The risk assessment of what the breach was about, who are the members affected, how do we reach out to them, and working with the affected entity to start to mitigate the impact of what’s happened.   


Another complexity of this is the security aspect of when something negative is happening with data through an explicit action. But much more hidden behind the scenes can be, if you’re not careful about it, is how is data being misused, either intentionally or even unintentionally, in these data-driven AI models, algorithms, etc., that we’re talking about?  


We are being very prescriptive in how, where, when, for what context, and for what use our data can be used, not just within, but when we send the data out as well. What are the stipulations within which it has to remain and be governed.  


And now, you need audit capabilities on top of that. Well, that can be extremely labor intensive. How do you automate that? Security used to be about what’s within my confines and how do I protect my castle.  It’s much more than that now, and it’s creating new opportunities and new challenges.   


Sue: That’s great. I hadn’t thought of it like that, Himanshu. I think broadening that spectrum of what we’re looking at and the way we’re thinking about it and using AI in recovery, potentially, when necessary, is brilliant. We talked about data strategy as it relates to AI and advanced analytics and how that’s different from traditional data strategy. But we still have the core issues of data quality and data accuracy.


How do you handle data quality, accuracy, and bias in internal data when developing a data strategy for AI and advanced analytics? 

Himanshu: This is a very live conversation for us internally. and we’re working with some key partners on this front today. We talked about how different entities are collaborating even more in the interest of delivering. For example, in healthcare, the highest health outcome for the lowest unit cost, best experience, in an equitable way.  


But that means different categories of data are now coming together more so. Clinical data, which is captured in a clinical setting, is now being combined with claims data, pharmaceutical data, and lab data. Each of these data sets come with their own data quality issues and challenges to begin with.  


No data set is 100% devoid of data quality challenges, in spite of all remediations, it’s sometimes a whack-a-mole, and you have to keep investing in making data quality better.  So, the intersections of how these different data sets come together, that’s an area where the focus of data quality is shifting beyond how I keep my claims, enrollment, member services data at its highest quality to how I now take what I have, combine it with these other data sets that are coming in or some other point of synthesis, and make data quality work across.  


A specific example of this is that it’s good to have good claims information about a member. But for us to really understand them from a 360-degree perspective, be able to define, design, and implement targeted care management programs around them, and be able to assist them before health risks present, we need clinical data. But even combining clinical data, even combining belly buttons – so, a patient for a provider and a member for us – making sure that it’s the same member involves some complexity. Then you start to overlay this claim with what episodes of care does it encapsulate and how do I correlate this prescription that was prescribed here with a claim that I see here on the medical side with a pharmacy claim that I see here.  Where do I park these lab observations on a timeline to be able to develop a comprehensive picture about the member?  


So, the data quality concerns are moving out of the databases and into real-world processes that people are experiencing. Therefore, data quality can no longer just be a discipline for the back-end.  It is a discipline that has to be applied in line with the running process. So, it’s not that we need less data quality. That’s always going to be an issue or a concern. Whether it’s missing observations, mistyped information, information that got dropped during transition from one entity to another, missing values, etc.  


Of all of those concerns, none of them are magically going to go away. But we can’t just handle them alone at the database level, we have to elevate that and handle them at the process level as well.   


Sue: Himanshu, thank you so much. You shared so many insights today from thinking about game-changing AI innovations, looking at an internal framework, de-biasing data, using AI for recovery and data breaches, potentially really a lot of rich insights. Thank you so much for your time today. I greatly appreciate it.  


To all, please visit for additional interviews. Thank you.   


This concludes our two-part interview with Himanshu. For further insights into data strategy and AI, explore our resources here.


Ready to get started on your AI journey? Get in touch with Wavicle’s experts.