Healthcare Data Readiness for AI: Why Programs Stall Before the Model
Why healthcare AI stalls at the data layer — the readiness across completeness, consistency, connectivity, and compliance that gets pilots to production.
Featuring Ratnadeep Bhattacharjee on The Signal Room
Ask a health system why its AI pilot never reached production, and the answer will almost always point to the model. The vendors promised magic and failed to deliver. There was slippage in accuracy. Clinicians steadfastly refused to trust the outputs. Such answers are common, and are mostly attributable to an earlier problem. The model was forced to give an answer to a problem based on data that was never brought together to support the answer.
This was the connecting thread of a Signal room engagement with Ratnadeep Bhattacharjee, whose work revolves around preparing healthcare organizations for the use of AI. His framing was blunt: the real AI challenges lie not in the model or the algorithms, but the data. Data rests in silos across EHRs, claims systems, and spreadsheets. Unless an organization is able to rectify that, the use of AI will mostly remain a hype and not a useful tool. At Hutchins Data Strategy Consultants, this is the condition we encounter most frequently when a system asks why its dashboards and models did not capture the trust of its intended users.
Readiness Is Measured Against a Use Case, Not in the Abstract
The use of AI has several facets, and it would be difficult to ascertain readiness in the absence of a specific definition. So asking, "are we ready for AI," is the wrong question. A health system is ready or not for a particular AI model serving a specific decision. The way to find out is to evaluate the data that the use case would actually consume.
Bhattacharjee gave readiness a four-pronged perspective, and the key benefit to this framing is the incorporation of the perspective of the technologists and non-technologists alike.
First, completeness. When looking to implement a model, do you even have the data it would require? For example, a care-gap analytics model using claims data without any clinical notes is, as he described, already flying half blind. Having lots of data is not the same as having the right data, which is crucial for answering the question at hand.
Next is consistency. Once the necessary data is identified, is it structured and systematized? He provided the example of organizations having twenty different methods for recording a single blood pressure, as well as addresses that do not align across all systems. A lack of consistency leads to variable predictions, and goes unnoticed until a model, trained on the messy data, is used in the real world.
Third is connectivity. A lab result lives in one system, the prescription in another, the social and economic context in a third. Unless those dots connect, no model can assemble the longitudinal view of a patient that actually drives impact. This is where the most damage occurs, since a fragmented record appears complete when viewed in isolation in any given system.
Fourth is compliance. The integration of data is vulnerable to complications that manifest at the most inopportune time if the tactics used to streamline data integration do not adapt to the constraints offered by data privacy and healthcare compliance regulations, such as HIPAA, CMS, and the HL7 FHIR standards.
The Damage is Connectivity, Not the Algorithm
The most insightful portion of the conversation was an example that Bhattacharjee presented involving a health plan that served Medicaid populations. The plan was interested in using predictive models to address gaps in care. When his team dug deeper, they discovered that there was an absence of connectivity amongst the claims data and the EHR feeds. The plan members were flagged as non-compliant with screenings, not because the screenings were not completed, but because the screenings occurred in community clinic settings. The data from these community clinics were never recorded in the claims data system.
The solution was not a predictive model. By addressing the connectivity and consistency issues amongst these data sources, the health plan was able to improve the quality of their quality-measure reporting without touching a single AI model. This is the story that lies within the majority of the organization's failed AI attempts. The problem that the organization identified as a failure of the algorithm was a data integration problem, and solving this problem resulted in immediate value prior to the deployment of any algorithms.
This also presents a new perspective on what the work to address bias in these models consists of. A model that is developed using only claims data will always have the blind spots that are inherent to claims data, and will lack the clinical context and the social factors that influence the level of care the patient is engaged with. This is the first problem to be solved in order to establish fairness in any model. The first thing to be addressed is the model's data.
Readiness Is a Capability, Not a Clean-Up
The most common error is to assume data readiness is a destination: it's been cleaned, it's been integrated, and it's time to move to the next item on the agenda, the AI. Data is not static. Standards change, labs are modernized, feeds are adapted, and new systems are added. Everything shifts and evolves, and even a validated model made yesterday is now outdated.
Bhattacharjee's answer was to think of data readiness as a capability, not a project with a deadline. Continuous data reconciliation and oversight. This effort is to ensure model integrity and reliability. This line of thinking is what differentiates systems that move to production from those that remain stagnant after prolonged pilot phases. The former focus on the integrity and reliability of foundational systems, while the latter see the need to rebuild systems to support a new demand or use case.
Having the capacity to withstand scrutiny is a great test of intent. Build your AI systems in a way that assumes they will be audited in the near future, not only by the legal regulators, but also by the patients whose data is used to train the systems. This thinking requires transparency and detailed drafts, as an audit would also require people to be involved in the design stages.
Make AI Everyone's Conversation
There is a cultural aspect that can make or break readiness work. When AI is confined to a data-science team, the rest of the organization is like train passengers, watching work happen without participating. The bottlenecks are experienced daily by care coordinators, claims processors, and patient advocates. Not including them not only impacts adoption, it also removes the ability to assess readiness from the perspective of those bottlenecks.
This is what makes the readiness question as much of a leadership question as a technology question. The organizations that listen, that include, and that empower across functions are the ones that succeed. They make data quality a collaborative effort, as opposed to a back-office task.
How Hutchins Approaches Data Readiness
Our work begins where the model conversation usually skips ahead: an unembellished assessment of whether your data can actually carry the use cases you have in mind. We examine source-system fidelity, the consistency of definitions, the connectivity between systems that need to join into a single patient view, and the compliance guardrails that govern how any of it can be integrated. We help organizations stand up the ongoing capability — not the one-time clean-up — that keeps that foundation trustworthy as it changes.
Readiness is not separate from data governance or from the clinical data platform that gives integrated data a home. It is the same body of work seen from the angle of a specific AI ambition, and it is the difference between a pilot that earns trust and one that confirms every reason to distrust AI.
These themes run throughout The Signal Room podcast, where practitioners working at the data layer of healthcare AI describe what readiness takes in practice.
Authoritative sources
Have a data or AI challenge like this?
A 30-minute call is enough to tell whether we're the right fit.
Frequently asked questions
What is healthcare data readiness?
The state in which clinical and operational data is complete, consistent, connected, and compliant enough to support a specific AI use case — assessed against that use case, not in the abstract.
Why do most healthcare AI initiatives stall?
Not because of the model. They stall because the underlying data is siloed across EHRs, claims systems, and spreadsheets, so the model is built on a partial and inconsistent picture of the patient.
How do you assess whether data is ready for AI?
Check four things against the intended use case: completeness (do you hold the data the model needs?), consistency (is it standardized across systems?), connectivity (do the sources join into one patient view?), and compliance (HIPAA, CMS, FHIR).
Is data readiness a one-time project?
No. Data changes every day as systems, codes, and feeds change. Readiness is an ongoing capability — continuous reconciliation and monitoring — not a clean-up you do once before a pilot.