Healthcare Data Privacy and AI: Designing Trust In From the Start
Why privacy is a core requirement for healthcare AI: privacy-by-design, data minimization, and the early controls that let teams move fast safely.
Featuring Andre Samokish on The Signal Room
In the Signal Room episode that inspired this article, Andre Samokish, a privacy and AI governance expert, began with an unedited expression of his opinion, stating, “If you collect garbage, you will have garbage.” Although he touched on data in his explanation, this statement is applicable in a much broader context. In the world of healthcare, the data that is collected, and oftentimes, not clearly justified, is the same data that AI will learn from, reveal, and rationalize to the organization. Privacy is not a checklist item to be addressed after an AI project has been planned and built. It is a determining factor in the decision to even execute the project.
In every context, particularly in healthcare, AI will almost always be engaged in the processing of personally identifiable information. In healthcare, the first question that is likely to be asked about a new service or an AI-enhanced tool is “what will happen to the personal information that is input to the system?” Samokish argues that privacy and AI governance have become a single field of practice because of the convergence of these two areas. For the purpose of this article, the focus is on privacy, the responsibilities linked to the processing of personal data, and especially the possible implications of AI. AI governance will be addressed only where it is impossible to separate the two.
Privacy Is the Condition for Moving Fast, Not the Brake on It
Privacy is thought to slow teams when in fact the opposite is true. Samokish’s analysis is important. It is privacy controls that allow a company to maintain a business function, not the contrary. If privacy controls have been embedded in the design early, the team can avoid a disruption when they reach the product’s launch. They will not discover a privacy gap that needs to be addressed.
The example given is intentionally simple. An example of a privacy control is a pop-up window informing the user they are communicating with an AI. Such a control is easy to build into the product during the development process, and very difficult to add after the product is deployed. This illustrates that the best time for a company to address privacy controls is the development phase of a project. It is the least disruptive and least costly approach. Adding controls after the project is developed and in the deployment phase of the lifecycle is not cost-effective.
Data Minimization Changes Healthcare Collection Paradigms
For years, the data collection mentality was to hoard and store everything and then figure out the valuable pieces later. Samokish explains that data minimization forces a collection practitioner to think differently. You have to provide a clear purpose for collection of data elements, then explain how the data elements will be used, and get consent from the subjects of the data.
This is a shift for healthcare, particularly with the evolving use of AI. More data does not make a better model, it actually is a bigger risk. Samokish provides a framework to evaluate data tools. Why is the tool needed? What will the tool do with the data? Is the data use necessary, and can it be avoided? It is much easier to consider these questions when the idea is being generated, compared to having to answer them after the data is set in motion.
Transparency Is Something You Publish, Not Just Promise
When talking about the ethical use of AI, Samokish focused the most on the importance of being honest with everyone whose data you are collecting, voluntarily or otherwise. Samokish believes that even if the law does not require you to do so, you should disclose on your website the existence of an AI component, behind-the-scenes logic, what certifications you comply with, and how you protect your data. This needs to be in a manner that the users of the website can easily understand.
Samokish acknowledges that this kind of documentation requires different levels of translation for different target audiences. He argues that technical employees should build documentation together, and not be provided with a document to read. Website visitors, customers, and employees also deserve the same amount of effort. It is the job of your attorneys to speak the language of regulations. It is common practice to first speak to them and refocus the answers to something that the rest of the employees can understand and use. If transparency is not being done in a manner that the majority of your employees can understand, you are not being transparent.
Talk to Legal Before You Buy, Not After
A warning of particular concern should be directed toward organizations who purchase AI as opposed to developing it. The belief is that a purchased service is inherently low risk. However, Samokish contends that the true risk resides in the use case. Some regulatory frameworks explicitly prohibit or classify certain AI use cases as high-risk. Given the potential for a vendor's tool to be applied to an extremely risky use case, its marketing is of little concern to the organization.
In terms of risk reduction strategies, Samokish recommends a somewhat boring, early-stage solution. Before making a purchase, organizations should analyze the lists of the high-risk and prohibited use cases, consult with legal, and answer the question of how the organization would be allowed to use the tool in question. If the issue the organization is struggling with is of high enough priority, those discussions may result in the organization dealing with a different vendor or determine which constraints the organization should focus on implementing. Many constraints, such as a vendor's tool certification, or registration in a particular system, are time-consuming to implement and do not occur instantaneously. A vendor's assurance that everything is fine is a starting point to verify, not a conclusion to accept.
The Intersection of Privacy and AI Literacy: An Educational Concern
Samokish asserts that privacy and AI literacy are intrinsically linked. He divides AI literacy into three categories. The first is the technical side. This includes the capabilities, functionality, and the metrics that describe its performance. The second is the legal side. This includes the actions that one may or may not take, and the regulations that may or may not be implemented. The third is the ethical side. This includes the reasoning for the data, and the ethical concerns surrounding the data.
Retention is what makes this an educational rather than a documentation issue. He states that privacy training was often an annual event that occurred at the time of onboarding. Such training was heavily security-focused. Data privacy was considered a minor topic, and as a result, most people forgot what was taught. He proposes something completely different. He suggests training to reinforce the understanding of a core topic, with a duration of around fifteen minutes, occurring as frequently as once per week. Such training should be engaging and focused. Since AI models are continually retrained, the traditional checklist approach to slow processes is no longer effective. Understanding should be sustained in parallel to the changes in the systems. For this reason, continuous education is a privacy control in its own right.
Build It as if You Will Be Audited Tomorrow
There are two categories of operations here. One is the tools. Automation is treated by Samokish, who is OneTrust certified, much like AI. Automation saves you in those situations where you are doing the same task on repeat, and it gives you and your team members the time to use their own judgment. Unlike spreadsheets, a company having a single source of truth for vendors/risk, and a tool for reports and dashboards gives them an overview and insight on their overall privacy. The trade-off is not false as it is true that the tool will require heavy customization, as no two companies are the same as they all have different workflows.
The other category is that reusability of AI and privacy governance is enough to construct the other. The organization can use a lot of the same controls, the same governance tool, the same collaboration patterns, the same questions for risk assessments, etc., and apply them to AI governance without having to start from square one. Not all of the skills required to extend the comfort zone of governance are technical. There are many governance skills required that are not technical, such as the ability to handle projects, the ability to teach and the ability to give constructive feedback. These are the skills that are most apparent to him that are required and are also the most easily learned.
How Hutchins Approaches Healthcare Data Privacy and AI
At Hutchins Data Strategy Consultants, we treat privacy as a design input to AI work rather than a clearance step at the end. That means helping organizations decide why each data element is collected and on what consent, building disclosure and explainability into products while they are still in the idea stage, and reusing the controls a privacy program already has so AI governance does not start from zero. This work sits alongside healthcare data governance and the broader practice of responsible AI in healthcare — related, but aimed specifically at the personal-data obligations that attach the moment AI touches a patient record. These themes run throughout The Signal Room podcast, where practitioners describe what privacy and governance take in real deployments.
Authoritative sources
Have a data or AI challenge like this?
A 30-minute call is enough to tell whether we're the right fit.
Frequently asked questions
Why is data privacy a core requirement for healthcare AI?
Because AI systems run on data, and most of the data healthcare organizations hold is personal data. The moment someone touches a service or enters information, their first concern is what happens to it. Privacy is what earns the trust that lets the AI be used at all.
What is privacy by design in an AI project?
Building the controls, disclosures, and consent into a product early, while it is still in the idea stage, rather than bolting them on before launch. Done upfront, these controls let a team proceed with confidence instead of stopping to retrofit later.
What is data minimization and why does it matter for AI?
Data minimization means you do not collect data just in case. You need a defined reason for every data element, and you need to explain to customers how it will be used and obtain their consent. For AI it matters because models trained on data gathered without purpose carry exposure the organization never accounted for.
How do privacy controls and AI governance relate?
They overlap heavily because both touch data. An organization with privacy and data governance already in place can transfer many of its controls, assessment questions, and tooling to AI governance rather than building it from nothing.