How Probabilistic Drafting Is Reshaping Clinical Decisions
AI now writes the clinical note first, and the clinician edits and signs it. Why probabilistic drafting shifts authorship while liability stays put, the difference between loud hallucinations and silent omission drift, and how to keep the human author in charge.
First published in The AI Health Pulse. Also on LinkedIn.
Clinical notes are increasingly written first by machines. An AI tool listens to the visit or reads the chart to help a clinician write a draft, which the clinician then edits and signs. The AI tool produces fluently drafted notes. However, since the tool is probabilistic, it drafts the notes based on a prediction of the next words, rather than a fact-based account of what really took place. Smooth drafts do not eliminate the risk that the tool generates a note that contains a significant error.
An analysis of drafts revealed two primary issues. First, hallucinations, or statements for which there is no support from the visit, are problematic. Many drafts omit important visit-related details. Errors in note drafts are especially concerning if the omissions are in the plan for follow-up and treatment. For example, a draft may appear accurate and polished, but may be wrong in the exact place that a reader, such as a clinician, will depend on it the most.
Something important changed when drafting moved to the machine, yet almost no one named it. For generations, the clinician was the author of the note. They decided what to record and what it meant. Now a probabilistic system writes the first version and the clinician becomes an editor of a text they did not compose. The work shifted from creation to review.
What did not change is the liability. The signature at the bottom of the note means the clinician is signing the document believing the record is correct. That draft did not come from a model to lessen the standard. Responsibility remained in the same place and the clinician is liable for the words a system created under circumstances that make verifying those words more difficult than the words the clinician wrote.
The loud failure and the quiet failure
The two ways a probabilistic draft fails should be considered separately since they do not have the same level of visibility. A hallucination (loud failure) is when the draft contains an assertion that contradicts an event. If a clinician is paying attention to the draft, the assertion can be removed since a false statement contradicts the memory of the visit.
An omission (quiet failure) is more dangerous since nothing is there to alert the user. The draft contains a statement that is supposed to contain an assertion, and there is nothing to alert the user of an error. The draft remains clear, and it is only a failure to communicate that should be noticeable. A clinician who is under time pressure and reviewing what should be a clear draft has nothing to respond to, since the error is an absence.
Why the silent drift is the real threat
Perhaps the most painful aspect of drift is that it is nearly invisible as it slowly decays a system. One note that lacks a detail is likely to be overlooked. The erosion of a system over time goes undetected, and it happens every time there is a failure of the system that is not visible to anyone. It is even more unfortunate that people learn to adapt to the failure of a system. As the system continues to decay, people learn to expect less from the system.
The true cost of drift is always deferred and reveals itself at the most inconvenient time. A clinician has a patient presented to them with a lack of detail and is satisfied with the lack of detail in the presentation. Finally, a reviewer is also satisfied with a lack of detail. The drift has become the new normal, and the quality of the system is so poor that it is no longer evident to anyone that there is a lack of detail.
Bringing Drafts Under Control
We should not eliminate the tools, as they provide great time savings and reduce a big burden. The goal is to keep the human element of authorship and to keep drafts as drafts. This means that, as time allows, we should check the draft against the source and/or the transcript to see how frequently the draft diverges. Ultimately, we should not assume that the draft diverges infrequently.
Sampling should focus most on the plan in the draft. Errors in the plan are more harmful than other types of errors. Then, the organization can establish some sense of the limits of the draft, and decide to take action when the limits are reached. There should be one person, as opposed to a dispersed group of committees, responsible for the accuracy of AI drafted documents. This way, when the document starts to drift, there is a person responsible for keeping it on track. None of this is new or innovative. Quality control has been applied to a step that became automated, and before anyone had built controls for this process.
Keep the Author in the Chair
Drafting may have inherent variability, but clinicians will benefit from it, as it nets some efficiency during an often-worn, energy-draining task. In these instances, clinicians will get more of their time back. The variability, or the drafting tool itself, is not the problem. The problem lies with an organization deciding that the drafting tool does not need time with a supervisor.
Clinicians will be the authors if the organization manages the drafting variability well. The supervisors will be the authors if the organization decides variability is the standard. The supervisors will be authors of well-drafted, but fictional records.
Christopher Hutchins Founder and CEO, Hutchins Data Strategy Consultants
One signal a week. No noise.
Join healthcare leaders reading The AI Health Pulse every Monday.
Facing a challenge like this in your own system?
See how we approach healthcare AI consulting and data and analytics strategy, or book a call.