Policy Forum
Jul 2007

Accountability via Chart Audits

Abraham P. Schwab, PhD
Virtual Mentor. 2007;9(7):503-507. doi: 10.1001/virtualmentor.2007.9.7.pfor1-0707.


In this short article, I will make a case for physician accountability via audit that moves away from some traditional conceptions of accountability. On my view, to hold some individual or organization accountable is to attempt to identify areas where improved performance is needed. I will start with some reasonable targets for medical practice and then discuss some strategies that can improve our aim. Of course, to identify the best strategies, we will have to take account of the current state of medicine.

My conception of accountability will not require an individual or organization to answer for past actions, and it will not threaten physicians with punitive or disciplinary responses for failure to act. My decision to move away from conceptions of accountability that do so is informed by social and cognitive psychology, which has shown that requiring justification or threatening punishment does not always mold behavior; indeed, in many cases, it entrenches confidence that what was done was correct [1].


Waiting patiently in an exam room, I always want the same things: accurate information and reliable predictions. Post-EKG, I want to hear that I don't have any clogged arteries—but only if I don't have any clogged arteries. When I'm told about my treatment options for taking care of my clogged arteries, I want the effectiveness and the odds of complications from an angioplasty or a stent placement to be reliable. If I'm going to get angioplasty, I want to be told how likely it is to keep me from having a heart attack. If I'm going to get a bare metal stent (BMS), I want to be told the chance of restenosis, and if I'm going to get a drug-eluting stent (DES), I want to be told that it requires longer antiplatelet therapy and includes a higher risk of late stent thrombosis. And again, I want all of these predictions to be reliable. Moreover, I want this same accuracy and reliability every time I'm in the doctor's office whether for a prescription, surgery, or the use of a medical device.

Obviously I am an involved patient-decision maker, and I recognize that other patients may not want all this information and all these predictions. Many patients don't want to be bothered with the details of the decision-making process and trust their internist or cardiologist or pediatrician to make good decisions for them. These patients also want the physician to be able to make a reliable prediction about the chances of restenosis and late stent thrombosis, even if she's not sharing the information with them. What I'm trying to say is that accurate information and reliable predictions are cornerstones of patient autonomy and professional judgment. At present, however, the structure of medical research and medical bookkeeping hampers the accuracy of information and the reliability of predictions. Specifically, the failure to adequately audit patient medical records regarding treatment effectiveness and a particular physicians' skills and judgment limits the reliability and accuracy of claims made in the medical encounter. Electronic medical records (EMRs), although not necessary, would be useful to address this oversight.

The Limits of Research

Randomized clinical trials (RCTs) are the gold standard for producing reliable predictions about treatment success. Several factors contribute to this status: (1) RCTs include large numbers of research subjects that (2) are followed for a specified time with constrained treatment alternatives, and (3) have built-in audits—the data is gathered specifically so that it can be analyzed. These features of RCTs give their conclusions more reliability than the judgments of any individual practitioner. Auditing multiple RCTs produces more reliable conclusions because it includes an even larger patient pool. Hence, meta-analyses of RCTs are more reliable than individual clinical trials. And yet, reliance on RCTs as the foundational unit of medical research limits the reliability, complexity, and speed of the conclusions. These limits arise from the prospective nature and the limited patient pool.

RCTs have a prospective nature because someone must decide on a hypothesis first. Only after a hypothesis has been produced is information gathered. Also, even though the size of the subject (patient) population in an RCT gives it an advantage over an individual physician's judgment, it is still limited by the funds available for the study, the exclusion criteria of the protocol, and the size of the patient population at that time.

Some moves have already been made to increase the size of the subject pool and the value of the information gathered. Take, for example, the PREMIER registry [2]. Starting in January 2003, this registry followed 2,500 patients from 19 states for one year after they suffered heart attacks. After the registry concluded in June 2004, analysis of the data illustrated that the 500 patients who had drug-eluting stents placed after their heart attacks were at greater risk for late stent thrombosis than had been previously thought. It appears that this risk can be attenuated through longer antiplatelet therapy, but this conclusion remains tentative. What makes this conclusion so exciting is that there was no need to design a study to look specifically for late stent thrombosis. Without this registry, who knows how many patients would have suffered significant harm from late stent thrombosis that neither patient nor physician could have reasonably known was a risk. In addition, the information collected by the registry can also be used to evaluate or produce other hypotheses—the information was not gathered solely to test a single hypothesis.

Still, registries are a slower method for producing accurate information and reliable predictions than an audit of electronic medical records (EMRs) would be. As with a registry, a researcher with a database of EMRs could access existing information, and his or her use of the database would not preclude others from doing so simultaneously. The widespread use of EMRs would provide an important opportunity to audit, and thus hold accountable, the medical establishment. Imagine if the conclusions of medical research were not limited by the number of research subjects but only by the accuracy of our statistical analysis. Rather than taking 1-1/2 years to follow registered patients, the researchers involved in the PREMIER registry could have simply looked back over existing records. Moreover, the use of EMRs would do more than increase the speed with which these conclusions could be reached; it would also increase the complexity of the conclusions that could be drawn. If all patient records were stored electronically, information could be gathered to identify a number of conclusions about the risk of late stent thrombosis from DESs (e.g., which other patient factors—age, complicating diseases, etc.—increase or decrease this risk).

The Limits of Skill and Judgment

EMRs will also be a valuable tool in auditing individual physicians' practices (though paper records could be used). Let's begin with the assumption that no physician is perfect and that the competence for particular tasks, measured across groups of physicians, is uneven. Some are better at some things while others are better at other things. The nature of these uneven skills is most easily imagined in areas requiring technical skill, like surgery and the use of medical devices. For example, there are a number of ways a cardiologist can increase (or decrease) the risk of complications from stent placement. If his estimates of stent length or diameter are inaccurate, patients are more likely to suffer complications. Hence, when I'm making a judgment about a therapy involving stent placement by my cardiologist, it would be very helpful for me (and the cardiologist) to know that the averagerate of restenosis for BMS is 25 percent, but that for this particular cardiologist, the rate is somewhat higher or lower. Patients are in a better position to make good decisions about particular treatments if they have a grip on the physician's specific skill set. Physicians are in a better position to make good decisions about the best way to care for patients and their needs for continuing education if they are aware of their skill set.

Physicians could also be audited regarding their judgment. Just as technical skill is distributed unevenly across individuals, so is excellence in judgment. By evaluating a physician's record, specific areas where his or her judgment is lacking can be identified and addressed. Cognitive and social psychology have reached robust conclusions illustrating that judgment can be easily biased [3-9], but which physicians are subject to which biases at what frequency cannot be known a priori; it must be evaluated through the analysis of practice.

A Thin Line

Whether or not they involve the use of EMRs, audits that analyze technical skill and judgment will blur the line between medical practice and medical research. Every clinical encounter produces information that can be used to produce broad conclusions (e.g., the risk of late stent thrombosis from DES) and narrow descriptions (e.g., this physician's skill at diagnosing pneumonia). Should patients be offered the opportunity to opt out of such studies? Should they be included only if they opt in? Does this kind of research require institutional review board (IRB) approval? These are difficult practical questions that need sorting out, but they are not insurmountable. The case for analyzing physicians' technical skill and judgment and producing more robust conclusions is a strong one, and concerns about patients as research subjects are minimal in this case.

There is less reason to be concerned about this type of research than about other types because these analyses avoid the conflict produced by RCTs. RCTs aim to produce generalizable knowledge through strict adherence to a protocol of unknown effectiveness. This necessarily compromises individualized (and so presumably, optimal) patient care. Audits do not require physicians to adhere to strict protocols, but allow them to practice the best medicine they can. After they have cared for their patients, the research begins. Hence, there is no conflict between a particular physician's judgment about what is best for a patient and a research protocol.

There is also some precedent for limiting patient autonomy regarding participation in this research. In a parallel case, demanding patients' participation in medical education is grounded in the need to maintain a pool of trained physicians. Only if patients continue to be involved in medical education will this pool continue. Along these same lines, Rosamond Rhodes has endorsed a legal requirement that all individuals participate in medical research once every 10 years [10]. Importantly, the research she is discussing is research that is or is similar to RCTs. In both of these cases, patients are at risk for harm. When patients are involved in the medical education process, they risk harms through mistakes made because the physicians are less qualified. When patients become research subjects in RCTs, their care can be compromised by rigid protocols. Both sets of risks are absent in the process of accountability through auditing that I am suggesting.

The serious risk introduced by auditing is the potential loss of confidentiality—larger numbers of individuals will see patient records. It is worth noting, however, that the use of records (electronic or otherwise) that require confidentiality is already widespread. Consider, for example, the fact that prescription information is available at all Walgreens pharmacies, regardless of which Walgreens pharmacy originally filled the prescription (disclosure: my spouse works for Walgreens). Given this widespread use of electronic records in RCTs, nonmedical government programs, and businesses in general, it seems that maintaining confidentiality is simply a question of resource allocation.

Finally, some may be concerned that physician-specific information will simply shuttle the poorest and most vulnerable patients to the worst physicians. Two responses allay this worry: (1) this happens already, so this is not a valid criticism of the recommendations here, but of the system on the whole, and (2) even though it may seem comforting to have everyone play the lottery (and not know how good or bad any particular physician is), the best means to improve health care generally is to improve the care provided by every individual medical practitioner. The most effective way to do this is through audits that identify areas in need of improvement.


  1. Lerner JS, Tetlock PE. Accounting for the effects of accountability. Psychol Bul. 1999;125(2):255-275.
  2. Spertus JA, Kettelkamp R, Vance C, et al. Prevalence, predictors, and outcomes of premature discontinuation of thienopyridine therapy after drug-eluting stent placement: results from the PREMIER registry. Circulation. 2006;113(24):2803-2809.
  3. Tversky A, Kahneman D. Judgment under uncertainty: heuristics and biases. Science. 1974;185(4157):1124-1131.
  4. Kahneman D, Tversky A. Prospect theory: an analysis of decision under risk. Econometrica. 1979;47(2):263-292.
  5. Koehler DJ. Explanation, imagination, and confidence in judgment. Psychol Bull. 1991;110(3):499-519.
  6. Kuhberger A. The influence of framing on risky decisions: a meta-analysis. Organ Behav Hum Decis Process. 1998;75(1):23-55.
  7. Lerner JS, Keltner D. Beyond valence: toward a model of emotion-specific influences on judgment and choice. Cognition and Emotion. 2000;14(4):473-493.
  8. Shafir E, LeBoeuf RA. Rationality. Ann Rev Psychol. 2002;53:491-517.

  9. Salovey P, Williams-Piehota P. Field experiments in social psychology. Am Behav Sci. 2004;47(5):488-505.
  10. Rhodes R. Rethinking research ethics. Am J Bioeth. 2005;5(1):7-28.


Virtual Mentor. 2007;9(7):503-507.




This article was improved through discussion at the 4th International Conference on Ethical Issues in Biomedical Engineering sponsored by SUNY Downstate Medical Center and Polytechnic University. This work was supported in part by a grant from The City University of New York PSC-CUNY Research Award Program.

The viewpoints expressed in this article are those of the author(s) and do not necessarily reflect the views and policies of the AMA.