Calibrating Reliance: How Clinicians Use AI-Generated Explanations in Diagnostic Decisions
We report on a mixed-methods study examining how confidence intervals embedded in AI-generated explanations shift diagnostic decisions in clinical radiology. Across 84 radiologists at two hospitals, we find that the dominant failure mode under ambiguity is not over-reliance but a systematic under-reliance — clinicians discount well-calibrated AI outputs when explanation length exceeds working-memory thresholds. We articulate three design principles for reliance calibration: temporal pacing, evidence framing, and contextual disagreement surfacing. Implications for clinical AI deployment and explanation interface design are discussed.
Why HCAI · Classification Reasoning
This paper satisfies several strong-relevance criteria simultaneously: the study object is real human interaction with an AI decision-support system; it presents a mixed-methods empirical study with 84 clinical users; it focuses on core research questions including trust calibration, explanation understanding, and reliance behavior; and it derives interpretable UX-oriented design principles. The abstract is free of algorithm-benchmark bias and surfaces multi-dimensional content — user experiment, AI system type, and evaluation metrics (trust, reliance, cognitive load). Across 7 scoring dimensions the paper receives 94.