CRAJ Spring 2025

Spring 2025 (Volume 35, Number 1)

AI-Powered Documentation in Rheumatology: Evaluating the Benefits and Limitations of Ambient Scribes

By Ramandip Singh, MD, FRCPC

CLINICAL DOCUMENTATION CHALLENGES
Clinical documentation in the electronic health record (EHR) era presents substantial challenges, impacting healthcare efficiency and the quality of patient care. The shift from paper-based records to digital systems has increased administrative burdens, diverting clinicians' focus from patient interactions. Ambient scribes —AI-powered tools that passively listen to clinician-patient conversations and generate real-time clinical notes —offer a promising solution to these challenges. Over the past year, I have implemented and evaluated ambient scribes in my rheumatology practice, examining their impact across multiple domains. Studies assessing their effectiveness have focused on key parameters, including accuracy, patient satisfaction, clinician satisfaction, documentation time, and privacy compliance. While these tools hold significant promise, their performance varies across these metrics.

EVALUATING AMBIENT SCRIBES: KEY PERFORMANCE DOMAINS

Accuracy
The ambient scribes I tested produced generally reliable notes, particularly for structured follow-up visits involving conditions like inflammatory arthritis and polymyalgia rheumatica. Even in cases with well-defined but systemically involved diseases such as vasculitis, the AI generated high quality notes.

However, when extensive symptomatology was present alongside an unclear diagnosis, the ambient scribe often included unnecessary information (“note bloat”) or omitted critical details. These notes required careful review and manual refinement to ensure accuracy and clinical relevance.

Some challenges previously identified in earlier digital scribe implementations—such as lack of standardization and difficulty adapting to nuanced clinical language—have been partially addressed in newer systems through customizable note templates and improved natural language models.

Documentation Time
Ambient scribes generally reduce documentation time by minimizing the need for manual note-taking and streamlining clinical workflows.

In a large-scale study, physicians using ambient scribes saved approximately one hour per day, based on over 300,000 patient encounters during a 10-week period (Trivedi et al., 2024). Celi et al noted a 20.4% reduction in time spent on notes per appointment, and a 30.0% decrease in after-hours work per day.

However, efficiency gains may be limited in more complex encounters, where AI-generated notes often require extensive review, editing, and correction to ensure clinical accuracy and relevance. This underscores the need for continued improvement in contextual accuracy and adaptability.

Clinician Satisfaction
Reducing the need for constant typing or jotting of notes has allowed for more natural and engaging conversations with my patients. Several studies have demonstrated non-time-related benefits of ambient scribes including a significant reduction in clinician burnout (Misurac et al., 2024). Kane et al. (2024) also observed reduced task load and cognitive burden among clinicians, suggesting that ambient scribes may alleviate the mental demands of documentation.

These improvements suggest that clinician satisfaction is supported not just by saved time, but by ambient scribes' ability to reduce documentation-related stress and improve day-to-day work experience.

Patient Satisfaction
While I have not formally measured patient satisfaction in my own practice, my clinical impressions align with these findings: patients appear more engaged, and interactions feel less disrupted by documentation tasks. A study evaluating patient experiences with ambient scribes found that 71% of patients reported spending more time speaking with their physician, and 81% observed that their physician spent less time looking at the computer screen compared to previous visits (Tierney et al., 2024).

Notably, all patients stated that the ambient scribe either had no effect or enhanced their visit, and all reported feeling neutral to very comfortable about an AI tool being used during their visit.

Privacy Compliance
Ambient scribe vendors operating in Canada—such as Scribeberry and Heidi—comply with federal and provincial health privacy regulations, and their servers are physically located within Canada, ensuring data residency. However, data residency is not the same as data sovereignty. If a vendor uses U.S.-based infrastructure or is owned by a U.S.-domiciled company, Canadian data may still be subject to the U.S. CLOUD Act, which allows U.S. authorities to access data held by American companies—even if stored abroad.

While this concern extends beyond ambient scribes alone, it highlights the need for a clear national framework to address data sovereignty and the legal reach of foreign jurisdictions. Oversight of cloud infrastructure and vendor ownership will be key to protecting the privacy of Canadian health information.

LOOKING AHEAD: OPTIMIZING AI SCRIBES IN CLINICAL PRACTICE

Establishing a Robust Evaluation Framework
While traditional metrics like accuracy, documentation time, and user satisfaction offer useful insights, additional measures—such as organization, internal consistency, completeness, coherence, relevance, efficiency, and error rate—could provide a more comprehensive evaluation of AI-generated clinical documentation.

Several tools exist for structured assessment, including PDQI-9 (Physician Documentation Quality Instrument) and DeepScore (a proprietary quality metric developed by DeepScribe, ambient scribe vendor, for internal quality review).

Although PDQI-9 effectively evaluates documentation quality across attributes like clarity, completeness, and organization, it does not assess real-world physician satisfaction or the usability challenges related to electronic documentation systems (Stetson et al., 2008). Currently, there is no universal standard for evaluating ambient scribes, and many vendor-developed metrics remain proprietary. Establishing a transparent, standardized evaluation framework will be essential to ensure these tools meaningfully enhance clinical workflows.

Enhancing EHR Integration
Although current ambient scribes function primarily as passive documentation tools, their utility could be significantly expanded through deeper EHR integration and context-aware enhancements. Ideally, these systems would access relevant lab results, medication histories, and previous visit summaries in real time—reducing manual data retrieval and supporting clinical decision-making. While some ambient scribes offer limited EHR integration, a fully interoperable solution capable of retrieving and querying patient data would maximize efficiency.

In addition to accessing prior data, context-aware note optimization could enable AI to organize information more coherently, resulting in notes that better reflect clinical reasoning and narrative flow.

The Need for Adaptive Healthcare Leadership
Healthcare organizations must take an active role in implementing and evaluating ambient scribe technologies. Their success will depend on continuous assessment of effectiveness, safety, and usability—alongside close collaboration between clinicians, technology vendors, and regulatory bodies to meet specialty-specific needs.

Project Athena, a national initiative defining the next-generation Canadian rheumatology informatics platform, highlights the importance of institutional leadership. A coordinated framework that supports ongoing evaluation and engagement will be key to ensuring these tools integrate seamlessly into clinical practice.

Conclusion
Ambient scribes represent a significant step forward in medical documentation, helping to reduce administrative burdens and improve patient interactions. My experience aligns with published research—these tools enhance workflow efficiency and reduce cognitive load, but their effectiveness depends on accuracy, integration, and adaptability. Future success will hinge on continuous refinement, collaboration, and standardized evaluation. As AI-powered documentation tools evolve, their value in clinical practice will depend on thoughtful implementation and stronger integration with existing healthcare systems.

Ramandip Singh, MD, FRCPC
Winnipeg, Manitoba

Reading List

1. van Buchem MM, Boosman H, Bauer MP, et al. The digital scribe in clinical practice: a scoping review and research agenda. NPJ Digit Med. 2021 Mar 26;4(1):57.

2. Tierney AA, Gayre GG, Hoberman B, et al. Ambient Artificial Intelligence Scribes to Alleviate the Burden of Clinical Documentation. NEJM Catal Innov Care Deliv 2024;5(3). DOI: 10.1056/CAT.23.0404

3. Physician Documentation Quality Instrument-9 (PDQI-9): A validated tool for assessing the quality of clinical documentation. (Stetson PD, Morrison FP, Bakken S, et al. Preliminary development of the physician documentation quality instrument. J Am Med Inform Assoc 2008;15:534–41).

4. Jon Oleson. DeepScore: A Comprehensive Approach to Measuring Quality in AI-Generated Clinical Documentation. Available at https://arxiv.org/html/2409.16307v1. Accessed February 24, 2025.

5. Canadian Rheumatology Association. Project Athena Update: AI Scribes. CRAJ Winter 2024; 34(4):6