Use of a Large Language Model to Assess Clinical Acuity of Adults in the Emergency Department is a research paper published in JAMA Network Open (2024). On theSindex it has a DataRank of 0.694. It has been cited 101 times.
ImportanceThe introduction of large language models (LLMs), such as Generative Pre-trained Transformer 4 (GPT-4; OpenAI), has generated significant interest in health care, yet studies evaluating their performance in a clinical setting are lacking. Determination of clinical acuity, a measure of a patient's illness severity and level of required medical attention, is one of the foundational elements of medical reasoning in emergency medicine.ObjectiveTo determine whether an LLM can accurately assess clinical acuity in the emergency department (ED).Design, setting, and participantsThis cross-sectional study identified all adult ED visits from January 1, 2012, to January 17, 2023, at the University of California, San Francisco, with a documented Emergency Severity Index (ESI) acuity level (immediate, emergent, urgent, less urgent, or nonurgent) and with a corresponding ED physician note. A sample of 10β―000 pairs of ED visits with nonequivalent ESI scores, balanced for each of the 10 possible pairs of 5 ESI scores, was selected at random.ExposureThe potential of the LLM to classify acuity levels of patients in the ED based on the ESI across 10β―000 patient pairs. Using deidentified clinical text, the LLM was queried to identify the patient with a higher-acuity presentation within each pair based on the patients' clinical history. An earlier LLM was queried to allow comparison with this model.Main outcomes and measuresAccuracy score was calculated to evaluate the performance of both LLMs across the 10β―000-pair sample. A 500-pair subsample was manually classified by a physician reviewer to compare performance between the LLMs and human classification.ResultsFrom a total of 251β―401 adult ED visits, a balanced sample of 10β―000 patient pairs was created wherein each pair comprised patients with disparate ESI acuity scores. Across this sample, the LLM correctly inferred the patient with higher acuity for 8940 of 10β―000 pairs (accuracy,β0.89 [95% CI, 0.89-0.90]). Performance of the comparator LLM (accuracy,β0.84 [95% CI, 0.83-0.84]) was below that of its successor. Among the 500-pair subsample that was also manually classified, LLM performance (accuracy,β0.88 [95% CI, 0.86-0.91]) was comparable with that of the physician reviewer (accuracy,β0.86 [95% CI, 0.83-0.89]).Conclusions and relevanceIn this cross-sectional study of 10β―000 pairs of ED visits, the LLM accurately identified the patient with higher acuity when given pairs of presenting histories extracted from patients' first ED documentation. These findings suggest that the integration of an LLM into ED workflows could enhance triage processes while maintaining triage quality and warrants further investigation.
FAIR checklist signals are shown for context only and do not affect DataRank scoring.
Base Score Contribution
0.694
From this paper's citation signal
Citation Network Contribution
0
Citation network not refreshed for this result
This paper's DataRank is currently driven only by its base citation score. Citation network data was not refreshed for this result.
Learn more about DataRank methodology βDataRank blends this paper's own citation count with the influence of the papers that cite it. Here, roughly 100% comes from its base citations and 0% from the citation network.
Citers are pulled from OpenAlex sorted by cited_by_count:descand capped per paper, so when the cap binds we keep the highest-signal references and the score is reproducible across reruns.