Computer Science - Artificial Intelligence Computer Science - Computation and Language
Cancer stage classification is important for making treatment and care
management plans for oncology patients. Information on staging is often
included in unstructured form in clinical, pathology, radiology and other
free-text reports in the electronic health record system, requiring extensive
work to parse and obtain. To facilitate the extraction of this information,
previous NLP approaches rely on labeled training datasets, which are
labor-intensive to prepare. In this study, we demonstrate that without any
labeled training data, open-source clinical large language models (LLMs) can
extract pathologic tumor-node-metastasis (pTNM) staging information from
real-world pathology reports. Our experiments compare LLMs and a BERT-based
model fine-tuned using the labeled data. Our findings suggest that while LLMs
still exhibit subpar performance in Tumor (T) classification, with the
appropriate adoption of prompting strategies, they can achieve comparable
performance on Metastasis (M) classification and improved performance on Node
(N) classification.
Metrics
14 Record Views
Details
Title
Classifying Cancer Stage with Open-Source Clinical Large Language Models