Comparison of proprietary and fine-tuned large language models for multi-label classification of billing codes from radiology reports
-
By
-
March 14, 2026
Objective:
To automate the classification of GOÄ codes from unstructured radiology reports using a fine-tuned large language model and compare its performance with specific commercial systems, such as [insert names].
Key Findings:
- The fine-tuned 4-billion-parameter LLM demonstrated competitive accuracy in classifying GOÄ codes, achieving a [insert specific percentage] improvement over traditional manual coding methods.
- The model achieved better performance compared to traditional manual coding methods.
- Cloud-based proprietary models were contrasted with local open-source solutions for privacy compliance.
Interpretation:
The study suggests that fine-tuned LLMs can effectively automate the billing code classification process, potentially reducing human error and improving efficiency in medical billing, which could significantly benefit healthcare providers.
Limitations:
- The dataset exhibited class imbalance, which may affect model performance and generalizability.
- The study focused solely on German-language radiology reports, limiting generalizability.
Conclusion:
Automating the classification of billing codes using LLMs can enhance billing accuracy and efficiency in healthcare, addressing key challenges in the current manual processes.