Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

Applying Medical Language Models to Medical Image Analysis

Abstract

Medical image analysis powered by deep learning computer vision models has achieved significant advancements in the past decade. Deep learning models have demonstrated remarkable capabilities in a wide range of tasks, including medical image classification, detection, and segmentation. However, the limited availability of annotations has become a persistent challenge. Annotating medical images requires specialized professional knowledge, making it a costly process. This dissertation aims to relieve the reliance on medical image annotations by leveraging medical reports directly, which are usually associated with corresponding medical images and readily available. This thesis delves into the application of vision-language models, including large vision-language models, for enhancing medical image analysis. Existing vision-language models are modified and applied for three critical tasks: disease diagnosis, disease segmentation and medical report generation. In particular, the main contributions include: (1) proposing two prompting strategies to improve the accuracy of disease diagnosis through visual question answering in large vision language models; (2) introducing a disease segmentation model using medical reports as weak supervision; (3) evaluating medical large vision-language models in terms of the hallucination in generated reports across multiple complex diseases and applying existing techniques to mitigate the diagnostic errors in generated reports.

Main Content
For improved accessibility of PDF content, download the file to your device.