MedAI Vision Language Models amp FineTuning KnowAda

>> YOUR LINK HERE: ___ http://youtube.com/watch?v=6Dc8Tny4agE

Smaller VLM hallucinate. A new counter-measure: Knowledge-Adapted Fine-Tuning (KnowAda) is a novel approach to mitigate hallucinations in vision-language models (VLMs) when generating dense image captions. Reducing Hallucinations in Multimodal Models Through new Adaptive Training. • Traditional fine-tuning methods often result in smaller-scale VLMs (up to 7B parameters) struggling to balance descriptiveness with factual accuracy, especially in visually complex datasets. • KnowAda addresses this by probing the VLM’s knowledge using generated visual questions to identify areas of uncertainty (classified using a difficulty threshold TT). Captions are then automatically adapted to exclude unreliable details while preserving rich descriptions. • Decomposed NLI (DNLI) is a proposition-based evaluation framework that decomposes captions into atomic claims and evaluates them for entailment, contradiction, or neutrality against ground truth descriptions. These innovations, validated across diverse datasets and models, demonstrate a significant reduction in hallucination rates without sacrificing descriptiveness. • Key findings include the consistent superiority of KnowAda in achieving high descriptiveness precision and reduced contradiction rates compared to baseline methods such as naive caption simplification or trimming. DNLI further establishes itself as a robust alternative to traditional metrics by correlating strongly with human evaluations, providing nuanced insights into model performance. • all rights w/ authors: • Bridging the Visual Gap: Fine-Tuning Multimodal Models • with Knowledge-Adapted Captions • https://arxiv.org/pdf/2411.09018 • 00:00 VLM 8B in MedAI • 00:50 Identify the problem • 02:17 Insights by John Hopkins Univ • 06:40 Fine-Tuning depends on Pre-Training • 08:57 Fine-Tune Medical VLMs • 11:14 Knowledge Adapt (KnowAda) • 15:05 Limit of medical VLMs • 19:03 New Solutions • 20:30 We need new Pre-Trained Models • 24:28 Delay of new VLMs (OpenAI) • #medical • #airesearch • #medai • #educational

#############################

New on site