标签 - 多模态 - 小熊的小站

06-17

Structural Graph Probing of Vision-Language Models

06-11

Q-MLLM： Vector Quantization for Robust Multimodal Large Language Model Security

06-01

Hallucination as Exploit： Evidence-Carrying Multimodal Agents

04-17

Vision Transformers Need More Than Registers

04-15

DYNAMIC MULTIMODAL ACTIVATION STEERING FOR HALLUCINATION MITIGATION IN LARGE VISION-LANGUAGE MODELS

04-13

Seeing Far and Clearly： Mitigating Hallucinations in MLLMs with Attention Causal Decoding

04-11

HALLUCINATION-AWARE INTERMEDIATE REPRESENTATION EDIT IN LARGE VISION-LANGUAGE MODELS

04-10

Beyond the Global Scores： Fine-Grained Token Grounding as a Robust Detector of LVLM Hallucinations

03-30

SAVE： Sparse Autoencoder-Driven Visual Information Enhancement for Mitigating Object Hallucination

03-22

Locate-then-Sparsify： Attribution Guided Sparse Strategy for Visual Hallucination Mitigation