共计 79 篇文章
2025
Blink:用于增强多模态理解的动态视觉token分辨率
JustRL: Scaling a 1.5B LLM with a Simple RL Recipe
SmartSight:通过时间注意力坍缩在不损害视频理解的前提下缓解视频大模型中的幻觉问题
Investigating Spatial Attention Bias in Vision-Language Models
T5Gemma2
Steer LLM Latents for Hallucination Detection
Agentic Context Engineering:面向自改进语言模型的动态上下文演化
Neural Message-Passing on Attention Graphs for Hallucination Detection
Think-on-Graph: Deep and Responsible Reasoning of Large Language Model on Knowledge Graph
AVG-LLaVA: 一种具有自适应视觉粒度的高效大型多模态模型