REACT:SYNERGIZING REASONING AND ACTING IN LANGUAGE MODELS (ICLR 2023) 姚顺雨腾讯实习之作。 2026-02-08 #深度学习 #大模型 #agent
RAGLens: Toward Faithful Retrieval-Augmented Generation with Sparse Autoencoders (ICLR 2026) 2026-02-08 #深度学习 #大模型 #RAG
Why Steering Works:Toward a Unified View of Language Model Parameter Dynamics (arxiv 2026) 将全量微调、LoRA、激活干预统一。 很好的文章! 2026-02-06 #深度学习 #大模型
LLM-VA: Resolving the Jailbreak-Overrefusal Trade-off via Vector Alignment (arxiv 2026) Large Language Model Vector Alignment (LLM-VA) 2026-02-06 #深度学习 #大模型
AlphaSteer: Learning Refusal Steering with Principled Null-Space Constraint (ICLR 2026) 之前看alphaedit和nullu就有过将二者融合起来。没想到已经有人做了。 2026-02-06 #深度学习 #大模型
FRAUDAR: Bounding Graph Fraud in the Face of Camouflage (KDD 2016) Mercor比赛中第二名用到了这篇论文。 2026-02-05 #深度学习 #图神经网络
DiffuGuard: How Intrinsic Safety is Lost and Found in Diffusion Large Language Models (ResponsibleFM @ NeurIPS 2025) 2026-02-03 #深度学习 #大模型
One-shot Optimized Steering Vector for Hallucination Mitigation for VLMs (arxiv 2026) 2026-02-02 #深度学习 #大模型