标签 - 深度学习 - 小熊的小站

02-08

RAGLens： Toward Faithful Retrieval-Augmented Generation with Sparse Autoencoders

02-06

Why Steering Works：Toward a Unified View of Language Model Parameter Dynamics

02-06

LLM-VA： Resolving the Jailbreak-Overrefusal Trade-off via Vector Alignment

02-06

AlphaSteer： Learning Refusal Steering with Principled Null-Space Constraint

02-05

FRAUDAR： Bounding Graph Fraud in the Face of Camouflage

02-03

DiffuGuard： How Intrinsic Safety is Lost and Found in Diffusion Large Language Models

02-02

One-shot Optimized Steering Vector for Hallucination Mitigation for VLMs

01-15

RelayLLM： Efficient Reasoning via Collaborative Decoding

01-14

mHC：流形约束的超连接

01-10

Text-to-LoRA： Instant Transformer Adaption