标签 - 大模型 - 小熊的小站

02-06

Why Steering Works：Toward a Unified View of Language Model Parameter Dynamics

02-06

LLM-VA： Resolving the Jailbreak-Overrefusal Trade-off via Vector Alignment

02-06

AlphaSteer： Learning Refusal Steering with Principled Null-Space Constraint

02-03

DiffuGuard： How Intrinsic Safety is Lost and Found in Diffusion Large Language Models

02-02

One-shot Optimized Steering Vector for Hallucination Mitigation for VLMs

01-15

RelayLLM： Efficient Reasoning via Collaborative Decoding

01-10

Text-to-LoRA： Instant Transformer Adaption

12-31

A Unified Definition of Hallucination, Or： It's the World Model, Stupid

12-28

Blink：用于增强多模态理解的动态视觉token分辨率

12-28

JustRL： Scaling a 1.5B LLM with a Simple RL Recipe