Microsoft researchers have developed On-Policy Context Distillation (OPCD), a training method that permanently embeds ...
The AI industry is witnessing a transformative trend: the use of distillation to make AI models smaller and cheaper. This shift, spearheaded by companies like DeepSeek and OpenAI, is reshaping the AI ...
Anthropic accused three Chinese artificial intelligence enterprises of engaging in coordinated distillation campaigns, the ...
Anthropic accused DeepSeek, Moonshot and MiniMax of illicitly using Claude to steal some of the AI model’s capabilities ...
The Chinese AI company DeepSeek released a chatbot earlier this year called R1, which drew a huge amount of attention. Most of it focused on the fact that a relatively small and unknown company said ...
Anthropic said it is investing heavily in defences designed to make distillation attacks harder to execute and easier to identify.
The original version of this story appeared in Quanta Magazine. The Chinese AI company DeepSeek released a chatbot earlier this year called R1, which drew a huge amount of attention. Most of it ...
Top United States artificial intelligence firm Anthropic is accusing three prominent Chinese AI labs of illegally extracting capabilities from its Claude model to advance their own, claiming it raises ...
LLMs tend to lose prior skills when fine-tuned for new tasks. A new self-distillation approach aims to reduce regression and simplify model management.