Nvidia's new Llama-3.1 Nemotron Ultra outperforms DeepSeek R1 at half the size

venturebeat.com

11 points by hochmartinez 7 days ago

Havoc 7 days ago

Half the size is not a great metric when comparing a dense model against a MoE.

dyl000 6 days ago

llama has order of magnitude mode compute requirement than deepseek.