What are the speeds (tokens per second) of Cerebras Models?

Last updated: June 30, 2025

  • llama3.1-8b: ~2200 tok/sec

  • llama-3.3-70b: ~2100 tok/sec

  • llama-4-scout-17b-16e-instruct: ~2600 tok/sec

  • qwen-3-32b: ~2100 tok/sec

  • deepseek-r1-distill-llama-70b: ~1700 tok/sec