Which API capabilities are supported?

Last updated: June 30, 2025

All models:

  • Streaming

  • Streaming with structured outputs

  • Structured outputs

  • Tool calling

  • Multi-turn tool calling

  • Temperature, top P, logit probabilities

Some models:

  • Parallel tool calling

  • Multi turn tool calling

  • Tool Calling w/ Structured Outputs

  • Streaming w/ Structured Outputs

  • Streaming w/ Tool Calling

Here is a breakdown of the limitations for each model:

  • llama3.1-8b

    • Parallel Tool Calling

  • llama-3.3-70b

    • Tool Calling w/ Structured Outputs

    • Multi-turn tool calling

  • llama-4-scout-17b-16e-instruct

    • Parallel Tool Calling

  • qwen-3-32b

    • Streaming w/ Structured Outputs

    • Parallel Tool Calling

    • Streaming w/ Tool Calling

  • deepseek-r1-distill-llama-70b

    • Streaming w/ Structured Outputs

    • Parallel Tool Calling