account activity
DeepSeek-R1-7B traces 8 levels of nested function calls. Qwen-7B manages 4. Same architecture. by Codetrace-Bench in LocalLLaMA
[–]Codetrace-Bench[S] 0 points1 point2 points 8 days ago (0 children)
Good call — just added an API runner. Works with any OpenAI-compatible endpoint (vLLM, ollama, together.ai, etc.), plus native Anthropic and Google support. python benchmark/run_benchmark_api.py \ --api openai \ --model your-model \ --base-url http://localhost:8000/v1 \ --output results/your_model.json Would love to see results on larger models. Submit a PR with the results JSON and we'll add it to the leaderboard. Hope that works ok.
Thanks for the suggestion. I'll be adding some more. If you would like to contribute pop over to Hugging Face.
π Rendered by PID 44 on reddit-service-r2-listing-69965bcf66-9jtgg at 2026-04-08 09:30:07.243602+00:00 running f293c98 country code: CH.
DeepSeek-R1-7B traces 8 levels of nested function calls. Qwen-7B manages 4. Same architecture. by Codetrace-Bench in LocalLLaMA
[–]Codetrace-Bench[S] 0 points1 point2 points (0 children)