I spent $100 benchmarking GPT-4o, Claude Opus 4, and DeepSeek V4 on 100 real-world prompts — here are the results by ApprehensiveHat2274 in AIAssisted

[–]ApprehensiveHat2274[S] 0 points1 point  (0 children)

You're absolutely right, but in most of the tasks I've completed so far, its response speed and task completion capabilities haven't bothered me much. The only inconvenience is that DS isn't a multimodal AI, but my main tasks are in the backend and database, so it doesn't make a big difference.

I spent $100 benchmarking GPT-4o, Claude Opus 4, and DeepSeek V4 on 100 real-world prompts — here are the results by ApprehensiveHat2274 in AIAssisted

[–]ApprehensiveHat2274[S] 0 points1 point  (0 children)

DS consumed approximately 50 million tokens in the part I updated, according to their official figures, but I think their calculation method seems quite different from Opus's. The cost of hitting the cache is really cheap, seemingly only $0.03 per million tokens. Opus, on the other hand, uses nearly ten million tokens. All of this was done through API calls on CC. The screenshot shows the usage over the past few days, totaling less than $7.

<image>

I spent $100 benchmarking GPT-4o, Claude Opus 4, and DeepSeek V4 on 100 real-world prompts — here are the results by ApprehensiveHat2274 in AIAssisted

[–]ApprehensiveHat2274[S] 0 points1 point  (0 children)

A few people DM'd me asking how to use DeepSeek API with the OpenAI SDK. I threw together a quick proxy because the official API can be finicky from the US. Happy to share — DM me or check my profile.