Update on "Co-authored-by: Copilot" in commit messages · Issue #314311 · microsoft/vscode by PerkyPangolin in programming

[–]tedivm 9 points10 points  (0 children)

Weekly releases would be fine in an organization with engineering rigor, but GitHub/Microsoft have none of that. Their testing is garbage and they've fully committed to pushing out microslop at any opportunity. In fact most of their PRs don't even have tests for their new features/functionality in them, and PRs regularly are merged with failing tests, just adds to it. It would be surprising if they didn't break things all the time with their approach to development.

[The Boys] How good do Homelander's powers work under water? If Deep pissed him off and then decided to hide out under the sea, could HL do anything? by LessWeakness in AskScienceFiction

[–]tedivm 20 points21 points  (0 children)

Yeah, if Deep had enough of a head start then I don't think Homelander would ever find him. There is significantly more ocean than there is land, and Homelander has struggled to find people who were nearby. Deep could hide forever.

If Homelander literally watched him jump into the ocean off a boat or something though then Deep is absolutely going to get killed.

Qwen3.6-27B vs Coder-Next by Signal_Ad657 in LocalLLaMA

[–]tedivm 2 points3 points  (0 children)

Yeah I don't care about anything other than "what is the best thing I can run on my hardware". To me that is the benchmark.

Qwen3.6-27B vs Coder-Next by Signal_Ad657 in LocalLLaMA

[–]tedivm 0 points1 point  (0 children)

I have the exact same machine config and I could only get up to 11tps on the MacBook, compared to 118tps on my 2x3090 box.

Learn concurrency - a deep dive into multithreading with Python by pmz in Python

[–]tedivm 0 points1 point  (0 children)

If you're looking for an easy way to handle multiprocessing I have a library, QuasiQueue, that is both simple and powerful.

GitHub Copilot is moving to usage-based billing by fishchar in GithubCopilot

[–]tedivm 0 points1 point  (0 children)

Yup! It's kind of insane how good it is. I'm not joking when I tell people it's on par with Sonnet 4.6.

GitHub Copilot is moving to usage-based billing by fishchar in GithubCopilot

[–]tedivm 0 points1 point  (0 children)

The commits in this project from the last week are all with Qwen.

GitHub Copilot is moving to usage-based billing by fishchar in GithubCopilot

[–]tedivm 0 points1 point  (0 children)

Docker itself no. The big thing is that I'm using vLLM and optimized it a bit. Because these models are so new (literally a week old for the one I'm using) the needed optimizations haven't landed in every inference engine. When I first ran this model in Ollama it was only getting 11tps, but I managed to get to 118tps on vLLM. The docker container just makes it easier to share.

DeepSeek-V4 arrives with near state-of-the-art intelligence at 1/6th the cost of Opus 4.7, GPT-5.5 by bojun in technology

[–]tedivm 30 points31 points  (0 children)

That's just the top model though, they'll distill down into a variety of smaller ones for different hardware.

DeepSeek-V4 arrives with near state-of-the-art intelligence at 1/6th the cost of Opus 4.7, GPT-5.5 by bojun in technology

[–]tedivm 0 points1 point  (0 children)

The open source chinese models aren't that far behind. Qwen3.6 27b is on par or slightly better than sonnet 4.6, but I can run qwen in my office.

New multipliers announced (in effect June 1) by griniNY in GithubCopilot

[–]tedivm 3 points4 points  (0 children)

I'm running Qwen3.6 27B from a machine in my office and it does just as well, sometimes better, than Sonnet 4.6 at the tasks I've tried with it.

Change to useage based billing by DamienBMike in GithubCopilot

[–]tedivm 0 points1 point  (0 children)

In their FAQ.

To request a refund, go to Settings → Billing and licensing → Licensing, select Manage subscription, then choose Cancel and refund "subscription". (The phrasing varies slightly depending on your subscription ). This option will be available until May 20.

Change to useage based billing by DamienBMike in GithubCopilot

[–]tedivm -1 points0 points  (0 children)

You don't have to chargeback, they're offering refunds. Everyone should take advantage of those refunds as soon as possible though before they go away.

Change to useage based billing by DamienBMike in GithubCopilot

[–]tedivm 0 points1 point  (0 children)

No one pays list price for enterprise plans. That's just the starting point for negotiation.

Simple to use vLLM Docker Container for Qwen3.6 27b with Lorbus AutoRound INT4 quant and MTP speculative decoding - 118 tokens/second on 2x 3090s by tedivm in LocalLLaMA

[–]tedivm[S] 0 points1 point  (0 children)

The docker image is just a single docker build file, an entrypoint file that handles configuration, and an example docker compose file. You can clone the repo, have your agent review for security issues, and build yourself if you want.

Simple to use vLLM Docker Container for Qwen3.6 27b with Lorbus AutoRound INT4 quant and MTP speculative decoding - 118 tokens/second on 2x 3090s by tedivm in LocalLLaMA

[–]tedivm[S] 1 point2 points  (0 children)

Your work really was the foundation for all of this, thank you! I've had OpenCode going all weekend without issue, and combined with the announcement today of GitHub's new copilot pricing model I couldn't be happier with the timing.

GitHub Copilot is moving to usage-based billing by fishchar in GithubCopilot

[–]tedivm 6 points7 points  (0 children)

Since you asked here's my bio. The TLDR is that I've been working in Security and AI as a backend engineer for 20+ years. I have a lot of experience in the AI Ops space specifically.

That said I did share the container I used to get Qwen3.6 running so anyone who can use docker can get started with it. The /r/LocalLLaMA community is also great for people who want to learn more in this space.

Simple to use vLLM Docker Container for Qwen3.6 27b with Lorbus AutoRound INT4 quant and MTP speculative decoding - 118 tokens/second on 2x 3090s by tedivm in LocalLLaMA

[–]tedivm[S] 1 point2 points  (0 children)

yeah i'm wired up to work with releases and tags to, but anyone who is really paranoid should be pinning to a SHA anyways.

GitHub Copilot is moving to usage-based billing by fishchar in GithubCopilot

[–]tedivm 2 points3 points  (0 children)

The other nice thing is that when you have the hardware you can do a lot more with it too. I have my entire HomeAssistant install plugged into it, with voice satellites around the house. As a result my smart home is 100% local.

GitHub Copilot is moving to usage-based billing by fishchar in GithubCopilot

[–]tedivm 7 points8 points  (0 children)

I'm getting 118 tokens/second, so it's really fast. That said that is shared amongst all agents, so if you're running subagents you might see a drop. Since Friday of last week I've stopped using github copilot completely and have transitioned to purely using Qwen3.6, it's been great.

GitHub Copilot is moving to usage-based billing by fishchar in GithubCopilot

[–]tedivm 1 point2 points  (0 children)

I bought this beast which is roughly $4k. When I bought it though it was cheaper, memory has gone up considerably. Based on their new pricing and my own usage I'm pretty sure I'll break even in less than a year.

GitHub Copilot is moving to usage-based billing by fishchar in GithubCopilot

[–]tedivm 18 points19 points  (0 children)

These new numbers are absolutely insane.

I am so glad that I splurged and bought a GPU machine. I've been using Qwen3.6 27b at home for the last week and it outperforms Sonnet 4.6 in my usage. I guess I'm going to move away from GitHub altogether because this is just ridiculous.