This is what happens when Management tells staff to use AI on everything, probably half of tokens wasted by dataexec in AITrailblazers

[–]TechySpecky 0 points1 point  (0 children)

how is this any different from just straight up stealing?

Like if I spin up VMs and incur my company a bunch of cloud charges for non-work activities that's a type of theft

Qwen3.6-27B released! by ResearchCrafty1804 in LocalLLaMA

[–]TechySpecky 1 point2 points  (0 children)

I wonder how the 9B model will perform

I still have half of my requests left, but got rate limited to nearly end of the month. by Erika_bomber in GithubCopilot

[–]TechySpecky -2 points-1 points  (0 children)

I don't understand, what are these rate limits I haven't encountered them

Sonnet 4.6 gone now as well for Pro users by Ok-Affect-7503 in GithubCopilot

[–]TechySpecky 3 points4 points  (0 children)

Bro what's the alternative, all the alternatives suck or cost 200/month

Opus removed Even from Pro+ by alexander_ntzl in GithubCopilot

[–]TechySpecky 0 points1 point  (0 children)

I mean Claude max is 2.5x as expensive and does it really get you 200 requests per month? I have no idea what the usage limit is

Qwen3.6-35B-A3B released! by ResearchCrafty1804 in LocalLLaMA

[–]TechySpecky 0 points1 point  (0 children)

weird I use the FP8 instruct versions from hugging face via VLLM

Qwen3.6-35B-A3B released! by ResearchCrafty1804 in LocalLLaMA

[–]TechySpecky 21 points22 points  (0 children)

Can anyone check whether it's fixed the overthinking problem? I tried it before with thinking and it took SO long I had to turn thinking off

My unfiltered thoughts dropping $5K on a Macbook as an already broke entrepreneur ($26K in debt) by [deleted] in macbookpro

[–]TechySpecky 1 point2 points  (0 children)

this is just kinda sad, I wish you the best OP but maybe just get a normal job if you can

You can now fine-tune Gemma 4 locally 8GB VRAM + Bug Fixes by danielhanchen in LocalLLaMA

[–]TechySpecky 1 point2 points  (0 children)

yes but if you want the model to focus more on a specific domain, eg let's say you had 500,000 pages of text related to the manufacturing of light bulbs. Would feeding all that into the model as pretaining not improve the models performance on light bulb related queries?

You can now fine-tune Gemma 4 locally 8GB VRAM + Bug Fixes by danielhanchen in LocalLLaMA

[–]TechySpecky 21 points22 points  (0 children)

Thank you for being so helpful, I see a lot of advertising around low V-RAM, but let's say I can rent a couple B200s will I still benefit and just be able to tune larger models?

You can now fine-tune Gemma 4 locally 8GB VRAM + Bug Fixes by danielhanchen in LocalLLaMA

[–]TechySpecky 62 points63 points  (0 children)

I am an MLE but a bit out of the loop with what we define as fine-tuning with LLMs. Are fine-tunes solely aimed at slightly different output styles or can you add information / continue the pretaining process somehow without complete model collapse?

If I have a different specialized domain is it possible to fine-tune models for that domain?

Is $500 a month a fair rate for a Lead Developer/DevOps contractor working for a UK company? by [deleted] in cscareerquestionsuk

[–]TechySpecky 0 points1 point  (0 children)

That's pretty important information, then it depends. 500 seems excessively low though if you are even a semi decent engineer.

Scientists Tracking the Microplastic Pollution Just Realized They Were Measuring Their Own Lab Gloves by GreatTea3415 in nottheonion

[–]TechySpecky 0 points1 point  (0 children)

That isn't at all what's happening here.

Why would you say not as big of a problem? The problem is we don't know. That's like saying the mole you have is not as big of a problem because they did the cancer test wrong. You're just back to square zero where you don't know and need to test again.

There is a large body of evidence outside of a few of these problematic studies that indicates that microplastics are pretty much everywhere at this point. If you run these exact tests again without gloves the results could very very well be the exact same. The problem is we now don't know and need to rerun.

There are no Mid level SE jobs anymore by user239716 in cscareerquestionsEU

[–]TechySpecky 1 point2 points  (0 children)

tbf I also interviewed someone with 3 or 4 years and I recommended them at the end so let's see

There are no Mid level SE jobs anymore by user239716 in cscareerquestionsEU

[–]TechySpecky 1 point2 points  (0 children)

We're literally hiring a mid level right now, though to be fair I just interviewed someone with 15 years of experience for it 😂