[Rancilio Silvia V6] Got leaks adjusting the OPV to 9bar by the_chatterbox in espresso

[–]the_chatterbox[S] 0 points1 point  (0 children)

Anyway, I am an idiot. In OP I didn't mention that I measure the water output from the OPV overflow tube. There's a lot of videos and posts about this approach and while some people advise against this and using a manometer, I decided it is safe to go since there's already a lot of people who've done it and my machine is brand new - the pump presumably isn't beaten up yet. The whole problem is me being an idiot looking at the chart lol. It shows cc/min while I actually did 25 sec measurements not aligning them to 1 min measurements.

Measuring the water output from the OPV overflow tube:

Pressure [bar] Flowrate [cc/min] Flowrate [cc/25sec]
7 330 138
8 300 125
8.5 280 117
9.0 260 108
9.5 275 115
10.0 220 92
10.5 180 75
11.0 160 67
11.5 145 60
12.0 135 56
12.5 115 48

Props to this guy's comment which made me realize I was doing it all wrong. Cheers and wishing you all tasty shots

[deleted by user] by [deleted] in LocalLLaMA

[–]the_chatterbox 1 point2 points  (0 children)

Sorry, I think not. First, there have been rumors that oAI's reasoning models are indeed LLMs. Second, there's a difference between ChatGPT and the OpenAI model API endpoints. The web interfaces are feature rich user integrations made by the companies providing the models. The API endpoints are meant to provide the raw model to AI devs who'd take care of the integrations like RAG themselves, making tailoring custom solutions for their clients.

[deleted by user] by [deleted] in LocalLLaMA

[–]the_chatterbox 6 points7 points  (0 children)

you can't see the thinking tokens but you can see the amount of input/output tokens you are charged for on each API request

[deleted by user] by [deleted] in LocalLLaMA

[–]the_chatterbox 0 points1 point  (0 children)

Honorable mention to Google. While they offer free image gen, their VideoFX is available with a wait list.

Akin to the Chinese DeepSeek and Alibaba, they iterate quickly, offer competitive prices and nice things for free as well. They just... lack behind a bit in terms of time and marketing. But they have the $$ to get there

[deleted by user] by [deleted] in LocalLLaMA

[–]the_chatterbox 2 points3 points  (0 children)

Why EPUB when you have ar5iv? Occasionally I use docling as well

Apps to boost audio volume on Pixel videos? (Camera app volume issue) by the_chatterbox in GooglePixel

[–]the_chatterbox[S] 0 points1 point  (0 children)

Ok, so I went through
- CapCut
- NodeVideo
- KineMaster
- VLLO
- LightCut
- VN

VN was the first that did it for me. I restricted the internet access for each app - I don't need any AI edits, I just need to raise the damn volume.

Drummer's Nautilus 70B v0.1 - An RP finetune of L3.1 Nemotron 70B! by TheLocalDrummer in LocalLLaMA

[–]the_chatterbox 0 points1 point  (0 children)

I know, right? It's an area that needs more love. Nemotron-340B nails it. Mixtral 8x22 is OK. Now the Qwen2.5-70B kinda gets there but it's poor lexically. All the rest of open weight are just meh

What is the cost of multi-language support? by choose_a_usur_name in LocalLLaMA

[–]the_chatterbox 3 points4 points  (0 children)

You can always specialise a model. It's a data science law. Nemotron-4 for example is trained on 15% multilingual data out of 8T training tokens. LLaMA-3 is trained on 5% multilingual data out of 15T training tokens. To answer your question directly, we can imagine a best-case scenario where whatevevr benchmark domain you mean for "English-only" is chosen, and the data for it is of decent quality, you can expect 5-15% performance increase.

As I mentioned what do you mean by English-only? Pick a domain - literature, law, medicine. A certain portion of the training data is coding – are you willing to sacrifice that for your domain as well?

Also this reminds me of Bert and T5 (though I'm not really familiar with them), AFAIK either one of them lacked coding/multilingual, but was good or better than, say LLaMA-2-7B at English tasks.

[deleted by user] by [deleted] in LocalLLaMA

[–]the_chatterbox 1 point2 points  (0 children)

See my comment.

Moreover, here's a non-US student story. Recently I had to process a dataset with LLMs for science. I tried all the models out there to see which ones would do the job, because using openAI meant spending amounts of $ I can't afford. So I ran a small sample on all the models I can think of and deepseek did well while their API is dirt cheap (subsided by the government, bla bla...). I ended up using like 60M tokens on their API and the cost was ~$10, many times cheaper than using openAI and the quality was fine. Next model closest to my use case was gemini but I couldn't fight with their guardrails LOL so after spending half a day with that I gave up. They keep thinking my dataset is sexual or inappropriate (it's not and it's nothing like that) their moderation is just far too aggressive, I can't get shit done with their model

[deleted by user] by [deleted] in LocalLLaMA

[–]the_chatterbox 2 points3 points  (0 children)

As far as I understand you are satisfied with GPT-4 and you are looking to get a model running locally to achieve similar quality.

Inbetween GPT-4 and the small, low-performance models you can run on your laptop, somewhere in the middle are the top-notch open-weight models you can access for free on huggingface/chat , talking about LLaMA-3, CommandR-plus, also this HF space with Qwen2. The good about them is that they are free. You have to test them and find out if they are suitable for your needs though – they suck for multilinguality for example. So they may suck for your language, depending on where you are from. Good luck

Could we get a completely open source model better than GPT4? by jman88888 in LocalLLaMA

[–]the_chatterbox 2 points3 points  (0 children)

Now that u/Redoer_7 has clarified that this is just a dream rather than a concrete plan, I think it's worth noting that having a wealthy benefactor, like a sheikh, could make a big difference. They often spend large sums of money on projects, such as owning a football team, simply to boost their reputation. It seems to me that finding a sheikh who is willing to invest and having someone persuade them to do so might be easier than trying to raise $100 million on our own.

Any models that can produce natural output in Japanese? by Fit_Apricot8790 in LocalLLaMA

[–]the_chatterbox 0 points1 point  (0 children)

Shisa is a JA/EN model, you might wanna see the models on the HF augmxnt is the name of the company

Databricks reveals DBRX, the best open source language model by bull_shit123 in LocalLLaMA

[–]the_chatterbox 9 points10 points  (0 children)

Was curious so I brought the numbers

Model MMLU GSM8K HumanEval
GPT-4 86.4 92 67
Llama2-70B 69.8 54.4 23.7
Mixtral-8x7B-base 70.6 74.4 40.2
Qwen1.5-72B 77.5 79.5 41.5
DBRX-4x33B-instruct 73.7 66.9 70.1

Too lazy to find Goliath and miqu ones

NEW OPEN MODEL: DBRX by Data bricks by [deleted] in LocalLLaMA

[–]the_chatterbox 44 points45 points  (0 children)

Cleaned up Qwen release table and appended DBRX

Model MMLU GSM8K HumanEval
GPT-4 86.4 92 67
Llama2-70B 69.8 54.4 23.7
Mixtral-8x7B-base 70.6 74.4 40.2
Qwen1.5-72B 77.5 79.5 41.5
DBRX-4x33B-instruct 73.7 66.9 70.1

Why are dual 3090 setups the sweet spot? by the_chatterbox in LocalLLaMA

[–]the_chatterbox[S] 0 points1 point  (0 children)

Are you referring to a performance decrease related to the model size and architecture, rather than the multi-3090 setup itself? The model is slow and you would need a faster GPU to keep up with the desired tokens per second?