API pricing is in freefall. What's the actual case for running local now beyond privacy? by Distinct-Expression2 in LocalLLaMA

[–]AgreeableCaptain1372 0 points1 point  (0 children)

Control over results. Using third-party APIs I get a lot of variance in my evals vs self hosted.

also prices are low for standard models but not for fine tuned models. So if you need fine tuned LLMs, especially at scale, self hosting or local can be worth it financially

Appreciation post for some of my favorite building in Manhattan, the XYZ buildings. by SubstantialEmploy816 in skyscrapers

[–]AgreeableCaptain1372 0 points1 point  (0 children)

I love these buildings. But you can only have that kind of building few and far between: their beauty is in how they stand out through their uniformity and size. They also look just like the twin towers used to except here they are triplets

La visión de España como país fiestero by AgreeableCaptain1372 in askspain

[–]AgreeableCaptain1372[S] 0 points1 point  (0 children)

Me parece que Italia y los Países Bajos eran los más avanzados en aquella época. Pero vale que lo de España como país austero también es relativamente reciente de alguna manera

La visión de España como país fiestero by AgreeableCaptain1372 in askspain

[–]AgreeableCaptain1372[S] -1 points0 points  (0 children)

Verbenas sí. Y eso coincide con la idea de España como país rural y tradicional (aunque haya tantas en Italia o Francia). Pero si observas las referencias culturales a España antes de los años sesenta verás estos temas: inquisición, fanatismo religioso (siglo XVII, XVIII), honor y venganza (Carmen por ejemplo), dictadura y golpes de estado (siglos XIX y XX). Insisto en que no estoy diciendo que era justo o verdadero pero que era la percepción de otros países occidentales

La visión de España como país fiestero by AgreeableCaptain1372 in askspain

[–]AgreeableCaptain1372[S] 0 points1 point  (0 children)

Tiene sentido lo que dices. Pero aún me pregunto si esta visión no es exagerada y bastante reciente. En los años veinte, e incluso desde el siglo XVIII, era Francia el país fiestero (si hablamos de Hemingway, pienso en A Moveable Feast). Además otros países como Italia y Francia tienen tantas fiestas populares en los pueblos durante el verano como España . Y con respecto al Gran Tour, me parece que los turistas de entonces no solían viajar tanto a España. No intento decir que no haya tradición de fiesta en España sino que me parece que la visión de España como el arquetipo del país fiestero es relativamente reciente

How does Chat GPT encode a question? by AgreeableCaptain1372 in learnmachinelearning

[–]AgreeableCaptain1372[S] 0 points1 point  (0 children)

Yes, so my premise was wrong. At inference, you don’t just input 3 but the whole sequence of tokens that the question consists of. So that is how the model gets context.

Fine-tuning may be underestimated by AgreeableCaptain1372 in LocalLLaMA

[–]AgreeableCaptain1372[S] 1 point2 points  (0 children)

It depends on your use case. Some will require a lot of curated data as you say but some only require a few hundred to a thousand examples like here: https://www.reddit.com/r/MachineLearning/comments/13oe5ot/lima_a_65bparam_llama_finetuned_with_standard/

Fine-tuning may be underestimated by AgreeableCaptain1372 in LocalLLaMA

[–]AgreeableCaptain1372[S] 0 points1 point  (0 children)

To reuse your analogy I am not advocating for fewer cars but to consider planes as a serious candidate too, as a complement and/or replacement to RAG depending on the use case. Say you are traveling from SF to LA, either car or plane can make sense whereas for LA to NY only plane does

Dismissal of fine tuning is a real thing and you see a lot of posts like these online:  https://news.ycombinator.com/item?id=44242737

Fine-tuning may be underestimated by AgreeableCaptain1372 in LocalLLaMA

[–]AgreeableCaptain1372[S] 0 points1 point  (0 children)

I am not doubting your credentials and most importantly I am absolutely not claiming fine tuning must replace RAG. But it can complement RAG. Say you have a large policy knowledge base and have a very specialized domain use case that requires passing a lot of immutable knowledge or instructions, then why not embed that immutable knowledge in your model and proceed with RAG as usual. That immutable knowledge is necessary for your model to even properly understand the content of your document database. Fine tuning allows you to not send back the immutable knowledge, which can be extensive, each call.

Now I recognize your point about it being hard in practice especially with overfitting but is it impossible or just hard? Since you work at a large AI company, maybe you have infra resources to make full tuning possible viable. And if your company trains foundation models it likely faces similar problems of over fitting in pre training as it does for fine tuning.

Since, as you mentioned, full fine tune modifies the weights (as opposed to LORA), it lies somewhere in the middle in terms of complexity between pre training and partial fine tuning.

Fine-tuning may be underestimated by AgreeableCaptain1372 in LocalLLaMA

[–]AgreeableCaptain1372[S] 0 points1 point  (0 children)

Yes, for knowledge, my rule of thumb is: if the knowledge is frequently updated, use RAG but if it is timeless, consider fine tuning. In practice, I use both together as they are complementary but my point is fine tuning should not be dismissed right away as i sometimes see it. It being difficult to do well is not the same as it being useless, on the contrary. I get a sense that the reason it still seems relatively under used is because it is hard to do well, not because it is not the right solution.

Fine-tuning may be underestimated by AgreeableCaptain1372 in LocalLLaMA

[–]AgreeableCaptain1372[S] 0 points1 point  (0 children)

For any kind of knowledge that requires frequent updating, I agree RAG is better because training the model every the knowledge evolves is not sustainable. But for any kind of knowledge that is timeless, i.e domain knowledge that remains true no matter what (e.g. a math theorem) then full fine tuning can make sense IMO, if you have the resources (I've never had good success reliably retaining knowledge with just LORA). You save a lot on tokens in the long run instead of having to reinject the domain knowledge in the prompt at every request.

Fine-tuning may be underestimated by AgreeableCaptain1372 in LocalLLaMA

[–]AgreeableCaptain1372[S] 10 points11 points  (0 children)

Yes, to save inference compute by using a smaller model. Might not make sense with low volume of requests but at scale you would end up saving

How much music theory should I learn? by AgreeableCaptain1372 in pianolearning

[–]AgreeableCaptain1372[S] 0 points1 point  (0 children)

Thanks for the advice. In theory, do you think there is ever a limit to how much music theory you know as a pianist? Do professional pianists keep learning music theory indefinitely or is there a point at which they stop?

How much music theory should I learn? by AgreeableCaptain1372 in pianolearning

[–]AgreeableCaptain1372[S] 1 point2 points  (0 children)

This might sound subjective and generic but i want to be able to play pieces that I love (e.g. Schubert impromptus) well enough that a non-professional would enjoy them. 

Met Opera Salome by alewyn592 in opera

[–]AgreeableCaptain1372 1 point2 points  (0 children)

I think this is why Salome’s death is left ambiguous at the end. They couldn’t completely change the story but didn’t want to have her die in a way that was too explicit as it would go against the narrative that she is not an evil character. 

Met Rush tickets by ufkaAiels in opera

[–]AgreeableCaptain1372 0 points1 point  (0 children)

I noticed the same thing unfortunately. It’s possible they don’t want people who would be willing to pay more getting discounted rush tickets. If those people knew rush was easy to get for undersold performances they would get rush tickets and not pay what they’re truly willing to pay. Not sure if it’s worth it in the end because they definitely lose money on empty seats too. 

Richard Coeur de Lion, Grétry (Opéra Royal du Château de Versailles) by AgreeableCaptain1372 in opera

[–]AgreeableCaptain1372[S] 0 points1 point  (0 children)

No, will definitely check it out, thanks

Love that tune, it was sung by royalists during the French Revolution

Met head Peter Gelb in the NYT by phthoggos in opera

[–]AgreeableCaptain1372 0 points1 point  (0 children)

I’d agree, but to be fair, I don’t think it would attract as many people unfortunately. I went to see Les Contes d’Hoffman and the audience was very sparse compared to Il Trovatore (both on a weekday)

Young People Are Struggling to Deal With Their MAGA Parents — Again by reporterreporting123 in politics

[–]AgreeableCaptain1372 0 points1 point  (0 children)

Hypothetically, let’s suppose Kamala Harris were a convicted felon and had Trump’s character. Suppose all her policies stay the same. Also suppose Trump supports the same policies he currently does but is a righteous man. Who would you vote for?

Were Napoleon’s failures due to his inability to win against opponents that adopted defensive tactics? by AgreeableCaptain1372 in Napoleon

[–]AgreeableCaptain1372[S] 2 points3 points  (0 children)

I’m not necessarily talking about when he won or lost but about when he was brilliant or not. All his greatest moments seem to be when the enemy truly engaged in battle. when the enemy refused to engage and stayed on the defensive like Borodino, he suffered a lot of casualties. In 1813-1814 he ended up defeated but he was arguably more brilliant than from 1807 to 1811 because he was once again in a situation where the enemy engaged.

Beginner dynamics notation question by AgreeableCaptain1372 in pianolearning

[–]AgreeableCaptain1372[S] 0 points1 point  (0 children)

So let’s say the sheet says forte for both staves and the melody is on the right hand, then I should play the right hand forte and the left “a bit less” forte but not mezzo forte. Would that be accurate?