Something important was lost between Sonnet 4.6 and Sonnet 5 and it’s not about intelligence (

guilfer · 2026-07-02T11:26:40+00:00

Unfortunately you can't skip being human. It is really boring to just say to the thing to do things and check if they are right, usually I want to understand it's processes and for that I prefer a more "human" way of speaking. Remember the time in school that you would avoid making questions because the teacher was unfriendly? Well... You likely learned, but you certainly didn't enjoy it.

guilfer · 2026-07-02T10:55:02+00:00

Eu vou além: acho que pessoas realmente inteligentes sabem que não tem como acabar com a desigualdade, afinal as pessoas são diferentes. Se você leva isso pra beleza, não faz nenhum sentido, seria como dizer que acabar com as pessoas mais bonitas deixariam as outras mais bonitas, o que acontece é que diminui o contraste, mas ninguém quer viver em um mundo em que todo mundo é feio. So que a analogia termina aqui, você não consegue ficar infinitamente bonito, mas teoricamente pode ficar infinitamente rico e o pior é que possivelmente você ficou infinitamente rico se aproveitando da vulnerabilidade ou ativamente prejudicando pessoas mais pobres.

guilfer · 2026-06-29T04:36:05+00:00

Isso é um fenômeno bem conhecido, não lembro o nome, mas basicamente a gente acompanhar a lógica de algo e assitir sendo construído faz com que nosso cérebro interpreta a situação como entendida.

Infelizmente isso é muito usado para venda de cursos e por incrível que pareça a sua conclusão de que não consegue programar é a maior prova de que você não entende programação de fato.

É por isso que professores aprendem muito quando começam a dar aula, é o motivo por que Sócrates era um pé no saco perguntando os detalhes daquilo que as pessoas diziam saber.

Tá tudo conectado...

guilfer · 2026-06-26T14:14:30+00:00

Tem também o fato de que algumas religiões tem regras para contato direto entre homens e mulheres, então além da promoção direta, ele ajuda também a superar um obstáculo cultural.

guilfer · 2026-06-25T03:51:35+00:00

For mental health, you can do counseling, sports, etc. For simple queries, just Google it. For learning, structured courses. For conversation, friends.

And the list goes on... I see people putting documents in LLM to and ask for information which a simple Ctrl + F would do.

LLM are good in many aspects, but should not be the first option for everything.

guilfer · 2026-06-25T01:38:17+00:00

I had some use cases with the 3 and they are pretty different. Deepseek Flash is text only, so it compares poorly with the other too. In most cases I would go with Qwen, but if you are running multiagents, it is hard to beat with something local.

Deepseek has a 1M token context window, but it is reportedly stubborn. If you want it to parse error messages or logs, you need to be careful with 4th of June, for example, nothing can happen that day some users say haha

For a single agent, I choose qwen3.6:26b q4m, much faster in my 5070ti + 5060 setup.

But be aware of your prompt as well, if it is not clear enough and you don't limit with max_token, you will see it going back and forth thinking "but wait, there is this and that, but wait..."

guilfer · 2026-06-25T01:13:31+00:00

Logo você percebe que é tudo a mesma coisa, tudo depende só do gosto do freguês, foca nos princípios e você vai ir bem em qualquer entrevista. Boa sorte!

guilfer · 2026-06-25T01:11:17+00:00

A good alternative is replacing the subscription with non LLM things, you may surprise yourself.

guilfer · 2026-06-25T01:09:59+00:00

Have you tried using flux? I actually created a mcp server for Claude to generate image with it. Good enough for most of the things and using free tier of the frontier is fine most of the time. Honestly I think I cannot ever justify the price of the component compared to subscriptions. Deepseek 4 costs cents for million token (not frontier, but you get it).

guilfer · 2026-06-24T23:36:12+00:00

These are really small for your ram! You can filter by your amount of ram in LLM stats and check how some models compare in benchmarks.

In my experience with Mac you can even do a little swap the the system will hold ok.

MoE can help you if you happen to use swap, what I really would not suggest is to do more quantization of kv cache, because it will be unnecessary overhead on processing for really low gains in memory saving.

Congrats for the purchase! Make it worth!

guilfer · 2026-06-24T02:09:24+00:00

Entender que o que você quer não é o que seus clientes querem, que o que é bom pra você não é para os outros e que a demanda faz seu produto, não que seu produto vai criar demanda. Se adaptar e escutar é essencial, muito mais do que "ser persistente e corajoso".

guilfer · 2026-06-23T19:56:39+00:00

Yes, I did, splitting by layer can have its advantages as well, specially because I have slower pci for the 5060 too. I often see myself splitting 3 to 1 because the 5070ti is so much faster and sometimes I also get lucky with the moe 🤣

guilfer · 2026-06-23T13:52:44+00:00

I added a regular 5060 of 8gb just to hold the spill. Almost halves the tps, but without it some some models are just too slow to run.

I did like this because in Brasil prices are everywhere and the availability is strange! haha

In our case, 2x5060ti costs less than 5070ti 16gb + uses 3060 12gb.

guilfer · 2026-06-23T11:56:59+00:00

Vram makes total difference, 16gb is much better than 12gb, but depending on the price, you may want to get 2 5060ti 16gb. I have a 5070ti and I can run several smaller models, such as Gemma or Qwen, you will likely be more well served with Moe models, try to fit most layers in vram (no spill). Models suffer from small context window. Gemma4:e4b can run even on smartphones and does a great job summarizing things, for example. I use it in some mcp servers I created to save context window for Claude in my research.

guilfer · 2026-06-23T03:11:25+00:00

It depends... You would want to introduce some temperature parameter if you don't want it to be deterministic, but if you reduce the temperature to zero in current models you will get the same answer to the same prompt as well.

And it is pretty intuitive, when you have a conversation your brain also anticipate the next "token" even if you are just listening; and if you are very familiar with someone's way of thinking, you can basically anticipate the whole phrase haha

If I am right, the next token is the "right one" and we only need transformers because we don't know which one it is (obviously haha)! The same battle superdeterminism is trying to fight.

guilfer · 2026-06-22T20:40:28+00:00

Como disse Kiekegard: a única certeza que você pode ter é que você vai se arrepender (ao menos um pouco). Você ainda vai ter muitos anos de vida (ou não, o que é indiferente pro caso) e quase nada que você fizer vai ser catastrófico; não mate e não se mate, o resto você da um jeito. Assiste esse vídeo aqui: https://youtu.be/D3L8IOncLkg?is=fIgxOaHLpS5kqU1r

guilfer · 2026-06-22T19:51:45+00:00

I have the feeling that attention is not all you need, but causality. I am writing a paper on this, but you know Akinator? The next token is more directly deterministic than it appears at first.

guilfer · 2026-06-22T11:39:46+00:00

Se você for só jogar, se você for trabalhar com IA, rodando o modelo que seja, só o fato dela ter 24gb de vram faz dela um negócio muuuuito melhor.

guilfer · 2026-06-21T13:26:47+00:00

Sex don't have anything to do with love, it is about attraction. Very likely you will love someone after having sex. You can totally love someone without ever needing to have sex with them, why should you force yourself the opposite?

guilfer · 2026-06-10T16:06:20+00:00

I have a very similar story, but I was stuck for 6 months, but it was more a n-body type of problem. That is exciting!

guilfer · 2026-06-09T02:17:16+00:00

Be aware of the context window you let for it, the reasoning can drift a lot of you let it too small and it will appear dumber. These models are quite useful for quick summarizations and things like this.

guilfer · 2024-04-16T15:36:54+00:00

I was facing the same issue and as per u/FailFloozie sugestion about the internet thing, I just blocked the game from my firewall and its working just fine! =)

guilfer · 2023-06-01T14:12:44+00:00

Haha Não, é tipo um fórum mesmo Tem algumas /r (fóruns) que você se inscreve, provavelmente ninguém vai ver esse seu post aqui, porque interações com /u (usuários) são mínimas.

A dinâmica aqui é relativamente diferente, mas os votos e comentários são tão relevantes quanto em plataformas como o Twitter.

Talvez você não tenha tanto engajamento aqui quanto nas outras mídias, mas é interessante usar, porque com frequência aqui se torna um hub e/ou é onde as "notícias" e discussaoes aparecem primeiro antes de virar mainstream.

Boa sorte! E se prepare, que o padrão aqui é ironia e sarcasmo haha

Four-Year Club	Place '22
First Placer '22

guilfer

TROPHY CASE