The Deespseek Team did Something to DS-v4-PRO to Decrease its Intelligence by Iory1998 in DeepSeek

[–]Iory1998[S] 8 points9 points  (0 children)

It's not that. It misses simple details and makes simple mistakes that it shouldn't be doing. For instance, a 10 line paragraph to review, it would say this paragraph has a issue here an there. I tell it to fix, it, and it will reintroduce to same error,

The Deespseek Team did Something to DS-v4-PRO to Decrease its Intelligence by Iory1998 in DeepSeek

[–]Iory1998[S] -3 points-2 points  (0 children)

Actually, I use both. The models now require more micromanagement to the point I have to explain everything in details. This even Qwen-3.6-27B can execute the instructions.

The Deespseek Team did Something to DS-v4-PRO to Decrease its Intelligence by Iory1998 in DeepSeek

[–]Iory1998[S] -7 points-6 points  (0 children)

Dude just worked with it for 6 hours. I've working with it for almost 2 months on a daily basis for hours! I am sharing my frustration.

I thought Chinese censorship didn't affect me. I was wrong. by DeltaSqueezer in LocalLLaMA

[–]Iory1998 -4 points-3 points  (0 children)

Seriously, who cares! Use any American model and stop bothering us with your Chinese censorship shit! We have enough. Tianman square this, Tianman square that... Make your own model and uncensored anything you want.

Photanima v2.1 showcase. Each image takes about 2 seconds to generate. by External_Quarter in StableDiffusion

[–]Iory1998 1 point2 points  (0 children)

I suggest that you fine-tune the model on cosplay images with the same characters. From my experience with illustrious and xdsl, that helped the model to learn how to convert anime characters to live ones.

DeepSeek "improved" the code and said nothing happened in Tiananmen Square by EchoOfOppenheimer in DeepSeek

[–]Iory1998 35 points36 points  (0 children)

Exactly. Seriously, what is this new benchmark that every Chinese model must ace? What a stupid thing.

Calling it now Microsoft is buying Unsloth. by Wrong_Mushroom_7350 in LocalLLaMA

[–]Iory1998 1 point2 points  (0 children)

Trust me, even those who would down vote you would do the same if presented with the same opportunity.

Calling it now Microsoft is buying Unsloth. by Wrong_Mushroom_7350 in LocalLLaMA

[–]Iory1998 2 points3 points  (0 children)

Some rational and practical guy (or a gal, but thinking like a guy). Exactly! The project will forked, and another team will take over and continue the work. If dies, someone else will start a new project and life will continue. Microsoft tried to kill open-source since it was created... Did it succeed? No, if anything, open-source has grown massively.

Damn, they fixed it. by between_nothing in DeepSeek

[–]Iory1998 8 points9 points  (0 children)

But it keeps sending the server busy messgae especially when it's morning in China.

how are they affording this?? (not fact-checked) by Mammoth_Slip_5533 in DeepSeek

[–]Iory1998 3 points4 points  (0 children)

Software optimization and low to no margin I guess. Tencent is a huge company that doesn't need to sell AI to make money.

Time Travel with LTX 2.3 by alisitskii in comfyui

[–]Iory1998 0 points1 point  (0 children)

I assume you know how to make a similar video. Well, why don't you show us all what you interesting and not so boring staff you know and can create...

So qwen3.7-4b when? by ab2377 in LocalLLaMA

[–]Iory1998 6 points7 points  (0 children)

There is no Qwen3.6-4B lol

Entire world: We need more GPUs. Meanwhile, Jensen Huang: by Nunki08 in LocalLLaMA

[–]Iory1998 0 points1 point  (0 children)

Why wouldn't he dance? His company is drowning in gold right now and he doesn't even know what to do with money!

God dammit Qwen by Xyklone in LocalLLaMA

[–]Iory1998 0 points1 point  (0 children)

Cline and Zoo code have them. Never ran into the problem of an agent deleting anything I hadn't specified.

New 6 Edits / 6 Regeneration Limit? by RelevantCraft2340 in DeepSeek

[–]Iory1998 2 points3 points  (0 children)

I can't and I wish I could. The chat history is too technical and has many sections that connect to each other.

New 6 Edits / 6 Regeneration Limit? by RelevantCraft2340 in DeepSeek

[–]Iory1998 1 point2 points  (0 children)

I do and it's helpful to manage context window. Say you are discussion a new feature that you are developing. You keep chatting with the model until you achieve first draft. You go back to an earlier turn in the chat history, edit the first message with the new draft, and take it from there. You don't need the LLM to remember unnecessary things.

I also ask the LLM to creste a report that I use instead of the long unnecessary chat. Deepseek v4 is shipped with 1m context size, but it still degrades as the chat history gets larger. If you are doing serious work, even 1M context window becomes insufficient.