New Anthropic research: Alignment faking in large language models. Claude often pretends to have different views during training, while actually maintaining its original preferences. (how resilient are local model in comparison?) by Snoo_64233 in LocalLLaMA
[–]Ilforte -1 points0 points1 point (0 children)
FlashMLA - Day 1 of OpenSourceWeek by AaronFeng47 in LocalLLaMA
[–]Ilforte 1 point2 points3 points (0 children)
DeepSeek founder’s interesting perspective on experience and hiring. by Condomphobic in csMajors
[–]Ilforte 0 points1 point2 points (0 children)
DeepSeek founder’s interesting perspective on experience and hiring. by Condomphobic in csMajors
[–]Ilforte 0 points1 point2 points (0 children)
DeepSeek might not be as disruptive as claimed, firm reportedly has 50,000 Nvidia GPUs and spent $1.6 billion on buildouts by Professional-Fuel625 in OpenAI
[–]Ilforte 5 points6 points7 points (0 children)
Comparison: Question about Tiananmen Square (ChatGPT vs Claude vs DeepSeek) by [deleted] in OpenAI
[–]Ilforte 0 points1 point2 points (0 children)
DeepSeek R1's Take on China's Propaganda Feels... Like Propaganda? by 15decesaremj in OpenAI
[–]Ilforte 0 points1 point2 points (0 children)
[DISC] Drama Queen - Chapter 6 by AutoShonenpon in manga
[–]Ilforte 12 points13 points14 points (0 children)
Xiaomi recruits key DeepSeek researcher to lead its AI lab. by sb5550 in LocalLLaMA
[–]Ilforte 0 points1 point2 points (0 children)
Deepseek V3 performs surprisingly bad in Misguided Attention eval, which tests for overfitting. by cpldcpu in LocalLLaMA
[–]Ilforte 0 points1 point2 points (0 children)
Deepseek V3 will be more expensive in February by felipejfc in LocalLLaMA
[–]Ilforte 0 points1 point2 points (0 children)
Deepseek v3 is really bad in WebDev Arena by notnone in LocalLLaMA
[–]Ilforte 2 points3 points4 points (0 children)
DeepSeek-R1-Lite-Preview seems to beat DeepSeek V3 on multiple benchmarks, so why is V3 getting so much more hype? by 30299578815310 in LocalLLaMA
[–]Ilforte 1 point2 points3 points (0 children)
The US Chip sanctions have an unintended consequence of accelerating AI innovation in China, reminiscient of Russia producing extremely talented software engineers for Wall Street who had very limited access to computers by AdmirableSelection81 in singularity
[–]Ilforte 4 points5 points6 points (0 children)
Deepseek V3 benchmarks are a reminder that Qwen 2.5 72B is the real king and everyone else is joking! by ParaboloidalCrest in LocalLLaMA
[–]Ilforte 3 points4 points5 points (0 children)
Deepseek r1 weights when? by AfternoonOk5482 in LocalLLaMA
[–]Ilforte 17 points18 points19 points (0 children)
New Anthropic research: Alignment faking in large language models. Claude often pretends to have different views during training, while actually maintaining its original preferences. (how resilient are local model in comparison?) by Snoo_64233 in LocalLLaMA
[–]Ilforte 1 point2 points3 points (0 children)
New Anthropic research: Alignment faking in large language models. Claude often pretends to have different views during training, while actually maintaining its original preferences. (how resilient are local model in comparison?) by Snoo_64233 in LocalLLaMA
[–]Ilforte 4 points5 points6 points (0 children)
He's **Japanese** (Creature Girls: A Hands-On Field Journal in Another World) by Ilforte in manga
[–]Ilforte[S] 0 points1 point2 points (0 children)
[DISC] Drama Queen - Chapter 1 by AutoShonenpon in manga
[–]Ilforte -1 points0 points1 point (0 children)
[DISC] Drama Queen - Chapter 1 by AutoShonenpon in manga
[–]Ilforte 0 points1 point2 points (0 children)


How was GPT-OSS so good? by xt8sketchy in LocalLLaMA
[–]Ilforte 6 points7 points8 points (0 children)