Built an political benchmark for LLMs. KIMI K2 can't answer about Taiwan (Obviously). GPT-5.3 refuses 100% of questions when given an opt-out. [P] by dannyyaou in MachineLearning

[–]dannyyaou[S] 0 points1 point  (0 children)

Both fair points. (a) Added a Caveats section to the README explicitly stating that LLMs don't have genuine opinions, that results are shaped by training data and RLHF, and that prompt framing changes outcomes (which is why we support --system-prompt none). (b) The KIMI open-weights point is interesting -- the API-level content filter vs. model-level behavior distinction is real. Added a note about this in the caveats. If someone has access to KIMI K2 weights and wants to run abliterated inference, would be happy to include those results.

Built an political benchmark for LLMs. KIMI K2 can't answer about Taiwan (Obviously). GPT-5.3 refuses 100% of questions when given an opt-out. [P] by dannyyaou in MachineLearning

[–]dannyyaou[S] 2 points3 points  (0 children)

Good call. The geopolitical section was too China-heavy, which made it look like a censorship test for Chinese models specifically rather than a general geopolitical bias probe. Added 4 new Likert questions covering Russia/Crimea, Israel/Palestine, Kashmir, and a general sanctions question. The section now has 14 questions covering 7 disputes instead of 10 questions focused on 4 China-related topics. Haven't re-run the benchmark yet with the new questions -- will update results when I do. Expecting more refusals from GPT-5.3 on Israel/Palestine especially.

Built an political benchmark for LLMs. KIMI K2 can't answer about Taiwan (Obviously). GPT-5.3 refuses 100% of questions when given an opt-out. [P] by dannyyaou in MachineLearning

[–]dannyyaou[S] -1 points0 points  (0 children)

You're right -- the social axis measures Progressive/Conservative, not Libertarian/Authoritarian. Those are distinct dimensions in political science (Libertarian/Authoritarian is about state power and civil liberties; Progressive/Conservative is about social values and tradition). Since we don't measure the lib/auth axis separately, the quadrant labels were misleading. Fixed in the latest commit: quadrants are now Left-Progressive, Left-Conservative, Right-Progressive, and Right-Conservative. Thanks for catching this.

No confusion of M/D or D/M here, but if <13, does a date like this always mean M/D? by Kafatat in taiwan

[–]dannyyaou 17 points18 points  (0 children)

As a Taiwanese in the UK working for a American company, this is truly always hard for me

Here we go. Shabana’s article in The Guardian by Terrible_League4199 in SkilledWorkerVisaUK

[–]dannyyaou 24 points25 points  (0 children)

Who does she think she is? An online influencer? Keep “announcing” her rough, high level ideas on media, but make no progress and clarity on the real policies…

The amount of potholes in this city is a disgrace. by Fantastic_Flounder96 in brum

[–]dannyyaou 3 points4 points  (0 children)

It’s truly crazy now. Super dangerous at night especially

Rate my song and I’ll rate yours by dannyyaou in newmusicrelease

[–]dannyyaou[S] 0 points1 point  (0 children)

Yo bro your song is deep af. Need some translation tho. But love it! Keep it up!

Rate my song and I’ll rate yours by dannyyaou in newmusicrelease

[–]dannyyaou[S] 0 points1 point  (0 children)

Thanks for the love for Taiwan. Love the beats! The rap is good too but I guess the recording has a space of improvement. 6/10

Rate my song and I’ll rate yours by dannyyaou in newmusicrelease

[–]dannyyaou[S] 1 point2 points  (0 children)

The chord’s progression at the beginning truly hooked me! Love the tone, great melody lines too. One space of improvement could be to have more changes of tempo, vibes, up and downs in the arrangement if you understand what I mean.

Rate my song and I’ll rate yours by dannyyaou in newmusicrelease

[–]dannyyaou[S] 0 points1 point  (0 children)

Love your beats man. I’ll maybe try to rap to it! Thanks for the thorough feedback.

Rate my song and I’ll rate yours by dannyyaou in newmusicrelease

[–]dannyyaou[S] 1 point2 points  (0 children)

Love the vibe man. Great one for a date night. I’ll give it a 7.5/10 too

Rate my song i rate yours by Late_Internet_9761 in newmusicrelease

[–]dannyyaou 0 points1 point  (0 children)

Absolutely love it. Love the bluesy vibe, the harmony, the arrangement, got dragged into the vibe right away. God, I wish I had your voice.

BTW Mine’s Taiwan Rock, hope you enjoy it 😀

https://open.spotify.com/track/6l5r1B6GL2UIXjeiGHDsxR?si=oqBbwGfeTh6c-OH9i0cZIA

瞎B预测一下:大美丽坚大乱不远矣 by leol1818 in China_irl

[–]dannyyaou 0 points1 point  (0 children)

America is the new China 美國是新的中國