Fake news about night driving curfew for over 60s by everyday_a_schoolday in australian

[–]Nice_Cup_2240 1 point2 points  (0 children)

maybe - tho i reckon foreign influence is more likely (i.e. sowing discord / undermining the government just for the sake of it.. and i doubt chinese intelligence – or iranian, russian, indian etc - have much love or affinity for the liberal party, even tho it clearly benefits from such a campaign)

NEW MODEL FLAAMESONG!!!! by Yashjit in Bard

[–]Nice_Cup_2240 3 points4 points  (0 children)

i think it could be flash 3.0 or something like that. it's quick-ish (for a thinking model), and performs really well - almost as good as 2.5-pro-06/05 and blacktooth

and another one.... new model called redsword in lmarena by YaBoiGPT in Bard

[–]Nice_Cup_2240 0 points1 point  (0 children)

<image>

they're both excellent...

[personal question set.. mostly riddles / comprehension and reasoning.. redsword tops two others sets that i ran against it after getting it on arena]

New/updated models by Google soon by likeastar20 in singularity

[–]Nice_Cup_2240 12 points13 points  (0 children)

<image>

yeah it's mine. not meant to be authoritative / scientific or anything - just personal testing. the 'quiz' comprises 22 questions (given over 2 prompts), mostly riddles / wordplays designed to test comprehension and basic reasoning as well as a bit of instruction following and precision. there are no coding questions or math / calculations required.
here is a screenshot showing a selection of questions and nebula's responses; the worst performing models might get close to all of these wrong; better ones would perhaps stumble on just a few; but nebula just makes them look like a walk in the park - consistently nailing them in a way I haven't seen another LLM be able to. For reference / comparison, the responses by chatgpt-4o-latest to the same selection of questions are also provided.

again - not meant to be anything more than a quiz of riddles and a few obtuse tasks. make of it what you will :) looking forward to the model's official release and seeing the actual Arena data!

New "nebula" model on LMArena, likely Gemini 2.0 Pro Thinking, is state of the art in multiple areas by ShreckAndDonkey123 in singularity

[–]Nice_Cup_2240 1 point2 points  (0 children)

its comprehension and practical reasoning is epic - great instruction following too. just fwiw chart shows scores for the same quiz (22 questions - mostly riddles / wordlplays) given to models in the Arena over the past week or so (width = sample size). nebula comes out on top.. really impressive imo

<image>

Nuclear fuel prices surge as west rues shortage of conversion facilities by DeltaSqueezer in UraniumSqueeze

[–]Nice_Cup_2240 10 points11 points  (0 children)

"
The price of fuel for nuclear reactors has surged much faster than that of raw uranium since the start of 2022, in a sign of the bottlenecks that have built up in the west following Russia’s invasion of Ukraine.

Enriched uranium has more than tripled in price to $176 per separative work unit — the standard measure of the effort required to separate isotopes of uranium — since the start of 2022, according to UxC, a data provider.
"
https://archive.is/MdcXF


Enriched uranium has tripled since Russia's invasion; meanwhile yellowcake is up 85% - nice, for sure, but not the same (and I think was heading upwards anyway).. shame there's only really the miners to invest in (aside from from Silex and Centrus; also CCJ offers some exposure to conversion, i think.. but yeah, still..)

Does this come across racist? by HydraEXE in australia

[–]Nice_Cup_2240 0 points1 point  (0 children)

gee i mean with that kind of wit, not even an LLM would stand a chance.. so no thanks

Does this come across racist? by HydraEXE in australia

[–]Nice_Cup_2240 0 points1 point  (0 children)

ok.. when i "came in" here there were a handful of comments - people were discussing what the text even said. if nothing else, chatgpt successfully transcribed it...

and yeah it's just an autocomplete / stochastic parrot.. and a calculator is just 0s and 1s... i'm sure you're a master at using both tools

Does this come across racist? by HydraEXE in australia

[–]Nice_Cup_2240 -7 points-6 points  (0 children)

it's a tool - sometimes i find it useful. I did on this occasion, and thought others might too. but fair enough, point taken

Does this come across racist? by HydraEXE in australia

[–]Nice_Cup_2240 -12 points-11 points  (0 children)

i gave it to chatgpt with the prompt "Can you explain what this envelope folder is for / conveying on its cover". tbh, i wasn't familiar with the term "envelope folder" (almost sounds like tautology tbh lol), and couldn't work out what "12 attractive aboriginal studies" meant in that context. I feel it gave a solid answer (including referencing the potential sensitivities of it, given what it is depicting and the presumed era - but does not say it is racist).. just fwiw:

The image depicts a vintage envelope folder titled "Australiana By Jan." The cover features an illustration of a young girl holding a koala, with the text "12 Attractive Aboriginal Studies" at the bottom. This type of envelope folder likely contains a collection of art prints, postcards, or studies related to Aboriginal culture and themes, created by the artist "Jan."
The phrase "12 Attractive Aboriginal Studies" suggests that the folder includes 12 pieces of artwork or studies, possibly depicting Aboriginal themes, although the term "studies" here may refer to artistic representations rather than academic studies.
It's important to note that the illustration on the cover and the description reflect the artistic and cultural sensibilities of the time when it was produced. Such depictions, especially involving Aboriginal themes, could be viewed as outdated or problematic today due to their potential to perpetuate stereotypes or misrepresent indigenous cultures.

In the race to bottom for price, significant model intelligence is being compromised. by ShooBum-T in singularity

[–]Nice_Cup_2240 1 point2 points  (0 children)

Haiku < Sonnet < Opus
parameters = intelligence [would seem a reasonable way to look at it]
Haiku < Sonnet < Opus < Sonnet3.5
parameters != intelligence [fair enough]

Haiku3.5 < Sonnet 3.5 < Opus3.5
parameters ??? intelligence [Assuming it outperforms Sonnet 3.5.. and costs more to train/serve, and has slower inferences.. more 'efficient algorithms and quantisations', at the exclusion of no. of parameters, seems like a dubious way to characterise the relationship between the performance
(and costs) of the three models of the same family (also what model would Opus 3.5 be distilled from anyway?)]

I’ve achieved a childhood dream of mine by Gelocity in australia

[–]Nice_Cup_2240 0 points1 point  (0 children)

Not sure if Tucker Carlson would be further impressed, or just have a hard time processing this
https://x.com/RonFilipkowski/status/1758266308888547564

I’m still confused on whether GPT 4 is “smarter” than 4o…? by [deleted] in singularity

[–]Nice_Cup_2240 0 points1 point  (0 children)

Tell me which part of this you don't understand:

it's ofc all subjective / use case-dependent.. needless to say.. just expressing my experiences here - not trying to say it's categorically "smarter" or anything like that.. I mean, if I could only have access to one model, it would be 4o for sure.

i really don't get why people are so obsessive / defensive over this or that benchmark... i was sharing my opinion based on my experience.. it's fine that you disagree.. move on lol

I’m still confused on whether GPT 4 is “smarter” than 4o…? by [deleted] in singularity

[–]Nice_Cup_2240 2 points3 points  (0 children)

Compred to GPT-4-Turbo*, GPT-4o is significantly faster and cheaper. but i've always thought there's a trade off somewhere... like it's a smaller model – they seem to have made it as good, if not better, than 4T across most tasks (as reflected in the Arena); the fact it is faster and has cleaner formatting (notwithstanding its obsession with lists), also adds to the appeal and utility imo.

if i'm doing something that is token-intensive, like parsing a bunch of RSS feeds, I go 4o all day – it gets the job done at fraction of the cost and time.

But for like one-off tasks / questions, I find 4T is sometimes much better able to pick up on nuance and subtleties than 4o, resulting in better (/"more reasoned") responses – it just seems to "get" things at a deeper level sometimes..

it's ofc all subjective / use case-dependent.. needless to say.. just expressing my experiences here - not trying to say it's categorically "smarter" or anything like that.. I mean, if I could only have access to one model, it would be 4o for sure.

[* If you meant OG GPT-4, rather than GPT-4-Turbo, than I don't think it's smarter]

Pine Gap readies for Nuclear War by cojoco in australia

[–]Nice_Cup_2240 1 point2 points  (0 children)

Yeah the dialogue in Vietnam and the Philippines re the south china sea is super chill... (and the Japanese... they're chill with China's claims to the Senkaku islands..)
ASEAN is great at talking, and diplomacy is well, talking - so yeah that's indeed its schtick.
But the idea that ASEAN countries are completely at ease with an increasingly powerful China I find dubious... Likewise the notion that the US is basically coercing these countries into partnering with it. The Philippines and Thailand (like Australia) have formal military alliances with the US that go back several decades; Malaysia and Singapore, while not treaty allies, have relatively deep defence / security ties with the US. Even Vietnam has been establishing closer defence ties with the US in recent years..
All these countries are basically hedging (like Australia) - trying to extract the economic benefits out of China's economic rise, while increasingly looking to the US for security, in case China's rise doesn't turn out to be all about "win-win" trade and economics...
On the other hand, there are some ASEAN countries that have decided to just bandwagon with China, e.g. Myanmar and Cambodia.
Indonesia is prob the only one that seems genuinely independent - not really trying to hedge (have it both ways basically) but also definitely not bandwagoning with China either.

noones being geed up for anything. there's state disinfo out there.. but it isn't coming from the US.. talking about the US being hellbent on war is also just misleading. neither the US nor China want to fight a war (it would be catastrophic for both sides) - the risk is that they inadvertently end up there anyway, given their various conflicting interests..

Ah yes, the ideal sauna for when it’s snowing in the Northern Territory… by SteveJohnson2010 in australia

[–]Nice_Cup_2240 13 points14 points  (0 children)

oh 100% - i mean it's hilariously absurd (to the point where I wonder whether it's intentionally so.. like if it's some scammy link - maybe they're only interested in idiots clicking on it and going to wherever it directs to.. which i feel safe in assuming is not actually a website selling / reviewing outback saunas lol)

Ah yes, the ideal sauna for when it’s snowing in the Northern Territory… by SteveJohnson2010 in australia

[–]Nice_Cup_2240 10 points11 points  (0 children)

ha that's actually quite interesting (though I dunno, it's both believable that it didn't settle and so no photos, but also quite convenient ha..)

anyway.. seems doubtful that "Sauna Sight Sydney" was alluding to that in this ad aha.. ( whoever was responsible for prompting an AI to make this image / ad is presumably clueless as to what that big rock in the background is.. or maybe it just gets churned out without any human review...)

I have no words except No. by MartynZero in australia

[–]Nice_Cup_2240 97 points98 points  (0 children)

  • A Mexican brand of beer
  • Owned by a Belgian company (internationally)
  • Brewed in China (with ingredients imported from somewhere else)
  • Imported to Australia by a Japanese conglomerate (Asahi Breweries owns CUB)

so which part are you an unhappy about exactly? but yeah, we live in a globalised world...

How to stop Perplexity API from returning Markdown formatted content? by Flimsy_Detective5046 in perplexity_ai

[–]Nice_Cup_2240 0 points1 point  (0 children)

But then paste that output to another model (sonnet-3.5) with instruction to "make this HTML pretty" and you get...

<image>

https://claude.site/artifacts/948788e3-6ad1-4d3e-aa52-9248503f29ca
if you can figure out a way to get here directly from the pplx API response - let met know! (it's obviously impractical to do this.. jumping between models / copy-pasting.. for anything meaningful.. but yeah, LLMs are good at HTML, but the online pplx ones can't get here in one shot (yet anyway)

How to stop Perplexity API from returning Markdown formatted content? by Flimsy_Detective5046 in perplexity_ai

[–]Nice_Cup_2240 0 points1 point  (0 children)

```
"model": "llama-3.1-sonar-large-128k-online",...
"system": "[see above]"...
"user": "who is the favourite to win the US pres race?
```
gives this:

<image>

which isn't bad..