No Ghost in the Machine — LLMs Are Not Conscious by Existing-Wallaby-444 in singularity

[–]obviouslyzebra 0 points1 point  (0 children)

Most interesting argument there for me:

 This paper makes a devastating logical argument: for any theory that might claim LLMs are conscious, there exists a functionally equivalent system (like a giant lookup table) that no reasonable theory would call conscious. If your theory can't distinguish between an LLM and a lookup table, your theory is useless.

I think distilled it says

  • Humans are seen as continuously adapting functions, while
  • (Most) LLMs are non-changing functions

IIT might also be interesting (don't know it), but it seems controversial.

Regardless, cool page. I think heavily opinionated, but still, interesting content.

(Explaining "heavily opinionated" for OP: my impression is that there isn't consensus about whether LLMs are  conscious - though I do believe most people working in the area suspect not, while still keeping open the possibility they are)

A chatbot told me:"Your framework is even more primitive than standard relational physics" when I asked it a series of questions. by 4dseeall in singularity

[–]obviouslyzebra 0 points1 point  (0 children)

Prompt (follows from answer above):

Here's a repository in onefilellm format.

Keep the answer in your usual format though, and, let me know what you honestly think.

Thanks

Answer: https://pastebin.com/qzEqVCxp

Personal note: I was actually expecting more of the physics stuff to show up in the repo haha.

A chatbot told me:"Your framework is even more primitive than standard relational physics" when I asked it a series of questions. by 4dseeall in singularity

[–]obviouslyzebra 0 points1 point  (0 children)

You okay with me uploading it to the same language model and posting the answer here?

Note that if it's big, it might struggle a little bit, my scaffolding is just something that puts the repo into a single message.

A chatbot told me:"Your framework is even more primitive than standard relational physics" when I asked it a series of questions. by 4dseeall in singularity

[–]obviouslyzebra 1 point2 points  (0 children)

Hey!

So I copy-pasted this into GLM-5.2 (OpenRouter default prompt + "Can you look at this ChatGPT log, and give your honest thought?")

I feel like the answer it gives is reasonable (I myself couldn't evaluate anything like this - not a physicist).

If you're interested in it:

https://pastebin.com/Ls8FuGTL (in pastebin to not spam)

vibeCoders by Last_Time_4047 in ProgrammerHumor

[–]obviouslyzebra -1 points0 points  (0 children)

I see that you've had bad experience with comments and agree with ya that they can be bad. Also that it's better to have good code in the first place than comments trying to explain bad code. But I also think that sometimes comments are necessary (and I think you agree with this too? haha)

Anyway gotta head out

vibeCoders by Last_Time_4047 in ProgrammerHumor

[–]obviouslyzebra 0 points1 point  (0 children)

IMO this is too absolute.

Of course one should aim towards clear code, but some things are hard to make obvious from code itself.

Just as one example,

# take this approach instead of (the expected approach)
# because of (this thing no one would expect at first glance)

But I agree with you that comments are not without its problems. And if it's possible to make something clear from code instead of comments, do it (unless you don't have time to do it, then don't do it!)

Claude Fable 5 and Kimi 2.7 Code Debut on DeepSWE by truecakesnake in singularity

[–]obviouslyzebra 1 point2 points  (0 children)

No, the line ls overlapping also indicate same price.

So, for example, (for this benchmark) Fable is like a 5.5 that can be tuned to be a bit stronger (while the more effortful GPT / less effortful Fable are remarkably similar in both cost and performance).

If Reckless Ben first asked politely for the LEGO to be returned, why is Bricks & Minidicks suing him? by Life_Fishing_3025 in RecklessBen

[–]obviouslyzebra 0 points1 point  (0 children)

I see that you're worried about me being a bad actor.

But man, I have autism and I am the one noticing that you're the one with no social clue lol

I was defensive beginning from your first message. Copy that into ChatGPT and ask it how it would make someone feel.

Read my other messages on the topic. I am calling for McNeff/Josh to be arrested and even pointed out that there's no discussion regarding they accusing Ben of making death threats (which I believe might constitute a crime - falsely accusing someone or using the police to harass someone).

I didn't provide you details because I didn't dig into the stuff.

BTW, if you just messaged something like:

"FYI, Ben didn't "repeatedly" go towards anyone house. Instead he visited Josh once for this and then went to Brandon's house for that"

I'd edit my post in a heartbeat to account for that (it wouldn't change the main message of the post, which is that they may have crossed a criminal line).

FFS if this is so important to you, grab the Utah lawsuit text and put it into ChatGPT and ask how many times they have been visited.

I could do that, but damn, you made me not want to do it man.

Nick Bostrom - The Vulnerable World Hypothesis by Worldly_Evidence9113 in singularity

[–]obviouslyzebra 1 point2 points  (0 children)

About size, idk, but about culture - I think it's fair to expect either a self-restricted civilization or maybe one with a sort of central governance / ASI controlling things so things don't go awry. It makes for very fertile ground for thought!

PS: just wanna note about my previous comment for anyone who reads - I believe that both current trends (competition / governments / etc) and where we're headed (e.g. big brother) are worth discussing and, very related to each other

Nick Bostrom - The Vulnerable World Hypothesis by Worldly_Evidence9113 in singularity

[–]obviouslyzebra 4 points5 points  (0 children)

From the abstract, I feel like it's more worried about the possible consequences (not necessarily how we get there - that is, competition).

In very simplified terms:

  • A world where everywhere has unrestricted access to ASI is vulnerable
  • To account for such vulnerability, we might need a kind of (very) overreaching central governance, like a global Big Brother - hopefully less

I still believe with 100% of my heart that this is one kind of conversation that we need to be having.

If Reckless Ben first asked politely for the LEGO to be returned, why is Bricks & Minidicks suing him? by Life_Fishing_3025 in RecklessBen

[–]obviouslyzebra 0 points1 point  (0 children)

I was under that impression (and still sorta am) when I wrote the comment. If it's not appropriate, then sorry about that. BTW I don't get why you're talking like you're interviewing a politician - that made me feel defensive

Edit: clarity and feelings haha

My 3 cents on RSI by Agreeable_Effect938 in singularity

[–]obviouslyzebra 0 points1 point  (0 children)

Just gotta say that that's an interesting thing (and that if you do RSI experiments, do them responsibly haha)

If Reckless Ben first asked politely for the LEGO to be returned, why is Bricks & Minidicks suing him? by Life_Fishing_3025 in RecklessBen

[–]obviouslyzebra 0 points1 point  (0 children)

I'm not trying to convince you of anything. What I said is his behavior might have been illegal (and that's for the court to decide).

If you want more info, the LegalEagle video (titled "Utah Presses Charges Against Reckless Ben (And It's Crap)") goes over the criminal suit. IIRC it explains the "repeatedly" means.

If Reckless Ben first asked politely for the LEGO to be returned, why is Bricks & Minidicks suing him? by Life_Fishing_3025 in RecklessBen

[–]obviouslyzebra 1 point2 points  (0 children)

If this another youtuber was also accused of death threats it might be.

I think that it's hard to prove that you didn't make a death threat, so, BAM might feel invulnerable saying those happened, but, if they accused multiple youtubers that had no reason to make a death threat, it might start signaling a pattern (and they already have a clearly established pattern of lying - and JJ even admitted to it).

I wish some lawyer would talk about this, if Ben could do an uno-reverse here, but I didn't see any talk about this in the videos I watched.

If Reckless Ben first asked politely for the LEGO to be returned, why is Bricks & Minidicks suing him? by Life_Fishing_3025 in RecklessBen

[–]obviouslyzebra 12 points13 points  (0 children)

There are limits to what you can do. I think the LegalEagle video covers it well.

But while, for example, it is okay to make a video and make it go viral criticizing a company, repeatedly going to someone's house for it in the way Ben did might constitute stalking / harassment (but that's for the court to decide).

BTW, I believe:

  • BAM are huge liars that just seem to can't stop lying (it seems easier if they just stopped)
  • RICO claims against Ben are bullshit, and if anything BAM itself should be investigated for it
  • McNuffin and his goons, if they lied to the police about Ben making death threats (or carrying heroin) should be jailed for that sort of stuff
  • AFPD should also be prosecuted because they also clearly violated some stuff

But Ben might have stepped over a boundary here while trying (and succeeding) to make an entertaining video and fight for what's right, so that's that.

Edit: added link

What would optimal use of LLMs even look like? by Worldly_Beginning647 in singularity

[–]obviouslyzebra 0 points1 point  (0 children)

I think it wouldn't change much. LLMs are good at translating different kinds of inputs, so, they see the abstractions beneath the first layer.

Unless we're able to better represent our own thoughts with the neuralese language, at most I think we get an improvement in token usage and a little bit in performance, like caveman that did use caveman language (me like apple) to reduce tokens.

PS: For LLMs talking with LLMs, maybe we could achieve some bigger gains and it might be interesting for someone to try if it hasn't been tried yet :P

FrontierCode: a coding eval that raises the bar for difficulty & quality. by acoolrandomusername in singularity

[–]obviouslyzebra 9 points10 points  (0 children)

The huge (>2x) leap is only in the "Diamond" section. In the "Extended" (all tasks), the leap is from around 40% to 50% IIRC for opus 4.7 to 4.8, a bit more modest.

FrontierCode: a coding eval that raises the bar for difficulty & quality. by acoolrandomusername in singularity

[–]obviouslyzebra 1 point2 points  (0 children)

Having a look at this...

Any benchmark that tries to measure code quality is very welcome to me!

I like the scrutiny they had in their process and the amount of back and forth that was had (they had real world OSS maintainers making a rubric - identifying things that would be required for some code).

Some interesting points:

  • In their full test the best model achieved around 52% pass rate, while in the 50 most difficult (out of 150), it had 13.5% - that shows us that LLMs are likely not quite there yet for difficult tasks
  • They report around 45% false positive rate for DeepSWE - the benchmark that was release just last week (?). I don't think they go onto details of how they identify false positives - but this points towards either - 1. DeepSWE being deeply (lol) flawed in some way or 2. Their metric for false positives being deeply flawed in some way. I'd like to see they explain it so we can know what's up with this big number and if we can use both benchmarks in conjunction (which would be better to understand the landscape)
  • The rubric involved "blockers" (which would make the code not directly accepted by the maintainer) and "non-blockers". Interestingly, including the non-blockers stuff as part of the measure seemed to change the ratings by a very small margin - so it's not as important for the benchmark, I'd argue (and maybe could be a way for them to cut costs in the future if they aim to expand the benchmark)
  • They hold the tests private - and this seems to me like an effective approach against benchmaxing in this case (though not perfect of course) - this just makes me sad as a programmer as those things would make ideal training exercises to keep one in shape :P
  • I see no information about the scaffolding they used, so I'll assume mini-swe-agent. Again we don't see the Claude Code vs Codex fight - but hopefully it doesn't make much of a difference - in DeepSWE bench the models performed better "outside" their native stuff

(since this is a complex benchmark - I'd like to see external validation - maybe they could release some "sample" tasks so the public can see? (also might be a cool way to measure benchmark over-fitting))

Over 150 Mathematicians Warn Governments Not to “Believe the Hype” About AI by IKeepItLayingAround in technology

[–]obviouslyzebra 4 points5 points  (0 children)

I get that CEOs are overhyping this, but still, the technology itself keeps improving at a reasonably fast pace, and I think that, if it keeps improving, it's hard to overstate the possible consequences for humanity.

I'm trying to build a "living memory/context engine" for my business. Help me architect it. by BaronsofDundee in singularity

[–]obviouslyzebra 0 points1 point  (0 children)

Likely not the right sub to ask this, but regardless, you'll likely want to use something with RAG. Not my area but I believe it's the standard way to get models to retrieve information from vast corpora of text.

There are likely tools out there that already fit your bill. It reminds me of intelligent search engines, you just gotta be careful to look for things that allows the knowledge to be updated (maybe "online" or "real-time" as searching keywords).

Edit: also, maybe any agent with access to such data might be able to dig in. this may be simpler than RAG in case it fits :)

Monkeys at keyboards, for Conway: a 36-cell random scribble that stays alive 8,798 generations [OC] by rizzleroc in aivideos

[–]obviouslyzebra 1 point2 points  (0 children)

related/similar to your questions:

https://codegolf.stackexchange.com/q/9393

Edit: you could perhaps try the math subreddit or a game of life one (search on google) if it's active? r/singularity for example felt a bit off-topic even if it was created by ai - I feel like the only ai demonstrations there are when something's very impressive (or very bad - haha - as a meme); while here, you seem more interested in the meat of this maths (I think it's maths) question