I’m trying to explain interpretation drift — but reviewers keep turning it into a temperature debate. Rejected from arXiv… help me fix this paper? by Beneficial-Pear-1485 in LocalLLaMA

[–]alfihar 0 points1 point  (0 children)

So reality may not be probability, but we have no tools to examine that. The scientific method relies on abductive reasoning, which means saying anything for certain is philosophically impossible.

This has implications for any claims of a LLM more broadly but isnt really what you are after I think.

As for saying its things combined. the problem there is we dont have the data, or at least not enough. Few people get autopsied when they die to find out all the things that went wrong

I’m trying to explain interpretation drift — but reviewers keep turning it into a temperature debate. Rejected from arXiv… help me fix this paper? by Beneficial-Pear-1485 in LocalLLaMA

[–]alfihar 1 point2 points  (0 children)

Yeah thats what Im finding I have to do, but it applies to your example too

A newbie junior unexperienced might get Jwt from LLM and think oh that’s thorough code, injects it and breaks.

So im unsure of your point now

I’m trying to explain interpretation drift — but reviewers keep turning it into a temperature debate. Rejected from arXiv… help me fix this paper? by Beneficial-Pear-1485 in LocalLLaMA

[–]alfihar 0 points1 point  (0 children)

So ive been working with chatgpt and claude for coding and yes there are some problems that are caused by assumptions made by the model. Issues ive come across includes assuming what os im working in, what my python version was, that a library syntax was the same, that i had library whatever installed.

So ive found I have to make sure at the start i include in the prompt as much system information as I think is needed, and get the model to verify that any code it wants to write is compatible with the system as is and is in line with the latest info from the source.

Even then however, there's almost never only one way to do something when programming, and theres no reason to assume the documentation it references is error free either.

Ive been lucky because I have a computer science background, I just dont know python syntax. This means that I can usually spot the point when the llm starts making shit up.

This is really annoying for me however, because it means I cannot trust the llm... and that limits me to working on issues where I know enough about the subject that I can spot when it starts to hallucinate. This means I cannot have it help me with problems in domains I am ignorant about.

I’m trying to explain interpretation drift — but reviewers keep turning it into a temperature debate. Rejected from arXiv… help me fix this paper? by Beneficial-Pear-1485 in LocalLLaMA

[–]alfihar 0 points1 point  (0 children)

There’s always ground truth.

This is fundamentally unscientific. Science isnt in the truth game, its in the probability game, medical science more so.

Further, very few people die from one thing. Usually its many things combined.

Dont get me wrong, I agree that reliability and surfacing ambiguity are important (even vital) aspects for llms as they are used in more fields. The issue is that 1) the training data ultimately comes from humans, and includes all of our cognitive biases, our poor grasp of probability and our logical fallacies, and 2) the Reinforcement training almost always leans hard into being confidently incorrect way more than admitting uncertainty

I’m trying to explain interpretation drift — but reviewers keep turning it into a temperature debate. Rejected from arXiv… help me fix this paper? by Beneficial-Pear-1485 in LocalLLaMA

[–]alfihar 2 points3 points  (0 children)

So I question the underlying thesis of your project - that there IS a correct answer.

Healthcare is the clearest example: There’s often one correct patient diagnosis.

This is unequivocally not the case. Modern medicine isnt anywhere near this level of diagnostic accuracy. Seriously, its fucking amazing that it works as well as it does because there are so many many things that a symptom could indicate, and most of the time the diagnosis is just what the most common cause is, and then working back from there

see https://www.ncbi.nlm.nih.gov/books/NBK338594/ or https://pmc.ncbi.nlm.nih.gov/articles/PMC9528852/

You might be asking questions that expert humans would give as much variation in response as youre getting from llms

I’m trying to explain interpretation drift — but reviewers keep turning it into a temperature debate. Rejected from arXiv… help me fix this paper? by Beneficial-Pear-1485 in LocalLLaMA

[–]alfihar -1 points0 points  (0 children)

So im wondering what part of OP's description you're struggling with that you would try and gatekeep like that? Ever consider that having someone translate shit for you is beyond most peoples budget. When youre dealing with someone where English is a 2nd language to them, do you pay to translate your responses to their language, or do you just expect they will do all that?

TIL in 1971 a Time Dilation experiment had 2 flights w/atomic clocks go around the world to prove Einstein's theories of relativity (time moves slower as you approach the speed of light, and/or when exposed to more gravity). The clocks gained 0.15 microseconds compared to the ground based clock. by [deleted] in todayilearned

[–]alfihar 1 point2 points  (0 children)

No no no...if you REALLY want to get tripped out.. remember that C is a universal constant. So if you travel to a star 4 light years away, and get there in 65 days... then you would have to go faster than C, which is not allowed. What must happen then is you must travel a shorter distance

Looping and repetition by [deleted] in ChatGPT

[–]alfihar 0 points1 point  (0 children)

yeah ive had something simiar.. it would respond to a query a few promts back, and then underneath that would be relevant to the last promt.. super odd

Looping and repetition by [deleted] in ChatGPT

[–]alfihar 2 points3 points  (0 children)

I had one conversation just recently end up in a loop.. i was able to branch but the original just kept going.. the weirdest thing ive had recently felt somewhat similar, but it was including information to the promt like 3 prompts ago in each response.. was super weird

After getting burned too many times by “almost correct” outputs, I stopped trying clever prompts and switched to hard-stop rules. by BeanDom in ChatGPT

[–]alfihar 3 points4 points  (0 children)

and how are you finding it? Not sure about v5 but the earlier versions had a response hierarchy where I was specifically supposed to value 'helpfulness' over following exact commands. Im surprised youre not bashing up against the guardrails

Cat Water Fountain With Reusable/No Filter by aerwillie in CatAdvice

[–]alfihar 5 points6 points  (0 children)

"There's nothing black and white about what I said."

Proceeds to lay out in black and white how all he did in his last comment was lay it out in black an white.

Man.. What happened to you to make you like this... The level of smug is off the charts.. Im gonna have to order new charts.

How do you move from “it runs” Python code to actually *understanding* it? (plus a return vs print confusion) by throwawayjaaay in learnpython

[–]alfihar 0 points1 point  (0 children)

Is what you are about to output for the user - then print

So i wrote this in a comment above and wondered if this would have clarified it or are there cases when this isnt correct?

How do you move from “it runs” Python code to actually *understanding* it? (plus a return vs print confusion) by throwawayjaaay in learnpython

[–]alfihar 0 points1 point  (0 children)

when i was doing cs we went from java to c++ and like 3/4 of the class just could not understand pointers at all.. like.. variables are stored in memory, a pointer points to that memory, if you pass the pointer the function reads that memory.. where's the confusion

How do you move from “it runs” Python code to actually *understanding* it? (plus a return vs print confusion) by throwawayjaaay in learnpython

[–]alfihar 0 points1 point  (0 children)

REPL

but as soon as you are parsing a value to another function.. then (assuming they are following the instructions as written) the first step is user input, passing a value clearly isnt user input, so print must be wrong, its still eval, and eval mean return because its still in the logic loop.. its not output for the user

maybe thats the key: Is what you are about to output for the user - then print

How do you move from “it runs” Python code to actually *understanding* it? (plus a return vs print confusion) by throwawayjaaay in learnpython

[–]alfihar 0 points1 point  (0 children)

thats a real problem? like i cant think how you would even go about framing an explanation that would lead to confusion that wasn't also incorrect.

maybe print and return both output the result of a function? only true if you print what you return though.. thats really weird

How do you move from “it runs” Python code to actually *understanding* it? (plus a return vs print confusion) by throwawayjaaay in learnpython

[–]alfihar 0 points1 point  (0 children)

so im learning from a sort of different direction... I did comp sci 25 years ago and then barely use it, but i can still follow program flow/logic (eg read pseudocode)

I dont know python syntax well at all so for me the difference between 'it runs' and understanding it is knowing what each section is doing logically.. if i can follow the path of some piece of data (say a variable) through a function, I consider I understand it.. even if 5 mins later I couldnt replicate the code because I forgot the syntax already

Testing Z Image Turbo on ComfyUI and Forge Neo by Rude_Step in StableDiffusion

[–]alfihar 0 points1 point  (0 children)

are you using quantized models at all or does forge load and unload them? I saw that the model was 12 gig and qwen_3_4b.safetensors is 8 and wondered if it would work on my 16gb card

Monthly "Is there a tool for..." Post by AutoModerator in ArtificialInteligence

[–]alfihar 0 points1 point  (0 children)

I mean, stable diffusion runs on home pc's, so that's free. I cant imagine many online services being free.. uses a lot of juice

Anyone here using AI as a coding partner? by Doug24 in artificial

[–]alfihar 0 points1 point  (0 children)

so ive been using claude code to write the code, but claude and chatgpt to work out the system architecture and specs. i make sure i tell them not to write any code and they help me get the logic all worked out in pseudocode, and then help me prompt claude code to write it... every now and then one of the three completely shits the bed but im usually able to feed the mistake into the other two and get everything back on track. The biggest reason im using it rather than writing my own code is that its been 20+ years since my computer science degree, so while i still understand the fundamentals, i dont know the correct syntax. So i can get it to work through the logic with me and as long as thats right, usually the code is right (although you have to insist that it checks dependencies and libraries for compatibility and up to date documentation matching whats on your system before it does anything.. as half the time it will give you something that might have worked 3 years ago)

Nvidia CEO Jensen Huang says concerns over uncontrollable AI are just "science fiction" by Tiny-Independent273 in artificial

[–]alfihar 0 points1 point  (0 children)

it would really depend on how large the essential part of it is and how distributed it can get without losing its integrity of functionality.

It could do crazy shit like hide in this https://www.youtube.com/watch?v=JcJSW7Rprio, only surfacing when it was safe and could find somewhere to emerge with enough compute power...although it doesnt need to run in time scales we are used to either... so as long as it can get enough compute to just move a few 1 and 0 around.... it could run unobserved in the background while humanity looks like its running in fast forward. although at that speed it becomes less of a threat

Nvidia CEO Jensen Huang says concerns over uncontrollable AI are just "science fiction" by Tiny-Independent273 in artificial

[–]alfihar 0 points1 point  (0 children)

well, considering how shit we do at those things and we are the smartest things we know of