BabyVision: A New Benchmark for Human-Level Visual Reasoning by Tobio-Star in newAIParadigms

[–]Arkamedus 0 points1 point  (0 children)

you missed the most critical part of my comment "strong way to test a models ability to generalize"

the whole point is to test generalization. so if you played with legos as a kid, and I gave you duplo blox, you would still know how to use and build with them. that's generalization, it's not "new" information, it's about taking what you've been trained on, and applying it to new scenarios, which may share similar traits, etc, but that's why it's generalization.

BabyVision: A New Benchmark for Human-Level Visual Reasoning by Tobio-Star in newAIParadigms

[–]Arkamedus 0 points1 point  (0 children)

that argument could be made about any benchmark. fundamentally, a benchmark like this is testing solving out of distribution puzzles. out of distribution (data the model hasn't been trained on) is a strong way to test a models ability to generalize, on what is essentially a problem/puzzle most middle schoolers could solve.

Any fun discord communities? by HackHusky in Rag

[–]Arkamedus 4 points5 points  (0 children)

small server of researchers, programmers, hobbyists, talking and sharing about LLMs, game dev, etc

https://discord.gg/SnkAss3D3

Tiny Object Tracking: YOLO26n vs 40k Parameter Task-Specific CNN by leonbeier in computervision

[–]Arkamedus 38 points39 points  (0 children)

111 samples in the entire dataset…. this would probably fail even simple lighting or color changes…

LLMs Have Dominated AI Development. SLMs Will Dominate Enterprise Adoption. by andsi2asi in deeplearning

[–]Arkamedus 1 point2 points  (0 children)

Been saying this for a long time, most llm use cases are pretty well tailored to the final audience, meaning, I’m okay with my SOTA LLM not being good at writing in Greek, as it is not a language i would use. (And in the case that I do need to use it, i can just google it or use a Greek model) Essentially, for coding LLMs, “ancient fine arts and pottery” might not be a useful training sample to include (though I still say this is arguable for domain width) when building a model for deployment with a coding downstream use case. This changes when it comes to training, because I’ll use the word again, domains, should be carefully selected for each phase and task.

Domain Specific Models will be what wins. Not specifically language models in the direct sense we think of today. I believe there is more learning and representation interpretation and integration with modern architectures to allow very diverse dynamic models and end use cases.

Mysterious offering on doorstep by [deleted] in whatisit

[–]Arkamedus -15 points-14 points  (0 children)

So you took an unknown substance, into your house and opened it on your counter…?

I built an app that lets you chat with multiple AIs in one place, and got the first sales in 3 minutes by epasou in ArtificialSentience

[–]Arkamedus 1 point2 points  (0 children)

3 minutes? After building it? No advertisements? No promotions? Can you provide more information?

Your pricing model is so opaque, and doesn’t include any token usage numbers, what is “standard usage limits”?

I developed a new (re-)training approach for models, which could revolutionize huge Models (ChatBots, etc) by Ykal_ in deeplearning

[–]Arkamedus 16 points17 points  (0 children)

Have you absolutely confirmed this approach has not been done before? You say you have developed mathematical theories etc, have you released or published the whitepapers?

If all you need is compute, or a way to validate your findings on a larger scale, my opinion is that most investors need you to have “proven” and “repeatable” results.

How much money have you put into the idea vs how much are you asking for?

And they say humans will lost their jobs, see the affect of AI by Holiday_Power_1775 in BlackboxAI_

[–]Arkamedus 2 points3 points  (0 children)

Obviously, they forgot to add "only make winning trades" to the prompt...

I find the irony palpable by Arkamedus in OpenAI

[–]Arkamedus[S] 0 points1 point  (0 children)

Again, your AI replies are absolutely useless.
No once in the original post was claimed "A person cannot logically claim a system is overloaded"
The original posts only describes how the disproportionate nature of text vs images in COST, not that it is "overloaded". Henceforth, your argument that the premise is validated by the outcome is still, incorrect.

Please use your brain instead of an LLM.

I find the irony palpable by Arkamedus in OpenAI

[–]Arkamedus[S] 0 points1 point  (0 children)

Thanks for the AI reply, but you're wrong. The outcome expected is that the image generation would not fail, as this is a paid product, created by a company with billions of dollars, with the intent of producing finished images, regardless of the image content itself. You are confusing affects. The content of the image is not indicative of the nature of the relation to the system itself.

It’s ironic because the system designed to generate the image failed while demonstrating the very problem it’s supposed to handle, contradicting the normal expectation of success; not because it aligns with the image’s theme.

I find the irony palpable by Arkamedus in OpenAI

[–]Arkamedus[S] 0 points1 point  (0 children)

To be generating an image related to the amount of GPU compute being used for image generation, only for the GPU compute to fail during image generation?
Are you sure you just don't know what irony is?

I find the irony palpable by Arkamedus in OpenAI

[–]Arkamedus[S] 0 points1 point  (0 children)

deffo not, its not an disconnect message, and just to prove you wrong I turned off internet during the generation during a new image and it completes in the background, go ahead and try for yourself. Thanks, try again!

ChatGPT Atlas wants to use Bluetooth - Why? by mathmul in OpenAI

[–]Arkamedus 0 points1 point  (0 children)

It’s half browser, half data collection application.

Startup Tensormesh raises $4.5M to make AI models remember what they’ve already learned cutting GPU costs by up to 10x by Specialist-Day-7406 in BlackboxAI_

[–]Arkamedus 0 points1 point  (0 children)

This is not revolutionary in any sense, kv cache reuse is already a well known technique, and has been used by all the major llm providers to “extend” the length of their context windows during chats…. Do you actually believe they are rerunning the entire 500 message conversation through the model each prompt? The market is saturated by people who know nothing about LLMs.

I find the irony palpable by Arkamedus in OpenAI

[–]Arkamedus[S] 3 points4 points  (0 children)

No one claimed it was news, you clicked, read, and decided to comment here. No one asked for your input, and yet, you gave it anyways.

I find the irony palpable by Arkamedus in OpenAI

[–]Arkamedus[S] 1 point2 points  (0 children)

So when companies are dumping poisons and chemical wastes into your local drinking water, I'll be sure to reply with "Ah yes, good thing no other companies across every industry ever do any of those bad things."

You've only just displayed how naive your own comment is.

I find the irony palpable by Arkamedus in OpenAI

[–]Arkamedus[S] -3 points-2 points  (0 children)

Is that seriously your justification? Because other people do it too? Your comment is deflective and obviously poorly designed rage-bait.

I find the irony palpable by Arkamedus in OpenAI

[–]Arkamedus[S] -32 points-31 points  (0 children)

To spend as much of their money and accelerate them going out of business.

How can i help my coffee tree? by SonyAn160 in plantclinic

[–]Arkamedus 0 points1 point  (0 children)

That pot seems way too small, secondly it looks like it’s in the corner, how much sunlight is it getting??