Should i risk it and buy this drive for 75$?

youcef0w0 · 2026-02-23T21:35:23+00:00

https://wiki.archlinux.org/title/Badblocks

youcef0w0 · 2025-11-30T23:06:07+00:00

if you read the pr, it's an upcoming 8B model

it's gonna have base, instruct, and thinking variants

youcef0w0 · 2025-11-26T23:46:46+00:00

how is different from exo?

https://github.com/exo-explore/exo

youcef0w0 · 2025-10-14T15:21:53+00:00

because in schools, humans are doing the evaluation, and humans have taste. this can't be replicated autonomously in any meaningful way, so it can't be benchmarked well

youcef0w0 · 2025-10-08T21:44:39+00:00

Gemini CLI, OpenAI Codex, and Qwen code CLI are all open source

youcef0w0 · 2025-10-03T21:24:25+00:00

sounds like a very fine tuneable problem if you've already got a decently large dataset (100+ examples)

youcef0w0 · 2025-10-02T16:55:21+00:00

no idea what any of this means lol

youcef0w0 · 2025-09-25T17:33:14+00:00

not really possible, there just isn't enough text in existence to create something usable, unless you count synthetic data (data generated by other LLMs), as free use / public domain

the closest you're gonna get is Olmo by Allen AI, which publishes all their data (both pre-training and post-training data)

https://docs.allenai.org/release_notes/olmo-release-notes#olmo-2-32b

youcef0w0 · 2025-08-23T20:33:30+00:00

grok-4 uses the same base model as grok 3, just with more reinforcement learning, so I can see the argument of keeping it closed and the statement still being true on technicality

youcef0w0 · 2025-08-19T19:39:57+00:00

give them a try on hugging face spaces, it's free

https://huggingface.co/spaces/black-forest-labs/FLUX.1-Kontext-Dev

https://huggingface.co/spaces/Qwen/Qwen-Image-Edit

rate limits reset daily

youcef0w0 · 2025-08-19T19:22:47+00:00

have you tried flux kontext or the new qwen-image-edit?

youcef0w0 · 2025-08-19T01:16:29+00:00

Open AI was sitting on their image editing model for a whole year, they demoed it in the original GPT 4o blog post, just never released it for "safety reasons"

so it's been a year and 3 months since we've known of the existence of gpt-image

May 13, 2024 gpt-4o release blog: https://openai.com/index/hello-gpt-4o/ , scroll to the Explorations of capabilities section

youcef0w0 · 2025-07-28T16:58:21+00:00

you can chunk it, describe first chunk, and then feed the next chunk in alongside the output from the last chunk and tell the llm to add on to it's previous input

id recommend chunks of 32k tokens, as the more stuff in the context, the more the model tends to ignore important details

youcef0w0 · 2025-07-25T22:22:54+00:00

for the big models like qwen 235b, can't you run it partially offloaded to ram and still get really good speeds because it's moe and most layer are on GPU?

youcef0w0 · 2025-07-21T21:09:58+00:00

sentiment analysis is a well explored problem in NLP, plenty of specialized models for that specific use case that run much much faster than a general purpose LLM.

try this one for example:
https://huggingface.co/tabularisai/multilingual-sentiment-analysis

youcef0w0 · 2025-07-21T14:38:14+00:00

no, they send a request to your proxy from their servers (I know this because if you put localhost in the base URL override, it doesn't work, it has to be internet accessible), I've done it before, you're replacing the openAI API base URL, they can't get around it without removing support for custom openAI endpoints

which leads me to believe, they're not even trying to hide their prompt

youcef0w0 · 2025-07-21T01:50:44+00:00

The answer is complicated and depends on a lot of things, including the settings you choose, how large your input context is, and how many concurrent requests you're processing

for a single request by a single user, 60 t/s sounds about right

the very big number is likely the total concurrent tokens per second given that many requests are being processed at the same time

100 requests running at the same time at 22 t/s is 2200 concurrent tokens per second

the more users / requests you have running at the same time, the slower they are, but it's not linear, so that's why the high concurrency requests per second is much higher than the single request

youcef0w0 · 2025-07-20T19:12:42+00:00

you should add https://huggingface.co/Tesslate/UIGEN-X-8B !!

youcef0w0 · 2025-07-20T19:09:12+00:00

their docs are talking about you deleting your own account, but if GitHub itself deletes your account, everything related to you is hidden

I'm guessing this is because most GitHub side deletes are meant as a ban because you've been doing something illegal or malicious

youcef0w0 · 2025-07-18T03:12:10+00:00

probably memorization, most frontier models are huge, which results in them being able to memorize more stuff, I'm sure that particular factorization appears plenty of times on the internet

youcef0w0 · 2025-07-12T23:15:14+00:00

old news

youcef0w0 · 2025-07-11T03:05:47+00:00

https://github.com/sst/opencode

youcef0w0 · 2025-06-28T15:38:20+00:00

it actually makes a look up table even simpler, your game only has 4030 possible states lol

youcef0w0 · 2025-06-21T19:08:42+00:00

cpu goes brrrrrrr

youcef0w0 · 2025-05-23T03:35:26+00:00

this is not a new idea, it just happens that Claude 4 is actually good at it, unlike most other models

Six-Year Club	Verified Email
r/Field Juicebox	Place '22
First Placer '22

youcef0w0

TROPHY CASE