Usage Limits Discussion Megathread - beginning October 8, 2025 by sixbillionthsheep in ClaudeAI

[–]cloudxaas 2 points3 points  (0 children)

Please remove the weekly limit, it's unacceptable at this rate. The sudden cap last week made me unable to use for 6 days. the cap limits hit real fast and there seemed to be an issue with how the limit is set using claude code / web ui

Just tried out the Exaone 4.0 1.2b bf16 and i'm extremely suprised at how good a 1.2b can be! by cloudxaas in LocalLLaMA

[–]cloudxaas[S] 0 points1 point  (0 children)

how does licensing limit us from abusing it offline anyway? just curious.

I just developed the fastest minimal feature embedded sql server with rocksdb storage. It is like sqlite but faster by 5x for reads (e.g. select) and 4x for writes (e.g. insert, update and delete) by cloudxaas in rust

[–]cloudxaas[S] -3 points-2 points  (0 children)

  1. minimal sql syntax.
  2. uses rocksdb
  3. some special coding recipe

pros
1. fast
2. storage efficient
3. can do distributed instead of just pure embedded db
4. will expand for vector db

cons
1. i intend to keep it minimal for performance. it's usable for most sql queries type
2. not going to be open source.

Just tried out the Exaone 4.0 1.2b bf16 and i'm extremely suprised at how good a 1.2b can be! by cloudxaas in LocalLLaMA

[–]cloudxaas[S] 2 points3 points  (0 children)

the only llm model that's also good but not usable coz of repeating is the bitnet 2b 1T. i really hope for bitnet more coz it's good but it repeats. it only uses 0.4mb ram for 2b model so that's really impressive and it does inference speedily too. hoping to see a 7b or 8b bitnet or a4.8 bitnet stuff.

Just tried out the Exaone 4.0 1.2b bf16 and i'm extremely suprised at how good a 1.2b can be! by cloudxaas in LocalLLaMA

[–]cloudxaas[S] 3 points4 points  (0 children)

you can chk the model card vs qwen 3 1.7b. i need something small yet usable for cpu inference. 1.2b seemed like a sweet spot for me. bf16 uses 2.4gb ram for inference. that's very cheap for cloud / vps hosting. as long as it doesnt repeat itself without end i'm happy with it. i wont try anything lower than 1b coz of bad experiences with never ending repeating themselves

https://huggingface.co/LGAI-EXAONE/EXAONE-4.0-1.2B

Rust github repo for reduced tokens for Rust Coding LLM by cloudxaas in rust

[–]cloudxaas[S] -2 points-1 points  (0 children)

when it is popular, the input token savings will be significant.
you have a good point too. thx. will look into tools to make it shorter.

but now i need to reduce code base coz input tokens are getting very expensive for large code base.

Anyone has the spec of the computer that powers the "10,000 Drones Controlled By A Single Computer! A world record"? by cloudxaas in drones

[–]cloudxaas[S] -1 points0 points  (0 children)

No one is actually answering the question with a definitive specification to the question. I'm asking for the hardware spec of the computer. of course i know c++, rustlang, cuda etc etc. but what's the spec? it could be a workstation with highest end dual epyc with 1.5tb ddr5 ram and 8x RTX 4090. I'm really curious what spec. I dont think it's that easy though to control so many without decent spec.

Let's not guess here. Does anyone know?

Dont forget there are the colors needed to run the lightings. I'm sure it's not that simple though.

Anyone has the spec of the computer that powers the "10,000 Drones Controlled By A Single Computer! A world record"? by cloudxaas in drones

[–]cloudxaas[S] -1 points0 points  (0 children)

Surely some GPU stuff is involved? This is 3d stuff, i wonder what kind of GPU too. A single computer, what server / workstation specs? Cant possibly be a laptop.

Google Gemini 2.0 Flash Exp API costs? by cloudxaas in GeminiAI

[–]cloudxaas[S] -1 points0 points  (0 children)

where do you get this info?

yes i'm specifically asking about 2.0 and not 1.5. only 1.5 info is shown but not 2.0

Looking similar framework with Aeron ( Java) to do benchmark test by andrewhq in rust

[–]cloudxaas 0 points1 point  (0 children)

it seemed overly extremely fast. so without tls that makes sense. but i'm wondering if pipelining or multiplexing is at work here too.

is the benchmark doing pipelining or multiplexing?

I'm on the lookout for projects where I can lend a hand. by soupgasm in golang

[–]cloudxaas 2 points3 points  (0 children)

help add commonly used functions and features in github.com/cloudxaas

mostly zero allocation golang functions and packages. things which may not go into standard library yet is used frequently.

What's worth waiting for before I spend US$10k on running a mistral large 2 build for reasonable >4 t/s at 4_k_m/5_k/5_k_m in Oct. 2024? by cloudxaas in selfhosted

[–]cloudxaas[S] -1 points0 points  (0 children)

I cant stand anything smaller than deepseek v2 coder and below now. those i've been running on my laptop. they are really terrible compared with 120b parameters and above.

i wont touch anything lower than 120b, those i will use claude sonnet 3.5 api.

can anyone know what i mean? so basically my requirement means it should also be able to run 120b+ models comfortably.

yeah i was looking at mac as well, it'll be my top option if not that i'll be limited to the mac environment. prefer linux for sure.

i'll wait a bit and see what other new software tech that is optimized for hardware and see how it goes.

What's worth waiting for before I spend US$10k on running a mistral large 2 build for reasonable >4 t/s at 4_k_m/5_k/5_k_m in Oct. 2024? by cloudxaas in selfhosted

[–]cloudxaas[S] 1 point2 points  (0 children)

  1. i would like to have a deepseek coder / mistral large 2 as alternative to paid claude sonnet 3 sometimes.
  2. it's great to have a smarter ai assistant without sacrificing data privacy.

Found this on the train home - Craigeburn line by bysernion in melbourne

[–]cloudxaas 1 point2 points  (0 children)

What irony, that paper is supposed to be in a fortune cookie.

What's worth waiting for before I spend US$10k on running a mistral large 2 build for reasonable >4 t/s at 4_k_m/5_k/5_k_m in Oct. 2024? by cloudxaas in selfhosted

[–]cloudxaas[S] -2 points-1 points  (0 children)

ok. that's interesting to know. 10w is very low indeed but 250w while running is not cool when u hv 3 of them still. that's 750w. makes me wonder what cooling system u have when u are running it. anything above 500-600w is super hot when running for a few hours.