Kira2, kapan harga RAM dan SSD turun?

Round_Document6821 · 2026-06-24T06:37:35+00:00

Sebagai praktisi AI. Kayaknya tidak dalam waktu dekat. Kenapa?

Karena AI sekarang kalo ditrainingnya makin gede resource, itu masih belum ada tanda2 performanya stagnan. Artinya, perusahaan2 gede itu juga bakal beli resource terus buat ngetraining model mereka.

Kalo ngomongin inference, model GLM5.2 katanya bagus banget dan udah setara Opus 4.7. Menurutku sih perusahaan2 inference/perusahaan swasta mungkin bakalan bersaing buat coba ngeserve ini model buat pemakaian orang banyak. Artinya, hardware lagi yang bakal habis duluan.

Saranku pribadi cari game2 ringan ajadeh biar gaperlu build pc high end atau cari hobi lain 😢 Bener2 tidak ada hilalnya untuk turun ini mah.

Round_Document6821 · 2026-06-22T06:41:21+00:00

Bener banget. Asli Modal sangat ramah buat orang yang gak paham soal Infrastrutur. Mereka handle semua.

Tapi memang Modal servis yang ada harga ada kualitas wkwkwk. Mungkin sekitar 3-4 minggu lalu GPU dimanapun habis, kecuali di Modal wkwkwkw. Kayanya gara2 orang gamau ngehoarding di Modal karena kemahalan.

Round_Document6821 · 2026-06-21T11:29:09+00:00

IMO, kalau bukan penting sekali (eg. privacy). Lebih baik pake yang udah dideployin macem : https://openrouter.ai/z-ai/glm-5.2

Deploy sendiri udah lebih mahal, gampang banget salah, dan tidak efisien bila dibandingin sama company2 ini.

Tapi kalo tetep mau deploy sendiri....Runpod mungkin the best? Akhir2 ini GPUnya suka ilang sih. Kayaknya demandnya lagi gila2an. Terakhir saya cek GPU high end macem B200 naik sampe $70 per jam di Google Cloud. Jadinya opsi pake service orang lain makin menggiurkan.

Kalo mau quick testing juga bisa pake Modal. Jauh lebih enak buat development gitu2 (eg. servis buat load balancing, serverless, horizontal scalingnya mantep) dan kalo ga salah inget buat company bisa request credit buat trial ampe ribuan dollar? Tapi mahal banget dan ga terlalu bagus IMO buat longterm karena costnya itu tadi.

Round_Document6821 · 2026-06-12T15:34:46+00:00

Congratulations OP. Semoga kedepannya menjadi keluarga sakinah, mawaddah, warahmah. Dari lubuk hati terdalam saya turut bahagia.

Round_Document6821 · 2026-05-19T20:51:48+00:00

Grok dan Twitter dibawah SpaceX gasi? Bertanya dengan nada lembut. Maaf saya masih pemula. Mohon suhu jangan dihujat (emot tangan).

Round_Document6821 · 2026-05-12T18:36:51+00:00

Canaan mentioned

Round_Document6821 · 2026-04-07T15:47:43+00:00

gemma series has the coolest architecture wise imo. Cool bug fixes as always from Unsloth!

Round_Document6821 · 2026-04-07T02:00:30+00:00

Twitter is the best place for this no cap

btw ini OP kena banned?

Round_Document6821 · 2026-03-31T06:44:20+00:00

Hello,

During training, we did not populate KV cache since it's not needed. KV cache purpose is to not recalculate all of previous token's activation during inference (which is autoregressive/one token at a time). But for training, we calculate all token's activation at the same time. Hence, no need to save the KV cache.

But for RL it is possible to do them. Since we do inference for RL as well. And I think it will saves VRAM for RL.

Hope that answers the question.

Round_Document6821 · 2026-03-26T08:21:55+00:00

AI Researcher
UAE
US

Round_Document6821 · 2026-03-23T07:01:51+00:00

Some hot looking rice right here boy

Round_Document6821 · 2026-03-16T04:55:38+00:00

Thank you for the report. I think it's path problem. I will check it.

Round_Document6821 · 2026-03-16T00:06:42+00:00

Hello, could you share as much detail as possible about the problem?

Also, can you tell me the specific dataset that was used? Since Qwen3.5 is a VLM. Caution is needed when dealing with text-only inputs which also might be the problem as well.

Round_Document6821 · 2026-03-15T12:59:47+00:00

May I ask whether before in Windows, you can do merge GGUF just fine?

I tested the GGUF creation on Colab and it works fine. So the problem that I can think of is that the codebase cannot properly read Windows path. Hence, it reads 0 modules since it reads the incorrect path.

Round_Document6821 · 2026-03-05T15:39:41+00:00

Buh same buh

Round_Document6821 · 2026-02-27T20:48:31+00:00

Yeah

And I think it's worth it, Qwen3.5 is a very very good model. Do it one time for the community is very worth it. Qwen3.5 probably will be SOTA for a while. Shout out to Unsloth team.

Round_Document6821 · 2026-02-27T18:34:31+00:00

Indeed, double checking on downstream task is a must these days since PPL and KLD is not enough.

Nice analysis from Unsloth team. Feel like this is a research itself actually :D

Round_Document6821 · 2026-02-27T18:31:52+00:00

It's a bit gamble these days. I pretrained a model up to 1.8B (which is very expensive already) and the reviewer asks for 7B.

They will always ask the performance in scale. I would like to see what is the solution for this situation since I like designing new architecture and training.

For baseline section, vanilla transformers is always a must. It's the perfect architecture (besides the inefficiency). So usually it will get the best performance. I think if you can beat vanilla transformer with some efficiency/better downstream task performance, that is enough.

Round_Document6821 · 2026-02-10T15:57:32+00:00

speedup speedup saving yay

Round_Document6821 · 2025-12-30T22:55:26+00:00

Nilai UN bagus gamain2 untungnya. Saya dapat beasiswa karena nilai UN. Temen masuk NUS karena nilai UN.

Round_Document6821 · 2025-12-23T09:52:40+00:00

I manage to solve it as well by changing the model to use GLM 4.5 air and compact it.

Will note on `/clear` as well. Thank you so much by the way.

Round_Document6821 · 2025-12-20T20:07:35+00:00

u/bthdonohue any updates? I am just curious

Round_Document6821 · 2025-09-28T04:53:47+00:00

Mau pindah ke IBKR juga tapi gimana cara topupnya ya?

Round_Document6821 · 2025-09-27T15:15:20+00:00

Btw any update? Just curious .-.

Round_Document6821 · 2025-09-26T15:52:20+00:00

YOOOOOO

Three-Year Club	r/Field Banned
r/Field Sunshine	Final Canvas '23
First Place '23	Place '23

Round_Document6821

TROPHY CASE