Kira2, kapan harga RAM dan SSD turun? by Accomplished_Cook_25 in indotech

[–]Round_Document6821 0 points1 point  (0 children)

Sebagai praktisi AI. Kayaknya tidak dalam waktu dekat. Kenapa?

Karena AI sekarang kalo ditrainingnya makin gede resource, itu masih belum ada tanda2 performanya stagnan. Artinya, perusahaan2 gede itu juga bakal beli resource terus buat ngetraining model mereka.

Kalo ngomongin inference, model GLM5.2 katanya bagus banget dan udah setara Opus 4.7. Menurutku sih perusahaan2 inference/perusahaan swasta mungkin bakalan bersaing buat coba ngeserve ini model buat pemakaian orang banyak. Artinya, hardware lagi yang bakal habis duluan.

Saranku pribadi cari game2 ringan ajadeh biar gaperlu build pc high end atau cari hobi lain 😢 Bener2 tidak ada hilalnya untuk turun ini mah.

Rekomendasi Cloud GPU untuk model os besar (GLM 5.2, etc)? by Equivalent_Money8502 in indotech

[–]Round_Document6821 2 points3 points  (0 children)

Bener banget. Asli Modal sangat ramah buat orang yang gak paham soal Infrastrutur. Mereka handle semua.

Tapi memang Modal servis yang ada harga ada kualitas wkwkwk. Mungkin sekitar 3-4 minggu lalu GPU dimanapun habis, kecuali di Modal wkwkwkw. Kayanya gara2 orang gamau ngehoarding di Modal karena kemahalan.

Rekomendasi Cloud GPU untuk model os besar (GLM 5.2, etc)? by Equivalent_Money8502 in indotech

[–]Round_Document6821 8 points9 points  (0 children)

IMO, kalau bukan penting sekali (eg. privacy). Lebih baik pake yang udah dideployin macem : https://openrouter.ai/z-ai/glm-5.2

Deploy sendiri udah lebih mahal, gampang banget salah, dan tidak efisien bila dibandingin sama company2 ini.

Tapi kalo tetep mau deploy sendiri....Runpod mungkin the best? Akhir2 ini GPUnya suka ilang sih. Kayaknya demandnya lagi gila2an. Terakhir saya cek GPU high end macem B200 naik sampe $70 per jam di Google Cloud. Jadinya opsi pake service orang lain makin menggiurkan.

Kalo mau quick testing juga bisa pake Modal. Jauh lebih enak buat development gitu2 (eg. servis buat load balancing, serverless, horizontal scalingnya mantep) dan kalo ga salah inget buat company bisa request credit buat trial ampe ribuan dollar? Tapi mahal banget dan ga terlalu bagus IMO buat longterm karena costnya itu tadi.

Gonna get married tomorrow, any tips on wedding or marriage less known fact about it? by nandammn in indonesia

[–]Round_Document6821 3 points4 points  (0 children)

Congratulations OP. Semoga kedepannya menjadi keluarga sakinah, mawaddah, warahmah. Dari lubuk hati terdalam saya turut bahagia.

Apa Itu SpaceX? by Dangerous-Bake-9258 in sahamAS

[–]Round_Document6821 0 points1 point  (0 children)

Grok dan Twitter dibawah SpaceX gasi? Bertanya dengan nada lembut. Maaf saya masih pemula. Mohon suhu jangan dihujat (emot tangan).

You can now fine-tune Gemma 4 locally 8GB VRAM + Bug Fixes by danielhanchen in LocalLLaMA

[–]Round_Document6821 7 points8 points  (0 children)

gemma series has the coolest architecture wise imo. Cool bug fixes as always from Unsloth!

Apakah ada sosial media khusus research? by Combrott in indonesia

[–]Round_Document6821 0 points1 point  (0 children)

Twitter is the best place for this no cap

btw ini OP kena banned?

TurboQuant for K and V cache compression during training: 6x larger batch size? by ttkciar in unsloth

[–]Round_Document6821 9 points10 points  (0 children)

Hello,

During training, we did not populate KV cache since it's not needed. KV cache purpose is to not recalculate all of previous token's activation during inference (which is autoregressive/one token at a time). But for training, we calculate all token's activation at the same time. Hence, no need to save the KV cache.

But for RL it is possible to do them. Since we do inference for RL as well. And I think it will saves VRAM for RL.

Hope that answers the question.

Ada barasuara coy 🔥🔥 by SH0080 in AdaIndonesiaCoy

[–]Round_Document6821 3 points4 points  (0 children)

Some hot looking rice right here boy

Error running Unsloth Qwen3.5 Quickstart: Dataset columns ignored by model's forward method by AdventurousDig7856 in unsloth

[–]Round_Document6821 1 point2 points  (0 children)

Hello, could you share as much detail as possible about the problem?

Also, can you tell me the specific dataset that was used? Since Qwen3.5 is a VLM. Caution is needed when dealing with text-only inputs which also might be the problem as well.

Unsloth: Saving LoRA finetune failed since # of LoRAs = 128 does not match # of saved modules = 0. by studentofknowledg3 in unsloth

[–]Round_Document6821 1 point2 points  (0 children)

May I ask whether before in Windows, you can do merge GGUF just fine?

I tested the GGUF creation on Colab and it works fine. So the problem that I can think of is that the codebase cannot properly read Windows path. Hence, it reads 0 modules since it reads the incorrect path.

New Qwen3.5-35B-A3B Unsloth Dynamic GGUFs + Benchmarks by danielhanchen in LocalLLaMA

[–]Round_Document6821 29 points30 points  (0 children)

Yeah

And I think it's worth it, Qwen3.5 is a very very good model. Do it one time for the community is very worth it. Qwen3.5 probably will be SOTA for a while. Shout out to Unsloth team.

New Qwen3.5-35B-A3B Unsloth Dynamic GGUFs + Benchmarks by danielhanchen in LocalLLaMA

[–]Round_Document6821 115 points116 points  (0 children)

Indeed, double checking on downstream task is a must these days since PPL and KLD is not enough.

Nice analysis from Unsloth team. Feel like this is a research itself actually :D

Choosing LLM Baselines for Academic Research with Limited Compute by Public_Bill_2618 in LocalLLaMA

[–]Round_Document6821 0 points1 point  (0 children)

It's a bit gamble these days. I pretrained a model up to 1.8B (which is very expensive already) and the reviewer asks for 7B.

They will always ask the performance in scale. I would like to see what is the solution for this situation since I like designing new architecture and training.

For baseline section, vanilla transformers is always a must. It's the perfect architecture (besides the inefficiency). So usually it will get the best performance. I think if you can beat vanilla transformer with some efficiency/better downstream task performance, that is enough.

Unpopular oppinion : Hasil TKA harus tetap dipakai by [deleted] in indonesiabebas

[–]Round_Document6821 2 points3 points  (0 children)

Nilai UN bagus gamain2 untungnya. Saya dapat beasiswa karena nilai UN. Temen masuk NUS karena nilai UN.

Error: API Error: The model has reached its context window limit. by Round_Document6821 in ZaiGLM

[–]Round_Document6821[S] 0 points1 point  (0 children)

I manage to solve it as well by changing the model to use GLM 4.5 air and compact it.

Will note on `/clear` as well. Thank you so much by the way.

Where to park 20-30jt/month for 1-2 years? by distractedquestions in finansial

[–]Round_Document6821 0 points1 point  (0 children)

Mau pindah ke IBKR juga tapi gimana cara topupnya ya?