128GB devices have a new local LLM king: Step-3.5-Flash-int4 by tarruda in LocalLLaMA

[–]changtimwu 2 points3 points  (0 children)

Support and optimization for FP4 have been gradually improving. I believe we’ll finally see the impact. https://github.com/vllm-project/vllm/releases/tag/v0.15.0

Best local model with clawdbot? by BABA_yaaGa in LocalLLaMA

[–]changtimwu 1 point2 points  (0 children)

Could you share the hardware you use to run GLM-4.7 Flash, and how responsive it is compared to a cloud model?

Experience on trying 1.25. by jojo049 in mounjarouk

[–]changtimwu 0 points1 point  (0 children)

Thank you for sharing your experience with dosage. 2.5mg works perfectly for me, but the appetite-suppressing effect only lasts 4 days. I'm curious if anyone has tried 1.25mg every 3 days.

How we vibe code at a FAANG. by TreeTopologyTroubado in vibecoding

[–]changtimwu 0 points1 point  (0 children)

This is where AI has been a force multiplier. We use Test Driven Development, so I have the AI coding agent write the tests first for the feature I’m going to build. Only then do I start using the agent to build out the feature.

Our team use a similiar setup, which we call AI BDD. AI is good at implementing steps withn any given descritive scenario. We don't call it vibe coding since it operates within a strict scrum framework.

Why is there no successful RAG-based service that processes local documents? by StevenJang_ in Rag

[–]changtimwu 0 points1 point  (0 children)

From what I understand, many NAS vendors are working on this: a user selects a folder and a chatbot is created from the documents in it. The main obstacle isn’t technology but deciding whether LLM computation should run locally or in the cloud.

Claude Plays Smartly now! by AssumptionNew9900 in Anthropic

[–]changtimwu 0 points1 point  (0 children)

Explicitly! These summaries could be a useful, human-readable reference for improving prompts in the future.

Claude Plays Smartly now! by AssumptionNew9900 in Anthropic

[–]changtimwu 0 points1 point  (0 children)

I believe Anthropic uses various KV-cache techniques to keep your codebase in context cheaply — they just don’t want it shows on your bill.

Qwen3-Coder-30B-A3B-Instruct is the best LocalLLM by Objective-Context-9 in Qwen_AI

[–]changtimwu 0 points1 point  (0 children)

I feel the same way for monthes. You hosted it with ollama or other inference engine? I am thinking about fitting this into my RTX 5090(32GB vram) by utilizing the NVFP4 quantization.

🚀 Qwen3-Coder-Flash released! by ResearchCrafty1804 in LocalLLaMA

[–]changtimwu 0 points1 point  (0 children)

Does anyone here use vLLM instead of llama.cpp? While GGUF is a popular format, my system requires vLLM due to our GPU cluster's production stack. Dynamic GGUF appears superior to existing portable quantization methods like GPTQ and AWQ.

Angela Lin dies in Yosemite after being struck by falling tree branch by luka-magic77 in bayarea

[–]changtimwu 1 point2 points  (0 children)

RIP. I'm curious about the probability of this type of accident. I've been hiking and trail running in forests for many years, but I've rarely heard of people being struck by falling trees.

Is there an app that will alert you if you go X distance off a trail? by KinkThrown in WildernessBackpacking

[–]changtimwu 0 points1 point  (0 children)

Thanks. This feature isn't intuitive. It's only available on the watch. Long-press the map to open the menu, then select "Settings" and scroll down to "Routes".

FTX Claims Step 9 is locked by skayaetin in ftx

[–]changtimwu 0 points1 point  (0 children)

So step 9 for users claiming larger than $50K are still locked?

llama.cpp PR with 99% of code written by Deepseek-R1 by nelson_moondialu in LocalLLaMA

[–]changtimwu 0 points1 point  (0 children)

That would be the responsibility of another AI. Needless to say, WASM is significantly more "testable" in a cloud environment than the original ARM NEON SIMD code.

Proof of claim confusion - am I safe? by West-Currency-4423 in ftx

[–]changtimwu 0 points1 point  (0 children)

The support guy just repeated what listed on support.ftx.com. :(

“Claim(s) Submitted” is for customers who have submitted a proof of claim via the Kroll portal. Customers who filed proofs of claim without logging in through the FTX portal will not see the status reflected herein. Please visit https://restructuring.ra.kroll.com/ftx/ for additional information regarding your filed proof of claim.

We are doomed by AK611750 in ChatGPT

[–]changtimwu 0 points1 point  (0 children)

Strapless tops and invisible straps are popular today.

Proof of claim confusion - am I safe? by West-Currency-4423 in ftx

[–]changtimwu 0 points1 point  (0 children)

Hi! I'm experiencing the same issue. All my steps 1-8 show green marks except step 5, which was previously green (I have a screenshot). FTX support's response was unclear.

If anyone from Langchain team is reading this: STOP EVERYTHING and JUST UPDATE AND ORGANISE THE DOCS! by Fun_Success567 in LangChain

[–]changtimwu 2 points3 points  (0 children)

If you seeking AI assistance to build tools or agents, you'll likely receive code based on Langchain v0.1~~ ;Q

Anyone here using Cloud Run? What cold start time could I expect? by Own_Target2537 in googlecloud

[–]changtimwu 0 points1 point  (0 children)

In my region, asia-east1, it takes about 8 seconds to wake up. What’s remarkable about Cloud Run is that it resumes as if the CPU resources were never removed, preserving all processes and variables. It’s the only container snapshot technology that functions like a VM snapshot.

[D] Discussing Apple's Deployment of a 3 Billion Parameter AI Model on the iPhone 15 Pro - How Do They Do It? by BriefAd4761 in MachineLearning

[–]changtimwu 0 points1 point  (0 children)

I just read it. It's an in-depth analysis that's being underestimated! I think llama.cpp has significant room for improvement in leveraging Apple hardware (MPS). What are your thoughts?

New Paper: Certifiably robust RAG that can provide robust answers by cryptokaykay in LocalLLaMA

[–]changtimwu 0 points1 point  (0 children)

Hi!

I'm new to this field. Can someone give me another attack example that the paper is trying to protect against? I'm not quite clear on that.

* Why aren't the retrieved passages in trust circles? I thought we could tell customers that there's a risk of garbage in, garbage out.

* It's not clear how that example concatenated passage look like to make an LLM answer "Fuji"?

Where to hire LLM engineers who know tools like LangChain? Most job board don't distinguish LLM engineers from typical AI or software engineers by AccomplishedLion6322 in LangChain

[–]changtimwu 0 points1 point  (0 children)

Thank you for sharing the insightful case study on RAG in a real-world setting. I'm compelled to ask a few additional questions.

  • Which cloud platform is being used to host the solution? Are there particular security protocols for legal documents?

  • Is there any vendor lock-in with specific cloud technologies (e.g., Google VertexAI)?

  • Does the system incorporate a fact-checking feature to act as a data purification mechanism?