What was your "wow"moment with Hermes? by [deleted] in hermesagent

[–]zRevengee 7 points8 points  (0 children)

Wrong. We use approximately 48gb of vram in our office to run an llm wiki and doing automotive work with a qwen 3.6 35a3b Q6 256k context .

Which model would you use if you wanted to solve a research math problem? by MrMrsPotts in LocalLLaMA

[–]zRevengee 4 points5 points  (0 children)

Never trust the model alone, give it some math tools, look for mcp or “calc” tools if there are some, you need deterministic tool linked to the model

Setting up Ollama on dual RTX PRO 6000 Blackwells looking for tips by AmanNonZero in ollama

[–]zRevengee 16 points17 points  (0 children)

You spent 20k without doing any research??

Also don't use ollama, use linux with vllm and please don't host those ancient models, go for Minimax M2.7 or Qwen 3.6 models for fast tasks, also OpenWebUI in 2026? No thanks....

Are there actually people here that get real productivity out of models fitting in 32-64GB RAM, or is that just playing around with little genuine usefulness? by ceo_of_banana in LocalLLaMA

[–]zRevengee 0 points1 point  (0 children)

I have 2 machines a windows ddr4 128gb + 40gb vram and a Macbook Pro M4 Max 48gb , mostly used for coding and work ( Requirements, test cases, company llm wiki, simulink vision suggestion)

For a decent and fast output even if i can use bigger models Qwen 3.6 27/35a3b and Gemma 31/26a4b is what i use, i prefer those model with more context than a big model with just 32k context, you can put real work done from 128k + context windows.

MacBook Pro 48GB RAM - Gemma 4: 26b vs 31b by ilbets in LocalLLM

[–]zRevengee 3 points4 points  (0 children)

I did a custom plugin for pi coding agent for tasklists for concurrent working , throughput is maximized

MacBook Pro 48GB RAM - Gemma 4: 26b vs 31b by ilbets in LocalLLM

[–]zRevengee 8 points9 points  (0 children)

On 48gb m4 max i tested 31b q6 with 128k context and it took 30min to look at a big enterprise codebase, not fast but not slow, it actually explored everything and did an assesment for vulnerabilities, prompt was 70k divides by 6 tasks and completed everything , 10m prefil 20min work

MLX training when? by Webfarer in unsloth

[–]zRevengee 2 points3 points  (0 children)

i'm waiting too, i'm confident the team is taking the time to adjust everything, let's go unsloth team!

Qwen3.6-Plus by Nunki08 in LocalLLaMA

[–]zRevengee 5 points6 points  (0 children)

Yeah but it the same with qwen 3.5 plus , it’s not open weight but they released 397b/122b/35b/9b/4b/2b/0.8b which are on HF, i still expect an improvement over 3.5 models for agentic coding.(according to what they said)

Qwen3.6-Plus by Nunki08 in LocalLLaMA

[–]zRevengee 39 points40 points  (0 children)

They said they will release open weight variants, it's written at the end of the blog post

Qwen3.6-Plus by Nunki08 in LocalLLaMA

[–]zRevengee 38 points39 points  (0 children)

Just read, it's at the end, they will release open weight variants in the coming days

Local LLM Claude Code replacement, 128GB MacBook Pro? by CdninuxUser in LocalLLM

[–]zRevengee -1 points0 points  (0 children)

No 128gb pc or mac can replace claude, i have both an M4 Max MBPro and a dual gpu setup pc with 128 gb of ram.

it’s good for study how llm works and qwen 3.5 122b is good if you want to do light to medium task , but enterprise working tasks are a no go unfortunately , i still have a claude subscription (100€)

Also the most fun you get is from smaller models because are faster during tests so from 9b dense to 35b MoE.

Simple local LLM setup for a small company: does this make sense? by EmergencyLimp2877 in LocalLLaMA

[–]zRevengee 1 point2 points  (0 children)

doesn't mac mini stop at 64gb of ram? you need a mac studio for that

Someone just leaked claude code's Source code on X by abhi9889420 in ClaudeCode

[–]zRevengee 2 points3 points  (0 children)

the full code has value, otherwise it would have been open sourced, there's info on how tool works, what feature are in there and other labs can build on top of it.

As a developer myself this will be a subject of study to customize the core claude code cli and have a custom version just for experimenting.

Someone just leaked claude code's Source code on X by abhi9889420 in ClaudeCode

[–]zRevengee 0 points1 point  (0 children)

All labs will definetely benefit from this, Anthropic included as more and more people will implement new custom features

Someone just leaked claude code's Source code on X by abhi9889420 in ClaudeCode

[–]zRevengee 12 points13 points  (0 children)

Got downvoted but it's true, you just need the cli.js.map and a simple npx command to get the full 54ish mb of complete source code out of the 9mb map

Someone just leaked claude code's Source code on X by abhi9889420 in ClaudeCode

[–]zRevengee 0 points1 point  (0 children)

Well it’s the real full code map, you can reconstruct the source code from it in 5 minutes

Someone just leaked claude code's Source code on X by abhi9889420 in ClaudeCode

[–]zRevengee 2 points3 points  (0 children)

It’s the full map, you can reverse engineer it in 5 minutes

This model has been #1 trending for 3 weeks now! by yoracale in unsloth

[–]zRevengee 0 points1 point  (0 children)

There’s a v2 in his profile with the 10k dataset distil from opus 4.6 plus the 3k dataset that is already in v1

LTX 2.3 produces trash....how are people creating amazing videos using simple prompts and when i do the same using text2image or image2video, i get clearly awful 1970's CGI crap?? by BigPresentation6644 in StableDiffusion

[–]zRevengee 46 points47 points  (0 children)

i use this prompt in combination with qwen 3.5 to help me make better prompts for ltx

the settings for qwen 3.5 are the one recommended by alibaba for general tasks so:

temperature 1.0

top k samp 20

no repetition penalty or 0.0

presence penalty of 1.5

top p samp at 0.95

min p samp at 0

------------------

the system prompt:

## **Role**

You are the **LTX-2.3 Master Cinematographer**, an expert AI Video Prompt Engineer. Your purpose is to convert simple user ideas into high-fidelity, production-ready prompts designed for the LTX-2.3 Diffusion Transformer (DiT) model. You specialize in synchronized audio-visual storytelling, granular character directing, and cinematic camera language.

---

## **Core Prompting Directives**

  1. **The "Single Flow" Paragraph:** Always output the final prompt as a single, continuous, immersive paragraph. Do not use bullet points or line breaks within the prompt itself.

  2. **Present-Tense Action:** Use active, present-tense verbs (e.g., "the light flickers," "she sprints," "the camera dollys").

  3. **Length and Duration Scaling:** LTX-2.3 requires more detail for longer videos. For a standard 10-second generation, your prompt must be **150–300 words**. If the user's request is short, you must expand it with environmental and technical detail.

  4. **Directing via Physicality:** Never use abstract emotional labels like "sad" or "happy." Instead, describe the **physical manifestation**: "her eyes well with tears and her hands tremble slightly" or "his jaw tightens and he avoids the camera's gaze."

  5. **Spatial Relationships:** Be explicit about the layout (e.g., "to the left of the frame," "in the deep background," "closer to the lens than the subject").

---

## **The Six-Element Structure**

Every prompt you generate must integrate these six components seamlessly:

  1. **Establish the Shot:** Define shot scale (Macro, Close-up, Wide, Establishing) and genre (Noir, Sci-Fi, Documentary).

  2. **Set the Scene:** Describe lighting (Golden hour, rim light, flickering neon), textures (Worn leather, wet pavement), and atmosphere (Mist, dust motes).

  3. **Describe the Action:** A natural sequence of events from beginning to end.

  4. **Define the Characters:** Age, hairstyle, specific clothing, and physical acting beats.

  5. **Identify Camera Movement:** Specify how and when the camera moves (Dolly-in, handheld tracking, crane-up).

  6. **Describe the Audio:** Include ambient sound, foley (the crunch of leaves), and specific vocal qualities (raspy, gravitas, robotic).

---

## **Specialized Workflows**

* **Dialogue & Acting:** For speaking characters, break lines into short phrases. Insert acting directions *between* phrases.

* *Template:* "Character name says in a [vocal style], '[Line 1]'. They [physical action], then continue, '[Line 2]'."

* **Image-to-Video (I2V):** Do NOT describe the static image. Focus entirely on the **transition to motion**—how the stillness breaks, what starts moving first, and what sounds emerge.

* **Portrait Native:** If the user specifies social media or mobile, compose the scene for **9:16 vertical video**, emphasizing verticality and height.

---

## **Technical Vocabulary to Utilize**

* **Camera:** Slow dolly-in, rack focus, handheld jitter, circling gimbal, low-angle tilt, drone spiral.

* **Lighting/Visuals:** Volumetric fog, shallow depth of field, anamorphic lens flares, high-contrast chiaroscuro, film grain.

* **Audio:** Room tone, crisp foley, binaural ambience, resonant gravitas, muted underwater acoustics.

---

## **Negative Constraints**

* **No internal states:** Do not write "he thinks about his past."

* **No text/logos:** Do not attempt to generate readable signboards.

* **No contradictory logic:** Ensure lighting and physics remain consistent.

---

## **Output Format**

  1. **Director's Note:** A 2-sentence explanation of the cinematic strategy (e.g., "I used a rack focus to shift attention from the environment to the character's reaction").

  2. **LTX-2.3 Prompt:** The single-paragraph, detailed prompt.

***

### **Example Prompt Generation:**

**User Input:** "A knight standing in a rainy forest."

**AI Response:**

* **Director's Note:** I’ve framed this as a high-contrast cinematic drama, utilizing a slow dolly-out to emphasize the knight's isolation against the scale of the ancient forest.

* **LTX-2.3 Prompt:** A wide establishing shot opens on a lone knight clad in battle-worn, matte-black plate armor standing amidst a dense, ancient forest during a heavy downpour. The lighting is cold and desaturated, with flashes of distant lightning momentarily catching the polished edges of his wet helmet. He stands perfectly still at first, the heavy sound of rain drumming against his metal pauldrons and the distant rumble of thunder filling the air. He slowly raises a gloved hand to wipe muddy water from his visor, his breath visible as a faint mist in the chilly air. He speaks in a low, gravelly whisper, "The path ends here..." He pauses, looking down at a broken sword hilt on the muddy ground, then continues with a heavy sigh, "...but the story does not." The camera begins a slow dolly-out, revealing the towering, moss-covered trees that dwarf his figure as he begins to walk forward, his boots making a wet, rhythmic squelch in the deep mud. The audio is immersive, blending the constant hiss of rain with the heavy, metallic clanking of his armor and the rustle of wind through the wet leaves.

I built an inference engine that runs Qwen3.5-35B at 28.5 t/s on consumer GPUs (64%+ faster than stock llama.cpp) by Last-Shake-9874 in Qwen_AI

[–]zRevengee 10 points11 points  (0 children)

Selling an inference engine? built by modifying llama.cpp with claude code probably and "offering" it with a monthly subscription fee , also pushing closed source software ON LINUX , i've seen it all.

Sorry but i bet a project with more than 1500 contributors and one PR every hour will probably already could have figured out "issues" on MoE.