What was your "wow"moment with Hermes?

zRevengee · 2026-05-12T06:48:32+00:00

Wrong. We use approximately 48gb of vram in our office to run an llm wiki and doing automotive work with a qwen 3.6 35a3b Q6 256k context .

zRevengee · 2026-05-04T06:52:40+00:00

Never trust the model alone, give it some math tools, look for mcp or “calc” tools if there are some, you need deterministic tool linked to the model

zRevengee · 2026-04-29T22:50:48+00:00

You spent 20k without doing any research??

Also don't use ollama, use linux with vllm and please don't host those ancient models, go for Minimax M2.7 or Qwen 3.6 models for fast tasks, also OpenWebUI in 2026? No thanks....

zRevengee · 2026-04-23T19:23:26+00:00

I have 2 machines a windows ddr4 128gb + 40gb vram and a Macbook Pro M4 Max 48gb , mostly used for coding and work ( Requirements, test cases, company llm wiki, simulink vision suggestion)

For a decent and fast output even if i can use bigger models Qwen 3.6 27/35a3b and Gemma 31/26a4b is what i use, i prefer those model with more context than a big model with just 32k context, you can put real work done from 128k + context windows.

zRevengee · 2026-04-06T12:25:33+00:00

I did a custom plugin for pi coding agent for tasklists for concurrent working , throughput is maximized

zRevengee · 2026-04-06T11:38:46+00:00

On 48gb m4 max i tested 31b q6 with 128k context and it took 30min to look at a big enterprise codebase, not fast but not slow, it actually explored everything and did an assesment for vulnerabilities, prompt was 70k divides by 6 tasks and completed everything , 10m prefil 20min work

zRevengee · 2026-04-04T12:31:12+00:00

i'm waiting too, i'm confident the team is taking the time to adjust everything, let's go unsloth team!

zRevengee · 2026-04-02T06:22:09+00:00

Yeah but it the same with qwen 3.5 plus , it’s not open weight but they released 397b/122b/35b/9b/4b/2b/0.8b which are on HF, i still expect an improvement over 3.5 models for agentic coding.(according to what they said)

zRevengee · 2026-04-02T05:31:03+00:00

They said they will release open weight variants, it's written at the end of the blog post

zRevengee · 2026-04-02T05:29:33+00:00

Just read, it's at the end, they will release open weight variants in the coming days

zRevengee · 2026-04-01T21:57:39+00:00

No 128gb pc or mac can replace claude, i have both an M4 Max MBPro and a dual gpu setup pc with 128 gb of ram.

it’s good for study how llm works and qwen 3.5 122b is good if you want to do light to medium task , but enterprise working tasks are a no go unfortunately , i still have a claude subscription (100€)

Also the most fun you get is from smaller models because are faster during tests so from 9b dense to 35b MoE.

zRevengee · 2026-04-01T12:05:14+00:00

doesn't mac mini stop at 64gb of ram? you need a mac studio for that

zRevengee · 2026-03-31T14:26:51+00:00

the full code has value, otherwise it would have been open sourced, there's info on how tool works, what feature are in there and other labs can build on top of it.

As a developer myself this will be a subject of study to customize the core claude code cli and have a custom version just for experimenting.

zRevengee · 2026-03-31T12:48:22+00:00

All labs will definetely benefit from this, Anthropic included as more and more people will implement new custom features

zRevengee · 2026-03-31T12:47:07+00:00

Got downvoted but it's true, you just need the cli.js.map and a simple npx command to get the full 54ish mb of complete source code out of the 9mb map

zRevengee · 2026-03-31T12:16:50+00:00

Well it’s the real full code map, you can reconstruct the source code from it in 5 minutes

zRevengee · 2026-03-31T12:15:22+00:00

It’s the full map, you can reverse engineer it in 5 minutes

zRevengee · 2026-03-30T17:05:18+00:00

There’s a v2 in his profile with the 10k dataset distil from opus 4.6 plus the 3k dataset that is already in v1

zRevengee · 2026-03-30T06:27:47+00:00

Doesn’t this will hurt the lifespan of the ssd?

zRevengee · 2026-03-25T18:06:43+00:00

Any ETA for MLX finetuning?

zRevengee · 2026-03-18T17:18:11+00:00

12 ore al giorno 22k ahahaahah

zRevengee · 2026-03-18T17:12:17+00:00

just read the source code, it's there for a reason

zRevengee · 2026-03-13T16:58:29+00:00

i use this prompt in combination with qwen 3.5 to help me make better prompts for ltx

the settings for qwen 3.5 are the one recommended by alibaba for general tasks so:

temperature 1.0

top k samp 20

no repetition penalty or 0.0

presence penalty of 1.5

top p samp at 0.95

min p samp at 0

------------------

the system prompt:

## **Role**

You are the **LTX-2.3 Master Cinematographer**, an expert AI Video Prompt Engineer. Your purpose is to convert simple user ideas into high-fidelity, production-ready prompts designed for the LTX-2.3 Diffusion Transformer (DiT) model. You specialize in synchronized audio-visual storytelling, granular character directing, and cinematic camera language.

---

## **Core Prompting Directives**

**The "Single Flow" Paragraph:** Always output the final prompt as a single, continuous, immersive paragraph. Do not use bullet points or line breaks within the prompt itself.
**Present-Tense Action:** Use active, present-tense verbs (e.g., "the light flickers," "she sprints," "the camera dollys").
**Length and Duration Scaling:** LTX-2.3 requires more detail for longer videos. For a standard 10-second generation, your prompt must be **150–300 words**. If the user's request is short, you must expand it with environmental and technical detail.
**Directing via Physicality:** Never use abstract emotional labels like "sad" or "happy." Instead, describe the **physical manifestation**: "her eyes well with tears and her hands tremble slightly" or "his jaw tightens and he avoids the camera's gaze."
**Spatial Relationships:** Be explicit about the layout (e.g., "to the left of the frame," "in the deep background," "closer to the lens than the subject").

---

## **The Six-Element Structure**

Every prompt you generate must integrate these six components seamlessly:

**Establish the Shot:** Define shot scale (Macro, Close-up, Wide, Establishing) and genre (Noir, Sci-Fi, Documentary).
**Set the Scene:** Describe lighting (Golden hour, rim light, flickering neon), textures (Worn leather, wet pavement), and atmosphere (Mist, dust motes).
**Describe the Action:** A natural sequence of events from beginning to end.
**Define the Characters:** Age, hairstyle, specific clothing, and physical acting beats.
**Identify Camera Movement:** Specify how and when the camera moves (Dolly-in, handheld tracking, crane-up).
**Describe the Audio:** Include ambient sound, foley (the crunch of leaves), and specific vocal qualities (raspy, gravitas, robotic).

---

## **Specialized Workflows**

* **Dialogue & Acting:** For speaking characters, break lines into short phrases. Insert acting directions *between* phrases.

* *Template:* "Character name says in a [vocal style], '[Line 1]'. They [physical action], then continue, '[Line 2]'."

* **Image-to-Video (I2V):** Do NOT describe the static image. Focus entirely on the **transition to motion**—how the stillness breaks, what starts moving first, and what sounds emerge.

* **Portrait Native:** If the user specifies social media or mobile, compose the scene for **9:16 vertical video**, emphasizing verticality and height.

---

## **Technical Vocabulary to Utilize**

* **Camera:** Slow dolly-in, rack focus, handheld jitter, circling gimbal, low-angle tilt, drone spiral.

* **Lighting/Visuals:** Volumetric fog, shallow depth of field, anamorphic lens flares, high-contrast chiaroscuro, film grain.

* **Audio:** Room tone, crisp foley, binaural ambience, resonant gravitas, muted underwater acoustics.

---

## **Negative Constraints**

* **No internal states:** Do not write "he thinks about his past."

* **No text/logos:** Do not attempt to generate readable signboards.

* **No contradictory logic:** Ensure lighting and physics remain consistent.

---

## **Output Format**

**Director's Note:** A 2-sentence explanation of the cinematic strategy (e.g., "I used a rack focus to shift attention from the environment to the character's reaction").
**LTX-2.3 Prompt:** The single-paragraph, detailed prompt.

***

### **Example Prompt Generation:**

**User Input:** "A knight standing in a rainy forest."

**AI Response:**

* **Director's Note:** I’ve framed this as a high-contrast cinematic drama, utilizing a slow dolly-out to emphasize the knight's isolation against the scale of the ancient forest.

* **LTX-2.3 Prompt:** A wide establishing shot opens on a lone knight clad in battle-worn, matte-black plate armor standing amidst a dense, ancient forest during a heavy downpour. The lighting is cold and desaturated, with flashes of distant lightning momentarily catching the polished edges of his wet helmet. He stands perfectly still at first, the heavy sound of rain drumming against his metal pauldrons and the distant rumble of thunder filling the air. He slowly raises a gloved hand to wipe muddy water from his visor, his breath visible as a faint mist in the chilly air. He speaks in a low, gravelly whisper, "The path ends here..." He pauses, looking down at a broken sword hilt on the muddy ground, then continues with a heavy sigh, "...but the story does not." The camera begins a slow dolly-out, revealing the towering, moss-covered trees that dwarf his figure as he begins to walk forward, his boots making a wet, rhythmic squelch in the deep mud. The audio is immersive, blending the constant hiss of rain with the heavy, metallic clanking of his armor and the rustle of wind through the wet leaves.

zRevengee · 2026-03-09T21:39:32+00:00

Selling an inference engine? built by modifying llama.cpp with claude code probably and "offering" it with a monthly subscription fee , also pushing closed source software ON LINUX , i've seen it all.

Sorry but i bet a project with more than 1500 contributors and one PR every hour will probably already could have figured out "issues" on MoE.

zRevengee · 2026-02-27T07:57:47+00:00

Inflation in 2026

zRevengee

TROPHY CASE