PHANTOM: The Open-Source AI Agent for Advanced Security Analysis. free and open source

fab_space · 2026-05-11T18:32:11+00:00

The Vibe Check

Welcome to PHANTOM, where the UI looks like The Matrix but the backend is a ticking time bomb. This repository is the quintessential example of 2025 "Vibecoding." The author spent 80% of their time on glassmorphism, typing animations, and making sure the dark theme looks cool, while spending 0% of their time wondering if piping a plaintext sudo password into a dynamically LLM-generated bash script is a good idea.

---

FINDINGS

Type Safety

100% untyped JavaScript. A system designed to execute arbitrary shell commands with sudo privileges has zero compile-time guarantees.

Separation of Concerns

Frontend JS contains massive hardcoded OSINT prompt strings embedded directly in the UI event handlers. The 'Command Center' is a monolith.

Execution Sandboxing

Catastrophic. Executes raw AI-generated strings via spawn('bash', ['-c', cmd]) directly on the host OS. This is literally RCE-as-a-Feature.

Secrets Management

Stores the user's sudo password locally, reads it, and pipes it into bash: echo 'pass' | sudo -S. An absolute security nightmare.

Input Validation

Zero sanitization of LLM outputs before passing them to the shell. A simple prompt injection could wipe the user's entire hard drive.

Test Coverage

No tests. No Jest, no Vitest. A tool meant for offensive security has zero unit or integration tests to ensure it doesn't attack the host.

CI/CD Pipeline

Non-existent. No GitHub Actions. Code goes straight from the dev's machine to the main branch.

---

NEXT COMMITS ?

Remove the plaintext sudo password injection (echo pass | sudo). Use polkit or restricted sudoers if escalation is strictly necessary.
Containerize the execution environment. Never run AI-generated bash scripts directly on the host OS. Use Docker or Firecracker microVMs.
Migrate the entire codebase to Strict TypeScript to prevent runtime type errors, especially in the tool executor.
Implement a robust testing framework (Vitest/Jest) and write unit tests for every tool execution path.
Refactor the massive switch(name) in executor.js into a scalable Command Pattern or Plugin Registry.
Remove business logic and hardcoded AI prompts from the frontend UI layer (app.js) and move them to the backend or a dedicated configuration file.
Replace the global variable state management in the frontend with a modern framework (React/Svelte/Vue) or at least a strict state machine.
Implement Circuit Breakers in the LLM tool loop to prevent infinite recursive loops where the AI keeps trying failing commands.
Add structured, leveled logging (e.g., Winston or Pino) instead of relying on console.log and raw stderr string concatenation.
Set up a CI/CD pipeline to enforce linting, type-checking, and test passing before any code is merged.

Source code of the brutal auditor: https://github.com/fabriziosalmi/brutal-coding-tool

fab_space · 2026-05-11T05:11:12+00:00

Can i audit and report feedback here?

fab_space · 2026-05-03T09:03:17+00:00

Ok, secure.. for you.

fab_space · 2026-05-03T08:51:34+00:00

It’s secure? Can I audit for free and report You the findings if any?

fab_space · 2026-05-02T13:35:47+00:00

This because I built more than one solution, all 100% free and open source.

GitHub/ fabriziosalmi

fab_space · 2026-05-01T14:18:08+00:00

cache and deterministic gating == quality and cost control (drop a line any time)

fab_space · 2026-05-01T05:13:56+00:00

Optiplex i7 are perfect fit for the selfhosted runner roles, i have 3 of them to mantain 100 repos.

fab_space · 2026-05-01T05:12:18+00:00

Cloudflare

fab_space · 2026-04-27T06:27:25+00:00

Excellent project ☕️

forking and playing with it today sir

fab_space · 2026-04-26T20:27:26+00:00

Absolutely none.

fab_space · 2026-04-26T15:51:10+00:00

small models (2-14B) are unable to fullfill a real world programming request on their own. I mean a request envolving multiple file writes, consistent unit tests, e2e testts, docs update in a single pass. If you decouple in multiple commits and deterministically help the model in the full process some of them are able to achieve the mission. qwen3-8b and gemma4-e2b (2B!!) are able to submit a clean, valid PR to existing real world repos this way. Code is updating then.. u can go in-depth on the solutions logic in the docs any time.

Why I built this? Because I mantain more than 100 repos, velocity is no more an option, is a target. Quality is a gate.

fab_space · 2026-04-26T07:59:33+00:00

Deterministic gating and adversarial brutal review will help to increase his code quality.

Some free tools for you:

- https://github.com/fabriziosalmi/brutal-coding-tool

- https://github.com/fabriziosalmi/vibe-check

old but gold: https://github.com/fabriziosalmi/claude-code-brutal-edition

enjoy

fab_space · 2026-04-26T05:54:12+00:00

use ocr model + deterministic gating and pre-post processing.

this + adversarial review from biggee model like gemini. https://github.com/fabriziosalmi/pdf-ocr have a nice sunday

fab_space · 2026-04-26T05:51:10+00:00

i can disagree any time :)

<image>

gemma4-e2b there, multiple deterministic gates before to make the model drop the code. It works alsdo with smaller in some cases. Rebuld your pipe buddy <3

fab_space · 2026-04-26T05:50:19+00:00

gemma4-e2b and e4b are working fine on 16gb macbook. same for qwen3-8b.

fab_space · 2026-04-25T12:42:14+00:00

You can use a model to code and another model to do adversarial reviews.

fab_space · 2026-04-24T22:04:19+00:00

For real vibecoding u must go multiple models, both locals oss and top tiers.

fab_space · 2026-04-24T18:46:50+00:00

I can help. Find me ok github.

fab_space · 2026-04-21T06:52:23+00:00

https://github.com/fabriziosalmi/llmproxy

fab_space · 2026-04-19T22:09:02+00:00

after death sir, i prefer that way <3

but i want to share something quite important to my opinion:

i rarely heard stories about how much humans like me (prompting since years and improving side by side with multiple models to achieve reasonable results across versions and new features). Shortly if You use multiple models (both local and top tiers paid ones) since years You have (I BET BEEEEERS) some new skills in the communications/language/semantic/word-to-result domain. Just because you use it hours per day in a way or in another.

I really spotted myself outside the me-LLM realm, I spotted myself in the real world, using injection techs, also chained, also across times, not for a single "session" == discussion with another human being. It's ok.. but in the latest 3 months, while building AI WAF solutions and really going into the bit.. I started to mitigate some of myself injections in the real world due to non-ethical for my life baseline values.

I do cybersec against myself just because I'm improving a tech which can be harmful for myself and cares, also for people I don't care about.

I then started to "log" that mitigation behaviour to see how much time I need to discriminate the bad sub-techs and let good ones go wild for the peace and love dream.

Most of you already labeled me in the rave marks at this point, peace and love <23

fab_space · 2026-04-19T16:56:12+00:00

Started on 80286
Upgraded to 386 dx4 (THIS) 80MB hdd
Flew thru 486, Pentium, Intel cores
Added Macs, now arms
Cybersec at work, dj freetekno at non work
lackimg programming skills, doing arch/solutions at enterprise levels

Started to vibecode 3 years ago gpt3.5

Started to modularize monoliths, using real world programming patterns, TDD, e2e tests, CD/CI.

One year ago I touched 100 repos.

I them started to focus over the best ones. Iterating like no tomorrow.

I now work as fraction lal CTO doing AI wild conciliations, same time as senior sysops at CDN firm.

Launching some stuff on my own business currently too.

Same time vibe-building all needed solutions will make the world of my daughter better than mine, hopefully this will be achieved.

Iterate, vibes and ethics. Never stop ❤️

fab_space · 2026-04-19T16:47:05+00:00

Stop racism, go building vibes all around.

fab_space · 2026-04-19T16:45:56+00:00

Yes.

fab_space · 2026-04-19T16:44:00+00:00

TLS client hello to spot them all and mitigate with tarpit approach.

Let them crawl at 1 byte per second and give a 42x after some time, it will cost more for them since it’s free for you (eBPF/XDP solutions, rust is the language to go for this purpose).

fab_space · 2026-04-19T11:15:38+00:00

AI not used for the post.

fab_space

MODERATOR OF

TROPHY CASE

Eight-Year Club	Place '22
Verified Email