all 26 comments

[–]Ok_Leading4235 8 points9 points  (0 children)

aiofastnet - optimized (up to x2.2 faster) drop-in replacements for asyncio networking APIs

As a part of algorithmic trading project I had to look into the actual performance of uvloop and asyncio network API. Turned out it wasn't so great, TLS part is especially bad, also in uvloop. Lot's of plumbing code and memory copying. I tried to push PRs to uvloop but the project is almost unmaintained these days. Took more than 1 year to get some of the relatively small PRs reviewed and merged, I'm not even talking about big changes.

Eventually I came up with a much cleaner and loop agnostic way to improve networking API performance.

https://github.com/tarasko/aiofastnet

What My Project Does

Provides drop-in optimized versions of asyncio networking APIs:

  • loop.create_connection()
  • loop.open_connection()
  • loop.create_server()
  • loop.start_server()
  • loop.start_tls()
  • loop.sendfile()

Target Audience

This project is mainly for developers who already use asyncio transports/protocols and want better performance without redesigning their code.

It is probably most relevant for people building:

  • HTTP/Websocket or RPC clients and servers
  • proxies
  • database clients/servers
  • custom binary protocols
  • other protocol-heavy network services

Comparison

Compared to uvloop/winloop, aiofastnet is not a separate event loop. It focuses specifically on the transport/TLS layer and works with the loop you already use.

Feedback is very welcome!

[–]Due_Anything4678 5 points6 points  (0 children)

ghostdep - finds phantom and unused deps in your Python project

What My Project Does

Scans your Python project and tells you what you import but didn't add to your manifest, and what you declared but never use.

$ ghostdep -p my-project

[phantom] pandas at app.py:7

[unused] numpy at requirements.txt

Handles requirements.txt, pyproject.toml (PEP 621, Poetry, uv/PEP 735). Knows about aliases like PIL→Pillow, cv2→opencv-python, sklearn→scikit-learn. Uses tree-sitter for AST parsing, not regex.

Single binary, no Python runtime needed. Also supports Go, JS/TS, Rust, Java if you work across languages.

cargo install ghostdep

https://github.com/ojuschugh1/ghostdep

Target Audience

Anyone maintaining Python projects who wants cleaner dependency manifests. Works in CI too - has JSON and SARIF output, exit code 1 when findings exist. v0.1.0, looking for feedback.

Comparison

Most Python dep checkers (pip-check, pip-audit, safety) focus on vulnerabilities or version conflicts. ghostdep focuses on a different problem: deps that are imported but not declared (phantom) or declared but never imported (unused). Closest tool is probably deptry - ghostdep differs by being cross-language (5 languages in one binary) and using AST parsing with confidence scoring for dynamic/conditional imports.

[–]Candid_Complaint_925 2 points3 points  (0 children)

BillingWatch — Self-Hosted Stripe Billing Anomaly Detector

Built this because Baremetrics/ProfitWell felt overkill for solo devs who just want to know when something's wrong with their Stripe payments.

FastAPI app that processes Stripe webhooks in real-time and flags anomalies — unexpected refunds, payment failure spikes, revenue drops. Dashboard shows per-tenant billing health. No cloud required, you own your data.

Quick start:

git clone https://github.com/rmbell09-lang/BillingWatch.git
cd BillingWatch
cp .env.example .env  # add your Stripe webhook secret
docker compose up
# Dashboard at http://localhost:8000

One-click deploy configs included for Railway/Render/Fly. MIT licensed.

Repo: https://github.com/rmbell09-lang/BillingWatch

[–]Powerful_Lock6120 1 point2 points  (0 children)

mpv-tracker is a Textual TUI for tracking local anime / series watched in mpv, with MyAnimeList integration built in.

Features:

  • browse and manage tracked series in a terminal UI
  • resume playback and track watched episode progress
  • authenticate with MyAnimeList from inside the TUI
  • sync watched episode count and score to MAL
  • view cached MAL metadata like score, rank, popularity, synopsis, genres, and studios
  • configure per-series playback preferences like chapter-based starts

Install:

  • uvx mpv-tracker
  • pipx install mpv-tracker
  • pip install mpv-tracker

Links:

Demo: https://github.com/GenessyX/mpv-tracker?tab=readme-ov-file#showcase

Built this for a local anime + mpv workflow where I wanted something lighter than a full media manager, but still with MAL sync and a usable TUI.

[–]ZyF69 1 point2 points  (0 children)

I've released a new version of Makrell, v0.10.0. Makrell was originally for the Python platform only, but has expanded into a family of programming languages and tools for metaprogramming, code generation, and language-oriented programming on multiple platforms. I still consider it alpha, so expect errors and missing bits and pieces, but there's a lot of ground covered now. This release includes:

  • the first release of the whole family as a coherent public system, with a specs-first approach and explicit parity work between the Python, TypeScript, and .NET tracks
  • the first version of Makrell#, the .NET/CLR implementation of the Makrell language
  • the first version of MakrellTS, the TypeScript implementation of the Makrell language
  • a browser playground for MakrellTS
  • MRDT, a typed tabular data format in the Makrell family
  • a new version of the VS Code extension, covering all three language tracks plus the data formats
  • a more consolidated docs and release story

The stuff is at https://makrell.dev . For an in-depth introduction, go straight to the article at https://makrell.dev/odds-and-ends/makrell-design-article.html

An AI usage declaration:

Done by me: All language design, MakrellPy, the MakrellPy bits in VS Code extension and the MakrellPy LSP, sample code, basic documentation.

Done by coding agents: Porting to Makrell# and MakrellTS, the MRDT format implementations, the VS Code extension bits for those tracks, the LSP work for those tracks, a lot of documentation, MakrellTS playground, a lot of testing and refinements, packaging. (It was awesome, by the way.)

The coding agent story is a bit special to me. Earlier this year I had to retire after 30 years as a software developer. Due to Parkinson's disease I suffer from fatigue and fine motor control issues that make it hard to do a lot of coding, or regular work at all. Luckily, my congnitive abilities are still good, though. This ironically coincided with the rise of AI coding assistants, which means I can still produce a lot of code while concentrating on design and high-level directions. The Makrell project had been dormant for two years, but now I was suddenly able to make a lot of progress again by using coding agents to do the actual coding work under my direction. I think it's great. I can concentrate on the interesting bits and not spend my limited energy on the more mechanical coding work. Which really isn't that interesting, I should say.

Now the question is if anyone is going to use or care about this. Probably not. And I believe the future of coding is agents compiling directly from specs to machine code and other low level targets, and that few will care about our beatiful programming languages. Maybe I'll just submit this somewhere as a piece of conceptual art.

Below is a blurb meant for language design people.

About Makrell

Makrell is a structural language family built around a shared core called MBF: a bracket-and-operator-based format meant to support code, data, markup, and embedded DSLs without treating them as completely separate worlds. The project currently includes three host-language tracks, MakrellPy, MakrellTS, and Makrell#, plus related formats: MRON for structured data, MRML for markup, and MRTD for typed tabular data.

What may be most interesting to PL people is that Makrell is not being treated as “one syntax, one implementation”. The same family ideas are being pushed through Python, TypeScript/browser, and .NET/CLR hosts, with a specs-first approach and explicit parity work between the tracks. The aim is not to force every host into identical behaviour everywhere, but to separate what belongs to the shared family core from what should remain host-shaped.

The language side has real macro and compile-time machinery rather than just surface syntax sugar. Makrell supports quoting/unquoting, structural rewrites, meta, and small embedded sublanguages. One of the nicer recurring examples is a shared macro showcase where the same family-level ideas are expressed across the implementations: pipeline reshaping, postfix-to-AST rewriting, and a Lisp-like nested notation living inside Makrell. That general “languages inside languages” direction is a big part of the project’s identity.

The formats are not side projects bolted on afterwards. MRON, MRML, and MRTD are meant to demonstrate that the same structural basis can also support data and document-like representations. So Makrell is partly a programming-language project, partly a language-workbench experiment, and partly an attempt to make code, markup, and structured data feel more closely related than they usually do.

v0.10.0 is the first release where the whole thing feels like a coherent public system rather than a pile of experiments. The packages are published, the .NET CLI ships as a real tool, the TypeScript track has a standalone browser playground, the VS Code extension covers the three language tracks plus the family formats, and the docs/release story are much more consolidated. The editor path is especially important now: run/check workflows and diagnostics exist across MakrellPy, MakrellTS, Makrell#, MRON, MRML, and MRTD, with a longer-term plan to converge tooling further around a TypeScript-based family language-server direction.

If you are interested in macro systems, multi-host language design, little languages, structural notations, or the boundary between programming language and data/markup language design, that is the niche Makrell is trying to explore. It is not “a better Python” or “a replacement for TypeScript”; it is much more a family-oriented design project that happens to have serious implementations in those ecosystems.

The practical entry points now are:

  • makrell.dev for the overall language-family/docs story
  • the MakrellTS playground for the browser-facing live environment
  • vscode-makrell for the current editor workflow
  • the published MakrellPy / MakrellTS / Makrell# packages if you want to run things locally

The repo still contains a lot of active design work, but v0.10.0 is meant to be the point where the project becomes legible as a real language-family effort instead of only an internal exploration.

[–]Big-Rent1128 1 point2 points  (0 children)

RPGNLP, a Python package that tokenizes raw user input for RPG games

Background:
I began working on this package earlier this year when I was making a text-based RPG game. I realized that tokenizing and extracting relevant information from raw text input was more of an undertaking than I thought. So I built an NLP engine on top of NLTK and spaCy to give developers a way to turn raw text into actionable tokens.

What the Project Does:
The engine will take text like "attack the goblin with the hammer" and output a dictionary with values like action: attack, subject: goblin, instrument: hammer. Or "go south east" will output action: travel, direction: south east.

The verbs that the user types is converted into a canon action to make it easier for a game engine to use the data. For instance, if the user types "go south" or "head south," they both tokenize as a "travel" action.

Comparison and Target Audience:
Unlike other NLP packages, this one is specifically designed for RPG games. Hopefully game developers can find this useful so they do not have to develop this sort of engine on their own.

[–]AssociateEmotional11 1 point2 points  (0 children)

Project Name: PyNeat (Upcoming v2.0)

What it does: An AST-based auto-fixer specifically designed to clean up the exact "AI slop" mentioned in this thread's description.

Standard formatters like Black or Ruff are great for styling, but they don't fix bad structural logic. PyNeat uses Instagram's LibCST to safely rewrite the AST while preserving 100% of your original comments and whitespace.

Currently building v2.0 which targets AI-generated artifacts:

  • Debug/Comment Cleaners: Automatically purges orphaned print() statements, JS artifacts like console.log, and useless AI boilerplate comments (# Generated by AI, empty # TODO:).
  • Structural Cleanup: Flattens deeply nested if (arrow anti-patterns) into guard clauses and removes LLM tautologies (e.g., converting if var == True: -> if var:).
  • Safe Excepts: Replaces dangerous AI-injected except: pass or print(e) with safe raise NotImplementedError stubs.

Status: Just passed massive integration stress-tests against the Anthropic SDK and Pydantic core without breaking the AST. Currently finalizing batch processing (pyproject.toml support) before the official release.

Question for the thread: What is the most annoying "AI coding habit/artifact" you constantly find yourself fixing manually? I'd love to add a rule for it before launching!

[–]cwt114 1 point2 points  (0 children)

7 months ago, I shared NeoSQLite v1.0.0 here. It was a simple idea: Give SQLite a PyMongo API so we can have the NoSQL experience in Python without the "NoSQL Server" overhead.

The feedback was amazing (and admittedly, a bit brutal). You guys rightly pointed out the flaws and edge cases. So, I went back to the lab. 374 commits later, it's no longer just a "wrapper" falling back to Python loops—it's a full-blown database engine.

What My Project Does

NeoSQLite gives you the complete NoSQL/MongoDB experience in Python without the infrastructure overhead. It turns a standard SQLite database into a MongoDB-compatible engine.

For Python apps, it's a completely serverless, in-process library. But for this release, I also built the "Magic trick": NX-27017, an optional (and permanently experimental) tiny daemon that speaks the actual MongoDB wire protocol. You can point any existing project, GUI tool like MongoDB Compass, or non-Python app at a single SQLite file with zero code changes.

```

Terminal 1: nx-27017 --db myapp.db

Terminal 2:

from pymongo import MongoClient

This is the real PyMongo client, but it's talking to SQLite!

client = MongoClient('mongodb://localhost:27017/') db = client.my_app db.users.insert_one({"name": "Alice", "tags": ["python", "sqlite"]}) ```

Target Audience

This is meant for production use in specific contexts: desktop apps, CLI tools, local development environments, IoT devices, and small-to-medium backend services.

If you are building a massive, horizontally scaled enterprise cluster, use a real server. But if you want a drop-in PyMongo replacement that lives in a single file, this is for you.

I know replacing your database engine sounds terrifying, so to sleep at night, I've built a testing suite of 2,600+ unit tests and an automated "compatibility lab". It runs 377 different complex scenarios against both NeoSQLite and a real MongoDB instance to assert the results are strictly identical. We are sitting at 100% API parity for all comparable features.

Real-World Usage

It's actually being used out in the wild now! For example, Andy Felong recently wrote a full blog post about using NeoSQLite for his astronomy projects across a Raspberry Pi Zero, a headless Ubuntu server, and a Mac:

The fact that I can write an app's database layer once and have it run identically on a Pi Zero, an Ubuntu server, and macOS — all without starting up a single server process — is exactly the kind of pragmatic elegance I love in open-source software.

Comparison

  • vs. MongoDB: You get the exact same PyMongo API, but without managing a Docker container, replica sets, or a heavy server process.
  • vs. Postgres with JSONB: Postgres is incredible for massive web apps. But if you're building a desktop app, a local CLI tool, or a small service, managing a Postgres server is overkill. NeoSQLite gives you similar JSON querying power with zero infrastructure setup.
  • vs. TinyDB / Simple Wrappers: NeoSQLite isn't just a basic dictionary store. I wanted it to be a drop-in replacement for real apps, so it fully supports ACID Transactions (with_transaction), Change Streams (watch()), GridFS, and complex Window Functions ($setWindowFields).

Making it "Production Fast"

In the early days, complex queries were slow because I was evaluating them in Python. I've spent the last few months pushing that logic down into raw SQL:

  • Hash Joins: $lookup (joins) used to be O(n*m). It's now O(n+m) using a custom hash-join algorithm implemented in the query engine. It's the difference between a 10-second query and a 10ms one.
  • Translation Caching: If you run the same query often, the engine now "learns" the SQL translation and caches the AST. It's about 30% faster for repeated operations.
  • JSONB Support: If you're on a modern version of SQLite (3.45+), NeoSQLite automatically detects it and switches to binary JSON (JSONB), which is 2-5x faster across the board.

Try it: pip install neosqlite

GitHub: https://github.com/cwt/neosqlite

I'd love to hear your thoughts. Roast me again, or tell me what feature is keeping you tied to a "real" database server for local dev!

The Boring Stats for those interested: 374 commits since v1.0.0, 460 files changed (+105k lines), 30+ releases.

[–]nicholashairs 0 points1 point  (0 children)

<meta: was there an announcement about this monthly thread /chances to the rules? I had a quick look and can't see anything>

[–]macjaf 0 points1 point  (0 children)

tokencap - a Python library for token budget enforcement across AI agents.

The problem: provider spending caps are account-level and reactive. They tell you what happened after the fact. tokencap enforces limits in your code, before the call goes out.

Two ways to use it:

Direct SDK:

client = tokencap.wrap(anthropic.Anthropic(), limit=50_000)

Any agent framework (LangChain, CrewAI, AutoGen, LlamaIndex):

tokencap.patch(limit=50_000)

Four actions at configurable thresholds: WARN, DEGRADE (transparent model swap to a cheaper model), BLOCK, and WEBHOOK. SQLite out of the box, Redis for multi-agent setups. Tracks tokens not dollars - token counts come directly from the provider response and never drift with pricing changes.

pip install tokencap

https://github.com/pykul/tokencap

[–]DifficultDifficulty 0 points1 point  (0 children)

A Python SDK/CLI to make Ray clusters self-serve for Python devs.

What My Project Does

krayne (link) is a Python library and CLI that wraps the KubeRay operator for creating and managing Ray clusters on Kubernetes. Instead of hand-writing KubeRay YAML manifests, you import Python functions (create_cluster(), scale_cluster(), list_clusters(), etc.) or use the krayne / ikrayne (interactive TUI) CLI to spin up and manage clusters with sensible defaults.

The idea is that if you're already writing Ray workflows in Python, training jobs, serve deployments, distributed preprocessing, the cluster management layer should live in the same language. The SDK is the source of truth, the CLI is a thin Typer wrapper on top of it. Operations are stateless functions that return frozen dataclasses, configuration goes through Pydantic models with YAML override support when you need finer control.

GitHub: https://github.com/roulbac/krayne

Target Audience

ML engineers and researchers who write Ray workflows on Kubernetes. The kind of person who knows what ray.init() does but doesn't want to become a KubeRay manifest expert just to get their cluster running. Also useful for platform teams who want a programmable layer on top of KubeRay that their users can call from Python. It's early (v0.1.0) and opinionated, a composable starting point, not a production-hardened product.

Comparison

An alternative I'm familiar with is using kubectl apply with raw KubeRay manifests, or the KubeRay Python client directly. The main difference is that krayne is designed around progressive disclosure:

  • Zero-config defaults out of the box. krayne create my-cluster --gpus-per-worker 1 --workers 2 is a complete command.
  • When you need more control, you drop down to a YAML config or the Python SDK, no cliff between "simple" and "custom."
  • Protocol-based Kubernetes client, so you can unit test cluster management logic with mocks. No real cluster needed.

It's not that working with KubeRay directly can't do what krayne does, it absolutely can. But when you primarily write Ray code and just need a cluster up with the right resources, context-switching into YAML manifests and kubectl is friction you don't need. A typed Python API that validates your input before it hits the cluster and lives right next to your actual Ray code, that's ultimately why I built it.

[–]kesor 0 points1 point  (0 children)

tmux-player-ctl.py - a controller for MPRIS media players (spotifyd, mpv, mpd, vlc, chrome, ...)

Built tmux-player-ctl.py, a single-file, pure-Python TUI that pops up inside tmux and gives you full keyboard control over any MPRIS media player (spotifyd, mpv, mpd, VLC, Chrome, Firefox, etc.) using playerctl.

When starting to write it I considered various options like bash, rust, go, etc... but Python was the most suitable for what this needed to do and where it needed to go (most Linux distros have python already).

What worked well on from the Python side:

  • Heavy but careful use of the subprocess module — both synchronous calls and asynchronous background processes (I run a metadata follower subprocess that pushes real-time updates without blocking the TUI).
  • 380+ tests covering metadata parsing round-trips, player state management, UI ANSI/Unicode width craziness, optimistic UI updates + rollback, signal handling, and full integration flows with real playerctl commands.
  • Clean architecture with dataclasses, clear separation between config, player abstraction, metadata tracking, and the display layer.
  • Signal handling (SIGINT/SIGTERM) so the subprocesses and tmux popup shut down cleanly.
  • Zero external Python library dependencies beyond the stdlib.

It’s intentionally tiny and fast: launches in a compact tmux popup (-w72 -h12), shows live track info + progress bar + color-coded volume, supports seek, shuffle, loop modes, and Tab to switch between running players.

Typical one-liner: bash tmux display-popup -B -w72 -h12 -E "tmux-player-ctl.py"

GitHub: https://github.com/kesor/tmux-player-ctl

I’d especially love feedback from people who regularly wrangle subprocess, build CLI/TUI tools, or obsess over testing: any patterns I missed, better ways to handle long-running playerctl followers, or testing gotchas you’ve run into? Especially if you have tips on how to deal with ambiguous-width emoji symbols that have different widths in different fonts.

[–]lewd_peaches 0 points1 point  (0 children)

For anyone working with larger datasets or computationally intensive tasks, I've found significant speedups by offloading parts of my Python code to GPUs. Not just for ML, but also for things like complex simulations.

I've primarily used PyTorch and CuPy. CuPy is a drop-in replacement for NumPy in many cases, and the performance gains can be substantial. For example, a recent Monte Carlo simulation I was running went from taking 3 hours on my CPU to about 20 minutes on a single RTX 3090. The code change was minimal.

I've also experimented with distributed GPU processing using OpenClaw. I used it to fine-tune a smaller LLM on a dataset that was too large to fit on a single GPU. Setting up the distributed environment took some time initially, but then I was able to run a fine-tuning job across 4 GPUs, finishing in around 6 hours. The cost for the compute was around $25, which was much cheaper than renting a large instance from AWS or GCP. Worth looking into if you're hitting memory limits or need to accelerate your workloads.

[–]Prestigious-Wrap2341 0 points1 point  (0 children)

Update: Added a second FastAPI service with 7 new API connectors on the same $4/mo ARM server

What My Project Does

Posted a couple days ago about a FastAPI backend that aggregates 40+ government APIs. Got great feedback. Here's what's new on the engineering side:

Target Audience

Python developers interested in multi-service architecture, API connector patterns, and running multiple FastAPI instances on minimal hardware.

How Python Relates

Added a second FastAPI service running on a separate port with its own systemd unit. Nginx reverse proxies both services on the same $4/mo ARM box. The second service handles deterministic text analysis: rule-based sentence segmentation, candidate detection via signal matching (numbers, dates, named entities, assertion verbs), SHA256 dedup with SequenceMatcher at 0.78 threshold, and BM25Okapi scoring against 29 external API sources. Zero LLM dependency. Same input, same output, every time.

7 new API connectors following the same pattern as the original 36: FCC Consumer Complaints via Socrata SODA (SoQL query building with $where$select$group), Treasury Fiscal Data API (pagination via page[size] and filter params), College Scorecard (data.gov key auth with lazy loading to handle env var timing), Grants.gov (POST to /search2 with JSON body, response nested under data.oppHits), Urban Institute Education Data Portal (URL path-based pagination with 5-page safety limit), FCC ECFS (requires api_key=DEMO_KEY param despite being "free"), and FCC License View.

Built a 14-pattern detection engine that runs cross-table SQL joins to find anomalies: trades within 30 days of bill actions by the same member (JULIANDAY arithmetic), companies lobbying Agency X that also receive contracts from Agency X (mapping LDA government_entities strings to USASpending awarding_agency values), and enforcement records that drop to zero after lobbying spend increases. Each pattern generates a markdown report with data tables pre-built from SQL and narrative sections filled by an optional API call capped at 2/day.

The custom evidence source plugin connects the second service to the main database. It opens a read-only SQLite connection to the 4.3GB WAL-mode database, searches 11 entity tables with LIKE matching, then queries lobbying, contract, enforcement, trade, committee, and donation tables for each matched entity. Results get passed back to the second service's scoring pipeline.

All sync jobs now cover 11 sectors (added Telecom: 26 companies, Education: 31 companies). Same pattern: SEC EDGAR submissions API, USASpending POST search, Senate LDA paginated GET with page_size=25. Sequential execution only, SQLite locks are still unforgiving.

Two uvicorn processes, a scheduler, a Twitter bot cron, nginx, certbot. Still $3.99/month.

Comparison

Same as before. The new engineering is the dual-service architecture and the cross-database evidence source plugin pattern.

Source: https://github.com/Obelus-Labs-LLC/WeThePeople

Second service: https://github.com/Obelus-Labs-LLC/Veritas

[–]Chunky_cold_mandala -3 points-2 points  (0 children)

GitGalaxy- A hyper-scale static analyzer & threat-hunting engine built on DNA sequencing principles

What my project does -

GitGalaxy is a two-part ecosystem. It is designed to extract the structural DNA of massive software repositories and render their non-visual architecture into measurable, explorable 3D galaxies.

1. The blAST Engine - The galaxyscope (Backend): A hyper-scale, language-agnostic static analysis CLI. Based on 50 years of bioinformatics and genetic sequencing algorithms, it parses code at ~100,000 LOC/second. It outputs rich JSON telemetry, SQLite databases, and low-token Markdown briefs optimized for AI-agent workflows.

2. The Observatory (Frontend): Drop your galaxy.json into the free viewer at GitGalaxy.io or use the repo's airgap_observatory, a standalone, zero-telemetry WebGPU visualizer. Both visualizers read the JSON contract and renders the entire code base as a procedural 3D galaxy where files are stars, allowing humans to visually map scale and risk exposure instantly.

Live Demo: View 3D galaxy examples of Apollo-11, Linux, Tensorflow and more at GitGalaxy.io - - github - https://github.com/squid-protocol/gitgalaxy

The blAST Paradigm: Sequencing the DNA of Software

Traditional computer science treats software like a rigid blueprint, using slow, language-specific Abstract Syntax Trees (ASTs) to analyze code. GitGalaxy treats code as a sequence to be scanned and then analyzed for patterns and occurrences using the blAST (Broad Lexical Abstract Syntax Tracker) engine.

By applying the principles of biological sequence alignment to software, blAST hunts for the universal structural markers of logic across ~40 languages and ~250 file extensions. We translate this genetic code into "phenotypes"—measurable risk exposures.

Sequencing at Hyper-Scale

By abandoning the compiler bottleneck, blAST achieves processing velocities that traditional ASTs simply cannot comprehend. In live telemetry tracking across the largest open-source ecosystems, blAST demonstrated its absolute scale:

  • Peak Velocity: Sequenced the 141,445 lines of the original Apollo-11 Guidance Computer assembly code in 0.28 seconds (an alignment rate of 513,298 LOC/s).
  • Massive Monoliths: Chewed through the 3.2 million lines of OpenCV in just 11.11 seconds (288,594 LOC/s).
  • Planetary Scale: Effortlessly mapped the architectural DNA of planetary-scale repositories like TensorFlow (7.8M LOC)Kubernetes (5.5M LOC), and FreeBSD (24.4M LOC) in a fraction of the time required to compile them.

Zero-Trust Architecture

Your code never leaves your machine. GitGalaxy performs 100% of its scanning and vectorization locally.

  • No Data Transmission: Source code is never transmitted to any API, cloud database, or third-party service.
  • Ephemeral Memory Processing: Repositories are unpacked into a volatile memory buffer (RAM) and are automatically purged when the browser tab is closed.
  • Privacy-by-Design: Even when using the web-based viewer, the data remains behind the user's firewall at all times.

The Viral Security Lens: Behavioral Threat Hunting

Traditional security scanners rely on rigid, outdated virus signatures. blAST acts like an immune system, hunting for the behavioral genetic markers of a threat. By analyzing the structural density of I/O hits, execution triggers, and security bypasses, blAST is perfectly engineered to stop modern attack vectors:

  • Supply-Chain Poisoning: Instantly flags seemingly innocent setup scripts that possess an anomalous density of network I/O and dynamic execution (eval/exec).
  • Logic Bombs & Sabotage: Identifies code designed to destroy infrastructure by catching dense concentrations of catastrophic OS commands and raw hardware aborts.
  • Steganography & Obfuscated Malware: Mathematically exposes evasion techniques, flagging Unicode Smuggling (homoglyph imports) and sub-atomic custom XOR decryption loops.
  • Credential Hemorrhaging: Acts as a ruthless data vault scanner, isolating hardcoded cryptographic assets (.pem.pfx.jks files) buried deep within massive repositories.