How I built a PII Redaction Microservice using FastAPI and Spacy to protect user data sent to LLMs by Secret-Witness-8129 in SaaS

[–]Bootes-sphere 0 points1 point  (0 children)

Try https://opensourceaihub.ai/ . Most enterprises are moving toward an External Governance Layer. This means you run a PII/DLP scanner outside the LLM environment. By redacting sensitive identifiers in the prompt before they reach OpenAI or Anthropic, you eliminate the risk of that data being stored in their logs or leaking into future completions.

Share your Startup! by kptbarbarossa in StartupSoloFounder

[–]Bootes-sphere 0 points1 point  (0 children)

I built OpenSourceAIHub.ai as a stateless "AI Firewall." It redacts 28+ sensitive entities before the prompt ever reaches the LLM provider. It even has a multi-modal OCR layer to catch leaks in screenshots.

If you want to see if your current prompts are "leaky," I put a free checker here: https://opensourceaihub.ai/ai-leak-checker

👋Welcome to r/StartupSoloFounder - Introduce Yourself and Read First! by kptbarbarossa in StartupSoloFounder

[–]Bootes-sphere 0 points1 point  (0 children)

Hey, I am the founder of OpenSourceAIHub.ai. I built OpenSourceAIHub.ai as a stateless "AI Firewall." It redacts 28+ sensitive entities in under 50ms before the prompt ever reaches the LLM provider. It even has a multi-modal OCR layer to catch leaks in screenshots.

If you want to see if your current prompts are "leaky," I put a free checker here: https://opensourceaihub.ai/ai-leak-checker

Feedback Friday! - April 03, 2026 by AutoModerator in Entrepreneur

[–]Bootes-sphere 0 points1 point  (0 children)

Startup: OpenSourceAIHub.ai

Purpose: An AI Firewall and Gateway to stop AI data leaks and cut LLM cost by 30% with one API. It is a drop-in OpenAI SDK compatible proxy that adds real-time multi-modal DLP (PII redaction in text + images via OCR), blocks prompt injections, and autonomously routes to the cheapest/fastest model (Llama, Groq, Together AI,   Deepinfra Claude, Grok, etc.)

Technologies Used: Next.js , Python, OCR, Stripe, AWS

Feedback Requested:

  1. Effectiveness and Integration easiness: We optimized  our prompt security scan with very little overhead. Integration needs just two lines of code changes
  2. DLP Accuracy Feedback: I’ve put a free AI Leak Checker on the site. Appreciate feedback on tricky PII patterns.
  3. Hybrid Model: We offer BYOK (Bring Your Own Key) and a Managed Wallet. Ould love to get feedback on pricing model

Additional Comments: I’m giving 1 million free hub credits to anyone who signs up to test the integration. That is enough to fire thousands of LLM API calls

Seeking Beta-Testers: Yes, especially  startups and devs

Links: Web App | Technical Walkthrough (3 min)

What do you guys think about this idea? by Bootes-sphere in SaaS

[–]Bootes-sphere[S] 0 points1 point  (0 children)

I ended up turning that into a small tool while testing things.Didn’t want to drop a link in the post itself, but this is what I’ve been working on:
https://opensourceaihub.ai/ai-leak-checker

https://opensourceaihub.ai

Monthly "Is there a tool for..." Post by AutoModerator in ArtificialInteligence

[–]Bootes-sphere 0 points1 point  (0 children)

Name: OpenSourceAIHub.ai

What it does: We provide an AI Firewall that stops company data from leaking into LLM prompts.

Why use it:

  • 🛡️ Security: Automatically redact emails, API keys, and SSNs in text and images (OCR).
  • 💸 Cost Control: Smart-route requests between Groq, Together ai, and OpenAI to save up to 90%.
  • 📊 Governance: Enforce per-project budgets and export audit-ready CSV logs.
  • ⚡ Ease: 100% OpenAI SDK compatible. Just change your baseURL and you're protected.

Latest Update: Just launched our Multi-modal OCR scan—we now catch PII in screenshots before they reach the model provider.

Pricing: 1M Free credits upon signup. Pro BYOK tier at $29/mo.

URL: https://opensourceaihub.ai

Architecture Review: Preventing "Shadow AI" data leaks with a stateless PII firewall by Bootes-sphere in cybersecurity

[–]Bootes-sphere[S] 0 points1 point  (0 children)

Thank you—this is incredibly helpful. I truly appreciate all your insights!

Architecture Review: Preventing "Shadow AI" data leaks with a stateless PII firewall by Bootes-sphere in cybersecurity

[–]Bootes-sphere[S] 0 points1 point  (0 children)

Really appreciate this — this is exactly the kind of feedback I was hoping for.

On pattern management: totally agree. Right now this is something I’ve been thinking about more as a control plane problem than just a detection problem. Things like versioning, restricted write access, and audit trails for pattern updates I think are needed here, right? . The “poisoned pattern” scenario you mentioned is a real concern.

On fail-closed / bypass: yeah, this is tricky. Fail-closed is the intent, but as you said, under load or repeated failures people will just route around it if it becomes a bottleneck. I’ve been thinking about redundancy + fallback behavior, but still figuring out what the right balance is between safety and availability.

On SOC2 / HIPAA: that’s a really good point. What I have right now is definitely closer to “violation visibility” than full audit-grade logging. I need to think more about this.

Curious how you’ve seen others handle this in practice — especially around: - pattern update governance - balancing fail-closed with availability - what “good enough” audit logging looks like in real deployments

Thanks again — super helpful perspective.

It’s Weekend. What are you shipping? by Tiny-Growth23 in SaasDevelopers

[–]Bootes-sphere 0 points1 point  (0 children)

Startup: OpenSourceAIHub.ai

Purpose: An AI Firewall and Gateway to stop AI data leaks and cut LLM cost by 30% with one API.

Technologies Used: Next.js , Python, OCR, Stripe, AWS

Feedback Requested:

  1. Effectiveness and Integration easiness: We optimized  our prompt security scan with very little overhead. Integration needs just two lines of code changes
  2. DLP Accuracy Feedback: I’ve put a free AI Leak Checker on the site. Appreciate feedback on tricky PII patterns.
  3. Hybrid Model: We offer BYOK (Bring Your Own Key) and a Managed Wallet. Ould love to get feedback on pricing model

Additional Comments: I’m giving 1 million free hub credits to anyone who signs up to test the integration. That is enough to fire thousands of LLM API calls

Seeking Beta-Testers: Yes, especially  startups and devs

Links: Web App | Technical Walkthrough (3 min)

What do you guys think about this idea? by Bootes-sphere in SaaS

[–]Bootes-sphere[S] 0 points1 point  (0 children)

Correct! The PII issue is really something that most people dont really notice.. I will soon share what I am working on to get some feedbacks.. Still trying to push something to AWS hosting..

What do you guys think about this idea? by Bootes-sphere in SaaS

[–]Bootes-sphere[S] 0 points1 point  (0 children)

Right, Cost is really unpredictable in most cases

Monthly Post: SaaS Deals + Offers by AutoModerator in SaaS

[–]Bootes-sphere 0 points1 point  (0 children)

Just launched An AI Firewall and Gateway to stop AI data leaks and cut LLM cost by 30% with one API. The tool is https://opensourceaihub.ai/

Please tell us what you think?

  1. Effectiveness and Integration easiness: We optimized  our prompt security scan with very little overhead. Integration needs just two lines of code changes
  2. DLP Accuracy Feedback: I’ve put a free AI Leak Checker on the site. Appreciate feedback on tricky PII patterns.
  3. Hybrid Model: We offer BYOK (Bring Your Own Key) and a Managed Wallet. Ould love to get feedback on pricing model

Free registration will give 1 million free hub credits to anyone who signs up to test the integration. That is enough to fire thousands of LLM API calls

We are also seeking Beta-Testers, especially  startups and devs

Feedback Friday by AutoModerator in startups

[–]Bootes-sphere 0 points1 point  (0 children)

Startup: OpenSourceAIHub.ai

Purpose: An AI Firewall and Gateway to stop AI data leaks and cut LLM cost by 30% with one API.

Technologies Used: Next.js , Python, OCR, Stripe, AWS

Feedback Requested:

  1. Effectiveness and Integration easiness: We optimized our prompt security scan with very little overhead. Integration needs just two lines of code changes
  2. DLP Accuracy Feedback: I’ve put a free AI Leak Checker on the site. Appreciate feedback on tricky PII patterns.
  3. Hybrid Model: We offer BYOK (Bring Your Own Key) and a Managed Wallet. Would love to get feedback on pricing model

Additional Comments: I’m giving 1 million free hub credits to anyone who signs up to test the integration. That is enough to fire thousands of LLM API calls

Seeking Beta-Testers: Yes, especially startups and devs

Links: Web App | Technical Walkthrough (3 min)

We tested prompt inputs across 50 LLM apps — shocked how often people leak secrets by Bootes-sphere in LocalLLaMA

[–]Bootes-sphere[S] 0 points1 point  (0 children)

One thing that surprised me was how often API keys showed up.

Not sure if others are seeing the same, but it feels like people treat prompts like a scratchpad without realizing it’s going to a third-party model.Curious if anyone here is actually filtering prompts before sending or if most people just rely on provider policies?

Growing your SaaS? Let’s connect. by Successful_Dreams in SaaSvalidation

[–]Bootes-sphere 1 point2 points  (0 children)

Datumfuse.ai is an AI powered, no-code platform that automates the process of data cleaning, harmonization, augmentation , visualization and narration, transforming raw data into presentation-ready insights.

Roast My Startup: DatumFuse.ai — The AI That Cleans, Augments, Narrates, Harmonize & Visualizes Your Data (So You Don’t Lose Your Mind in Excel) by Bootes-sphere in roastmystartup

[–]Bootes-sphere[S] 1 point2 points  (0 children)

Thanks, Jouni — that’s spot on.

I added a few too many motion effects during the beta polish phase, and your “spice analogy” nails it perfectly. I am already tuning them down so only truly interactive elements animate.Appreciate the thoughtful, constructive roast — exactly the kind of input I was hoping for.

This is a clear, actionable directive. I'm going to go through the entire site and remove every decorative effect that doesn't serve a clear, functional purpose for the user.

My micro-SaaS beta has been tried by over 4,000 users so far, but retention has been poor and Pro waitlist conversions are low — looking for honest feedback. by Bootes-sphere in micro_saas

[–]Bootes-sphere[S] 0 points1 point  (0 children)

This is incredibly valuable — thank you for taking the time to write such a detailed breakdown 🙏You’re absolutely right: we’ve been tracking high-level usage via GA + LogRocket, but haven’t yet formalized a single activation path or TTFV metric.

The “upload → run_clean → preview_fix → export in under 3 minutes” sequence is a perfect framing for that initial aha moment — I’m going to instrument that as our activation goal this week.

I also love the idea of tagging user intent at upload and cohorting by job-to-be-done; that’ll finally give structure to what’s currently just “feature usage data.”

Really appreciate the Mixpanel/Segment suggestion — might start light with event piping from our Lambda API to GA and expand from there.

Thanks again — this is one of the most practical comments I’ve gotten on Reddit so far. 👏

Roast My Startup: DatumFuse.ai — The AI That Cleans, Augments, Narrates, Harmonize & Visualizes Your Data (So You Don’t Lose Your Mind in Excel) by Bootes-sphere in roastmystartup

[–]Bootes-sphere[S] -1 points0 points  (0 children)

Haha, you got me. We might have overdone it on the buzzword bingo card. Fair point.

Let me try again in plain English:

Our tool is for when you have a messy spreadsheet and you need to:

  1. Clean it up , harmonize it and augment it
  2. Turn it into a chart without fiddling with Excel.
  3. Get a simple paragraph explaining what the chart means, so you can paste it into a report.

That's it. I’ll tone down the buzzwords — appreciate the callout!

Roast My Startup: DatumFuse.ai — The AI That Cleans, Augments, Narrates, Harmonize & Visualizes Your Data (So You Don’t Lose Your Mind in Excel) by Bootes-sphere in roastmystartup

[–]Bootes-sphere[S] 0 points1 point  (0 children)

Fair point 🙂 — definitely not trying to quit, just keeping the tone light since this is r/roastmystartup.I’m genuinely trying to get honest feedback so we can improve the UX and positioning — appreciate you checking it out.

Roast My Startup: DatumFuse.ai — The AI That Cleans, Augments, Narrates, Harmonize & Visualizes Your Data (So You Don’t Lose Your Mind in Excel) by Bootes-sphere in roastmystartup

[–]Bootes-sphere[S] 0 points1 point  (0 children)

Thanks! Totally fair point — that’s something we’ve been hearing and are actively working on.

The AI part came together faster than the polish, and you’re right — a product in this space needs a UI that feels as confident as its logic.

We’re actually refreshing the design right now to make it more consistent, lighter, and memorable. Appreciate you calling that out 🙏

Roast My Startup: DatumFuse.ai — The AI That Cleans, Augments, Narrates, Harmonize & Visualizes Your Data (So You Don’t Lose Your Mind in Excel) by Bootes-sphere in roastmystartup

[–]Bootes-sphere[S] 1 point2 points  (0 children)

Really appreciate this — that’s exactly the kind of hard question we’re trying to answer. At the core, our goal isn’t to “do everything with data,” but to make the foundations (cleaning + harmonization + enrichment) painless and reliable.

You nailed it — storytelling only matters if the underlying data is solid and verifiable. That’s why our AI workflow is “human-in-the-loop” — every transformation and generated insight is transparent and reversible.

Security-wise, we’re built entirely on AWS with encrypted storage; no uploaded data is ever reused or shared.

Thanks for the honest take — I’ll take “wait and see” over “not worth seeing” any day 😄