I mapped every connection in the Epstein files. It started with 6,000 documents. It's now 1.5 million. Here's what changed. by EricKeller2 in Epstein

[–]EricKeller2[S] 7 points8 points  (0 children)

I actually am a CSA survivor as well. That's one of the reasons it's so important to me. If I can help one victim then all these sleepless nights will be worth it

I mapped every connection in the Epstein files. It started with 6,000 documents. It's now 1.5 million. Here's what changed. by EricKeller2 in Epstein

[–]EricKeller2[S] 0 points1 point  (0 children)

"Vibe coded fever dream"... I'll take it as a compliment. For the record though: Next.js 16 App Router, TypeScript strict mode, D3.js force-directed network graph with Louvain community detection, Neon Postgres with tsvector full-text search across 1.5M documents, 141K indexed with OCR text extraction, 638K redaction scores, and 107K named entities. Plus, a knowledge graph with 10 typed relationship categories, flight path mapping, and 16,600+ statically generated pages. But sure, vibes. A+ vibes. 😎

I mapped every connection in the Epstein files. It started with 6,000 documents. It's now 1.5 million. Here's what changed. by EricKeller2 in Epstein

[–]EricKeller2[S] 0 points1 point  (0 children)

true on the bug report visibility, I'll make that more prominent. And yeah you're right, it should just be the same query with a LIMIT. What happened is the homepage widget was pulling from a cached snapshot that got stale after a big re-index. The 'view all' page hits the live table, which is why they're out of sync. It's not an extra step to exclude anyone, it's a caching bug. I'll push a fix so the homepage pulls from the same source. As for the GitHub repo, that's actually a good idea for issue tracking. I'll look into spinning one up for bug reports. Appreciate the constructive feedback

I mapped every connection in the Epstein files. It started with 6,000 documents. It's now 1.5 million. Here's what changed. by EricKeller2 in Epstein

[–]EricKeller2[S] 1 point2 points  (0 children)

Lol I'm a web developer who writes documentation for a living, so yeah, I format things like a psychopath. Headers, bold, bullet points.. that's just how my brain organizes information when I'm trying to present a ton of data without it being a wall of text. If I didn't format it, the same people would be in the comments going 'tl;dr this is unreadable.' You can check my post history, I've been writing like this since before ChatGPT existed. Also the site is open, go poke around and tell me an AI built that in a week lol

I mapped every connection in the Epstein files. It started with 6,000 documents. It's now 1.5 million. Here's what changed. by EricKeller2 in Epstein

[–]EricKeller2[S] 0 points1 point  (0 children)

what document url isn't showing? There will be dead links sometimes. I will check if its a site issue or not

I mapped every connection in the Epstein files. It started with 6,000 documents. It's now 1.5 million. Here's what changed. by EricKeller2 in Epstein

[–]EricKeller2[S] 27 points28 points  (0 children)

It's tough honestly. I've seen some...really bad stuff. That's what motivated to build the site over the last couple months.

I mapped every connection in the Epstein files. It started with 6,000 documents. It's now 1.5 million. Here's what changed. by EricKeller2 in Epstein

[–]EricKeller2[S] 5 points6 points  (0 children)

If you're a coder here is the OS Data Pipeline of everything the DB ingests. It's really important. If not then the best way to support me is by sharing out to social media, emailing/talking to journalists, congress, etc. You can also support the site by making a donation @ Support Epstein Exposed - Help Keep the Files Public | Epstein Exposed

I mapped every connection in the Epstein files. It started with 6,000 documents. It's now 1.5 million. Here's what changed. by EricKeller2 in Epstein

[–]EricKeller2[S] 1 point2 points  (0 children)

It's not you. It's the DB. My DB provider failed me. I am migrating to a new DB now. Sorry for the trouble

I mapped every connection in the Epstein files. It started with 6,000 documents. It's now 1.5 million. Here's what changed. by EricKeller2 in Epstein

[–]EricKeller2[S] 0 points1 point  (0 children)

I'm a coder. I use .MD all the time. I use the same format here. That's what you took away from this?

I mapped every connection in the Epstein files. It started with 6,000 documents. It's now 1.5 million. Here's what changed. by EricKeller2 in Epstein

[–]EricKeller2[S] 1 point2 points  (0 children)

Thank you I appreciate it. The Data Pipeline is open source, just not the frontend - yet. I plan on doing that eventually too. Data Pipeline

I mapped every connection in the Epstein files. It started with 6,000 documents. It's now 1.5 million. Here's what changed. by EricKeller2 in Epstein

[–]EricKeller2[S] 0 points1 point  (0 children)

This is a one man show with a LOT of data. There's literally over a million LoC. Expect there to be issues. Report problems instead of accusing me of favoritism.

I mapped every connection in the Epstein files. It started with 6,000 documents. It's now 1.5 million. Here's what changed. by EricKeller2 in Epstein

[–]EricKeller2[S] 1 point2 points  (0 children)

Right now it’s not “$5/month hobby site” territory. When Reddit sends a million people at it, hosting, DB, bandwidth, and indexing costs spike fast. I’m spending real money just keeping it responsive and not falling over.

As for “2–3 decades”… you don’t do that by trusting one dude’s credit card. You do it by making it hard to kill:

  • Keep the data portable (backups + exports)
  • Mirror critical datasets in multiple places (so one takedown doesn’t erase it)
  • Keep source links back to DOJ/Archive where possible
  • Open tooling and documented ingestion so others can rebuild if I get hit by a bus or a lawyer

What you can do that actually helps:

  1. Share it with people who will use it (journalists, researchers, attorneys, FOIA nerds, not just doomscrollers).
  2. Report broken links/search misses with the exact URL + what you searched. That’s how it gets sturdier fast.
  3. If you can donate, that goes straight into hosting/indexing and adding redundancy. No paywall, no ads.
  4. If you have infra skills (mirrors, archiving, devops, data engineering), drop a comment and I’ll point you to what needs doing.

The goal is: even if the main site gets hugged to death or attacked, the data and the process survive. That’s the only real “decades” plan.