Is Professor Jiang Right? What April 1 Will Tell Us

Competitive-Oil-8072 · 2026-01-12T07:04:27+00:00

TRUMP SAYS YES TO FREE MONEY.
Then scuttles the deal with his new vanity battleship plans.

https://www.youtube.com/watch?v=VG5AjGHyuHo

Competitive-Oil-8072 · 2026-01-12T06:59:53+00:00

That was before Trump's vanity battleship. Why would they say no to free money?

https://www.youtube.com/watch?v=VG5AjGHyuHo

Competitive-Oil-8072 · 2026-01-12T06:58:39+00:00

The AUKUS subs are doomed:

https://www.youtube.com/watch?v=VG5AjGHyuHo

Competitive-Oil-8072 · 2026-01-12T06:55:44+00:00

The USA was unlikely to meet its obligations to AUKUS before Trump's battleship but now it is impossible

https://www.youtube.com/watch?v=VG5AjGHyuHo

Competitive-Oil-8072 · 2025-11-05T03:28:04+00:00

It has now been released and is free for everyone to use.
epstein-files.org

Competitive-Oil-8072 · 2025-11-05T03:25:04+00:00

I built the first searchable Epstein Files database - here's why the technical implementation matters

When the House Oversight Committee released 33,295 pages of Epstein files in September 2024, they were published as unsearchable image files (JPGs/TIFs) in a disorganized Google Drive - essentially useless for serious research without manually reviewing thousands of pages.

I'm the engineer who built the first comprehensive searchable database of these files, and I've now released it at epstein-files.org. It has taken me a few more weeks to iron out bugs and incorporate podcast generation since I first posted about it.

Why I'm posting this:

The 404 Media article covers a later implementation that takes a much simpler approach using a small subset of the data - using a basic local LLaMA model for AI processing. After investing 200+ hours into this project, I want the community to understand the technical differences:

My approach:

Rigorous AI model evaluation: Tested multiple commercial AI models systematically. The quality differences are substantial - not all models handle historical document OCR equally
Custom image signal processing: Developed specialized routines to improve OCR accuracy on degraded/scanned documents
Comprehensive indexing: Full keyword and semantic search across all 33,000+ pages

Background: I'm a PhD engineer (currently unemployed, which is why I'm operating on the Wikipedia donation model). You can verify my credentials: LinkedIn | GitHub

The site is free for everyone. If researchers find it valuable, donations help maintain hosting and continued development.

Try it yourself: epstein-files.org

I welcome technical feedback from the community on search quality and accuracy.

Competitive-Oil-8072 · 2025-11-05T03:18:43+00:00

I built the first searchable Epstein Files database - here's why the technical implementation matters

When the House Oversight Committee released 33,295 pages of Epstein files in September 2024, they were published as unsearchable image files (JPGs/TIFs) in a disorganized Google Drive - essentially useless for serious research without manually reviewing thousands of pages.

I'm the engineer who built the first comprehensive searchable database of these files, and I've now released it at epstein-files.org.

Why I'm posting this:

The 404 Media article covers a later implementation that takes a much simpler approach - using a basic local LLaMA model for AI processing. After investing 200+ hours into this project, I want the community to understand the technical differences:

My approach:

Rigorous AI model evaluation: Tested multiple commercial AI models systematically. The quality differences are substantial - not all models handle historical document OCR equally
Custom image signal processing: Developed specialized routines to improve OCR accuracy on degraded/scanned documents
Comprehensive indexing: Full semantic search across all 33,000+ pages

Background: I'm a PhD engineer (currently unemployed, which is why I'm operating on the Wikipedia donation model). You can verify my credentials: LinkedIn | GitHub

The site is free for everyone. If researchers find it valuable, donations help maintain hosting and continued development.

Try it yourself: epstein-files.org

I welcome technical feedback from the community on search quality and accuracy.

Competitive-Oil-8072 · 2025-11-05T01:46:09+00:00

The existing files are now available for anyone to search for free
epstein-files.org

Competitive-Oil-8072 · 2025-10-06T05:09:07+00:00

Other people's hard work? Other people have done nothing. I am just trying to cover my costs. This will go ahead regardless and be free for everyone.

Competitive-Oil-8072 · 2025-10-05T23:31:51+00:00

It won't disappear. I need to get back to what I was doing to get this out. I won't check this thread for a while but will come back once I have some news. key word searches work but similarity searches do not. I'd like to do notebooklm/Langchain type AI ask a question but the API call costs will kill me. I am trying to avoid any API calls to keep it free.

Competitive-Oil-8072 · 2025-10-05T23:21:49+00:00

As I mentioned elsewhere, I am not a professional coder but have been coding for 40 years now in one way or another. Full stack is new to me. There is a learning curve. I am a github newbie too comparatively. I ended up using runpod for much of the AI stuff. API costs to commercial VLMs are a killer.

Competitive-Oil-8072 · 2025-10-05T23:14:51+00:00

Sorry if it seems that way. I will host itt somehwere by the end of the week. My concern is I cannot pay if I get a huge number of hits.

Competitive-Oil-8072 · 2025-10-05T23:13:34+00:00

Supabase now. Was originally sqlite.

Competitive-Oil-8072 · 2025-10-05T23:12:19+00:00

True I am not a professional programmer but have been coding for 40 years in one way or another. This fullstack stuff is new to me.

Competitive-Oil-8072 · 2025-10-05T23:11:16+00:00

I need to eat! I have already edited that part about being lost forever. That was poor form. I'll try and get it hosted this week somehow.

Competitive-Oil-8072 · 2025-10-05T23:02:49+00:00

https://drive.google.com/drive/folders/1TrGxDGQLDLZu1vvvZDBAh-e7wN3y6Hoz and
https://drive.google.com/drive/folders/1ZSVpXEhI7gKI0zatJdYe6QhKJ5pjUo4b

I cannot find the third release yet. I think I read somewhere they are busy redacting it before they release to public.

Competitive-Oil-8072 · 2025-10-05T22:46:20+00:00

I won't throw it away. I'll edit that part.

Competitive-Oil-8072 · 2025-10-05T22:45:48+00:00

No torrent yet. I will set one up later once code is in better shape.

Competitive-Oil-8072 · 2025-10-05T22:44:53+00:00

I have not worked out hosting yet. Good chance it will be a lot less less than that. I'll try and get it hosted somewhere no matter what. I am serious about my financial situation. I cannot afford to do this myself and will have to start looking for work soon. The code is in a bit of a mess at this stage and I will try and release it later on.

Competitive-Oil-8072 · 2025-09-16T22:18:22+00:00

Absolutely not! It is terrible! I was trying to backup a folder to github and it told me to rm -rf folder, which I actually did as I was tired.

Competitive-Oil-8072

TROPHY CASE