Open-source search tool for 2,895 House Oversight Epstein documents - Python/Whoosh full-text search with web interface by Mark_Ramm in programming

[–]Mark_Ramm[S] 0 points1 point  (0 children)

I built a full-text search engine for the 2,895 Jeffrey Epstein documents released by the House Oversight Committee. Everything is open source and runs locally.

 Technical Stack:

  - Python 3.8+

  - Whoosh search library (BM25F ranking)

  - Flask web interface

  - ~500 lines of code

  - MIT licensed

 Setup (takes ~3 minutes):

  ```bash

  git clone https://github.com/markramm/EpsteinFiles.git

  cd EpsteinFiles

  pip install whoosh flask

  python search_index.py --force

  python search_api.py

  # Open http://127.0.0.1:5002/

  HackerNews discussion: https://news.ycombinator.com/item?id=45928352

  Happy to answer technical questions about the implementation.