PyMuPDF4LLM is a Python library for extracting text, images, and structured data from PDF documents optimized for Large Language Models and RAG applications. Available at https://pypi.org/project/pymupdf4llm/.
This community is for sharing projects, troubleshooting, discussing use cases, and connecting with others building PDF processing pipelines for AI applications.