Mineru

Repo

Turn Any PDF Into AI-Ready Data With MinerU Struggling to feed complex documents into your AI models

65.3k

About Mineru

Turn Any PDF Into AI-Ready Data With MinerU Struggling to feed complex documents into your AI models? Meet MinerU, a powerful open-source engine designed to transform messy PDFs, office documents, and scanned images into clean, machine-readable Markdown or JSON. By using a dual-engine approach combining advanced OCR and vision-language models, it excels at reconstructing tricky layouts, converting tables into HTML, and turning mathematical formulas into perfect LaTeX. It is the ultimate tool for developers building high-quality RAG pipelines or agentic workflows. Whether you need to process thousands of pages or just a single file, MinerU simplifies document intelligence, making your data truly ready for any AI application. Open source: opendatalab/mineru Follow for more open source project drops. #opensource #github #programming #coding #buildinpublic #mineru

Topics

pdf parser
data extraction
document processing
markdown converter
llm pipeline

Reviews (0)

No reviews yet.

Sign in to write a review.

Similar Repos

L
Open Source

LangChain

Framework for developing applications powered by language models

No reviews yet
E
Free

Ecc

Supercharge Your AI Coding Workflow Today Are you tired of AI coding agents that lack focus and depth

No reviews yet
H
Free

Hermes Agent

The AI Agent That Grows With You

No reviews yet
S
Free

Skills

Supercharge Your AI Coding Workflow Are you tired of AI writing code that just does not fit your engineering standards

No reviews yet