OCR相关的笔记
OCR相关的知识整理:建议实际业务使用的时候,基地模型使用PaddleOCR,然后布局可以使用minerU,pdf类型的文本,可以使用PymuPDF工具,我们实际产品中就是这么用的,文本布局和表格部分一般需要自己训练优化,很难满足各类自己的应用场景。
https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/docker/linux-docker.html
https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_ch/quickstart.md 官方文档说明
https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/ppstructure/docs/quickstart.md : structure分析
docker pull paddlepaddle/paddle:2.6.0-gpu-cuda12.0-cudnn8.9-trt8.6
docker:nvidia-docker run --name paddle -it -v $PWD:/paddle registry.baidubce.com/paddlepaddle/paddle:2.5.2-gpu-cuda10.2-cudnn7.6-trt7.0 /bin/bash
数据准备,模型训练:
https://zhuanlan.zhihu.com/p/686402622
数据标注: paddleLabel
https://blog.csdn.net/qq_49627063/article/details/119134847
数据标注工具:
https://github.com/PFCCLab/PPOCRLabel  #标注工具
https://github.com/sohaib023/T-Truth # 表格识别表格标注工具,需要做转换
https://github.com/PaddleCV-SIG/PaddleLabel/blob/v1.0.0/doc/CN/install.md
https://aistudio.baidu.com/modelsdetail/18?modelId=18 官方文档信息
https://github.com/PaddlePaddle/PaddleOCR/blob/main/doc/doc_ch/table_recognition.md 表格识别
https://github.com/PaddlePaddle/PaddleOCR/blob/main/doc/doc_ch/dataset/table_datasets.md 表格数据集
https://github.com/PaddlePaddle/PaddleOCR/blob/main/applications  应用说明
https://gitee.com/paddlepaddle/PaddleOCR/blob/release/2.6/ppstructure/layout/README_ch.md 数据集连接
https://github.com/WenmuZhou/TableGeneration 表格数据生成
GPT使用
https://learn.microsoft.com/zh-cn/azure/ai-services/openai/how-to/gpt-with-vision?tabs=python%2Csystem-assigned%2Cresource
算法说明:
https://blog.csdn.net/shiwanghualuo/article/details/129132206
https://huggingface.co/datasets/juliozhao/DocSynth300K 数据集
========================
PyMuPDF相关
========================
https://github.com/pymupdf
https://github.com/pymupdf/RAG
https://pymupdf4llm.readthedocs.io/en/latest/   PyMuPDF4LLM
https://pymupdf.readthedocs.io/en/latest/rag.html#   rag_with llm
https://github.com/pymupdf/RAG 代码
https://pymupdf.readthedocs.io/en/latest/tutorial.html 文档
https://blog.csdn.net/shiwanghualuo/article/details/129132206 SLANet总结
tesseract-ocr
https://github.com/tesseract-ocr/tesseract
https://github.com/tesseract-ocr/tessdoc
IBM-OCR
https://github.com/DS4SD/docling
MinerU:
https://github.com/opendatalab/MinerU
layoutReader: https://github.com/ppaanngggg/layoutreader
DocLayout-YOLO+mesh-candidate_bestfit: https://github.com/opendatalab/DocLayout-YOLO/tree/main/mesh-candidate_bestfit
https://github.com/PaddlePaddle/PaddleX/blob/release/3.0-beta1/docs/module_usage/tutorials/ocr_modules/table_structure_recognition.md
RapidOCR:
https://github.com/RapidAI/RapidOCR
RapidTable:https://github.com/RapidAI/RapidTable
posted on 2025-02-19 15:35 Sanny.Liu-CV&&ML 阅读(81) 评论(0) 收藏 举报
 
                    
                 
                
            
         
 浙公网安备 33010602011771号
浙公网安备 33010602011771号