您好, 欢迎来到 !    登录 | 注册 | | 设为首页 | 收藏本站

使用python解析pdf

使用python解析pdf

用途PyPDF2

from PyPDF2 import PdfFileReader

with open('CT1-All.pdf', 'rb') as f:
    reader = PdfFileReader(f)
    contents = reader.getPage(0).extractText().split('\n')
    pass

当您打印时contents,它将看起来像这样(我在这里进行了修剪):

[u'Serial NoRoll NoNameCT1 Marks (50)111MA20026KARADI KALYANI212AR10029MUKESH K
MAR5', u'312MI31004DEEPAK KUMAR7', u'413AE10008FADKE PRASAD DIPAK27', u'513AE10
22RAHUL DUHAN37', u'613AE30005HIMANSHU PRABHAT26.5', u'713AE30019VISHAL KUMAR39
, u'813AG10014HEMANT17', u'913AG10028SHRESTH KR KRISHNA37.51013AG30009HITESH ME
RA33.5', u'1113AG30023RACHIT MADHUKAR40.5', u'1213AR10002ACHARY SUDHEER11', u'1
13AR10004AMAN ASHISH20.5', u'1413AR10008ANKUR44', u'1513AR10010CHUKKA SHALEM RA
U11.5', u'1613AR10012DIKKALA VIJAYA RAGHAVA20.5', u'1713AR10014HRISHABH AMRODIA
1', u'1813AR10016JAPNEET SINGH CHAHAL19.5', u'1913AR10018K VIGNESH42.5', u'2013
R10020KAARTIKEY DWIVEDI49.5', u'2113AR10024LAKSHMISRI KEERTI MANNEY49', u'2213A
10026MAJJI DINESH9.5', u'2313AR10028MOUNIKA BHUKYA17.5', u'2413AR10030PARAS PRA
python 2022/1/1 18:36:32 有246人围观

撰写回答


你尚未登录,登录后可以

和开发者交流问题的细节

关注并接收问题和回答的更新提醒

参与内容的编辑和改进,让解决方法与时俱进

请先登录

推荐问题


联系我
置顶