User-friendly, affordable, and packed with different features, DocHub is a healthy and cost-efficient alternative to MuPDF. Try it now and learn how to squeeze the maximum of our solution with easy-to-use feature shortcuts.
welcome back i want to look at at pdf files and i want to write a python script that can obtain information from these pdf files and returns the total number of times a search term appears in a document and i also want to obtain the page numbers where a particular search term appears now in practice this is a very frequent problem which you encounter when you want to construct measures of good or bad corporate governance as an example i have here a financial statement which was published by marx and spencer a retail company in 2019 now when you have a pdf document of course what you can do is you can manually collect information yeah so we could for instance look at audits yeah so we could or we can look at audit term and we would be able to find the number of occurrences just by him doing a manual search using control f in the pdf document of course that you know is maybe an acceptable process if you have a handful of pdf documents however if you have hundreds of them maybe thousands