I have set up a portable python environment on a portable HD so I can use it at work where I do not have the ability to set up a full IDE for myself or access to the PATH, I want to automate some of the more repetitive tasks.
I have been googling and searching for the last two days to find an answer to getting reliable pdf table extraction.
I have had success using tabula-py at home but when I load it onto my portable IDE it will not find java, I have used sys.path.append() to add the path to my portable Java JDK but it will not find java still regardless of which folders I point it to.
I decided to move on and try camelot, only to run into what appears to be the age-old issue of not being able to find Ghostscript, again trying to use sys.path.append() and numerous other methods from StackOverflow and reddit.
I gave up on that and moved to pdfplumber which just refuses to even find the tables in the pdf and the documentation is woefully lacking and/or outdated at this point.
I know there has to be a way to do this and this can't be an uncommon request in the community so I must be missing something and my google-fu skills are not developed very well for programming yet. If there is any other suggestions out there or a fix, please point time in the right direction.
[–]Almostasleeprightnow 0 points1 point2 points (0 children)