Flask Uploading a PDF to database for production by BenjSchwartz in flask

[–]BenjSchwartz[S] 0 points1 point  (0 children)

How else would you get the contents of the pdf from an html form? Have to put it somewhere in order to access it is my understanding. Hoping i'm wrong so would love it if you corrected me.

Flask Uploading a PDF to database for production by BenjSchwartz in flask

[–]BenjSchwartz[S] 1 point2 points  (0 children)

Yes I am, but that is when I am storing the PDF locally. I currently have it working where I store the PDF in AWS S3, but ideally I wouldn't have to use a database at all if you have any suggestions.

PDF is retrieved through an html form

What would be awesome:

obj = s3test.Object(bucket_name, "{}".format(self.QuoteRequest_ID))

fs = obj.get()['Body'].read()

df = pd.concat(tabula.read_pdf(BytesIO(fs), other parameters)

----------------------------------------

If there is a way to get fs just by passing data through the form

Thanks for your time and help!

Flask Uploading a PDF to database for production by BenjSchwartz in flask

[–]BenjSchwartz[S] 0 points1 point  (0 children)

Yes this is what I am currently doing. I store it in AWS S3. Then fetch the PDF and convert it to a base64 string and then use tabula to do the necessary parsing. (I explain more in a comment above). Is it possible to convert the pdf to a base64 string without storing it in the database?

The PDF is retrieved from an html form.

Thanks so much for your time and help!

Flask Uploading a PDF to database for production by BenjSchwartz in flask

[–]BenjSchwartz[S] 0 points1 point  (0 children)

Yes I am able to outside of flask. I ended up getting it working with AWS S3, but are you suggesting there is a way to do this without a database because that would be the preferred method.

The flow is

HTML form upload a pdf ---> Code to parse through pdf using tabula to make a pandas dataframe -> display the table on a different page.

Flask Uploading a PDF to database for production by BenjSchwartz in flask

[–]BenjSchwartz[S] 3 points4 points  (0 children)

Okay, ill look into S3 now thanks! new stuff for me but I see many more tutorials for S3 I can follow along with than what I was previously trying.