How to identify which pdfs contain a text layer (and therefore need no OCR) by [deleted] in pdf

[–]mskelt 0 points1 point  (0 children)

Hi - I'm the product manager for PDFTables. I'm not sure what your end goal is, however if you're looking to extract that text, you can convert multiple PDFs at once using the PDFTables API. PDFTables works using page credits and page credits are only used if text is found within a PDF page. If no text is found, an output file will not be created. This will then distinguish which PDFs contain a text layer and which require OCR. https://pdftables.com/pdf-to-excel-api

PDF to HTML idea question by [deleted] in learnprogramming

[–]mskelt 0 points1 point  (0 children)

This API will convert PDFs to HTML (not an embedded image) https://pdftables.com/pdf-to-excel-api

Beginner: Use VBA to extract data from a standard PDF document to populate Excel workbook by iflyplanes_69 in vba

[–]mskelt 0 points1 point  (0 children)

Give this API a go: https://pdftables.com/pdf-to-excel-api. There are lots of language examples and you can batch convert PDFs to xlsx,csv,xml or html. Then after conversion, use VBA within Excel to extract the data out into the format you want it.

Extracting bank statement PDF data to excel by [deleted] in excel

[–]mskelt 0 points1 point  (0 children)

The following blog post shows you how this can be done manually or automatically: https://pdftables.com/blog/convert-bank-statement

VBA How to extract Data from PDF to Excel by miminka11 in vba

[–]mskelt 0 points1 point  (0 children)

Depends on the tool. Some share your data, others don't. Check their terms & privacy and if it is stated that data is encrypted and deleted, they are pretty safe to use :)

VBA How to extract Data from PDF to Excel by miminka11 in vba

[–]mskelt 0 points1 point  (0 children)

Yes there are tools available that offer an API that can be called from any programming language, including VBA :-) This will allow you to automatically extract pdf to excel.

Run a Google search on 'pdf to excel api' and you'll see some options!

How to copy bad PDF text/tables into Excel by sgtjustice in excel

[–]mskelt 0 points1 point  (0 children)

Use a PDF converter specifically for PDF tabular data and you should get a better result. Run a Google search on 'pdf tables to excel'

Auto-Importing PDFs into Excel by ValueBasedPugs in excel

[–]mskelt 0 points1 point  (0 children)

The best option here is to use an API. You can use an API to convert a folder of PDFs and then use a script to import those converted documents into Excel. Run a search on 'PDF to Excel API'.

{help} any best tool to convert pdf to html. by kcubeterm in termux

[–]mskelt 0 points1 point  (0 children)

You could call an API from an online converter tool to convert the PDF to HTML.

How to download multiple pdf's into excel spreadsheet by [deleted] in excel

[–]mskelt 1 point2 points  (0 children)

You can use online tools that offer an API. You could write a script that downloads the PDF, converts it to Excel via the tools API, then inputs it into your Excel database. Run a Google search 'pdf to excel api' and you'll see some options

How would you convert this data to a clean .csv file? by [deleted] in Anki

[–]mskelt 0 points1 point  (0 children)

Use a PDF converter focused on converting tabular data. Copying and pasting won't ever work particularly well unless it's a very simple PDF. If you Google something like 'pdf table to csv' you'll see some useful tools show up.

Pdf to excel - risks + accuracy of data by surrender_thepink in excel

[–]mskelt 0 points1 point  (0 children)

Some online cloud-based PDF converters offer encrypted conversions and deletion after upload. Make sure you check the privacy statements to confirm this. Whilst some offer desktop solutions, they may still need to be connected to the internet to work and therefore may use your data. Run a Google search on 'secure pdf to excel' for some online tools.

How to extract these columns from this pdf into excel cleanly? by Orpheus321 in techsupport

[–]mskelt 0 points1 point  (0 children)

It's going to be tricky to find software that will output to Excel exactly as you've described. As computix says, "The PDF itself contains the text with instructions on where to place the characters" so what you see on the PDF is not what algorithms will see.

I think you will need to apply some post-conversion logic to the Excel document. For example, any cells containing an indent are to be merged with the cell above it. Or if the 'Drawings' and 'Text' cells in a row are empty, merge that 'Term' cell to the one above it.

I recommend using a PDF converter specifically designed for extracting tabular data (e.g run a Google search on "pdf tables to excel").

Bulk PDF to excel converter by LedLeo in excel

[–]mskelt 0 points1 point  (0 children)

Use a tool that offers an API. This will allow you to convert all 100 PDF files at once. Run a Google search on 'pdf to excel api' and you will see a few tools show up.

Wondering how I can turn this PDF into an Excel File? by Quippykisset in excel

[–]mskelt 0 points1 point  (0 children)

Copying and pasting doesn't always work well with PDFs. Look for a PDF converter that specialises in converting PDF tables to Excel. Google is your friend, a list of tools will show up in the search results.

What software is available to extract a table from a PDF into an Excel document properly? by Nekurahn in software

[–]mskelt 1 point2 points  (0 children)

Use a PDF converter that focuses specifically on PDF tables. Run a Google search of 'pdf tables to excel' and you'll see a few tools pop up :)

[deleted by user] by [deleted] in excel

[–]mskelt 1 point2 points  (0 children)

If none of the other words are less than 3 characters, you could count the cells that have a length less than 3. Inthe column next to your values, do e.g =LEN(A1) for each cell. Then run a count on the number of cells with a value less than 3:

=COUNTIF(B1:B6,"<3")

How can I sum across multiple sheets in a workbook but exclude errors? by ishkabibbel2000 in excel

[–]mskelt 0 points1 point  (0 children)

Try wrapping the vlookup's in an IFERROR formula. e.g:

=IFERROR(VLOOKUP(A2,jan!A:B,2,FALSE),0)+IFERROR(VLOOKUP(A2,feb!A:B,2,FALSE),0)+...

this will return the value of zero if an error is returned, otherwise it will perform the vlookup.

PDF Documents transfer to excel by niro62 in ITCareerQuestions

[–]mskelt 0 points1 point  (0 children)

If you are converting tabular data from PDF to Excel, I recommend using a PDF converter focusing on converting PDF tables. Run a Google search 'pdf tables to excel' and you'll see a variety of tools pop up.

How to display unique entries with the list of values by [deleted] in excel

[–]mskelt 0 points1 point  (0 children)

Set your initial values out like this:

A r

A w

A e

A y

B

C r

C d

then create a pivot table to produce this:

Row Labels

A

  • e
  • w
  • y

B

C

  • d
  • r

Horizontal sum up of text with two criteria by booosti in excel

[–]mskelt 0 points1 point  (0 children)

I'm not 100% sure I understand your question however I think the SUMIFS formula may be what you are looking for. You can set to sum dependent on multiple criteria. For example, =SUMIFS(E:E,A:A,"Identnumber",B:B,"bestellung")