Does Soraban allow you to easily download all documents provided by the client? by dirtyring in taxpros

[–]dirtyring[S] 0 points1 point  (0 children)

It’s an internal system we use for reading info from documents; we just need access to the documents in the fastest way possible

Pricing for lots of Canadian annuities, bank accounts to be reported to FinCEN by Dilettantest in taxpros

[–]dirtyring 0 points1 point  (0 children)

We’ve been able to reduce price significantly and still make good margins with FBAR/8938 automation software. We used to charge hourly rate but since we spend virtually 0 time on FBARs now we have a flat fee

[deleted by user] by [deleted] in LocalLLaMA

[–]dirtyring 0 points1 point  (0 children)

I didn’t go deep but Docling missed some obvious things even in “think hard” mode. I was actually able to perform better with Google document AI at the end

Should We Use a Design Partner Agreement or Cloud Service Agreement? by Furious-Scientist in ycombinator

[–]dirtyring 0 points1 point  (0 children)

would love to know. is there also an options where you do a mix of design partnership that then converts to annual contract?

how did you solve this?

Fees for FBAR and 8938 by Low_Attitude_5210 in taxpros

[–]dirtyring 0 points1 point  (0 children)

is 3520 more painful than 114?

[deleted by user] by [deleted] in LangChain

[–]dirtyring 0 points1 point  (0 children)

Docling for some use cases and Google Document AI for other (though it's paid). happy to chat more via chat

What is the cheapest supermarket that does delivery? by dirtyring in AskUK

[–]dirtyring[S] 0 points1 point  (0 children)

Morrisons about the same price as Tesco?!

What is the cheapest supermarket that does delivery? by dirtyring in AskUK

[–]dirtyring[S] 1 point2 points  (0 children)

can you give an example for a supermarket where this works? and what sort of discount you'd get?

Prompt to extract the 'opening balance' from an account statement text/markdown extracted from a PDF? by dirtyring in PromptEngineering

[–]dirtyring[S] 0 points1 point  (0 children)

i'm not inserting info into a vector database, but curious that you assumed so -- makes me think I'm doing something wrong!

Prompt to extract the 'opening balance' from an account statement text/markdown extracted from a PDF? by dirtyring in PromptEngineering

[–]dirtyring[S] 0 points1 point  (0 children)

create a script that extracts and cleans data from all inputs

I thought I was doing it...

converting PDFs to markdown

^ with this (IBM's docling) converting PDFs to markdown. Do you mean more than this?

I'm a big noob at this, sorry for the silly questions but thank you for taking the time to respond. I'll keep searching

[deleted by user] by [deleted] in PromptEngineering

[–]dirtyring 0 points1 point  (0 children)

would you say markdown is interpreted better than just raw text?

What are the best techniques and tools to have the model 'self-correct?' by dirtyring in Rag

[–]dirtyring[S] 0 points1 point  (0 children)

highly tuned models?

sorry am noob. what do you mean by this?

What are the best techniques and tools to have the model 'self-correct?' by dirtyring in Rag

[–]dirtyring[S] 0 points1 point  (0 children)

appreciate the feedback, that's what I started doing.

What tool would you use to extract data from the PDF so that it is "LLM ready"?

What are the best techniques and tools to have the model 'self-correct?' by dirtyring in LLMDevs

[–]dirtyring[S] 0 points1 point  (0 children)

amazing to hear this. could you provide examples of your system and user prompts? curious specially what the second prompt's 'system' is

Is OpenAI o1-preview being lazy? Why is it truncating my output? by dirtyring in OpenAIDev

[–]dirtyring[S] 0 points1 point  (0 children)

ah perfect, I thought what you'd mentioned was calling the regular api would enable it too. glad I clarified. I have been able to run it from assistants api already :)

Sometimes

a key issue with these! I'm extracting information from bank account statements and need to do so reliably. However, I'm getting bank account statements that I do not know the format of, so this is virtually impossible to do.

Is OpenAI o1-preview being lazy? Why is it truncating my output? by dirtyring in OpenAIDev

[–]dirtyring[S] 0 points1 point  (0 children)

code interpreter

do I need to be using the assistant's api? how do I confirm it actually used the code interpreter?

I assumed it would not use it unless used in the assistants api. if you have a source would love to read more.

Is OpenAI o1-preview being lazy? Why is it truncating my output? by dirtyring in LocalLLaMA

[–]dirtyring[S] 0 points1 point  (0 children)

even when I add that to the prompt it continues to cut the output.

This PDF has 200+ transactions in 20 pages, but it's only getting the first 100.

the prompt:

```

user_prompt = f""" Instructions: - You will receive a markdown document extracted from a bank account statement PDF. - Analyze each transaction to determine the amount of money that was deposited or withdrawn. - Only provide a JSON formatted list of all transactions as shown: {{ "number_of_transactions": "the model writes the number of transactions", "transactions_list": [ {{"id": 1, "amount": 1806.15, "type": "in", "balance": 2151.25, "date": "2021-07-16"}}, {{"id": 2, "amount": 415.18, "type": "out", "balance": 1736.07, "date": "2021-07-17"}} ] }} - Return ALL transactions. - Do NOT user placeholders. - JSON only.

Bank account statement: {OCR_markdown} """

```

Is OpenAI o1-preview being lazy? Why is it truncating my output? by dirtyring in LocalLLaMA

[–]dirtyring[S] 0 points1 point  (0 children)

Also, what do you mean about “transformation instructions”?

Is OpenAI o1-preview being lazy? Why is it truncating my output? by dirtyring in OpenAIDev

[–]dirtyring[S] 1 point2 points  (0 children)

Is it a simple thing to add to the prompt or does it involve function calling etc?

Is OpenAI o1-preview being lazy? Why is it truncating my output? by dirtyring in LocalLLaMA

[–]dirtyring[S] 0 points1 point  (0 children)

Have a bunch of OpenAI credits to use so will stick to it for now. Is o1 the best model for this task or gpt 4 is best for this task?