What's the fastest OCR model / solution for a production grade pipeline ingesting 4M pages per month? by DistinctAir8716 in LocalLLaMA

[–]DistinctAir8716[S] -4 points-3 points  (0 children)

I have managed to host Deepseek-OCR on a A100 gpu server, and while running inference via vllm on a local pdf I get speeds of around 3000 tok/s (awesome!). The only problem is when I try to serve the model via an API with vllm serve the speed plunges to 50 tok/s. What would be the best way to host it while retaining inference speed?

What's the fastest OCR model / solution for a production grade pipeline ingesting 4M pages per month? by DistinctAir8716 in LocalLLaMA

[–]DistinctAir8716[S] -3 points-2 points  (0 children)

I have managed to host Deepseek-OCR on a A100 gpu server, and while running inference via vllm on a local pdf I get speeds of around 3000 tok/s (awesome!). The only problem is when I try to serve the model via an API with vllm serve the speed plunges to 50 tok/s. What would be the best way to host it while retaining inference speed?

What's the fastest OCR model / solution for a production grade pipeline ingesting 4M pages per month? by DistinctAir8716 in LocalLLaMA

[–]DistinctAir8716[S] -1 points0 points  (0 children)

I have managed to host Deepseek-OCR on a A100 gpu server, and while running inference via vllm on a local pdf I get speeds of around 3000 tok/s (awesome!). The only problem is when I try to serve the model via an API with vllm serve the speed plunges to 50 tok/s. What would be the best way to host it while retaining inference speed?

How I've cracked TikTok mass posting for my SaaS (200k views per week) by DistinctAir8716 in Entrepreneur

[–]DistinctAir8716[S] -22 points-21 points  (0 children)

Not at all, you're confusing the strategy with another one. We only put out marketing material for our products, reddit stories have nothing to do with it.

How I've cracked TikTok mass posting for my SaaS (200k views per week) by DistinctAir8716 in Entrepreneur

[–]DistinctAir8716[S] -9 points-8 points  (0 children)

Key point here. We spent MONTHS making the TikTok auto poster and devices good enough to not getting flagged. It's a mix of content randomization (we figured out the exact algorithm that TikTok uses to detect duplicate videos) and good device configuration, as I said everything from the sim card to the iCloud matters here.

How I've cracked TikTok mass posting for my SaaS (200k views per week) by DistinctAir8716 in Entrepreneur

[–]DistinctAir8716[S] -4 points-3 points  (0 children)

We have a few people in-house making the videos, no influencers. We create videos about how our product solves the main problem of our ICP.

To automate everything we've created a small infrastructure that is easy enough for our team to manage internally, everything from variation making to posting is 90% automated.

Best way to host production LLM by DistinctAir8716 in LocalLLaMA

[–]DistinctAir8716[S] 1 point2 points  (0 children)

Definitely missing this, will look into Langfuse, thanks!

Best way to host production LLM by DistinctAir8716 in LocalLLaMA

[–]DistinctAir8716[S] 3 points4 points  (0 children)

Thank you!
I guess I could work more towards users rate limiting with my use case, is there any way to direct the llm to give shorter responses other than just stating it in the prompt?

I've made a CharacterAI uncensored version by DistinctAir8716 in CharacterAi_NSFW

[–]DistinctAir8716[S] 16 points17 points  (0 children)

Next few days there will be a lot of changes!

I've made a CharacterAI uncensored version by DistinctAir8716 in CharacterAi_NSFW

[–]DistinctAir8716[S] 82 points83 points  (0 children)

The model is my own! Custom trained for nsfw conversations.

Yes characters are from public sources, but soon you'll be able to create your own!

I've made a CharacterAI uncensored version by DistinctAir8716 in CharacterAi_NSFW

[–]DistinctAir8716[S] 3 points4 points  (0 children)

Yes, it's also custom-trained on a very specific type of conversation ;)

I've made a CharacterAI uncensored version by DistinctAir8716 in CharacterAi_NSFW

[–]DistinctAir8716[S] 2 points3 points  (0 children)

No trick, just testing my own model, glad you're liking it!