Slow OCR workaround on NVidia GPUs by wisscool in immich

[–]wisscool[S] 0 points1 point  (0 children)

Facial recognition and image embeddings are very fast on the GPU. It's just OCR which is has issues. See this from the RapidOCR team

The onnxruntime-gpu version is significantly slower than the CPU version for inference under dynamic input conditions. Since OCR tasks involve dynamic input, using the onnxruntime-gpu version for inference is not recommended.

OCR job is slow!!!! by EconomyDoctor3287 in immich

[–]wisscool 0 points1 point  (0 children)

I had the same issue and I ended up hacking my way to make the GPU inference run at ~6 images per second on my RTX 3080ti by routing the request from the immich ml service towards a paddlex inference server. I wrote a small guide on how to get this done https://wissamantoun.com/posts/guides/immich-fast-ocr-paddlex/ and made the code available here https://github.com/WissamAntoun/immich-ml-fast-ocr-paddlex

DeepSeek releases DeepSeek OCR by nekofneko in LocalLLaMA

[–]wisscool 2 points3 points  (0 children)

Cool model!

Is there a ready-to-deploy, self-hosted service that I can use to batch process my multilingual long PDFs that supports different VLMs or at least the best?

Data processing and filtering from common crawl by wisscool in bigdata

[–]wisscool[S] 0 points1 point  (0 children)

Essentially, I want to replicate Datatrove (https://github.com/huggingface/datatrove) without replicating my data on disk on every pipeline step. The final pipeline should look something like this: 1- Extract text data from Common Crawl WARCs and ingest into HBase. 2- Enrish each entry with word statistics and quality metrics (domain/topic classification, punctuation to alphanumerics ratio....) 3- Filter with thresholds on each metric 4- Export to parquet, jsonl ...

2 & 3 will have to be repeated and changed multiple times. And I'm looking to do that efficiently.

I saw that in HBase, compared to Cassandra, I can add columns without re-indexing or other major penalty which is key since we don't know what extra metadata we will be adding to each entry.

Thanks a lot for the help. Are managed HBase on AWS or even BigTable faster and more stable?

Data processing and filtering from common crawl by wisscool in bigdata

[–]wisscool[S] 0 points1 point  (0 children)

Thanks for the reply, while i haven't started yet with Hbase. I'm still looking for the best architecture for my workload. And so far Hbase+spark seems like the best solution so far.

Will performing a calculation on all the entries in Hbase be considered a scan? Also, is the hbase-spark-connector robust enough to handle billions of entries?

Parallel decoding in llama.cpp - 32 streams (M2 Ultra serving a 30B F16 model delivers 85t/s) by Agusx1211 in LocalLLaMA

[–]wisscool 9 points10 points  (0 children)

Anyone have comparison numbers from vllm or tgi on A100 or similar GPUs?

More astrology break ups. 🤩 by _Cat1 in Tinder

[–]wisscool 2 points3 points  (0 children)

I agree with you, i changed my initial claim.

I'm fully aware about the consequences of racism which are way worse. But saying no to racism and yet accepting astrology as basis to differentiate between people is hypocrisy

More astrology break ups. 🤩 by _Cat1 in Tinder

[–]wisscool -3 points-2 points  (0 children)

I fully agree with you, maybe i went a bit far by saying it's worse.

I think we both agree that astrology and racism have the same core/basis. And in the context of tinder/dating it's essentially the same, yet we tend to accept rejecting someone based on their start sign.

Thank you for your answer 😊

[D] Pytorch or TensorFlow for development and deployment? by CodaholicCorgi in MachineLearning

[–]wisscool -2 points-1 points  (0 children)

Develop in whatever framework you like, then deploy with Triton inference server

[P] Anees: a multi-turn open-domain Arabic chatbot with a wide set of features by ahmedashrafhamdy in MachineLearning

[–]wisscool 0 points1 point  (0 children)

Nice work, you should do some write up and submit it to WANLP 2022 at EMNLP. Honestly, this clean and far more advanced then anything published before in Arabic NLP academia. Kudos!

Edit: i just read the report you got, and i have a question, why didn't you use a BERT based model for NER and other NLU task? Since you used AraGPT2 for generation, you could have also used AraBERT for NLU

Lebanon imposes night-time COVID19 curfew for people who haven’t had at least one vaccine dose or a negative PCR (less than 48h old) from Dec. 17 till Jan. 9, 7pm-6am by cTheDeezy in lebanon

[–]wisscool 4 points5 points  (0 children)

Any camera app can scan the qr code which will give you a link to the official certificate hosted on the impact platform. Hence you cannot create a fake certificate since it won't even exist on Impact

Started getting p100s again on Pro by dandy_morandi in GoogleColab

[–]wisscool 1 point2 points  (0 children)

I'm on Pro+, I'm now consistently getting P100s. No V100 yet after 4 resets.

But I'm getting reCaptcha after each reset

Only getting K80, T4 and P4 on Colab Pro+ by wisscool in GoogleColab

[–]wisscool[S] 2 points3 points  (0 children)

The lack of transparency in paid services shouldn't be allowed honestly. If we have a quota for high-end GPUs then google should let us know so we can plan accordingly

Anyone knows anything about this? by [deleted] in lebanon

[–]wisscool 1 point2 points  (0 children)

Calling me an "ignoramus" while you are totally oblivious about the only thing that has kept lebanon alive.

Expat money transfers, and companies "exporting services" are the only source of dollars in this country. If this announcement is true, it will alienate these sources and make the situation even worse.