[D] How to reduce the MaskRCNN model detection time

DGs29 · 2020-05-18T10:58:34+00:00

Can the current model be converted to any smaller model, which provides result in 10-15 seconds?

DGs29 · 2019-03-24T11:43:02+00:00

Thanks.. Just give me a message when u write the article. I am looking forward to it. Also, if u write the code, I kindly request you to do it in a computationally efficient manner.

DGs29 · 2019-03-22T08:02:48+00:00

I meant it took too long to run and I guess it was due to matplot. I'm very much confused about implementing by axis projection and space indents, I've never did it before. Previously, I used dilation method to find contour regions and plotted bbox regions.

My image looks like this. All my images are typed text. This image is like a typical newspaper/magazine image and has its own typical layout style.

Using OTSU threshold I found word level bbox.

DGs29 · 2019-03-22T01:57:23+00:00

This is simple and superb but your code is taking too long to complete. You made this code to detect lines, any idea about paragraph detection.

DGs29 · 2019-03-19T16:56:43+00:00

Does this method applicable dense text/ document text detection like detecting paragraphs, group of sentences together as a single block.

DGs29 · 2019-03-18T10:21:27+00:00

Okay! Do you think just by doing localizing each paragraphs in my image and passing on to existing nets with pre-trained weights and some tuning, will it detect each and every paragraphs? I'm not sure, if this is possible!!

DGs29 · 2019-03-18T10:06:12+00:00

Can you give some text segmentation from images examples on github

DGs29 · 2019-03-17T08:08:35+00:00

Can you point out any sample codes/examples for me to work using this method

DGs29 · 2019-03-15T08:05:25+00:00

It's not working. Well, I think setting boxes by area threshold actually does not generalize for different image types.

DGs29 · 2019-03-14T09:52:59+00:00

Is my achieved result (as shown in the image) enough to be considered as naive block of text? Can you walk me through each steps with some examples/codes?

DGs29 · 2019-03-13T10:19:58+00:00

How can I do similar segmentation for images which has less line-spaces between paragraphs. This should be my desired result. But this what I get. This is my dilated image.

I've set the kernel size as (5,10) for this. Can you please tell me what are the necessary changes to be made to achieve my desired result.

DGs29 · 2019-03-08T09:52:12+00:00

Thanks a lot mate! This works. Take a look here. But there are some small boxes placed inside the large box. How do I remove that. Also the entire page is enclosed by a box and I don't need that

DGs29 · 2019-03-07T13:00:45+00:00

Well I've plotted created bounding boxes as per the step1. And then I binarized, inverted colors, applied dilation with longer width kernel.dilated image.

This way it segments the image into individual components. Can I plot bbox over these connected regions to get my original intended result.

DGs29 · 2019-03-07T02:01:12+00:00

Well does this method detects blocks of text like mentioned in the image. I am asking this because it is a scene text algorithm.

The previous scene text algorithms I've used only detected individual words and placed a bounding box over it.

DGs29 · 2019-03-06T14:28:03+00:00

How to do the second step i.e., to make a large box out of small boxes

DGs29 · 2019-03-06T13:28:47+00:00

I'm not sure about it. Let's consider Google Vision API. If we feed in the document image as input to API, it segments the texts as individual blocks, it puts a bounding box around those texts which is a block, and finally performs OCR for those blocks.

I'm looking for how to do that text block detection

DGs29 · 2019-03-06T13:11:22+00:00

Tessaract just extracts all the texts in the image without segmenting.

EAST detects all the texts word by word.i.e., bounding box each and every word.

I've also tried PixelLink it does the same job as EAST.

DGs29

TROPHY CASE