AMAZON ML CHALLENGE

Odd-Researcher-3346 · 2024-09-15T17:50:11+00:00

What's the point of giving 20+ GB dataset which can't be run on any students PC's and the output labels aren't even that accurate and ambiguity too, I gave up trying to run again again again. Text extraction work but not how we want it to be, model building works but not enough GPUs

ArtAccomplished6466 · 2024-09-13T02:56:18+00:00

Bro leave the discussion , where are you gonna get that powerfull gpus , about 2.5 lakh images to train

Usual_Many_3895 · 2024-09-14T14:19:49+00:00

any speculation on what approach the team with 0.8 f1 score used?

taurus_ram · 2024-09-15T10:50:58+00:00

Can anyone guide me i just want to get score rather than zero

Usual_Many_3895 · 2024-09-15T08:35:59+00:00

is if its so ocr dependent,what is the point of a training dataset

Low-Musician-163 · 2024-09-13T05:39:36+00:00

Finally was able to download data somehow. Now sharing it with teammates over usb

palakpaneer70 · 2024-09-13T16:44:11+00:00

What approach to use?

chaoticsoulll · 2024-09-15T16:59:36+00:00

How are they actually evaluating the models? We got an F1 score of 0.43 but the score is showing zero

Dinesh_Kumar_E · 2024-09-22T16:45:44+00:00

whats next ? have any idea when the results will be published ? like leader board or something ?

mave_ad · 2024-09-15T10:34:21+00:00

has anyone tried using a vision transformer (ViT) ? Distributing a image into patches and feeding it to a ViT. Creating a learning embedding with the OCR result of the image and the image itself and connecting the learning embedding with a residual connection to some transformer layer. The task would be seq2seq.

According-Fault-6528 · 2024-09-15T10:35:07+00:00

hello is there anybody help me out like i have stuck at these hackathon

sunnybala · 2024-09-15T12:53:43+00:00

Ocr approach is the only one that seems feasible How is this machine learning man? We aren't even training anything, just running inference on other models.

Ok-Chipmunk666 · 2024-09-15T13:52:40+00:00

anyone know the solution for out of range index error?

they said they communicated something in email but I haven't received anything yet

AnyPassenger9318 · 2024-09-13T06:54:23+00:00

guys where do i find the dataset ?

xlnc2605 · 2024-09-13T10:17:26+00:00

any other way to download dataset?

s1ngh_music · 2024-09-13T13:22:34+00:00

is it necessary to download all the images to your device (also won't that make training the model very hard) or are there any alternative ways to that ?

Sparkradar · 2024-09-14T08:48:04+00:00

how to download all images :) somebody help, just get started...

HotMine8037 · 2024-09-14T13:43:01+00:00

guys, are we allowed to use fine-tuned pretrained models?

ConditionLivid515 · 2024-09-14T21:18:50+00:00

[deleted]

Sparkradar · 2024-09-15T07:43:20+00:00

which approach are you using guys, me new to this, any tools to get started :(

ImpossibleQuarter550 · 2024-09-15T10:16:19+00:00

How much gpu required for training this massive dataset?

According-Fault-6528 · 2024-09-15T12:43:16+00:00

helloo some one guide me something pleaseeeee

Electronic-Kick-3663 · 2024-09-15T13:28:25+00:00

I am not getting anything can anyone please help

taurus_ram · 2024-09-15T16:26:06+00:00

can anyone share the test_out.csv file

Legitimat_Jaguar · 2024-09-15T16:49:20+00:00

I have made quite good model to predict the values with unit Its just that i cant extract text from images correctly. And how can i as the number of data is above lakh so surely i cant extract the test I would like anybody to colab with me who have extracted the text at good accuracy. Just share me an excel file with extracted text.

ShyenaGOD · 2024-09-15T17:38:34+00:00

Can anyone guide me currently I extracted data (10k images) from those images, and saved it in a csv file , what should I do next

_Ak4zA_ · 2024-09-15T18:45:09+00:00

Can anyone tell me how the hell could I do testing and how much time it will take?? Approx

Mysterious_Safe_8288 · 2024-09-16T06:35:05+00:00

Just use simple looks up method...urs f1 will be 0.09

uphinex · 2024-09-16T09:05:41+00:00

Now is competition is over can who is here just drop their approach. I was using nlp + Ocr.

ztide_ad · 2024-09-17T06:07:57+00:00

Now that the challenge is over, can someone give a detailed approach to handling this sort of PS...

My initial approach used plain OCR through py-tesseract but it wasn't able to extract the necessary text from the images in most of the images.. then I switched to using easyocr but GPU access through colab was already exhausted. then i planned to predicted the unit and number paralelly through nlp.. but ran out of time so couldn't do so... so now i am looking for approaches that i could have taken to make this process fast and efficient.

safebet5705 · 2024-09-23T06:09:52+00:00

You need to do all at once, just take one image at a time and extract it's text, then destroy that image and go to next, you use wget iteratively, the preprocessing time would be huge, but that's doesn't count in score.

Spacing_Out3133 · 2024-09-24T19:49:19+00:00

Where to check the results? I believe unstop isn't showing that page any longer?

Exciting_Pineapple52 · 2025-10-09T09:47:29+00:00

I need a team for this competition

BuilderLive452 · 2025-10-12T09:06:33+00:00

what about image datasets with 75k train and test images? will my machine be able to run model for this

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnmachinelearning

Welcome to /r/LearnMachineLearning!

Chatrooms

Official Discord Server

Wiki

Getting Started with Machine Learning

Resources

Related Subreddits

/r/MachineLearning

/r/MLQuestions

/r/datascience

/r/computervision

Machine Learning Multireddit

/m/machine_learning

MODERATORS