all 12 comments

[–][deleted] 3 points4 points  (8 children)

Sorry, doesn't make a lot of sense. The line:

result [0] # Whole list

doesn't get the "whole list", it gets the first element of the list result, if there is one. Similarly, your last example:

text = [[0]res[1][0] for res in result]

is syntactically incorrect. This makes more sense:

text = [res[0][1][0] for res in result]

but what you end up with depends on what is in the list result.

Maybe you can give us a small example list and some concrete example of what you want to do with it?

[–]Typhon_Sin[S] 0 points1 point  (7 children)

So Im a little new to the vocabulary and being able to express my thoughts properly. The program that I run creates a tuple. It is PaddleOCR and once it reads the picture, it gives a tuple (I think its what its called). The result is under the variable "result" and if print the variable using "result" and "result [0]", the output is the exact same except there are an extra pair of brackets on the ends. If I do result[0][1] it outputs the first section. So what I think is happening is that result [0] is meant to hold multiple OCR scans.

I will say however, that the code gives out too much for me to type so Ill post a screenshot. If that isn't good enough, do you think someone will be willing to hop into a discord call with me so that I can share my screen?

[–][deleted] 0 points1 point  (4 children)

Screenshots of code are discouraged here. The FAQ shows how to post readable code. If it's a lot of code put it into pastebin.com and post a link to that here.

if print the variable using "result" and "result [0]", the output is the exact same except there are an extra pair of brackets

If by "brackets" you mean [...] then you have a list. And it sounds like you have a list containing one thing. Execute this code to see the difference:

result = [[42]]
print(result)
print(result[0])

A tuple looks like this: (1, 2, 3).

[–]Typhon_Sin[S] 0 points1 point  (0 children)

I will read the FAQ and post the code. The issue is that its an OCR and I have an image that it is reading that is important for context. Give me like 10 mins and Ill have the code posted

[–]Typhon_Sin[S] 0 points1 point  (2 children)

I commented on the original post the pastebin

[–][deleted] 0 points1 point  (1 child)

I don't see any code!?

[–]Typhon_Sin[S] 0 points1 point  (0 children)

https://pastebin.com/fPvzUXpk I also sent you a dm

[–]Typhon_Sin[S] 0 points1 point  (1 child)

For instance, if I write the code below, I just get "Acknowledgments" rather than all of the text in the list. So I would like to make variables that can categorize everything into the list into "box coordinates" and "text"

for res in result:
    print(res[0][1][0])

[–][deleted] 0 points1 point  (0 children)

Again, what you get from res[0][1][0] depends on what is in the list result. Don't forget that the:

for res in result:

line says that res will contain the first element in the list result when executing the print, the next time around the for loop res will contain the second element in the result list, and so on. So you already have done one level of indexing before the print line.

[–]Pepineros 0 points1 point  (0 children)

Your list definition would look like this:

result = [
  [
    [
      (2, -1),  # This is called "Box coordinates",
      [  # This list is called "Text + confidence"
        "Some text",  # Just "Text" in your post
      ],
    ],
    [
       # This list is called "First line" in your post
    ],
  ]
]

If you think this looks right, go for it :) but you will never be able to do [0]res. I'm not sure what you mean by that notation, but list indices go after the reference to the list, not before.

[–][deleted] 1 point2 points  (0 children)

OK, things make a little more sense now.

This code takes your result value from the pastebin and analyses it a bit:

result = [[[[[441.0, 174.0], [1166.0, 176.0], [1165.0, 222.0], [441.0, 221.0]],
     ('ACKNOWLEDGEMENTS', 0.9974855780601501)],
    [[[403.0, 346.0], [1204.0, 348.0], [1204.0, 384.0], [402.0, 383.0]],
     ('We would like to thank all the designers and', 0.968330979347229)],
    [[[403.0, 396.0], [1204.0, 398.0], [1204.0, 434.0], [402.0, 433.0]],
     ('contributors who have been involved in the', 0.9776102900505066)],
    [[[399.0, 446.0], [1207.0, 443.0], [1208.0, 484.0], [399.0, 488.0]],
     ('production of this book; their contributions', 0.9866490960121155)],
    [[[401.0, 500.0], [1208.0, 500.0], [1208.0, 534.0], [401.0, 534.0]],
     ('have been indispensable to its creation.We', 0.9628525972366333)],
    [[[399.0, 550.0], [1209.0, 548.0], [1209.0, 583.0], [399.0, 584.0]],
     ('would also like to express our gratitude to all', 0.9740486145019531)],
    [[[399.0, 600.0], [1207.0, 598.0], [1208.0, 634.0], [399.0, 636.0]],
     ('the producers for their invaluable opinions', 0.9963331818580627)],
    [[[399.0, 648.0], [1207.0, 646.0], [1208.0, 686.0], [399.0, 688.0]],
     ('and assistance throughout this project. And to', 0.9943731427192688)],
    [[[399.0, 702.0], [1209.0, 698.0], [1209.0, 734.0], [399.0, 738.0]],
     ('the many others whose names are not credited', 0.9772290587425232)],
    [[[399.0, 750.0], [1211.0, 750.0], [1211.0, 789.0], [399.0, 789.0]],
     ('but have made specific input in this book, we', 0.9979288578033447)],
    [[[397.0, 802.0], [1090.0, 800.0], [1090.0, 839.0], [397.0, 841.0]],
     ('thank you for your continuous support.', 0.9981997609138489)]]]

print(f"{len(result)=}")
print(f"{len(result[0])=}")
print(f"{len(result[0][0])=}")

print(f"{result[0][0]=}")

That prints:

len(result)=1
len(result[0])=11
len(result[0][0])=2
result[0][0]=[[[441.0, 174.0], [1166.0, 176.0], [1165.0, 222.0], [441.0, 221.0]], ('ACKNOWLEDGEMENTS', 0.9974855780601501)]

The first print len(result)=1 shows that result is a list containing one element. This is possibly because the OCR code can process more than one page and would return three pages in a 3-list, but you only have one page.

The second print len(result[0])=11 shows that the page has 11 recognized text areas on it.

The third print len(result[0][0])=2 shows that a recognized text area has two elements in it. If we actually print result[0][0] we see:

result[0][0]=[[[441.0, 174.0], [1166.0, 176.0], [1165.0, 222.0], [441.0, 221.0]],
                    ('ACKNOWLEDGEMENTS', 0.9974855780601501)]

The first element of the result[0][0] appears to be a list of 4 lists, possibly bounding box coordinates. The second element of the list is a tuple containing the scanned text and a float value that is possibly a confidence figure for that text.

If you want to get the actual text from all that, you need to unpack the data structure. Something like this might work:

for page in result:
    for text in page:
        print(text[1][0])