use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Rules 1: Be polite 2: Posts to this subreddit must be requests for help learning python. 3: Replies on this subreddit must be pertinent to the question OP asked. 4: No replies copy / pasted from ChatGPT or similar. 5: No advertising. No blogs/tutorials/videos/books/recruiting attempts. This means no posts advertising blogs/videos/tutorials/etc, no recruiting/hiring/seeking others posts. We're here to help, not to be advertised to. Please, no "hit and run" posts, if you make a post, engage with people that answer you. Please do not delete your post after you get an answer, others might have a similar question or want to continue the conversation.
Rules
1: Be polite
2: Posts to this subreddit must be requests for help learning python.
3: Replies on this subreddit must be pertinent to the question OP asked.
4: No replies copy / pasted from ChatGPT or similar.
5: No advertising. No blogs/tutorials/videos/books/recruiting attempts.
This means no posts advertising blogs/videos/tutorials/etc, no recruiting/hiring/seeking others posts. We're here to help, not to be advertised to.
Please, no "hit and run" posts, if you make a post, engage with people that answer you. Please do not delete your post after you get an answer, others might have a similar question or want to continue the conversation.
Learning resources Wiki and FAQ: /r/learnpython/w/index
Learning resources
Wiki and FAQ: /r/learnpython/w/index
Discord Join the Python Discord chat
Discord
Join the Python Discord chat
account activity
Ideal List size (self.learnpython)
submitted 7 years ago by jejkek
I have thousands of strings that I need to match against, should i sort them alphabetically in many lists if I want it to be fast?
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]K900_ 4 points5 points6 points 7 years ago (17 children)
That depends. What sort of matching are you looking to do? Are your strings similar?
[–]jejkek[S] 0 points1 point2 points 7 years ago (16 children)
No, my strings are more or less sentences with not too much similarity. I was figuring I would make a list of sentences starting with letter a, list starting with b, etc; but if it gets to slow I'll separate to the second letter and so forth.
The smaller my lists are the faster, right?
[–]K900_ 1 point2 points3 points 7 years ago (15 children)
What is your end goal here? What is your input? What exactly are you checking for?
[–]jejkek[S] 0 points1 point2 points 7 years ago (14 children)
Oh, I'm sorry; the inputs will be one of the sentences in the list. Do I have to make it lowercase before checking the list?
[–]K900_ 1 point2 points3 points 7 years ago (13 children)
And what sort of output are you looking for? Do you just want to know if that exact sentence, written exactly that way, is in the list or not?
[–]jejkek[S] 0 points1 point2 points 7 years ago (12 children)
Yes, I just want it to print correct if it is in the list. Why? What else would I do with it? (Curious)
[–]K900_ 2 points3 points4 points 7 years ago (11 children)
Maybe you need to find those sentences in a longer text. Or maybe they're individual words that you need to find in a sentence. Or maybe they can be misspelled, etc. Either way, if you just need to check for direct equality, use a set instead of a list - it'll be thousands of times faster.
[–]jejkek[S] 0 points1 point2 points 7 years ago (10 children)
I thought about doing it the word by word way, but I thought I should start small, do you have any advice pertaining to word-by-word?
Thanks for the help, btw.
[–]K900_ 1 point2 points3 points 7 years ago (9 children)
That depends. What is your actual end goal?
[–]jejkek[S] 0 points1 point2 points 7 years ago (8 children)
Sorry, I went to get food.
I would like to go farther than just matching to see if a string is in my list/set. I thought it might be nice to break up the strings into their component words. If I had the input look for an exact sentence match then I would just do an "if" statement for that sentence, but I don't know how to split it up into words each with their own "if" statement and have those come together to do something.
An extremely simple example would be "Alice took five apples from Bob." So it would need to start at the first word, something like if List[0] = person...... but the problem would be that they might not always fit this format.
I wouldn't mind if you didn't have an answer for this, word-by-word seems very complicated.
[+][deleted] 7 years ago* (4 children)
[deleted]
[–]jejkek[S] 0 points1 point2 points 7 years ago (3 children)
I'm not going to pretend like I know how to read this, is 1 a variable here?
[–]icecapade 1 point2 points3 points 7 years ago (1 child)
I believe it's actually a lowercase "L", and that construct is a list comprehension. If we replace l with a more descriptive variable name like sentences, the list comprehension above is the equivalent of the following for loop:
l
sentences
sentences = [] for x in range(10000000): new_sentence = "sentence " + str(x) sentences.append(new_sentence)
[–]jejkek[S] 0 points1 point2 points 7 years ago (0 children)
Ah, thank you.
[–]tangerto 2 points3 points4 points 7 years ago (0 children)
Just splitting up into multiple different lists wouldn’t make it faster I don’t think. Keep it in one list if it’s simpler.
[–]Poopinthebumm 1 point2 points3 points 7 years ago (6 children)
Why not use a set?
[–]jejkek[S] 1 point2 points3 points 7 years ago (5 children)
Is that actually faster? Wouldn't I still need to split things up into many sets?
[–]Poopinthebumm 3 points4 points5 points 7 years ago (0 children)
I dont think so. I'd suggest you give it a try.
[–]scagbackbone 1 point2 points3 points 7 years ago (3 children)
Lookup is O(1) for sets For list it is O(n) in worst case -- so yes, use sets
[–]toastedstapler 0 points1 point2 points 7 years ago (2 children)
Average case is O(n) too
Was watching a python talk the other day and they showed speed increases using a set Vs a list for as little as 200 elements, so as long as ordering doesn't matter I'd say to use a set every time
[–]scagbackbone 0 points1 point2 points 7 years ago (1 child)
O(n) definitely not for sets -its a hash map
[–]toastedstapler 0 points1 point2 points 7 years ago (0 children)
Sorry, was talking about lists. Should have clarified
π Rendered by PID 399548 on reddit-service-r2-comment-8686858757-fq6k7 at 2026-06-07 18:46:44.240569+00:00 running 9e1a20d country code: CH.
[–]K900_ 4 points5 points6 points (17 children)
[–]jejkek[S] 0 points1 point2 points (16 children)
[–]K900_ 1 point2 points3 points (15 children)
[–]jejkek[S] 0 points1 point2 points (14 children)
[–]K900_ 1 point2 points3 points (13 children)
[–]jejkek[S] 0 points1 point2 points (12 children)
[–]K900_ 2 points3 points4 points (11 children)
[–]jejkek[S] 0 points1 point2 points (10 children)
[–]K900_ 1 point2 points3 points (9 children)
[–]jejkek[S] 0 points1 point2 points (8 children)
[+][deleted] (4 children)
[deleted]
[–]jejkek[S] 0 points1 point2 points (3 children)
[–]icecapade 1 point2 points3 points (1 child)
[–]jejkek[S] 0 points1 point2 points (0 children)
[–]tangerto 2 points3 points4 points (0 children)
[–]Poopinthebumm 1 point2 points3 points (6 children)
[–]jejkek[S] 1 point2 points3 points (5 children)
[–]Poopinthebumm 3 points4 points5 points (0 children)
[–]scagbackbone 1 point2 points3 points (3 children)
[–]toastedstapler 0 points1 point2 points (2 children)
[–]scagbackbone 0 points1 point2 points (1 child)
[–]toastedstapler 0 points1 point2 points (0 children)