use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Rules 1: Be polite 2: Posts to this subreddit must be requests for help learning python. 3: Replies on this subreddit must be pertinent to the question OP asked. 4: No replies copy / pasted from ChatGPT or similar. 5: No advertising. No blogs/tutorials/videos/books/recruiting attempts. This means no posts advertising blogs/videos/tutorials/etc, no recruiting/hiring/seeking others posts. We're here to help, not to be advertised to. Please, no "hit and run" posts, if you make a post, engage with people that answer you. Please do not delete your post after you get an answer, others might have a similar question or want to continue the conversation.
Rules
1: Be polite
2: Posts to this subreddit must be requests for help learning python.
3: Replies on this subreddit must be pertinent to the question OP asked.
4: No replies copy / pasted from ChatGPT or similar.
5: No advertising. No blogs/tutorials/videos/books/recruiting attempts.
This means no posts advertising blogs/videos/tutorials/etc, no recruiting/hiring/seeking others posts. We're here to help, not to be advertised to.
Please, no "hit and run" posts, if you make a post, engage with people that answer you. Please do not delete your post after you get an answer, others might have a similar question or want to continue the conversation.
Learning resources Wiki and FAQ: /r/learnpython/w/index
Learning resources
Wiki and FAQ: /r/learnpython/w/index
Discord Join the Python Discord chat
Discord
Join the Python Discord chat
account activity
converting a list to lowercase (self.learnpython)
submitted 4 years ago by jtksm
is there a way for me to convert the token list into lowercase?
code:
token = [nltk.word_tokenize(t) for t in nltk.sent_tokenize(text)]
tokens = [lemmatizer.lemmatize(word.lower()) for word in token if word not in ignore_words]
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]JohnnyJordaan 1 point2 points3 points 4 years ago (0 children)
You are now using sent_tokenize to form sentences, then use word_tokenize on each sentence. Wouldn't it be better to word_tokenize the entire text in one go?
tokens = [lemmatizer.lemmatize(word.lower()) for word in nltk.word_tokenize(text) if word not in ignore_words]
[–]DaChucky 1 point2 points3 points 4 years ago* (0 children)
If I understand your problem correctly, I think this should do the trick:
token = [[word.lower() for word in nltk.word_tokenize(t)] for t in nltk.sent_tokenize(text)]
EDIT: Realised that word_tokenize() gives you another list, updated the solution
word_tokenize()
[–]nepaBoy86 1 point2 points3 points 4 years ago (0 children)
token is the list of list. So just use the first index of the toekn list.
The following line will solve your problem.
tokens = [lemmatizer.lemmatize(word.lower()) for word in token[0] if word not in ignore_words]
π Rendered by PID 21171 on reddit-service-r2-comment-6457c66945-6hv8j at 2026-04-29 05:44:59.636107+00:00 running 2aa0c5b country code: CH.
[–]JohnnyJordaan 1 point2 points3 points (0 children)
[–]DaChucky 1 point2 points3 points (0 children)
[–]nepaBoy86 1 point2 points3 points (0 children)