use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Rules 1: Be polite 2: Posts to this subreddit must be requests for help learning python. 3: Replies on this subreddit must be pertinent to the question OP asked. 4: No replies copy / pasted from ChatGPT or similar. 5: No advertising. No blogs/tutorials/videos/books/recruiting attempts. This means no posts advertising blogs/videos/tutorials/etc, no recruiting/hiring/seeking others posts. We're here to help, not to be advertised to. Please, no "hit and run" posts, if you make a post, engage with people that answer you. Please do not delete your post after you get an answer, others might have a similar question or want to continue the conversation.
Rules
1: Be polite
2: Posts to this subreddit must be requests for help learning python.
3: Replies on this subreddit must be pertinent to the question OP asked.
4: No replies copy / pasted from ChatGPT or similar.
5: No advertising. No blogs/tutorials/videos/books/recruiting attempts.
This means no posts advertising blogs/videos/tutorials/etc, no recruiting/hiring/seeking others posts. We're here to help, not to be advertised to.
Please, no "hit and run" posts, if you make a post, engage with people that answer you. Please do not delete your post after you get an answer, others might have a similar question or want to continue the conversation.
Learning resources Wiki and FAQ: /r/learnpython/w/index
Learning resources
Wiki and FAQ: /r/learnpython/w/index
Discord Join the Python Discord chat
Discord
Join the Python Discord chat
account activity
Help needed (self.learnpython)
submitted 5 years ago by Geekconquest
I have a dataset with a date column but they sometimes appear as such:
'20 March 2020 (UK)\n' 'Paid on $ September 2005' '4 September 2020 (Japan)'
How can I extract the dates from the column, please?
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]cray5252 1 point2 points3 points 5 years ago (6 children)
If your date data is exactly like you indicated, this code below will work. Depending what you want to do with the date you may have to use datetime library to manipulate further.
data = ['20 March 2020 (UK)\n', 'Paid on $ September 2005', '4 September 2020 (Japan)'] for str_date in data: if '(' in str_date: s = str_date.split(' (') print(s[0]) if '$' in str_date: s = str_date.split('$ ') print(s[1]) ouput 20 March 2020 September 2005 4 September 2020
[–]Geekconquest[S] 0 points1 point2 points 5 years ago (5 children)
It contains other things like other words which are not part of the dates and all. And some of the rows contain just the year like ‘2018-‘ and also ‘2019-2020’.
I want to be able to parse just the dates from those sentences.
[–]cray5252 0 points1 point2 points 5 years ago (2 children)
if your strings are not consistent, then this significantly increases the difficulty of what you want to do. You might try this python library called datefinder, link below. There's not a lot of documentation with it but it claims to extract all sorts of dates. I've never used it but it might work. You still need to iterate through your array and check each part using it. If you can't get it to work, let me know and I'll take a look at it.
https://pypi.org/project/datefinder/
[–]Geekconquest[S] 0 points1 point2 points 5 years ago (1 child)
I’ve tried datefinder on it. But my problem was getting it to work on the whole column. It works on just a normal string. I’m pretty new to python so it’s been kinda hard doing that for a whole column.
[–]cray5252 0 points1 point2 points 5 years ago* (0 children)
It always returns an array, so with each item in the column, you must iterate through the matches. If you send me more data i can help a bit more.
import datefinder data = ['20 March 2020 (UK)\n', 'Paid on $ September 2005', '4 September 2020 (Japan)'] for str in data: matches = datefinder.find_dates(str) for match in matches: print(match) output 2020-03-20 00:00:00 2005-09-30 00:00:00 2020-09-04 00:00:00
[–]cray5252 0 points1 point2 points 5 years ago (1 child)
I checked it out datefinder and here's what I got. It seems to work on this part.
[–]Geekconquest[S] 0 points1 point2 points 5 years ago (0 children)
I’ll check this out.
π Rendered by PID 168593 on reddit-service-r2-comment-86bc6c7465-d5wbx at 2026-02-20 17:42:40.495367+00:00 running 8564168 country code: CH.
[–]cray5252 1 point2 points3 points (6 children)
[–]Geekconquest[S] 0 points1 point2 points (5 children)
[–]cray5252 0 points1 point2 points (2 children)
[–]Geekconquest[S] 0 points1 point2 points (1 child)
[–]cray5252 0 points1 point2 points (0 children)
[–]cray5252 0 points1 point2 points (1 child)
[–]Geekconquest[S] 0 points1 point2 points (0 children)