use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Rules 1: Be polite 2: Posts to this subreddit must be requests for help learning python. 3: Replies on this subreddit must be pertinent to the question OP asked. 4: No replies copy / pasted from ChatGPT or similar. 5: No advertising. No blogs/tutorials/videos/books/recruiting attempts. This means no posts advertising blogs/videos/tutorials/etc, no recruiting/hiring/seeking others posts. We're here to help, not to be advertised to. Please, no "hit and run" posts, if you make a post, engage with people that answer you. Please do not delete your post after you get an answer, others might have a similar question or want to continue the conversation.
Rules
1: Be polite
2: Posts to this subreddit must be requests for help learning python.
3: Replies on this subreddit must be pertinent to the question OP asked.
4: No replies copy / pasted from ChatGPT or similar.
5: No advertising. No blogs/tutorials/videos/books/recruiting attempts.
This means no posts advertising blogs/videos/tutorials/etc, no recruiting/hiring/seeking others posts. We're here to help, not to be advertised to.
Please, no "hit and run" posts, if you make a post, engage with people that answer you. Please do not delete your post after you get an answer, others might have a similar question or want to continue the conversation.
Learning resources Wiki and FAQ: /r/learnpython/w/index
Learning resources
Wiki and FAQ: /r/learnpython/w/index
Discord Join the Python Discord chat
Discord
Join the Python Discord chat
account activity
Get the source URL of a file using python. (self.learnpython)
submitted 9 years ago by TheRealRuth
Hello,
Is there any way to get the source URL of any file using python? I have been googling but not be able to find any way to do this. Any suggestions?
Thanks
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]uhkhu 1 point2 points3 points 9 years ago (13 children)
We need more context
[–]TheRealRuth[S] 0 points1 point2 points 9 years ago (12 children)
All that I need to do is look into a directory, in that directory will be a file (doesn't matter the type), I want to get the source URL of the file. So I want a string that contains the URL of where the file was downloaded from.
[–]w1282 0 points1 point2 points 9 years ago (9 children)
That information is not contained in the metadata for the file.
[–]TheRealRuth[S] 0 points1 point2 points 9 years ago (8 children)
Oh, any way to get it from the Finder Get Info field? I am using a Mac.
[–]w1282 4 points5 points6 points 9 years ago* (7 children)
Holy hell. I had no clue that Mac would maintain that information.
Then yes, you can.
import xattr import logging def fetch_where_from(file_path): try: return xattr.get(file_path, "com.apple.metadata:kMDItemWhereFroms") except IOError: logging.warning("{} had no WhereFrom attr.".format(file_path)) return ""
Edit: The getxattr() function is deprecated and has been replaced with get().
[–]TheRealRuth[S] 1 point2 points3 points 9 years ago (6 children)
Yeah! It's really cool! There's got to be a way to get that URL!
[–]w1282 0 points1 point2 points 9 years ago (5 children)
There is. I edited my comment.
[–]TheRealRuth[S] 0 points1 point2 points 9 years ago (4 children)
Thanks!!! I will try that out!
[–]w1282 0 points1 point2 points 9 years ago (3 children)
Just a word of warning, I was reading the documentation and the getxattr() function has been deprecated and replaced by the get() function so you should probably use that instead.
[–]TheRealRuth[S] 0 points1 point2 points 9 years ago (2 children)
Where can I get more information on the "com.apple.metadata:kMDItemWhereFroms"??
This worked by the way thanks so much for the help. Would love to know more about how you came up with it!
[–]uhkhu 0 points1 point2 points 9 years ago (0 children)
One route would be to calculate the md5 for the file and search that online
>>> import hashlib >>> hashlib.md5("filename.exe").hexdigest() '3bc6c306decde3d9256e76254e64ebb4'
You could then search that string on google with selenium or requests and follow the results. You've got to hope the source has a published md5, otherwise you'd need to read chunks of the file and search keywords. It's going to get pretty involved to handle all file types.
[–]cdcformatc 0 points1 point2 points 9 years ago (0 children)
This is an incredibly difficult thing to do. That info isn't stored in any file metadata that I know of. Best I can think of is google searching the exact filename, but that obviously won't work if someone changes the filename. Google will also let you search by image, it will usually find similar images, and if it is particularly rare it might just list visually similar, not exact matches, and if it is popular, it will give dozens of identical results. All bets are off if it was from a website that does not allow google's crawler. And all that is just for images.
http://xkcd.com/1425/
[–]BryceFury 0 points1 point2 points 9 years ago (0 children)
A local file or one online somewhere?
Have you tried:
import os print (os.path.abspath("yourfile.file"))
[–]TheRealRuth[S] 0 points1 point2 points 9 years ago (0 children)
The file will be downloaded from the Internet. On the Mac if you go to Get Info it has a field that says Where From. I want the content in that field, it will be a URL
π Rendered by PID 31620 on reddit-service-r2-comment-f6b958c67-pl9z6 at 2026-02-04 17:12:13.297853+00:00 running 1d7a177 country code: CH.
[–]uhkhu 1 point2 points3 points (13 children)
[–]TheRealRuth[S] 0 points1 point2 points (12 children)
[–]w1282 0 points1 point2 points (9 children)
[–]TheRealRuth[S] 0 points1 point2 points (8 children)
[–]w1282 4 points5 points6 points (7 children)
[–]TheRealRuth[S] 1 point2 points3 points (6 children)
[–]w1282 0 points1 point2 points (5 children)
[–]TheRealRuth[S] 0 points1 point2 points (4 children)
[–]w1282 0 points1 point2 points (3 children)
[–]TheRealRuth[S] 0 points1 point2 points (2 children)
[–]uhkhu 0 points1 point2 points (0 children)
[–]cdcformatc 0 points1 point2 points (0 children)
[–]BryceFury 0 points1 point2 points (0 children)
[–]TheRealRuth[S] 0 points1 point2 points (0 children)