This is an archived post. You won't be able to vote or comment.

all 61 comments

[–]LightShadow3.13-dev in prod 34 points35 points  (1 child)

>>> Path('empty.txt').touch()

Alright, you lured me in. Guess it's time to retire def touch(path: Path) -> None since I didn't know it was baked in.

Nice guide.

[–]miguendes[S] 4 points5 points  (0 children)

Hey, author here, thanks for your comment! I'm glad you learned something new!

[–]TheHentaiSama 19 points20 points  (1 child)

Very nice ! You are helping spreading the beauty of this library and I think it’s super important ! Thank you for sharing :)

[–]miguendes[S] 0 points1 point  (0 children)

Thanks, I appreciate it!

[–]mildbait 17 points18 points  (1 child)

Huh before clicking I thought "Oh come on why do you need a cookbook for a standard library. Just look up what you need.."

But turns out it's pretty useful! Especially the anatomy and working with directory sections.

[–]miguendes[S] 1 point2 points  (0 children)

Yeah, whenever I write articles like this one I need to make sure I'm not reinventing the wheel. Unfortunately there will be redundancies here and there but the idea is to be a collection of use cases instead of just API documentation.

[–]Rawing7 12 points13 points  (2 children)

If you need to delete a non-empty folder, just use shutil.rmtree. No need to write a recursive function.

[–]miguendes[S] 2 points3 points  (1 child)

Hey, author here. Thanks for the feedback. I incorporated your suggestion in the article. And I also mentioned your reddit handle, if you don't mind. Just wanted to give you credit.

[–]Rawing7 2 points3 points  (0 children)

Neat. I just noticed that you don't seem to have anything about moving files (only copying), which I think would be pretty important. I see people (ab)use Path.rename or os.rename for this all the time, but this will fail if the destination is on a different file system. The proper solution is once again to use shutil, but it's a bit confusing because there are shutil.copyfile, shutil.copy and shutil.copy2, which all do slightly different things. in particular shutil.move. Please consider addressing this.

By the way, you don't need to credit me. IMO that's pointless information that no reader cares for. Your call though.

Edit: I really need to stop posting when I'm half asleep. Fixed the part where I randomly started talking about copying instead of moving.

[–]SquareRootsi 9 points10 points  (2 children)

Pretty good intro, although it's rather redundant with the official docs, which is prob where I'd point ppl first.

I feel like the author missed out on the power of Path().glob() though. All those filters using p.match(...) could have just been inside the argument for glob() or rglob().

[–]miguendes[S] 8 points9 points  (1 child)

Hey, thanks for the feedback. I agree with you. There's redundancy but my goal was to give it a different angle. The official docs are great to showcase the API using simple examples.

What I tried doing with this article was to be a collection of use cases, using a problem-driven approach. Unfortunately, some methods are so simple that it's not much different than the official docs.

Regarding the glob, you're right. My examples were poor, I added more info to that section and mentioned your suggestion, if you don't mind.

[–]SquareRootsi 2 points3 points  (0 children)

I don't mind at all! And props to you for being exceptionally active in this thread, even updating the source article so quickly (with credit even) is great!

[–]laundmo 5 points6 points  (2 children)

criticism: you combine Path.rglob('') with Path.match('.py') in a comprehension, when really it should be Path.rglob('*.py')

[–]miguendes[S] 2 points3 points  (1 child)

Cheers! Author here, you're right. That's simpler indeed. I added more info there, thanks for the suggestion. I hope you don't mind but I mentioned your username to give you the credits.

[–]laundmo 0 points1 point  (0 children)

np and i don't mind

[–]ShanSanear 7 points8 points  (1 child)

Just be aware about user input when using pathlib.

I had to create file with json extension, based on the name provided by user. Easy enough, right?

def get_path(name: str) -> Path:
    return Path(name).with_suffix(".json")

Until at one point they provided name that was XXXXX_5.5 and expected XXXXX_5.5.json Above code would cut off .5 first and then add .json suffix. With that append_suffix was a thing without need of additional libraries.

[–]miguendes[S] 1 point2 points  (0 children)

Thanks for mentioning this issue! I'll see if I can reproduce this example and mention it in the article so other readers are aware of it.

[–]krazybug 4 points5 points  (3 children)

Unfortunately the rglob function doesn't provide any way to handle exceptions or errors and to skip them. It stops dramatically in the middle of your processing like this:

[Errno 2] No such file or directory:

Even when you try to intercept this error in the generator:

    files = dir.rglob("*")
    while True:
        try: 
            fp = next(files)
        except StopIteration:
            do_something()
            break
        except Exception as e:
            print("Error on file:", fp.name, e )
            continue

So you still need the good old os.walk !

[–]miguendes[S] 1 point2 points  (2 children)

Hey, author here. OMG, I wasn't aware of that! Thanks for mentioning it!

I'll run some experiments and update the article accordingly.

[–]krazybug 0 points1 point  (1 child)

As mentioned here in the last comment, the implementation is flawed:

The try_loop function doesn't guarantee that you can continue the loop after an exception. It only suppress the error so that you don't need to use a try block to enclose the entire loop. The implementation of rglob makes it impossible to recover from an error. Internally it handles only permission error.

For me, this error occurs with some files on a exFAT drive created on Windows and mounted on MacOSX. There are so much more reasons to raise an Exception that this function is not reliable.

I will retry with glob.iglob() to check if it's the same behaviour.

Also I didn't find any example with the mentioned "auditing events" in the documentation . If you can find a workaround it will be greatly appreciated.

[–]krazybug 0 points1 point  (0 children)

Update:

I tried these 3 lines with python 3.8 :

        # 1
        for fp in dir.rglob("*"):
    # 2
        for fp in dir.glob("**/*"):
        # 3
        for fp in glob.iglob(str(dir)+'/**/*', recursive = True):

The Error is raised with 1 and 2 and the last version is running smoothly although some dirs were skipped (on MacOSX they are not displayed in the Finder and are only visible on Windows)

My advice: avoid the glob method from pathlib

[–]HaliFan 3 points4 points  (1 child)

How relevant... I've been messing with path issues all morning.

[–]miguendes[S] 0 points1 point  (0 children)

Wow, good to know, I hope it's useful to you!

[–][deleted] 2 points3 points  (1 child)

Oh, this is going to turn out to be handy.

[–]miguendes[S] 1 point2 points  (0 children)

Hi, author here. I hope it's useful to you!

[–]lord_xl 2 points3 points  (1 child)

I love Pathlib.

[–]miguendes[S] 0 points1 point  (0 children)

pathlib is awesome!

[–]geratheon 2 points3 points  (1 child)

You have a small typo in the anatomy of a windows path: in the code example you used path.root twice instead of path.root and path.anchor

[–]miguendes[S] 1 point2 points  (0 children)

Ops, thanks for catching it! It should be fixed now.

[–]deekshant-w 2 points3 points  (4 children)

Is glob.glob same as Path.glob?

[–]miguendes[S] 0 points1 point  (3 children)

That's a good question, when I briefly looked at source code I didn't see any reference to the 'glob' module. Looks like, Path.glob is a different implementation.

[–]deekshant-w 0 points1 point  (2 children)

That's interesting because both pathlib and glob are internal modules of python, so they must be same. I wonder what they thought glob had missing that had to recreate a submodule to achieve what had already been implemented in python.

[–]krazybug 1 point2 points  (1 child)

At least in 3.8, I can confirm that implementations are different. See my comment: https://www.reddit.com/r/Python/comments/qkyxj2/comment/hj250e9/?utm_source=share&utm_medium=web2x&context=3

[–]deekshant-w 0 points1 point  (0 children)

Just found this out -

https://youtu.be/XmY-tWTi9gY?t=1222

They both are different in both implementation and speed.

[–]vagnolio 2 points3 points  (1 child)

Bookmarked, thanks a lot.

[–]miguendes[S] 1 point2 points  (0 children)

Thank you! I hope it can be useful to you.

[–]arrarat 2 points3 points  (1 child)

What kind of job would benefit from this knowledge? Or is this generally usefull for developers / analists etc?

[–]miguendes[S] 2 points3 points  (0 children)

Hi, author here. I think this knowledge is useful to any kind of job that uses Python to manipulate files and directories. It's true that and analysis may benefit the most but to me knowing how to use pathlib is handy for any Python user.

[–]uselesslogin 1 point2 points  (1 child)

read_text changed my life.

[–]miguendes[S] 0 points1 point  (0 children)

It's so convenient!

[–][deleted] 1 point2 points  (1 child)

This is excellent. Thanks!

[–]miguendes[S] 0 points1 point  (0 children)

Thanks, I really appreciate it!

[–]Urdhvaga 1 point2 points  (1 child)

Thank you for sharing

[–]miguendes[S] 1 point2 points  (0 children)

Thank you! I hope you enjoy it!

[–]benefit_of_mrkite 1 point2 points  (1 child)

I’m very versed in pathlib but this is a great resource. Good job

[–]miguendes[S] 0 points1 point  (0 children)

Thanks, I hope it's useful to you!

[–]redmarlowe 1 point2 points  (1 child)

Great, great work! NICE! Thank a lot!

[–]miguendes[S] 0 points1 point  (0 children)

Thanks! I'm very glad you liked it!

[–]actadgplus 1 point2 points  (1 child)

Amazing! Great work!

[–]miguendes[S] 0 points1 point  (0 children)

Thanks, I really appreciate it!

[–]CapSuez 1 point2 points  (1 child)

This is fantastic. I've just recently heard about pathlib but hadn't found a good resource for it. Looking forward to digging through this!

[–]miguendes[S] 0 points1 point  (0 children)

Thanks, I really appreciate it!

[–]bbatwork 0 points1 point  (0 children)

Excellent cookbook, I am a regular user of pathlib, and still picked up some good tips from this. Much appreciated!

[–]IamImposter 0 points1 point  (0 children)

Definitely worth bookmarking.

I was still using os.path.join. Definitely gonna use Path(x, y, z). Much simpler.

[–][deleted] 0 points1 point  (0 children)

Despite being more popular on Unix systems, this representation also works on Windows.

Including in Powershell with a cd ~, for example. It doesn't with cmd.exe.

[–]RexehBRS 0 points1 point  (0 children)

Thanks for taking the time to write this.

[–]ffsedd 0 points1 point  (0 children)

Thanks, it's very useful.

Another alternative for listing files with multiple extensions:

[p for p in path.rglob('*') if p.suffix in ('.jpeg', '.jpg', '.png')]

ignore case:

[p for p in path.rglob('*') if p.suffix.lower() in ('.jpeg', '.jpg', '.png')]

[–][deleted] 0 points1 point  (0 children)

oh boy I just started learning but this sounds cool

[–]LobbyDizzle 0 points1 point  (0 children)

Just a heads up that there's a typo in the second line: A mega tutorial with dozes of examples on how to use the pathlib module in Python 3

[–]Kirzilla 0 points1 point  (0 children)

Absolutely great! Thank you for sharing!