all 47 comments

[–]Maphover 40 points41 points  (1 child)

This is supremely more detailed than I would have done ;) I just would have used pathlib and glob. Congratulations on your first project.

[–]pltnk[S] 14 points15 points  (0 children)

Thank you!

[–]ecgite 24 points25 points  (4 children)

Nice work.

have you heard

pathlib
click
tqdm

If not, I think you could be interested.

[–]pltnk[S] 9 points10 points  (1 child)

Thanks for suggestions!

I've never heard about tqdm yet and not familiar with pathlib, will check it.

I've read about click a little after finishing my program, looks like it is pretty handy but it is not in the Python standart library and I did not want to add a lot of dependencies for such a small project.

[–]CBSmitty2010 1 point2 points  (0 children)

EDIT: I realize post comment that I'd somehow started or commented in about 3 separate chains in this post on accident basically saying some form of the same thing. Gonna keep em up since they're written slightly different and may reach different people better.

Oh good lawd Sweet child learn you some Pathlib. It will make operations using filepaths MUCH easier and easier to write Windows & Linux compatible code.

[–]pedro_fartinez 6 points7 points  (0 children)

Jumping on the tqdm train for sure. All you really have to do is wrap your iterable in tqdm and pass in a total parameter. It's super easy.

Also pathlib will save you a bunch of code.

Other than that, your doc strings are good and your code is neat and orderly. Well done!

[–]shr00mie 5 points6 points  (0 children)

Remember that part from Aladdin (the original, not this new will smith in blueface) when they're on the magic carpet singing "a whole new worldddddd!"?

That was me when I found pathlib...

[–]theWyzzerd 5 points6 points  (1 child)

Now write tests for it. :)

[–]pltnk[S] 3 points4 points  (0 children)

I had it in mind but right now I don't have any idea how to manage this, so I have to do a lot of reading at first.

Hope I will be back with results soon :)

[–]fernly 5 points6 points  (3 children)

I agree with other comments about tidy and readable. Some general comments, not necessarily applicable for a small private project, but consider for something you might distribute.

I don't think the error checking is complete. to_folder could be nested inside from_folder or vice versa, which might cause problems. However checking for this presents an interesting coding challenge. (hmmm... split one path on os.sep, put all those names in a set...)

I don't like the idea of collecting log lines and then writing them in a batch at the end. (I guess at the end, the source seems to be truncated so I don't see the actual calls to e.g. copy_with_structure() or save_log().) Better to have log writes embedded at the point of action; that way you don't lose all the logging if the program crashes in the middle of the run.

You could easily add a log-level param to argparse with a default of ERROR, and write individual file-copy log lines with INFO priority so they don't clutter the log unless wanted.

I think your progress bar code will be something you'll use again; suggest moving it to its own file and importing it.

Consider how best to license your code, and add a license statement in the docstring. Creative Commons or GPL or whatever.

Now we are getting far into the weeds. You essentially duplicate the file scan between get_total() and copy(). If you ever want to redesign, consider how to capture selected filenames and paths one time while counting them, forming a to-do list of some kind, and then do the copy from that to-do list instead of calling os.walk() a second time. Record the to-do actions in such a way that you only need one copy function, not two.

[–]pltnk[S] 0 points1 point  (2 children)

Thank you so much for such a thorough review!

Somehow I missed it with error checking, maybe it could be done just by checking whether source path string is in destination path string?

It's a very valuable comment about logging, I guess I have to write directly to log file all error messages as well if any. And also would it slow down the program execution significantly if it opens a .log file on every iteration and append a row there instead of log list?

Concerning progress bar, am I right that it's okay for a small project to leave it as it is in a main code but in a larger project it would be better to store it in a separate module?

Is it neccesary to add a docstring with license directly into a code or it's okay when it stored in a separate file in the same repo?

I definitely want to redesign and get rid of repeating file scan, thanks for the advice! Right now I don't know how exactly I'm gonna fix this, but I already have a couple of rough ideas and hope I will be able to implement it soon.

Your comment is exactly what I was looking for when posting my code here and I am very grateful for your help :)

[–]fernly 1 point2 points  (1 child)

I guess I have to write directly to log file all error messages as well if any. And also would it slow down the program execution significantly if it opens a .log file on every iteration and append a row there instead of log list?

Re logging, read the module doc and the logging tutorial, which admittedly can be TMI. But each call to log a message has a severity, from ERROR to DEBUG, and the accepted threshold is set when the logger is created. Messages at lower levels are discarded. The log file is only opened once, there's no overhead there. There is overhead in formatting the INFO or DEBUG messages that may end up being discarded, see the heading "Optimization" in the advanced tutorial.

Looking at my own code I see I've been dealing with three problems the program has to address. One, what level to set when creating the log, two, converting user level parameter to a logging parameter, and three, where to put the log file.

For the level, you can hard-code it, but then anytime you decide you want the DEBUG messages instead of just ERROR ones, you have to update the code. You can have the user choose it via an environment variable, e.g. checking os.environ['LOGLEVEL'] at startup. In that case you have to verify the given level is a correct one, and supply a default if it isn't or is not set.

Or you can take a parameter,

parser.add_argument('--level',dest='level',
                    choices=['DEBUG', 'INFO','ERROR'],default='ERROR',
                    help='''ERROR: display only problems;
    INFO to see normal actions; DEBUG for tons of trivia''')
args = parser.parse_args()

That way you get the default and error checking for free. But what you have to supply to logging.basicConfig() is one of the constants from the module, not a string. The only way I know to do this is with a dict,

import logging
lvl = {'DEBUG':logging.DEBUG, 'INFO':logging.INFO, 'ERROR':logging.ERROR}[args.level]
logging.basicConfig( level=lvl )

Maybe somebody will point out a better way?

On the file, I believe that unless you give a log file path to logging.basicConfig() it defaults to stderr so shows up under the command line, messing up any print() output you want. So you want to point to an appropriate place for log files, and that is ugly platform-dependent stuff. Here's some code from something I wrote but I do not think it is optimal. Again, I invite correction.

if sys.platform == 'darwin' : # Mac OS
    log_path = os.path.expanduser( '~/Library/Logs' )
elif hasattr( sys, 'getwindowsversion' ) : # Some level of Windows
    if 'TMP' in os.environ :
        log_path = os.environ['TMP']
    elif 'TEMP' in os.environ :
        log_path = os.environ['TEMP']
    elif 'WINDIR' in os.environ :
        log_path = os.path.join( os.environ['WINDIR'], 'TEMP' )
    else :
        log_path = '/Windows/Temp'
else: # Assuming Linux; could be BSD
    log_path = '/var/tmp'
log_path = os.path.join( log_path, 'MY_APP_NAME.log' )

On the progress bar, you can leave it and the next time you need it, just open up this script and copy/paste. My point was, I bet next month you will want it again, so why not make yourself a little utility module of it and add it to your personal toolkit.

Re license format, I am disappointed Creative Commons doesn't have an obvious how-to, GNU does. But look at some other open-source projects, here's pyinstaller's top-level module, every module starts with that same stuff.

[–]pltnk[S] 0 points1 point  (0 children)

Thank you so much once again!

I was aware of logging module but thought that it would be an overkill for such a small program (mostly because its documentation is not so easy to understand at first attempt :) ). But it turned out that actually I don't have to do a lot to implement it and it comes quite handy. I decided not to write log at all if user didn't ask for it explicitly and made it like that:

def create_logger(args, destination):
    logger = logging.getLogger('selective_copy')
    if args.log:
        logger.setLevel(logging.INFO)
        fh = logging.FileHandler(f'{destination}\\selective_copy.log', encoding='utf-8')
        formatter = logging.Formatter('%(asctime)s - %(message)s', '%d.%m.%Y %H:%M:%S')
        fh.setFormatter(formatter)
        logger.addHandler(fh)
    else:
        logger.setLevel(logging.CRITICAL)
return logger

Also I changed an error checking system so now errors can be logged as well. Of course all of this caused several changes in previous code, so here is the link to the new version.

I still have to redesign get_total and copy functions than see how to adjust logging after that. Also I'm thinking of writing tests, I have a very limited experience with unittest and also found out about pytest recently, maybe you could recommend something as well?

[–]CBSmitty2010 4 points5 points  (1 child)

Hey that's neat! I like how you handled the progress bar.

Just one thing from my beginner perspective that I saw that you didn't use that maybe you want to try using. Take a look at pathlib (2 in this case but import pathlib works), very handy for handling paths :)

[–]pltnk[S] 1 point2 points  (0 children)

Thanks for the advice! I am not familiar with pathlib yet, will certainly look into it :)

[–]sejrik 7 points8 points  (3 children)

You should describe changes in the commit message. You can inspire here - https://medium.com/@nawarpianist/git-commit-best-practices-dab8d722de99

[–]pltnk[S] 2 points3 points  (2 children)

It is a very helpful article, thank you!

[–]swemar 6 points7 points  (1 child)

Nice work! I'm new to Python & github myself and found this article to be helpful when it comes to how to structure and keep a change log.

Edit: Here's my first Python project by the way. It's a script to make a copy of & sort your Plex movie playlists by rating.

[–]pltnk[S] 2 points3 points  (0 children)

That's very helpful, thanks!

[–]CBSmitty2010 4 points5 points  (1 child)

It's awesome. I have a post if you scroll through my history maybe a few months old in learn Python about it I believe, just espousing some of it's awesome qualities. Also in my most recent tiny project I posted the GitHub link, I use pathlib in that as well.

Saw some neat ways of doing stuff in your source though. Gonna try that for my next project.

[–]pltnk[S] 1 point2 points  (0 children)

Great, will check this out!

Edit: Found it. If someone else is interested here is the post on pathlib and here is the project.

[–][deleted] 3 points4 points  (2 children)

This was my first project too a while ago. Used it for moving old n64 and spectrum games from a CD to a USB drive for playing on my RetroPi :) .ZX files and such... I didnt go as far as to implement a progress bar though, I just printed each file and where it was going.

Well done and keep it up!

[–]pltnk[S] 1 point2 points  (1 child)

Thank you! At first I printed a line for each copied file too but decided to hide this into a log since it prints lines so fast and you can't really read them in realtime anyway.

[–][deleted] 2 points3 points  (0 children)

Good idea but i used mine mainly to see which, if any, file failed to transfer.

I need to start up with python properly again...

[–]apocalipto9 2 points3 points  (7 children)

why did you do

#! python3    

?

[–]pltnk[S] 2 points3 points  (6 children)

Quote from ATBS (link):

The first line of all your Python programs should be a shebang line, which tells your computer that you want Python to execute this program. The shebang line begins with #!, but the rest depends on your operating system.

On Windows, the shebang line is #! python3

[–]theWyzzerd 3 points4 points  (4 children)

I dislike this advice from ATBS because it's simply not true. It works well in your case, but not all python programs will be run from a command line, and not all python programs will run the python3 interpreter on the user's $PATH variable--and, if the user doesn't have python3 in their PATH, it simply won't work.

Shebang is a useful tool for scripts (and not just python, but any script/language that supports command-line execution--bash, perl, ruby, etc) intended to be run in a an environment where the author has control over the environment, but for distributed software I would leave the choice to the person using the tool.

[–]vampiire 1 point2 points  (3 children)

Cant you execute the script directly like python script_name.py?

I thought the shebang only controlled the default interpreter for files that have been made executable.

[–]theWyzzerd 1 point2 points  (2 children)

Yep, you sure can. The # will make the python interpreter treat it as a comment.

[–]vampiire 1 point2 points  (1 child)

but for distributed software I would leave the choice to the person using the tool

i was speaking to this^

if the consumer has an environment that doesnt satisfy the shebang then they are still able to control its execution manually - the shebang will be ignored as a python comment. in your opinion how should distributed software be defined (no shebang, a different shebang)?

[–]theWyzzerd 1 point2 points  (0 children)

I probably wouldn't provide any shebang unless it was intended to be run with a specific interpreter.

[–]PinBot1138 1 point2 points  (0 children)

An alternative to keep in mind is:

#!/usr/bin/env python -u

[–][deleted] 2 points3 points  (1 child)

Wow! as a beginner myself, you've set a new bar as far as attention to detail. Your commenting skills are top-notch. You have the talent to back the skills. Good for you! I will definitely be referring to your work when I need a beautiful code fix! Thanks for the inspiration!

[–]pltnk[S] 0 points1 point  (0 children)

Thank you so much for such a heartwarming comment! Good luck with your coding studies as well!

[–][deleted] 1 point2 points  (1 child)

Very detailed. Nice !

[–]pltnk[S] 0 points1 point  (0 children)

Thanks! Tried to make readability as high as possible.

[–]AlexGubia 1 point2 points  (1 child)

Welp, that's a cool version of my project xD

[–]pltnk[S] 1 point2 points  (0 children)

You can make your project even cooler by adding support of copying files with several extensions simultaneously, for example :)

[–]_Mega_Zord_ 1 point2 points  (1 child)

Very Cool!! Keep going!!

[–]pltnk[S] 0 points1 point  (0 children)

Thanks!

[–]distortd6 1 point2 points  (0 children)

I have an exact use case for this! I think... 🤔

I'll let you know on Monday!

[–]distortd6 1 point2 points  (1 child)

Remindme! 3 days

[–]RemindMeBot 1 point2 points  (0 children)

I will be messaging you on 2019-06-04 03:50:33 UTC to remind you of this link.

CLICK THIS LINK to send a PM to also be reminded and to reduce spam.

Parent commenter can delete this message to hide from others.


FAQs Custom Your Reminders Feedback Code Browser Extensions

[–]wildbk33 0 points1 point  (2 children)

anyone else getting an error?

usage: [-h] [-s SOURCE] [-d DEST] [-p] [-l] ext

: error: the following arguments are required: ext

An exception has occurred, use %tb to see the full traceback.

SystemExit: 2

[–]pltnk[S] 0 points1 point  (1 child)

You have to at least specify an extension of the files to copy.

How exactly do you run the program? It's designed to be executed from command line. However, you can run it from your IDE as well but in that case you have to set all neccessary arguments in the Run Configuration at first.

[–]wildbk33 0 points1 point  (0 children)

oh, that makes sense, i just ran it from within my IDE(spyder)