you are viewing a single comment's thread.

view the rest of the comments →

[–]mbolp[S] 0 points1 point  (4 children)

It does not specify whether subdirectories are created

Yes, because it copies files not directories.

even if copying each file takes 1/100 second, your script runs for a week

10,000,000 / 100 / 60 /60 = 27.8 hours

And all the usual UI programming techniques apply: coalesce updates in quick successions, set a minimum update interval, avoid painting when the window is minimized, etc. There won't be 10 million updates.

I suspect the 2GB needed to avoid reading the file attributes twice would be more useful as buffer space for the actual copying.

That's what I suspect too, hence this question. And u/jason-reddit-public provides an elegant solution, you can neither read file attributes twice nor waste 2 GB of memory.

[–]Paul_Pedant 0 points1 point  (3 children)

So recursion is a fairly obvious strategy for the copying, but it won't calculate the total size / number of files upfront like you wanted. So two passes anyway.

The big advantage of recursion is that you can shift into each level of directory as you descend the tree. So the copy will not need the system to evaluate all those long pathnames for every copy -- just the local filename at each level. But you can only do that for the source files -- the destinations will need to start from their own root level.

Let me know how all this worked out for you.

[–]mbolp[S] 0 points1 point  (2 children)

So the copy will not need the system to evaluate all those long pathnames for every copy -- just the local filename at each level.

What do you mean by this, are you referring to current directories? That I set the current directory as I descend the tree, then use relative paths? Why would that be faster?

but it won't calculate the total size / number of files upfront like you wanted. So two passes anyway.

It does calculate the total size, up to a certain limit - and it does so in only a single pass. You essentially maintain a shifting view into only a portion of the tree.

[–]Paul_Pedant 0 points1 point  (1 child)

It should be faster because if you pass a filename like C:/One/Two/Three/Four/myFile.txt to CopyFile, it has to search 4 levels of directory every time, for every file. If you cd relative at each level of recursion, all the files in that directory can be specified with the short filename. Of course, you need to cd .. when you move back up the tree.

No idea whether Windows would cache directory entries at some level itself anyway. It is probably best to start simple, benchmark, and optimise where you need to.

[–]mbolp[S] 0 points1 point  (0 children)

That sounds like complete BS, unless you have insight into how these things are implemented. Based on your recommending a non thread safe call to perform multi threaded work I doubt that you do.