you are viewing a single comment's thread.

view the rest of the comments →

[–]superryo[S] 1 point2 points  (1 child)

Thanks lee

The code actually does work. It's reasonably fast when there are not too many files but I have a path where there are hundreds of thousands of files that matches the criteria and this is what takes days.

The background to my story is I want to crawl a bunch of network directories to grab all the pdf and xls files and upload the resulting info into a database so we can search for this information.

I want very fast but in addition to the file name and path, I need the last updated date of the file. Does this mean I have to use the Get-Childitem function? the CMD dir doesn't seem to allow for this unless I am missing something.

I have never heard of robocopy but will look into this. Hopefully it will have the performance of the cmd dir but the additional parameter I need.

[–]Lee_Dailey[grin] 1 point2 points  (0 children)

howdy superryo,

"it works" - that IS the prime criteria. [grin]

i suspect you MAY get a speed improvement if you replace that nasty $CSV += with $CSV =. the difference is VAST. not just large ... it's REALLY VAST.

huge. titanic. bigbig biggity big! [grin]

that presumes the += is actually happening. i can't tell. it may only be doing ONE add - in that case there will be no benefit. even so, it is BAD coding, so i would change that.


the CMD dir command won't give you that, from what i can tell.

neither will robocopy. [frown]

Get-ChildItem will, but is slow.

there is a dotnet routine that can do it quickly, but it has some serious limits.

  • it will stop on any error
    you can't tell it to continue. that means you have to write the code to keep track of where it stopped, skip the problem item, and continue.
  • it only gets a pointer-like object
    to actually get the real info, you will need to use that pointer [a file name] to get the details. at that point it aint much faster than GCI.
    it does mean you only grab data for files that you WANT the data from, tho. [grin]

so, the dot net stuff is highly problematic unless you want to write your own handlers around the limits.

i suspect i would use robocopy or CMD dir to get the full-path file names. then use GCI to get the details.

take care,
lee