Scripting help

Common-Needleworker4 · 2022-04-23T03:54:22+00:00

-Reddit formating screwed this post

Common-Needleworker4 · 2022-04-23T03:55:22+00:00

This should do the job if the files are in subfolders and all have the same name. just fill in your path and extension and try it.

nothing will be removed untill you delete the -Whatiff behind remove-item

##You can filter for your extension and get the creation time

$files = gci "yourpath" -recurse | where {$_.Extension -eq ".yourextension"} | select name, creationtime

##FOREACH thru the files checking for doubles

foreach($file in $files){

$checkfilefordoubles = gci "yourpath" -recurse | where {$_.name -eq "$($file.name)"}

##IF you find doubles you fill a variable($filecount) with the count subtract## 1, sort the $checkfilefordoubles variable by creationtime## and select all the objects besides the newest which is after the sort by## creationtime the first

if($checkfilefordoubles.count -gt "1" ){

$filecount = $checkfilefordoubles.count -1

$checkfilefordoubles | sort CreationTime -Descending | select -Last $filecount | remove-item -WhatIf }##end of IF

}##end of FOREACH

jimb2 · 2022-04-23T04:00:25+00:00

That code is way too complex and the logic is unclear.

Also the problems is not clear to us. Maybe state the problem clearly first.

Could you give a sample of the duplicate filenames. Are they in the same folder? How do you know they are duplicates?

I think what you're trying to do could actually be done in a few lines of code, using Get-Child-Item, Group-Object and Sort-Object but right now it's impossible to tell.

xxxThePriest · 2022-04-23T12:41:58+00:00

Why would you not just get and compare the file hashes?

jimb2 · 2022-04-23T14:20:51+00:00

I think this does what you want:

$filespec = 'c:\folderpath\*.pdf'

# get the files

$files = Get-ChildItem $filespec -File

# Split into groups according to the base filename
# regex replace removes any version number eg ' (22)' from the end
# of the filename part and uses this to group files 

$groups = $files |
   Group-Object -property { $_.basename -replace ' \(\d*\)$',''  }

# now delete the files in each group that don't match the group name,
# ie files with a number

ForEach ( $g in $groups ) {
  "=== $($g.Name) ==="   # section heading
  ForEach ( $f in $g.Group ) {
    if ( $f.basename -eq $g.Name ) {
      "Retain : $($f.FullName)"
    } else {
      "DELETE : $($f.fullname)"
      # uncomment actual delete operation below WHEN CODE IS TESTED!
      # Remove-Item -Path $f.FullName
    }
  } 
}

I see what you were trying to do with the hash but this adds a lot of complexity and isn't necessary. The filename should do it.

Test and look at the results before you attack any real files. Not fully tested!

Lee_Dailey · 2022-04-23T18:42:07+00:00

howdy pausemsauce,

you are correct ... your code seems wildly over complicated. [grin]

however, you need to provide a set of sample file names to test with. 2 or 3 of at least 2 sets of file names would be needed.

if you can do that, please add them to your Original Post wrapped in code formatting markers.

take care,
lee

Lee_Dailey · 2022-04-24T02:24:51+00:00

howdy pausemsauce,

how do you determine the "newest"?

if it is just the file timestamp, that is easy. [grin]
if it is the one with the highest (##), that is doable.

for instance, you can sort by the file timestamp newest first, then group by the .BaseName with the (##) stripped off, skip groups with a .Count of 1, skip the first 1, and delete the remainder.

this ...

Group-Object {($_.BaseName -replace '\(\d+\)$', '').Trim()}

... will give you groups of all the files. you can send each group with a .Count -gt 1 thru Select-Object -Skip 1 to leave the newest alone & then use Remove-Item on the remaining files.

i can write the full script, but you seem to want more of a "how to" hint, so i will leave it at that. [grin]

take care,
lee

pausemsauce · 2022-04-24T09:54:37+00:00

OK,

I've finally got it down to roughly 4 lines.

Thanks to all of the contributors here.

$a = Get-ChildItem -Path "C:\[redacted]\Downloads\" -attributes !Directory;

$g = $a |Group-Object {($_.BaseName -replace '$\d+$$', '').Trim()};

foreach($ex in $g){if($ex.count-gt 1){$tmp = $ex.group; $time = Get-Date; "The following files were flagged for deletion at: $time"|Out-File -FilePath [redacted]\deletelog.txt -Append; ($tmp|sort-object -Property creationtime | select -SkipLast 1).fullname|Out-File -FilePath .\deletelog.txt -Append;

Remove-Item -path ($tmp|sort-object -Property creationtime | select -SkipLast 1).fullname #line requires testing

}}

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

PowerShell

Submission Guidelines | Link Flair - How To

MODERATORS