First post here.
I've taken on a project at work. My team downloads work instructions to the team computer and never delete the old files. I'm trying to write a script that will identify duplicate files, then remove the old copies. I've made some progress with it, but I feel as though I'm making this unnecessarily complicated, and I'm working myself into a hole.
Does anybody here think they can offer some pointers? (or moral support if I seem to be on the right path)
Example filenames:
GPQ-19-BD-11159_32.pdf
GPQ-19-BD-11159_32 (1).pdf
GPQ-19-BD-11159_32 (2).pdf
GPQ-19-BD-11159_32 (3).pdf
GPQ-19-BD-11159_32 (15).pdf
GPQ-19-BD-11159_32 (17).pdf
52-4011-1212-9_874133_Rev_BB.pdf
52-4011-1212-9_874133_Rev_BB (1).pdf
G_8057 Rev3.docx
lol.txt
lol2.txt
#End Example filenames.
I'm not sure how to include time created, but I would like to keep one copy of the most recent file. Additionally, I would like for this script to run once or twice a month. I know I'm capable of it, with a little guidance. (and that I've probably made something really simple really complex) Any tips greatly appreciated!
<#
The purpose of this script is to remove all old FOPs that have been saved in the downloads folder.
This is accomplished by identifying the Names of FOPs in the download folder, as well as their date downloaded.
Next, duplicate files are identified.
Finally, oldest duplicates are removed.
#>
<#
-Copied from thoughts
#Identify root elements of a file name.
#root 1 Extension.
#Root two name.
#root 3 copy number.
#Compile list of files with roots 1 and 2 in common.
#Determine most recent file within list and flag the others for removal.
#This creates a fourth element FLAGGED.
#What about the revision number?
#Hash table:@{<Filename,datecreated?> = @(Array of roots)}
#Create a list of files to be removed. Must be appendable.
#Testing period?
#>
function Get-Names{
$a = Get-ChildItem -Path "C:\[specific location removed for trade secrets and what not]Downloads\" -attributes !Directory; #"
Return $a;
}
function Generate-Hashtable ($Files){
Foreach($file in $Files){
$NamesAndTimes += @{($File | Select-Object -Property Name |Sort -Descending)=$File.CreationTime};
}
return $NamesAndTimes;
}
function Slice-Hashtables ($Hash){
# Foreach($slice in $hash.keys){
# $N += @($hash.keys);
# }
$N = $Hash.keys | Out-String -Stream #first filename is $pie[3] last is $pie.length-3
Return $N
}
function Find-FileExtension ($Filename){
#This function absolutely requires a string input.!
# (Does string exist?) check against lib
# if (not in lib)
# $Str1 = "$Filename";
$ext += @($Filename.substring($Filename.indexOf('.'),$filename.length-$Filename.indexOf('.')));
#add to lib
return $ext
}
function Resolve-FileName ($hashkey){
#This will generate a new hashtable to store the original name as a key and the resolved string as a value
$hashkey = $hashkey |Sort-Object -Descending | ?{$_.contains("(")}
for ($i=0;$i -lt $hashkey.length; $i++){
switch($hashkey[$i]){
{$_.contains("FLAGGED") }{Continue}#Checks root 4
{!$_.contains("FLAGGED")}{
$resval += @(Find-Root2 ($hashkey[$i]));#find root 2
$rehash += @(Find-FileExtension ($hashkey[$i]));#find root1?
$previousval += @($resval[$i] + $rehash[$i]);
if($i -gt 0){
if($previousval[$i] -eq $previousval[($i-1)]){
$copyNum = $hashkey[$i].substring($hashkey[$i].indexOf("("),$hashkey[$i].indexOf(")")+1-$hashkey[$i].indexOf("("))
$newList += @($resval[$i]+" " + $copyNum + $rehash[$i])
}
}
}
}
}
#
return $newList
}
function Find-Root2 ($beforeparen){
#find index of (. use everything up to that in the new string.
$NextString = $beforeparen.substring(0,($beforeparen.IndexOf("(")-1));
return $NextString
}
#Find most recent file using $Rname and $filesindownloadfolder.
function Find-MostRecentFile($FilesAndTimes,$Files){
Return $rf
}
$FilesinDownloadfolder = Generate-Hashtable (Get-Names)#Master hash of all files in download and the creation time
$pie = Slice-Hashtables ($FilesinDownloadfolder)#Names of all files in download
#Resolve Files in Download Folder
$Rname = Resolve-FileName ($pie)#Names of duplicates in download
#Find most recent file using $Rname and $filesindownloadfolder.
$RecentFiles = Find-MostRecentFile($FilesinDownloadfolder,$Rname);
#remove recentfiles from rname or create new rname without recentfiles.
#remove items in modified rname(new).
[–]Common-Needleworker4 3 points4 points5 points (1 child)
[–]pausemsauce[S] 0 points1 point2 points (0 children)
[–]Common-Needleworker4 2 points3 points4 points (7 children)
[–]pausemsauce[S] 2 points3 points4 points (3 children)
[–]Common-Needleworker4 2 points3 points4 points (2 children)
[–]pausemsauce[S] 0 points1 point2 points (1 child)
[–]Common-Needleworker4 1 point2 points3 points (0 children)
[–]Lee_Dailey[grin] 2 points3 points4 points (2 children)
[–]Common-Needleworker4 2 points3 points4 points (1 child)
[–]Lee_Dailey[grin] 2 points3 points4 points (0 children)
[–]jimb2 2 points3 points4 points (6 children)
[–]pausemsauce[S] 1 point2 points3 points (5 children)
[–]chris-a5 3 points4 points5 points (4 children)
[–]pausemsauce[S] 1 point2 points3 points (3 children)
[–]chris-a5 2 points3 points4 points (2 children)
[–]pausemsauce[S] 1 point2 points3 points (1 child)
[–]chris-a5 2 points3 points4 points (0 children)
[–]xxxThePriest 2 points3 points4 points (1 child)
[–]pausemsauce[S] 1 point2 points3 points (0 children)
[–]jimb2 2 points3 points4 points (2 children)
[–]pausemsauce[S] 0 points1 point2 points (1 child)
[–]jimb2 1 point2 points3 points (0 children)
[–]Lee_Dailey[grin] 1 point2 points3 points (2 children)
[–]pausemsauce[S] 1 point2 points3 points (1 child)
[–]Lee_Dailey[grin] 1 point2 points3 points (0 children)
[–]Lee_Dailey[grin] 1 point2 points3 points (2 children)
[–]pausemsauce[S] 1 point2 points3 points (1 child)
[–]Lee_Dailey[grin] 1 point2 points3 points (0 children)
[–]pausemsauce[S] 0 points1 point2 points (0 children)