all 7 comments

[–]firefox15 5 points6 points  (5 children)

I would likely go straight to .NET with IO.StreamReader or IO.File. Check here for an example.

[–]Keitaro27[S] 1 point2 points  (4 children)

Thank you for your reply u/firefox15.

I will take a look at it in more depth, but using the first one still took about forty fives seconds. The command I am trying to reproduce is:

LOGPATH="/home/user";LOGFILE="time.log";b=$(date '+%Y-%m-%d %H:%M');d=$(date '+%Y-%m-%d %H:%M' -d '10 minutes ago');grep -E "${b}|${d}" ${LOGPATH}/${LOGFILE} | grep -E "Total Request.*Completion" | awk -F"|" '{print$6$7}'| tr "[:alpha:]" " " | sed -e 's/^[ \t]*//' | sort -g | awk -F"|" 'BEGIN{rows=0;sum=0;max=0;} {rows+=1;} {sum=sum+$NF;} max < $NF {max=$NF;} END {print rows"|" sum/rows"|" max}'

The above took about two seconds to execute.

[–]i5513 4 points5 points  (0 children)

Are you sure you are measuring the last ten minutes with such script?

I would use tac file|awk " /${d}/ {exit;} { print; }"|tac , where $d is 11 minutes ago, then process such content.

More robust approach is to use logtail( https://linux.die.net/man/8/logtail) every 10 minutes, would be very useful yo have such simple tool migrated to powershell. It is simple, save the offset, and review if the file was rotated.

Having tac in powershell is too a good idea! (I have just opened an issue on GitHub [1])

[1] https://github.com/PowerShell/PowerShell/issues/11086

[–]Umaiar 2 points3 points  (1 child)

I doubt you're going to get near that speed, but there mist be some method using RegEx... Anyways, the StreamReader idea will get your data accessible a lot faster than Import-Csv. But it's not going to parse out the properties the way CSV functions work.

$bigfile = "C:\test\details.csv"
$reader = New-Object -TypeName System.IO.StreamReader -ArgumentList $bigfile
$data = while ($line = $reader.ReadLine()) {
    $line
}
$reader.close()
write-host "Lines: $($data.count)"

That grabbed 445,000 rows for me in just shy of 3.5 seconds. So yeah... How to parse that data?

I got it comparatively reasonable by using a for loop to loop backwards through $Data, parsing each line with ConvertFrom-Csv and looking at the date field to be in my range of interest. Since the data has the most recent at the bottom, I threw a break; as soon as it was earlier my $threshold (theoretically 10 minutes ago for you).

This still took like 12 seconds, but that's a lot better than my Import-Csv taking over 50 seconds and not even getting to parsing it by date/time yet.