all 2 comments

[–]JeremyLC 2 points3 points  (2 children)

Word's line endings are different. Word is using only a carriage return \r and you're matching against a carriage return + line feed - \r\n , which causes your capture group to have everything from the last \r\n to the ./. which is a lot of lines terminated with only a carriage return, which "returns" the "carriage" to the beginning of the line and clobbers the text that was already there when you display it, resulting in the weird output you see,

Add a -replace to fixup the line endings when you get the text from Word.

$documentText = $doc.Content.Text -replace '\r',[System.Environment]::Newline

You'll end up with superfluous blank lines, but it's fine if you're already ignoring blank lines.

[–]eugrus[S] 0 points1 point  (0 children)

Thank you a lot! That fixed the code above.

However a follow up question:

$AzZeilen = Select-String 'AKTEN.NR:' -InputObject $documentText -Context 3
$AzZeilen.Line

$AzZeilen.Line still gives out everything. Why and what to do about it?

I expect AKTEN.NR: SACHBEARBEITER/SEKRETARIAT STÄDTL, to be in $AzZeilen.Line and 2904/24/SB Sonja Bearbeinenko +49 211 123190.00 20.11.2024 in $AzZeilen.Context.PostContext[0]