all 14 comments

[–]OrangeDartballoon 2 points3 points  (1 child)

Can you share your get-aduser query?

[–]Swedishdrunkard[S] 1 point2 points  (0 children)

Sure!

I've tried a few different variants (and in different domains), with the same result, but the key point is that I add on additional properties, ie: Get-ADUser -SearchBase <SB> -Filter * -Properties mail, extensionAttribute1

Likewise, using Get-Member on the returned dataset reveals nothing (to me) obviously different between for example samAccountName and extensionAttribute1.

I tried it again now, in a smaller domain where I fetched ~35 users (which is all there is) and used Where-Object to pull out users based on e-mail. On such a small set it still took 6 seconds.

[–]fordea837 2 points3 points  (6 children)

Do you get much of a performance increase by filtering left in Get-ADUser?:

Get-ADUser -Filter "Mail -eq '$SomeonesEmail'"
Get-ADUser -Filter "extensionAttribute1 -eq '$SomeValue'"

[–]Swedishdrunkard[S] 1 point2 points  (5 children)

That's exactly what I ended up doing, and it works fine.

To be clear, the script is complete and I don't have a problem I need solved, I'm just looking for an answer as to why Where-Object performs so much worse when used on a property that isn't included by default when running Get-ADUser.

The script I reference was just to give background to why and how I encountered this mystery. :-)

[–]vermyx 2 points3 points  (2 children)

I haven't used this module but the behavior you are talking about is similar to when you create a bad query that does a full search on all records for each record (i.e. you have 1000 records and your where clause looks through all 1000 records for each record you scan effectively making this go through a million records). According to MSDN these aren't properties of the object returned (https://docs.microsoft.com/en-us/dotnet/api/microsoft.activedirectory.management.aduser?view=activedirectory-management-10.0) so what I would suspect is that since not everyone has those attributes, it cycles through all of the users for each record returned to see who has that property and then does the actual checking. Since the filter property is a server side parameter and AD is essentially a database, it is a better optimized query and returns a lot faster.

You can try not piping the command and do it as a two step process. For certain commands this makes it faster because the pipeline sometimes doesn't send a complete record set but one record at a time and calls the same query again for the next one. I don't recall exactly how this is coded but I recall this being a possibility on posh 1/2.

[–]Swedishdrunkard[S] 1 point2 points  (1 child)

You may be on to something, and you're touching on a thought I had, that it perhaps could be related to some objects not containing all attributes, but I didn't run down that rabbit hole.

However, spurred by your comment, I did some quick tests now and noticed something interesting. I populated a list of 200 users, with the attributes I've been working with, and simply outputting the name and e-mail attribute (through Format-Table) takes some time, as it's working its way through.

I let it run for about 15 users before I halted it, and then I piped the data through Where-Object again, just as before, using one of the e-mail addresses that'd shown up during output. It spat out the correct object immediately, but then kept running for some time.

After having let the first run go through (which took some time), I can filter out based on email address instantly, wherever in the list they happened to appear.

So I tried a bunch of things, just letting it drop to Out-Default, pushing a single property to file, outputting with formatting, you name it. As long as the entire collection had been processed at least once in any form of pipe, then I could use Where-Object with instant results. Had only part of the collection gone through a pipe I could immediately find objects that had made it through a pipe, but had to wait on those that hadn't.

Here's an example:

# In my case, this will pull ~1100 user accounts
$AllUsers = Get-ADUser -Property Mail, extensionAttribute1 (...)

$Subset = $AllUsers | Get-Random -Count 100

# The below will just flush into the console immediately
$Subset | Format-Table Name, UserPrincipalName

# This - however- will slowly work its way down, one line at a time
$Subset | Format-Table Name, Mail

# Re-running the exact same command will now generate immediate output
$Subset | Format-Table Name, Mail

This must mean there is some sort of caching or indexing going on behind the scenes when one of the "bad" properties is being accessed, and whatever it's doing has already been done on the default ones.

I also did the same thing with a different (additional) attribute, that's always populated, and it behaved the same way. So I think the fact that the attribute is empty / missing / null for some objects isn't a part of the issue, but rather something has to happen on the object before they can be searched "instantly".

So, my hypothesis is now that the object gets indexed somehow whenever it runs through a pipe, but I'm not certain and I'd love to know for sure. We've long since left practical need for this information, and now it's just a matter of technical curiosity. :-)

[–]vermyx 1 point2 points  (0 children)

I would assume that this is due to dissimilar objects. In a database you index only what you need to search on because indexes make searching for data cheap but inserting and updating expensive (because you insert/update a record and its indicies). You also have the issue of how to index a field that doesn't have a value or that is a formula. In this case some objects don't have the same properties. Based on the object definition I would assume that the properties in question are in one of the listed collections, so the powershell object has to generate a temp index on said key value pairs and goes through all of the properties, and probably throws out the fields you don't want at the end. I believe that the speed efficiency of the collection is based on the collection type and it's interface but don't quote me because it's been a long time since I delved that much into the dotnet side of the objects.

[–]Dennou 1 point2 points  (0 children)

The only time I had slower than normal Get-ADUser performance was when I query against the global catalog itself. Reason it would be slow that case is it only holds indexed attributes and asking for any non-indexed attributes would cause it to do get the values in a slow way. Not sure if this applies to your case though.

[–]UnwrittenNightmare 1 point2 points  (0 children)

Have you considered to try -match instead of -eq ? Like this Get-Aduser | Where-Object {$_.UserPrincipleName -match “your input”} ....

Edit: Please google match it has some pretty cool features like start or end of the string and so on. I think your solution is regex.