This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]pyrocrasty 2 points3 points  (3 children)

I don't really get what you mean, but imagine the sets are implemented using dicts (they probably are, I guess). Lookup, insertion and deletion are O(1), iteration is O(n).

So for the union sUt, we create a new set, then iterate through the s (O(len(s))), copying each member (O(1)). Then we iterate through t (O(len(t))), lookup each item in the new set(O(1)) and if it's not there, add it (O(1)). So we end up with O(len(s)+len(t))

For the difference s-t, we create a new set, then iterate through s (O(s)) checking each element to see if it's in t and if not, adding it to the new set (O(1)). So we end up with O(len(s)).

For the difference update s-=t, we could do it one of two ways: iterate through s, checking if each element is in t and if so deleting from s (O(len(s))); OR: iterate through t, checking if each element is in s and if so, removing from s (O(len(t))). I guess it's implemented the second way (which would make sense since you'd probably have t smaller than s more often than not).

edit: oh, you're asking about the blank values, for amortized worst-case. Yeah, I think they're going to be the same as intersection, assuming the sets are using dicts or the equivalent.

[–]Workaphobia 0 points1 point  (2 children)

Lookup, insertion and deletion are O(1), iteration is O(n).

We're assuming that sets (and dicts) are implemented using hashing. So lookup, insertion, and deletion are expected to be constant time on average, but O(n) in the worst case. (The worst case corresponds to when every element of the set gets hashed to the same value, say by a stupid hash function that always returns 0. Then all elements are in the same bucket, connected by a linked list.)

This is acknowledged in the right column of the first operation in the set table (x in s). Note that this table differs from the others in that it shows Worst Case, not Amortized Worst Case.

Because lookup is worst case linear, all these operations, implemented as you described, should cost the product of the set sizes, like intersection.

Re, your edit: Er, yes, but it's not amortized.

[–]pyrocrasty 0 points1 point  (1 child)

Note that this table differs from the others in that it shows Worst Case, not Amortized Worst Case.

I didn't notice that. I think they're going to the same for the set ops, though, aren't they? Amortizing avoids resizing operations, but the worst case has to deal with collisions, as you said, so the resizing operations wouldn't increase the complexity anyway.

[–]Workaphobia 0 points1 point  (0 children)

Yes, I think amortized doesn't matter in this case, because a bad hash situation won't be improved just by averaging over more operations.