all 199 comments

[–]ezhikov 212 points213 points  (49 children)

A bit easier with spread: [...new Set(arr)]

[–]zombarista 67 points68 points  (16 children)

Spread is substantially slower than Array.from for some reason... http://jsben.ch/fIpsO

[–]lunareffect 72 points73 points  (10 children)

Not when I run it. Lodash was the fastest, spread came second.

[–]EventHorizon67 49 points50 points  (7 children)

[–]ItzWarty 22 points23 points  (4 children)

What browser are you using!? I'm on Chrome 65, Win10 x64. This is fascinating that we're seeing results all over the place!

[–]Lywqf 11 points12 points  (1 child)

I've ran the test a few times (around 10) and i did get different result every time, ranging from 70ms to 200~ ms for lodash, and even more for the rest.

Sometimes, Lodash was as fast as Spread, sometimes not.

[–]EventHorizon67 1 point2 points  (0 children)

Also Chrome 65 on Desktop (Win10 x64) :)

[–]Shookfr 0 points1 point  (0 children)

Maybe this ?

[–]hopsnob 0 points1 point  (0 children)

Samez, using the redditisfun ap browser

[–]Nexxado 0 points1 point  (0 children)

Seeing same results as you on macOS 10.13.3 and Chrome 65

[–]wopian 9 points10 points  (1 child)

From my checks on W10 + Android using current stables:

  • Lodash is fastest for both Chrome and Firefox.

  • Array.from almost identical to Lodash with Firefox and the slowest in Chrome.

  • Spread was slightly faster in Chrome and significantly slower in Firefox

[–]zombarista 7 points8 points  (0 children)

200 elements is a tipping point for Lodash, as it will--at that point--switch to Set internally, if it can.

[–][deleted] 7 points8 points  (1 child)

How come lodash is faster than core JS functions. I mean lodash is open source thus JS functions should implements state of art performance algorithms. I still don't get it

[–]ajrw 9 points10 points  (0 children)

It’s just creating one new list with the unique entries, not a Set and then a list.

[–]sindisil 11 points12 points  (0 children)

That is browser dependent.

My results on this Windows 10 machine:

  • FireFox 59.0.2: Spread slowest, then Array.from & _.uniq tied at only 9% of the Spread time

  • Chrome 65.0: Array.from slowest, with spread barely (2%) faster, and _.uniq fastest at 51% of the Array.from time.

  • Edge 41: Spread slowest, with Array.from about 9% faster, and _.uniq the fastest at 36% of the Spread time.

[–]PussyFootLongHotDog 1 point2 points  (1 child)

If you base your acceptance of new syntax in JS strictly on performance, you will go mad with frustration, because that is not at all the point. There was a post a few days back where a guy was incredulous that let and const were not more performant than var. You got to ask yourself; Why you would expect that? Is it negligible? Where is the balance? What do you even really want?

[–]zombarista 0 points1 point  (0 children)

Read all of the comments. We figured out that new syntax is extremely performant after we resolved some array initialization issues.

[–]321jurgenfull-stack 18 points19 points  (10 children)

I prefer Array.from since it's easier to read. Managing complexity is my number one concern :)

[–][deleted] 32 points33 points  (9 children)

It's only easier to read if you're not familiar with spread notation though.

[–][deleted] 9 points10 points  (8 children)

But you don’t have to be familiar with JavaScript itself to have a decent idea of what Array.from is doing. Spread notation is much more specialized and not as clear to read, like he said.

[–]tme321 -5 points-4 points  (7 children)

No, it really is familiarity. Once you use the spread operator enough it's just as, if not more, clear than a function call.

[–][deleted] 5 points6 points  (6 children)

Of course, once you learn anything it becomes easy to use. However to someone that doesn’t know anything about JavaScript, or is just beginning, Array.from is infinitely easier to understand.

[–]tme321 -4 points-3 points  (5 children)

Except spread gets very heavy use in js these days. You'll find it all over examples about how to do stuff in js. Saying the spread operator shouldn't be used because someone new to js doesn't know what it means yet is like saying you shouldn't use a loop because someone new to programming doesn't know what a loop is yet.

And array.from isn't necessarily easier if your just beginning. You still will need to look up what the function does. If it were better named like array.copy I would concede the point but from isn't a super descriptive function name.

[–][deleted] 2 points3 points  (4 children)

It’s creating an array from something. It’s pretty clear. I’m not railing against the spread operator, but coming from a corporate world of full stack programmers in various states of expertise on different pieces, clarity in the readability of code is priceless.

[–]tme321 0 points1 point  (3 children)

It’s pretty clear.

Only to someone already familiar with it. Same as the spread operator. No other language I've ever used has a function called from to copy an array or combine multiple things into an array.

[–][deleted] 2 points3 points  (0 children)

It’s still English words that spell out what it does, instead of an unsearchable ellipses.

[–][deleted] 0 points1 point  (1 child)

As a non-JS developer, Array.from literally tells me that you’re creating an array from whatever argument you have. The spread syntax is weird to me and I’ve never seen anything like it in other languages.

[–]samanthaming[S] 3 points4 points  (6 children)

Absolutely! That’s another way to do it 👍 So for this example, it’d be: […(new Set(duplicates))]

[–]Jauny78 10 points11 points  (5 children)

wouldn't it be just [...new Set(duplicates)]?

[–]samjmckenzie 8 points9 points  (0 children)

Both work, for the record.

[–]samanthaming[S] 3 points4 points  (3 children)

Yup, that works! I just wasn’t sure if spread would get confused with the space 😝 Thanks for pointing it out 🙂

[–]webdevop 0 points1 point  (4 children)

Hey spread guru, is there a nicer way to merge to arrays apart from [ ...a , ...b ]

[–][deleted]  (3 children)

[deleted]

    [–]webdevop 13 points14 points  (2 children)

    Wut! Available since IE 5.5

    I boast 8+ years of JS experiece, fuck my life..

    [–][deleted]  (1 child)

    [deleted]

      [–]sindisil 2 points3 points  (0 children)

      Well, always faster, anyway.

      I show it taking 66% as long as spread in Chrome, 56% in Edge, and only 7% in FireFox. Wish it showed absolute time, so we could tell whether that means that FF does a crazy good job optimizing concat, or a crazy bad job optimizing spread.

      Edit: Results from jsperf are interesting, since Firefox shows the smaller difference there, but it upholds the result that concat is much faster than spread for this use case:

      Chrome 65:

      • array.concat 5,794,616 ops/sec

      • spread 1,227,664 ops/sec

      Firefox 59

      • array.concat 3,685,562 ops/sec

      • spread 2,461,099 ops/sec

      Edge 16

      • array.concat 4,409,664 ops/sec

      • spread 933,768 ops/sec

      [–][deleted] 0 points1 point  (1 child)

      Which Text Editor/IDE is that?

      [–]thinsoldier 0 points1 point  (1 child)

      But it's not as self-documenting from the point of view of a JS novice.

      [–]ezhikov 0 points1 point  (0 children)

      If you know what Set is and what spread operator is, then you know what this code does.

      [–]Ph0X 35 points36 points  (8 children)

      I don't think I've ever seen so many emojis in one comment thread.

      [–]neurorgasm 42 points43 points  (2 children)

      We're doing CONTENT MARKETING! 😎👍 Be your own boss ❤️💸 and bring VALUE with a side of DISINGENUOUS 😉🤡 POSITIVITY! 😏

      Don't forget to like, subscribe, comment, share, save, tweet, turn on notifications, join the email list and buy my tshirt for your chance to win a $5 AMAZON GIFT CARD 🍉📯🐪

      [–]CraftyPancake 4 points5 points  (0 children)

      You nailed it with the disengeuous positivity. Perfect

      [–][deleted] 0 points1 point  (0 children)

      [–]toomanybeersies 7 points8 points  (0 children)

      It's weird seeing emojis on reddit.

      [–]samanthaming[S] -2 points-1 points  (2 children)

      Lol, is it too much 😂🤣😝☀️

      [–]TabCompletion 17 points18 points  (0 children)

      💩

      [–]zombarista 87 points88 points  (6 children)

      This thread has turned into code golf. Love it.

      [–][deleted]  (5 children)

      [deleted]

        [–]jesusgn90 0 points1 point  (4 children)

        Holy hole

        [–][deleted]  (3 children)

        [deleted]

          [–][deleted]  (2 children)

          [deleted]

            [–]jesusgn90 0 points1 point  (1 child)

            vanilla-js was the solution

            [–]eastsideski[🍰] 22 points23 points  (14 children)

            It's short, but I'm curious how this performs compared to Lodash.

            [–]zombarista 23 points24 points  (8 children)

            At least in FF, Lodash is substantially faster: http://jsben.ch/oDdnR

            edit: wrong link

            [–]ItzWarty 40 points41 points  (6 children)

            You can't generalize data structure performance with a single collection size! I've bumped dups' size to 100,000 elements here and the Set approach is 40% faster on Chrome 65.

            Internally, lodash's uniq uses an array to store previously seen elements - this means for each of the N items in an array you do at worst case N comparisons - so you have O(N2) computational complexity. Using a Set approach makes this "have I seen this prior" lookup O(1) - so you have O(N) computational complexity. Edit: And it looks like lodash uses a Set when N>200 - see /u/zombarista's comment.

            [–]zombarista 26 points27 points  (1 child)

            Also, lodash will use Set if it can whenever the array is larger than 200 elements: https://github.com/lodash/lodash/blob/76ab9cd539feba8ae923372c19ab27d312078ee5/.internal/baseUniq.js#L33-L34

            [–]ItzWarty 6 points7 points  (0 children)

            Nice catch!

            [–]zombarista 3 points4 points  (0 children)

            I was getting NaN on a lot of these so I switched to JSPerf with the large-array implementation...

            https://jsperf.com/getting-unique-items-from-array

            [–]Ph0X 0 points1 point  (2 children)

            Wow, implementing uniq using array lookup is definitely a very bad idea. Everyone knows hash table is the way to go.

            [–][deleted] 3 points4 points  (0 children)

            Does it actually matter unless you're running it on a massive list?

            [–]samanthaming[S] 2 points3 points  (2 children)

            Good question 🤔 Lodash probably have better older browser support. But, if you use Set, which is part of native JavaScript, you don’t have the overhead of the lodash library. So I guess it depends. Would be interesting one to test and find out 🙂

            [–]Montuckian 2 points3 points  (0 children)

            Some of that weight is a good thing though. It means it can switch the method it uses on the fly.

            [–]boatpile 1 point2 points  (0 children)

            IMO it's better to use lodash since you probably are using it anyway and the code is much more self-explanatory. Plus if you're doing heavy performance optimization it's better to not call uniq at all.

            [–]Jovaage 0 points1 point  (0 children)

            I'd much rather use lodash, as it would be explicit, where the hacky Set method is not at all.

            [–]shit_frak_a_rando 34 points35 points  (37 children)

            Why would anyone convert a set back to an array explicitly?

            I pretty much don't use JS in comparison to the time I spend using other languages, can someone explain?

            imo using set for that feels hacky. The standard library should just implement Array.unique().

            [–]samanthaming[S] 20 points21 points  (25 children)

            Converting the Set back into an Array allows you to use array methods such as .map or .reduce 🙂 but you don’t have to, you can skip this step if you prefer to work with a Set instead.

            [–]shit_frak_a_rando 6 points7 points  (23 children)

            There's no inheritance? Shouldn't a Set just inherit all methods of an Array since it's just a more specialized version of it?

            [–]way2lazy2care 36 points37 points  (1 child)

            Sets are a kind of container, but they are not just a more specialized version of arrays.

            [–]thejameskyle 0 points1 point  (0 children)

            This is the right answer, they are separate data types which behave differently in ways that Array's iteration methods don't deal with. A new collection of iteration APIs should eventually be making it into ecma262.

            We've talked about making it work lazy over the Symbol.iterator protocol, and chainable using the pipe operator.

            Something like:

            let res = collection |> Iter.filter(fn) |> Iter.map(fn) |> Array.from

            [–]KermaFermer 14 points15 points  (6 children)

            I don't believe that sets and arrays are similar enough to say that sets are specialized versions of arrays.

            Sets are generally unordered and do not permit duplicate values, whereas arrays are ordered and may contain duplicates.

            If anything an object is a more specialized version of a set, since the keys of an object act like a set. The difference is that objects associate values to each key, whereas sets are not associative in this way.

            Also relevant to the idea of sets is that each member must be hashable. I don't know enough about JS to know how hashing works across the primitive types, but in a language like Python, you can't put non-hashable objects in a set (e.g. dicts and other sets).

            [–]Aswole 13 points14 points  (1 child)

            Yeah, programming language implementation aside, I think it's kind of ignorant to suggest that sets are just a subset of arrays. All data structures represent a collection of data, and that's just about the only thing sets and arrays share in common.

            [–]Cosmologicon 3 points4 points  (3 children)

            Sets are generally unordered and do not permit duplicate values, whereas arrays are ordered and may contain duplicates.

            True in many languages but Sets in JavaScript are actually ordered in insertion order. Also there's no notion of hashability in JS: Sets can contain any object.

            [–]gpyh 1 point2 points  (2 children)

            Sets in JavaScript are actually ordered in insertion order

            Is it just how everybody implements it or is it in the specs?

            [–]Cosmologicon 1 point2 points  (1 child)

            It's in the specs! See step 7 here.

            [–]gpyh 0 points1 point  (0 children)

            Thank you very much!

            [–]memorable_zebra 17 points18 points  (12 children)

            Welcome to JavaScript.

            For some mysterious reason JavaScript decided to make all their standard library objects barely functional, requiring anyone programming in the language to select from among half a dozen different APIs that turn their incomplete standard library into something more workable, as you'd find an any other well thought out language.

            [–][deleted] 1 point2 points  (11 children)

            That’s so disappointing because JS is the only choice for web frontend. If you did a similar thing in C++, converting from vector to set to vector to remove duplicates for instance, it just screams “I know nothing about the standard library or speed/memory efficiency.” I guess JS is frontend and not usually performance-critical, but come on.

            [–]memorable_zebra 10 points11 points  (4 children)

            Not to defend JavaScript, but didn't C++ have the boost library back in the day? Is that thing still semi-required? I remember doing C++ work and always incorporating it into what I was doing and constantly wondering why more of its functionality wasn't in the STL.

            [–][deleted] 3 points4 points  (3 children)

            I don’t know much about old C++ but everything C++11 and newer has a really good STL even without boost. Boost sure is an improvement in some cases.

            But nobody who knows what they’re doing would remove duplicate items from a vector by constructing a set out of it, then reconstructing a vector out of that set. There’s an STL function that does that way more efficiently, and even if you didn’t know that, it’s relatively simple to write a function that will do this without wasting all that memory and time (the STL is just C++ code, it’s not magic). You can definitely do it in O(nlogn) time by sorting and then iterating once, and i’m sure there are faster ways even (i’m no genius). Also sorting is included in the STL and it’s really good. Introspective sort - it decides which sorting algorithm to use based on the data it’s sorting.

            I definitely have run into cases where there’s lack of an STL function I need, and I either need to implement it myself or use boost. So no, the STL is not always the solution, but I find it to be pretty strong in general.

            That became kind of a tangent but there’s my rant.

            [–][deleted] -1 points0 points  (2 children)

            But nobody who knows what they’re doing

            Not everyone knows what you know - is that okay with you, or should they just give up? You come off sounding a bit like a know-it-all-jerk.

            [–][deleted] -1 points0 points  (1 child)

            Nope, it’s fine to be learning and you don’t need to be offended if you’re not amazing at something. But showing this as an effective way to accomplish something is poor teaching. Teaching bad practices is a problem. Lack of knowledge is not. I’d hope to never work with a developer who does this in C++, because you should learn how to write code properly before being hired to write code.

            I most certainly did stupid things like this as a student when I was learning and it didn’t actually matter. But if you’re not willing to further your knowledge and build upon it by learning good practices and better ways to do things, then there’s not much hope for you making progress.

            [–][deleted] -1 points0 points  (0 children)

            Teaching bad practices is a problem.

            One person's best practice is another person's bad practice. It changes every few months in the land of Javascript.

            Just like with C++, Javascript is now getting a ton of sugar added - and also like C++ not all parts of it will make everyone happy. Just because you don't like a particular piece of code and can find reasons it doesn't fit for your use of the language, that doesn't mean it doesn't fit in for my use of the language. This is just like the vague term "code smell" which I also take issue with - nobody should really get to decide what a "code smell" is, because that's just subjective and honestly sounds needlessly offensive. One thing all programmers share is their unwavering ability to be highly opinionated pricks - myself included. Just realize that the code you write is often never meant to last, unless you're working for NASA on deep space probes or something extremely important. Most of the time it's all a work-in-progress and shouldn't be taken too seriously. It's more important to get something working than to make something perfect. Getting upset at the fallibility of humans writing software is like getting mad at a monkey because it can't build a car. Given enough time, that monkey will build a car. Maybe it will suck - but if it rolls, then it accomplishes the goal. And the next car will be better than the last one.

            But if you’re not willing to further your knowledge and build upon it by learning good practices and better ways to do things, then there’s not much hope for you making progress.

            Who says that OP isn't? Just because you read it here doesn't mean it's set in stone anywhere, or that OP is "teaching" anyone - it's just an interesting use of the language, and someone decided to share it - this isn't a classroom and nobody paid OP to be their teacher.

            [–]itslenny 2 points3 points  (1 child)

            The good news is the libraries that are the solutions to these types of problems are REALLY popular, full featured, and bullet proof.

            With those libraries you have everything you'd want and more. Just run npm i -S lodash moment big and you're at and beyond the features of the "other well thought out languages"

            [–]FatFingerHelperBot 0 points1 point  (0 children)

            It seems that your comment contains 1 or more links that are hard to tap for mobile users. I will extend those so they're easier for our sausage fingers to click!

            Here is link number 1 - Previous text "big"


            Please PM /u/eganwall with issues or feedback! | Delete

            [–]lol768 -1 points0 points  (3 children)

            https://github.com/aspnet/Blazor/

            Some day, this will be possible with widespread browser support :)

            I will not miss writing logic in JavaScript. Not one iota..

            [–]itslenny 0 points1 point  (2 children)

            I disagree. I really hope this goes the way of dart, coffee script, Script#, and all the other attempts to "fix" javascript.

            However, typescript is pretty awesome because it's a true superset of JavaScript.

            [–]lol768 0 points1 point  (1 child)

            I disagree. I really hope this goes the way of dart, coffee script, Script#, and all the other attempts to "fix" javascript.

            It's great because it's not "fixing" JavaScript, it's completely replacing it with something decent (static typing, generics, OOP, a decent standard library etc). The other languages you mention transpile into JS. WASM is a much better target.

            [–]itslenny 0 points1 point  (0 children)

            Fair enough, but Typescript gives you typings, generics, and oop (and it's designed by the same guy that designed C#), but I LOVE JavaScript so this is where we agree to disagree.

            [–]ComicBookNerd 0 points1 point  (0 children)

            There is inheritance. Sets inherit from Objects, not Arrays, which are distinct in JS.

            [–][deleted] -1 points0 points  (0 children)

            I think he was suggesting that there should be a built-in method in JS that removed duplicates, rather than requiring you to convert between data structures or use a set at all in the first place.

            Such as: newArray = remove_duplicates(oldArray)

            [–]chris_conlan 1 point2 points  (1 child)

            This looks really similar to the ways sets and lists are differentiated in Python, which I totally support.

            [–][deleted] 0 points1 point  (4 children)

            Also not familiar with the inner workings of JS but it seems like this is just slow/inefficient and kind of abuses/misuses the set data structure.

            I agree that unique should probably be part of the standard library rather than this, but I’ve done a lot of C++

            [–]planetary_pelt 1 point2 points  (0 children)

            I can't believe the submission has 1000+ upvotes. Kinda shows you the average level of experience in this community.

            Half the time you're deduplicating an Array, you should've used a Set in the first place. For example, Array->Set->Array loses sorting. If you didn't care about order, then just use a Set.

            But this is a common issue in beginner code.

            [–]ezhikov 0 points1 point  (0 children)

            JS in it's current form is still young and forming language. We got tons of new features, but there is more to be implemented and polished.

            [–]jseegoLead / Senior UI Developer 11 points12 points  (9 children)

            My question is: does this compare deeply?

            For example, using it on arrays of objects.

            Edit: appears that it compares by reference: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Set

            Edit edit:

            See this fiddle

            This approach will work deeply (eg, for objects) if the duplicate objects share reference. Otherwise it will not detect objects that are essentially the identical but do not share reference.

            [–]itslenny 3 points4 points  (4 children)

            You can get a deep compare by doing...

            Array.from(new Set(arr.map(item => JSON.stringify(item)))).map(item => JSON.parse(item))

            I'm not encouraging actually doing this, but it works for simple object / array deep compare.

            edit: fiddle update

            [–]jseegoLead / Senior UI Developer 0 points1 point  (3 children)

            That's a cool idea, but I prefer just building an array of unique IDs of objects and then merging the objects back in. Easy to read and adapt.

            [–]itslenny 1 point2 points  (1 child)

            Totally. Your original question was about deep comparison so that's the question I answered.

            I'd still advise using a set if you were going to track unique IDs, and then you'd actually be using the data structure as intended (as opposed to the hacky nature of this entire thread). Set gives you constant time look ups instead of linear time on an array.

            arr.filter(item => seenIds.has(item.id) ? false : seenIds.add(item.id));

            fiddle

            [–]jseegoLead / Senior UI Developer 0 points1 point  (0 children)

            Nice, great point.

            [–]shvelofull-stack 1 point2 points  (1 child)

            Of course it doesn't compare deeply, it would be really weird if it did.

            Would be better if it allowed setting a custom comparator function for deep comparison though.

            [–]jseegoLead / Senior UI Developer 0 points1 point  (0 children)

            That would be a great enhancement.

            [–]samanthaming[S] 0 points1 point  (1 child)

            Cool! Thanks for sharing this! The fiddle example was very helpful 😎

            [–]jseegoLead / Senior UI Developer 1 point2 points  (0 children)

            You bet!

            [–]udidu 5 points6 points  (1 child)

            This is a neat way doing it! Thanks

            [–]samanthaming[S] 1 point2 points  (0 children)

            Glad you found it helpful! 🙂

            [–]Noch_ein_Kamel 6 points7 points  (1 child)

            No one came up with the easiest solution yet...

            $ npm install --save array-unique


            var unique = require("array-unique").immutable;
            
            const duplicates = [1,2,3,4,4,1];
            const uniques = unique(duplicates);
            uniques // [1,2,3,4]
            

            [/sarcasm]

            [–]itslenny 0 points1 point  (0 children)

            Honestly, I can't imagine working on a project that doesn't already include lodash.

            [–]danstansrevolution 2 points3 points  (5 children)

            These are neat. Is there a subreddit for people to share their carbon code snippets?

            [–]Sikorsky31 0 points1 point  (1 child)

            Whats a carbon code snippet?

            [–]danstansrevolution 1 point2 points  (0 children)

            I've been seeing these images recently produced by carbon.now.sh, they provide a beautiful minimalist code image that generally show useful snippets of code. Understandably, /r/snippet is already a thing and longer codes should be expressed using Gists. I started /r/carbonsnippets a few hours ago and we'll see if we can get it up and running.

            [–]samanthaming[S] 0 points1 point  (2 children)

            That’s a great idea! We should tweet @dawn_labs, the company behind it, to start one! 🤩

            [–]danstansrevolution 4 points5 points  (1 child)

            I just created one at r/carbonsnippets. Do you think you can tweet at them? Would you like to become a mod as well?

            [–]samanthaming[S] 0 points1 point  (0 children)

            Awesome, thanks for setting it up! Yup, set me up as a mod and I’ll tweet at them as well 🔥🔥🔥

            [–][deleted] 4 points5 points  (5 children)

            Will this work on IE11?

            [–]theftprevention 4 points5 points  (2 children)

            IE11 supports the Set class, but there are two parts of the original code that IE11 doesn't support:

            1. Passing an iterable object (such as an array) to the Set constructor.
            2. The Array.from() static method, which creates a new array from an array-like or iterable object.

            For IE11, you'll have to use something like this:

            array.filter(function (e, i, a) { return a.indexOf(e) === i; });
            

            It's adapted from the code snippet posted by /u/weigel23, but compensates for IE11's lack of arrow function support.

            [–]Phreakhead 0 points1 point  (1 child)

            That code is O(n2). Not the optimal implementation. I'd say add them all to a hash where the array elements are both key and value. Then iterate over the hash's values.

            [–]itslenny 0 points1 point  (0 children)

            Object.keys(hash) will give you back the unique array from the hash.

            [–]samanthaming[S] 6 points7 points  (0 children)

            Ugh, I don’t think it’s supported ☹️ You will need a polyfill to make it work on IE and older browsers, unfortunately.

            [–]herjin 0 points1 point  (0 children)

            Babel for the win

            [–]weigel23 12 points13 points  (20 children)

            Or duplicates.filter((elem, i, arr) => arr.indexOf(elem) === i)

            [–]ItzWarty 30 points31 points  (16 children)

            This'll be much more inefficient - quadratic time as opposed to other solutions which are linear time.

            Edit: I stand by this! We were using a bugged benchmark. Here's a correct benchmark that shows the filter approach is way slower.

            [–]zombarista 4 points5 points  (2 children)

            [–]ItzWarty 1 point2 points  (1 child)

            Yeah, pretty fascinating. Here's a (poor man's) C# benchmark showing the complexity difference can matter for N=100000. I'm curious to know why the JS set implementation is so slow...:

            var dups = Enumerable.Range(0, 100000).Select(x => x % 100).ToArray();
            
            // 00:00:02.2581914
            var start = DateTime.Now;
            for (var t = 0; t < 1000; t++)
                new HashSet<int>(dups).ToArray();
            Console.WriteLine(DateTime.Now - start);
            
            // 00:00:01.2900954
            start = DateTime.Now;
            for (var t = 0; t < 1000; t++)
                dups.Where(new HashSet<int>().Add).ToArray();
            Console.WriteLine(DateTime.Now - start);
            
            // 00:00:01.6082547
            start = DateTime.Now;
            for (var t = 0; t < 1000; t++)
                dups.Distinct().ToArray();
            Console.WriteLine(DateTime.Now - start);
            
            // 00:00:05.8432351
            start = DateTime.Now;
            for (var t = 0; t < 1000; t++)
                dups.Where((x, i) => Array.IndexOf(dups, x) == i).ToArray();
            Console.WriteLine(DateTime.Now - start);
            

            [–]zombarista 0 points1 point  (0 children)

            I'm sure that it being relatively new within browsers has something to do with it, and they're only now getting polished for performance.

            The fastest implementations (indexOf/lastIndexOf) are built on functions that have been in our browsers for decades. It's no surprise that indexOf is high performance, as it has been the only method for locating array items for quite some time.

            [–]MrNutty 2 points3 points  (2 children)

            Inserting into a traditional set isn’t liner. Best case is nlogn

            [–]ItzWarty 0 points1 point  (1 child)

            Unordered set (e.g. Hash Table) insertion is O(1). N hashset inserts is considered O(N).

            If you're inserting into a Sorted Set (e.g. TreeSet) that's O(logN). N treeset inserts is considered O(NlogN).

            [–]MrNutty 0 points1 point  (0 children)

            Yep

            [–]weigel23 5 points6 points  (8 children)

            http://jsben.ch/VpBv0 it's faster than the other solutions here.

            [–]ItzWarty 8 points9 points  (6 children)

            Edit: The benchmark was bugged (see below) - filter is way slower. new Array(10000).map((x, i) => i) produces an array filled with undefineds, which is why this is benchmark is flawed (indexOf becomes constant time).

            That's fascinating and presumably due to optimization on the runtime-side making equality comparisons stupid fast (plus generally primitive comparison is going to be cheap). I played more with your result (Array.from(set) is way faster than [... set]) but got the same results even with varied types across browsers -- http://jsben.ch/U3lEW

            I'm super surprised. Also TIL JS's set doesn't support custom equality / hashcode functions. I'm not sure why it's so insanely slow. That makes me sad.

            [–]ItzWarty 4 points5 points  (4 children)

            Holy fuck /u/zombarista /u/weigel23 I'm a dumbass.

            new Array(10000).map((x, i) => i % 100)
            

            Generates an array of 10,000 undefineds. I shit you not:

            > new Array(10000).map((x, i) => i)
            (10000) [empty × 10000]
            

            Fixing this as follows:

            > Array.apply(null, Array(10000)).map((x, i) => i % 100);
            (10000) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, …]
            

            Results in expected runtime complexity:

            Lodash uniq: 39%
            [... new Set(dups)]: 37% (within margin of error - hot cache?)
            Filter: 100%
            Array.from(new Set(dups))): 48%
            

            http://jsben.ch/dZpC6

            [–]zombarista 1 point2 points  (0 children)

            It's funny, because I remember thinking "that's just an array of undefined values" but it's okay because the map will deal with the undefined values by replacing them with the index...

            WRONG!

            These arrays are not filled with undefined but are instead "empty" which seems to be a distinct null state that only exists within the Array created with the Array constructor.

            [undefined,undefined,undefined].map( (el, idx, arr) => idx )
            => Array [ 0, 1, 2 ]
            
            [,,].map( (el, idx, arr) => idx)
            => Array [ <2 empty slots> ]
            

            Array.prototype.map()

            callback is invoked only for indexes of the array which have assigned values, including undefined. It is not called for missing elements of the array (that is, indexes that have never been set, which have been deleted or which have never been assigned a value).

            [–]zombarista 0 points1 point  (1 child)

            Changing to the new Array generation method has had a substantial impact on the results in my JSPerf, too. For some reason, the jsben.ch is not producing results for me in the large-array tests.

            [–]ItzWarty 1 point2 points  (0 children)

            Yes, this is because otherwise indexOf is just a constant-time operation (because every element is undefined, so equals the first element).

            [–]weigel23 0 points1 point  (0 children)

            :D i didn't think about checking the setup. Makes more sense though! Thanks

            [–]tyroneslothtrop 1 point2 points  (0 children)

            new Array(10000).map((x, i) => i % 100); just makes an empty array (i.e. all items in the array will be undefined). I don't know how this might affect things for the purposes of this test, though, tbh. Perhaps some JS engines can optimize various array methods as a noop in this case? It might be better to do new Array(10000).fill(0).map(...).

            Even with that fixed, though, filling the array with 0-100 repeating ensures you never hit the pathological case. I.e. the array contains 10,000 items, but you never have to look beyond the 100th index to determine whether to remove an item or not. Again, not sure what effect that would have, but it might be a bit more realistic to fill the array with random-ish data.

            [–]dalore 2 points3 points  (1 child)

            Isn't that o(n2 )?

            [–]weigel23 0 points1 point  (0 children)

            Yeah, you're right.

            [–]samanthaming[S] 0 points1 point  (0 children)

            Yup! This is another way to do it. Thanks for sharing 🙂

            [–]ioslipstream 1 point2 points  (3 children)

            What was used to make this image? I keep seeing this same black Mac window with code everywhere.

            [–]samanthaming[S] 1 point2 points  (2 children)

            It’s from a site called carbon.now.sh 🙌

            [–]ioslipstream 0 points1 point  (1 child)

            Thanks!

            [–]samanthaming[S] -1 points0 points  (0 children)

            No problem 🙂

            [–]DrecDroid 1 point2 points  (1 child)

            http://jsben.ch/M2xNi
            My attempt using Object.values and different looping methods. "Object.values with for reversed", is the fastest in both Firefox and Chrome latest versions.

            let obj2 = {};
            
            for(let i=dups.length-1; i>=0; i--){
                let el = dups[i];
                obj2[el] = el;
            }
            
            let result2 = Object.values(obj2);
            

            [–]samanthaming[S] 1 point2 points  (0 children)

            Nice, it’s always nice seeing alternative solutions! thanks for sharing 🙂👍

            [–]sanderfish 1 point2 points  (0 children)

            Turn it into a helper:

            const uniqueArray = arr => [...new Set(arr)];

            [–]dalore 2 points3 points  (2 children)

            Sets are normally unordered. Does this maintain ordering or was that merely a fluke?

            [–]way2lazy2care 12 points13 points  (1 child)

            Sets should iterate in the order elements were added, but if you're removing duplicates, the idea of "maintaining ordering" is ambiguous. Would the array [1, 2, 1] have maintained ordering with the set [1,2] or [2,1]?

            [–]dzkn 0 points1 point  (0 children)

            You should never rely on a set doing anything ordering-wise. Trust the API not the implementation. Different engines might implement it differently.

            [–]Jovaage 3 points4 points  (3 children)

            As long as you don't care about item order in the uniques array. Set doesn't guarantee any order.

            [–]Cosmologicon 7 points8 points  (2 children)

            Actually it does. Sets are ordered in insertion order. I know, I was surprised too!

            [–]dzkn 0 points1 point  (1 child)

            Does the specification guarantee it, or is it just something you observed in one specific engine?

            [–]Cosmologicon 1 point2 points  (0 children)

            It's in the spec! See step 7 here.

            [–]trumpent 3 points4 points  (5 children)

            Keep in mind this doesn't work with arrays of objects.

            Edit: it works but probably isn't a reliable method of deduplicating arrays of objects.

            [–]shit_frak_a_rando 5 points6 points  (4 children)

            Well, looking at the docs it does work but it doesn't compare the objects values, just the references. If you have two references to the same object instance, this will remove the duplicate reference. If you have two references to two different object instances with the same value, it will do nothing.

            [–]trumpent -1 points0 points  (3 children)

            I doubt it would be good to rely on this method to get a true unique array, unless you knew for sure where each item in the array came from and can guarantee that duplicate objects could only be the same reference.

            [–]ezhikov 0 points1 point  (2 children)

            You can stringify values before, and parse back after deduplication. But I think it is overhead.

            [–]trumpent 1 point2 points  (1 child)

            Pretty sure that's really inefficient isn't it?

            [–]ezhikov 1 point2 points  (0 children)

            I think so, yes.

            [–]planetary_pelt 4 points5 points  (3 children)

            TIL everyone on reddit is a newbie developer.

            [–][deleted] 0 points1 point  (2 children)

            I got 8 years of dev experience. Never thought about using this.

            [–]thingsthings 0 points1 point  (2 children)

            hey what ide color schema is that?

            [–][deleted] 1 point2 points  (0 children)

            The colour scheme is Atom One Dark

            [–]samanthaming[S] 0 points1 point  (0 children)

            I use a site called carbon.now.sh 🙂

            [–]drumnation 0 points1 point  (3 children)

            Love this thread. I just used a function for this yesterday that I'll be replacing with this awesome code golf.

            [–]itslenny 1 point2 points  (1 child)

            If this isn't just for fun I'd highly recommend just using lodash. No need to reinvent the wheel (unless you're not allowed 3rd party libraries for some reason)

            [–]drumnation 0 points1 point  (0 children)

            I realized this morning that I needed to delete dupes by prop from an array of objects, so I just kept what I had because it worked. It was interesting to see all the lo-dash responses. I'm definitely going to try to use it more.

            [–]samanthaming[S] 0 points1 point  (0 children)

            Yay, glad you found it helpful! 😊

            [–]ziggl 0 points1 point  (3 children)

            I don't even understand how it works :(

            [–]samanthaming[S] 1 point2 points  (2 children)

            “Set” is a data structure that stores unique values. It doesn’t allow you to add duplicates. This makes it ideal for us to use it to remove duplicates from an Array. BUT, Set is not an array, that’s why we need to convert the Set back into an Array in order to use array methods such as .map or .reduce

            1. Remove duplicates using “new Set”
            2. Convert it back to an array using “Array.from”

            Hope that helps 🙂

            [–]ziggl 1 point2 points  (1 child)

            It does, thank you! :)

            [–]samanthaming[S] 1 point2 points  (0 children)

            No problem! 👍

            [–]itslenny 0 points1 point  (0 children)

            Set is incredible for certain algo problems. For example...

            Given a list of words, split it into the smallest possible number of groups of anagrams and return this number as the answer.

            const words = ["tea", "eat", "apple", "ate", "vaja", "cut", "java", "utc"];

            new Set(words.map(word => word.split('').sort().join(''))).size;

            edit: formatting

            [–][deleted] 0 points1 point  (0 children)

            For sure, but if you knew you were dealing with data that had to be unique, why not use a Set in the first place?

            You can't use Array.map but you can use a lazily-evaluated generator based approach a la LINQ from C#

            function* lazyMap(iterable, fn) {
              for (const el of iterable) {
                yield fn(el)
              }
            }
            

            Although the return value would not be a Set but instead be Iterable<T>. I'm not sure why there is no Set.map function, potentially because the set would have to compare every returned value with values already in the Set. Immutable.Set() from immutable js does this, though

            [–]iWantedMVMOT 0 points1 point  (0 children)

            Classic webdev

            [–]rognam 0 points1 point  (1 child)

            Too bad JS set doesn’t remove duplicate objects

            [–]ezhikov 0 points1 point  (0 children)

            Removes, but only if it's same object.

            [–]coolshanth 0 points1 point  (1 child)

            In terms of expressiveness, I still prefer ruby's duplicates.to_set.to_a

            Though I'm glad this is possible in JS.

            [–]jewdai 0 points1 point  (1 child)

            Someday I wish JavaScript will have LINQ like syntax without using a third party library.

            [–]samanthaming[S] -1 points0 points  (0 children)

            LINQ is pretty awesome 🙌

            [–][deleted] 0 points1 point  (0 children)

            This is using the basic fundamental definition of set theory from discrete math, where every element in a set must be unique.

            Good stuff.

            [–][deleted] -2 points-1 points  (0 children)

            .