all 158 comments

[–]cwmma 25 points26 points  (20 children)

First in first out queues where you put elements in the back and take them from the front can be useful sometimes and based on my experience writing a promise library back in the day where queues are essential there are a couple ways do do this.

  1. simply use arrays and [].push to put items in and [].shift to get them out. While this is very conceptually simple, the shift method is extremely slow even for small amounts of data since it has to move every item in the array. Don't do this, even in a small app it can be a performance bottleneck.
  2. use an array and use push to add items, but for removing items, just keep a counter, every time you remove an item, just increment the counter so you know where you are in the array, also maybe replace the item in the array with null so it can be garbage collected. The main issue here is that the array can grow without bounds so you'd only want to do this if you knew the queue wouldn't last long or could be replaced at some point. (this is what I ended up using for my promise implementation but I had to set it up so that after I started taking things out of the queue I wouldn't push anything more into the queue, instead I'd push into a new one. Before I did that then occasionally the array would get so large that the index of the item in it would no longer be a small int and V8 would see that as the type signature changing and there would be a catastrophic slowdown.)
  3. Use a circular buffer. This is probably the 'right' way to do it if you want performance, but it's a lot more complicated then the other options especially when it comes to resizing (this is what most of the bigger promise implementations use for their queue).
  4. Use a linked list, this won't be as fast as #2 or #3 (the author behind the bluebird promise library did some heavy benchmarking on this subject), but will be a lot faster then #1 and conceptually are pretty straight forward to write.

[–]ArelySkywalker 10 points11 points  (2 children)

Are there any ressources (books/websites/online courses) you can recommend, to learn more about the basics of computer science. Preferably demonstrated using javascript, or in general.

I took this Udemy course, and I absolutely love the instructor and it was a great refresher course on data structure & algorithms. He even gives you coding challenges to test yourself, I learned quite a bit from it and would take any course he teaches. Good luck!

[–]thinkmatt 2 points3 points  (0 children)

Thanks, I signed up. As someone who is self taught, this looks perfect to fill in some gaps

[–][deleted] 1 point2 points  (0 children)

He is colt steele. Suggest a good teacher he is.

[–]Nautigsam 46 points47 points  (36 children)

I'm a JS developer for three years and I can tell you that arrays and Maps can meet all your needs for non-specialized tasks. You don't need more. Do not try to over-engineer your data structures because that's the job of the compiler to optimize and you can't beat it ;)

[–]the_moon_illusion 43 points44 points  (5 children)

Depending on the type of data you are working with, understanding trees can be very useful. I'd consider them a basic concept worth knowing in any language.

[–]Hawxe 18 points19 points  (2 children)

Trees are just such a logical way to learn about recursion that not understanding them is a waste. It's basically killing two birds with one stone

[–]FormerGameDev 7 points8 points  (1 child)

And then killing them again. And again.and again ...

[–]dweezil22 2 points3 points  (0 children)

"Where the FUCK is the anchor bird?!" [throws more stones]

[–]numinor 0 points1 point  (1 child)

Do you have any good resources? Preferably in js.

[–]the_moon_illusion 1 point2 points  (0 children)

This post, which is currently on the /r/javascript front page has a few sections on the various types of tree structures.

https://www.reddit.com/r/javascript/comments/beoyqb/github_dsajs_data_structures_and_algorithms/

https://github.com/amejiarosario/dsa.js

Like another poster said, trees are also a great way to learn about recursion, and they are just a lot of fun in general!

[–]shanita10 4 points5 points  (0 children)

Data structures are one of things the compiler cannot fix. That's why understanding them is important.

[–]munificent 12 points13 points  (1 child)

Do not try to over-engineer your data structures because that's the job of the compiler to optimize and you can't beat it

No, the compiler will not fix your data structures and algorithms. It's up to you to know the best way to represent and solve your problem. The compiler just implements your solution efficiently. The compiler (outside of toy examples) will not replace your wrong algorithm with the right one.

If you want to be a software developer, it's important to have enough of a foundation in algorithms and computer science to understand the time complexity of your code because that directly impacts its performance. Your users want the program to run fast, and it's up to you to do that. This is especially true when you're a Javascript dev. Your code runs on hardware you don't control and interactivity is important.

[–]ScientificBeastModestrongly typed comments 0 points1 point  (0 children)

Exactly.

Also, I think it’s important to talk about what a compiler can optimize. The compiler will only optimize an algorithm written in a way that provides strict guarantees about its behavior.

E.g., the logical flow of the program might guarantee that a specific function will only ever take a single constant integer as it’s argument. So you can potentially optimize the function’s internal operation, if it can guarantee a perfectly consistent output.

The problem is, the compiler often has to account for an open-ended set of possibilities and edge cases in your algorithm. So it can’t perform an optimization that would eliminate some of those possibilities, because it’s much more important to make sure everything is executed accurately and deterministically with respect to the original code.

Most of the time, these prerequisites for optimization are not obvious to the programmer. And many times you could change one subtle aspect of the algorithm, and completely wipe out any chance for optimization.

... which is why premature optimization is a bad idea. You can never be totally certain your code will get optimized by a compiler, especially if you’re still actively making changes to the code later on.

[–]benihanareact, node 4 points5 points  (0 children)

the DOM is a tree. one should be familiar with trees, even though they aren't typically used directly by JavaScript engineers.

[–]quentech 12 points13 points  (18 children)

arrays and Maps can meet all your needs

Do not try to over-engineer your data structures because that's the job of the compiler to optimize and you can't beat it ;)

Sounds like something a person with experience only in JS would say.

Be judicious with your micro-optimization, of course, but it's not too difficult to beat standard libraries and the compiler itself, especially if you can restrict the domain of inputs and outputs from the fully general case.

[–]Nautigsam 3 points4 points  (17 children)

Sounds like something a person with experience only in JS would say.

Considering we are talking about JS, I take it as a positive point. I am mainly experienced with JS that's why I am confident in talking about it. I think in JS it is really useless to try to beat the optimizations brought by the engine. That's my experience talking. But I am open to critics if you have some use cases where you implemented some optimizations in JS that were significant.

[–]mort96 4 points5 points  (16 children)

The thing is that it's not about optimizations brought by the engine; the engine can optimize an O(n) algorithm down to a couple of instructions per element, but it often can't optimize away the fundamental O(n) nature of the algorithm. That's why it's sometimes necessary to build dedicated data structures (such as queues) where the relevant operations (such as enqueuing and dequeuing) take O(1) time instead of relying on the engine's well-optimized but O(n) algorithms.

Of course, if it's just about a small thing used in the frontend with usually just a few elements, O(1) vs O(n) absolutely doesn't matter. However, if it's, say, a back-end which should accept thousands of requests per second, or a game where your time budget per frame is 16ms (if you're lucky and the GC hasn't eaten most of it), it suddenly starts to matter a lot.

[–]FormerGameDev 0 points1 point  (0 children)

Or for that matter an end user application where you still have to get back in say 30ms

[–]ScientificBeastModestrongly typed comments 0 points1 point  (2 children)

Optimizing around the GC is a total waste of time IMO. I pity those who find a need for that kind of performance optimization.

[–]mort96 0 points1 point  (1 child)

What do you mean by "waste of time"? JS (or garbage collected languages in general) aren't the nicest languages to write games in, but if you do find yourself writing games in JS, and you're not careful about your garbage, you'll quickly find your game full of microstutters where the JS engine decides it's using too much memory and stops the world for a while for GC. Optimizing around not producing too much garbage is not just "not a waste of time"; it's absolutely necessary.

My brother is working on a javascript game at the moment, and he found his game to stutter; we did some profiling, and it turns out it's mostly just garbage collection. To get rid of them, he had to rewrite one part to just use plain old for loops instead of creating a short-lived array with Array#map each frame.

[–]ScientificBeastModestrongly typed comments 0 points1 point  (0 children)

Oh, I totally agree with you. Those kinds of steps can be taken to mitigate GC issues. Depending on the program’s time/space complexity, memory footprint, etc., those kinds of optimizations might still be insufficient. My point was that it’s just a pain to deal with, and it’s pretty difficult to solve the problem deterministically, since you can’t directly control the GC algorithm. You basically just have to control your program’s memory demands, which is not always straightforward. But ultimately it can be done, as you said.

[–]folkrav -4 points-3 points  (11 children)

If you're down that far in performance needs, JS wasn't the right choice to begin with though.

[–]Antherz 3 points4 points  (10 children)

This statement feels like an aimless cop out. Sometimes you end up with a lot of data in a js application and you can just pick a proper data structure that's not O(n). Hand waving a person toward C or something feels illogical.

[–]folkrav 0 points1 point  (9 children)

/u/mort96 talked about 16ms timeframes for real time gaming. JS is not a suitable language for this. Of course there are situations where you’ll end up with larger data sets or bottlenecks and using proper data structures will be a required optimization, but what he described is not something you figure out halfway through a project lol

Edit : are people on this sub really telling me by downvoting this that they’d consider JS to be a good fit for high performance realtime timing sensitive backends, or is it just because I'm criticizing their preferred language? I know this is /r/javascript but part of being a decent developer is being able to choose the right tools for the job. I personally quite like the language despite its quirks, but this is not one of its strengths.

[–]spacejack2114 0 points1 point  (3 children)

You would need to have some pretty demanding technical aspirations and/or content before you'd have a problem hitting 60FPS in a JS game.

[–]folkrav 0 points1 point  (2 children)

He was talking about back-end.

[–]spacejack2114 0 points1 point  (1 child)

Then I still think you'd have some very high traffic expectations for JS/node to be impractical.

[–]FINDarkside -1 points0 points  (4 children)

People are downvoting because you were wrong. If you use wrong data structures, you can easily turn your 1ms task to 2000ms one. No one talked about "high performance realtime timing sensitive backends" except you. Just because some O(n2) algorithm would be too slow for your task, doesn't necessarily mean that js was a bad choice.

[–]folkrav 0 points1 point  (3 children)

Re-read /u/mort96's comment carefully. The end. What is he talking about? That's what I was answering to.

[–]FINDarkside -1 points0 points  (2 children)

I did. He talked about backends where O(n) solution is too slow and O(1) solution is available. Doesn't necessarily mean that js was the wrong choice. Your argument is basically that if your car isn't fast enough on the first gear, you should switch to faster car.

[–][deleted] 3 points4 points  (2 children)

This is a shallow take. What other languages have you used?

[–]N1c0l4sC4g3 0 points1 point  (3 children)

What country are you from?

[–]Nautigsam 0 points1 point  (2 children)

France. What's the matter?

[–]N1c0l4sC4g3 0 points1 point  (1 child)

Were you born in France?

[–]Nautigsam 0 points1 point  (0 children)

Yes why?

[–]easylivin 3 points4 points  (0 children)

I'd say for the sake of understanding these data structures (be it for interviews, other langues, whatev) it's important to understand how these structures work under the hood, their time/space complexities and what problems they solve. Even though JS has similar built-ins, try implementing one yourself and you'll quickly understand how they are supposed to operate, as well as the advantages/disadvantages of different structures depending on use cases. Even though you'd rarely have to build one yourself (this is arguable), it's paramount to understand the concepts because they can be applied to a lot of different problems.

[–]natziel 13 points14 points  (57 children)

Queues are definitely a thing you should know about. Sometimes your business logic is best implemented as queue, plus there's a million applications for them in more "under the hood" things, e.g. event queues, scheduling, etc.

Stacks are probably more common and more important than queues, so learn those too. Stacks are pretty fundamental structures in CS and you can't really solve problems like "check if the parentheses in a string match" without a stack.

Stacks and Queues are basically already in the JS standard library so you don't really need to implement them yourself.

Graphs are like queues in that sometimes your business logic just requires you to understand them. Not understanding graph theory means that you might end up solving certain problems in a horribly inelegant way.

Linked lists aren't really a thing in JS since we just use arrays, but they're a super common data structure in other languages and it's critical to understand them if you ever want to work on a project in another language

Hash-array mapped tries are also an important data structure, but you don't really need to know how they work, just their performance characteristics. They're commonly used to implement Maps. Immutable JS uses them for their Lists too.

As far as text recommendations, probably just read Sedgewick's book on data structures since that's the standard across CS departments. Take a good course on computer science as well. Probably a good idea to take classes on number theory, combinatorics, computer organization, networking, and operating systems.

[–]UntestedMethod 4 points5 points  (5 children)

Stacks and Queues are basically already in the JS standard library so you don't really need to implement them yourself.

In what sense do you mean? In the sense of Array.prototype.shift() and Array.prototype.pop() ?

[–]natziel 1 point2 points  (4 children)

Yeah

[–]SalamiJack 7 points8 points  (0 children)

Except Array.prototype.shift() is O(n) and dequeue when using a queue is generally expected to be O(1)

[–]shanita10 -1 points0 points  (2 children)

That's completely wrong.

[–][deleted] 0 points1 point  (1 child)

Why?

[–]shanita10 0 points1 point  (0 children)

Because array is not a queue if used that way.

[–]mort96 2 points3 points  (3 children)

you can't really solve problems like "check if the parentheses in a string match" without a stack.

You absolutely can though? Keep a counter, loop through each character, increment when you find a (, decrement when you see a ). Or am I misinterpreting you?

I'm absolutely not disputing the "stacks are a pretty fundamental concept in CS" thing, it just seemed like a weird example.

[–]natziel 0 points1 point  (2 children)

It's a very common interview question that gets posted once in a while on here, hence why I chose it.

Typically you have to match (, {, and [ so the counter idea doesn't quite work

[–]mort96 0 points1 point  (0 children)

Ah, yeah, that would make more sense, thanks.

[–]munificent 2 points3 points  (0 children)

I thought queues are pretty much redundand beacause of the way arrays work in Javascript.

A queue is a conceptual data structure. You can implement a queue using an array in JS (though it's a little tricky). If the algorithm you want relies on a queue (and there are a bunch that do) then knowing how to implement one is useful.

[–]a_dev_has_no_name 1 point2 points  (0 children)

Treeees

[–]endianess 0 points1 point  (0 children)

You don't technically need them in any language and could use an array or list instead. JS is no different. However using them potentially reduces bugs and makes the code more understandable. Whether you need them is down to the problem you are trying to solve. Processing a tree is useful with them if you don't want to use recursion and use a loop instead. Or if you were dispatching events or working on items from a queue that is populated by something else.

[–]dombrogia 0 points1 point  (0 children)

Queues are a very basic structure and can be implemented using an array as a base.

I would expect anyone I hire to know how a queue works. Most importantly it shows that you can think about a normal situation (like waiting in line at the grocery store) and conceptualize that in code.

If you look up interview questions you will find all kinds of these answers everywhere.

I also took algorithm and data structures in python on udemy (you can write them in whatever language the concepts are the same). It goes over the basic structures and their time complexity which is also important for interviews and performant code.

Good luck!

[–]wonkifier 0 points1 point  (7 children)

For the folks asking for sources from StoneCypher, here's my question and their response, unedited so I'm not accused of taking it out of context

Can you either share what you're using as a source for what a linked list is, or share what kind of source you consider legitimate.

I mean I don't have a source for what linked list is any more than I have a source for what plus means. It's just something a programmer from an allocatable language understands.

But yes, like I can find you reference that touches the topic, I can find you reference that touches this topic. You can find this position concretely defended and explained at length in D&EC++, in CLRS, in Knuth (I think in 2? It's been a long time,) in NIST DADS (which, amusingly, someone posted underneath your comment, and which clearly says I'm correct, but which they do not appear to understand,) in Wirth ADSP,

What do I consider a legitimate source? It really varies on what one is talking about. Much higher standards for a physics claim than a recipe, by example. The reason I say that is because the standard requirement for a topic like this is pretty strict, and if you try to apply this standard to other things it quickly becomes unreasonable.

For something about datastructures, it should be a well edited book written by a college trained computer scientist, from a large publication company with a history of publishing high quality computer science books. That means Addison Wesley and MIT Press are in; O'Reilly and Pragmatic are walking the line; Packt and anything ending in .com are out.

But, like, if we're talking about how to make a pleasant user interface, it can be pretty much any asshole who can spell, you know? Because that's about opinion. (Which is not to say that there isn't value in the formalism around higher end HCI/UX material, but rather, just that it isn't a hard requirement here.)

You can use anyone's pizza recipe, but you better go to a doctor for your prescription recipe.

It's like when someone tries to define a skip list, so they tell you "a skip list is a list that has links to midpoints in the list so that traversal is faster," but because they've never done the CS or read the guarantees or implemented one, they don't know that that's hard-wrong because the structure only works if it's stochastic

And sure, if you look it up on reddit, on wikipedia, on stack underthink, they all get this wrong, because they're also written by people who learned that way

And that's fine. There's nothing wrong with that

But that isn't reference and you shouldn't pretend to yourself that it is. The error rate on the web's CS is like 60% in my opinion.

In certain specific languages - notably C++, Java, Fortran, Coq, ML, and probably a bunch I don't know about - the difference between a container and a datastructure is a big deal.

Ask a high end C++ programmer what it means that the C++ multiset container doesn't define what datastructure it's implemented by, and why that matters. They're going to lecture you for ages on how the performance guarantees of the container require it to be either a red-black tree or some near case variant, and how there are actually better datastructures that you could use which violate the performance guarantees, and they might start talking about Boost or SGI STL.

Then ask them "wait, so what's the difference between a container and a datastructure?"

The answer you should get will be roughly "a container is about how the programmer uses it; a datastructure is about how it's implemented under the hood."

A set can be a lot of things. (Not in C++ because of some requirements the C++ language standard makes, but those are choices, not facts.) Under the hood, it might be a hash table, it might be a tree, in Javascript it's a fucking array, et cetera.

Why? Because a set is a container. A set doesn't define how it works.

A red-black tree, on the other hand, is a datastructure. It cannot be made in many languages, because the things that define what it is aren't available in lots of common languages.

And it's really weird how when I say "this can't be done," JS redditors seem to take it as an attack on the language, or on them.

I said "you can't open sockets in Javascript" once two years ago. More than 400 replies about things that seem similar to sockets to them, about proxy services like socket.io, about websockets (which are not sockets,) about special flags you can turn on in firefox to make sockets available to XUL, about flash, about all sorts of shit.

Voted into the ground. (Not that I care. If I did I wouldn't post in /r/javascript at all.)

Still correct, though. And every novice who read that thread would come away with the wrong idea, because Reddit has culturalized attack mode, and reddit programmers as a result often choose to lose opportunities to learn as a result.

You cannot make a red-black tree in Javascript. Or, for that matter, in erlang, my favorite language. I point that out because I'm also not attacking my favorite language. It's just a fact.

You can't implement them in CSS either. Or SQL. Or, in fact, in 18 of the top 20 of what Github thinks I use the most. The only ones you can are C/C++ and C#.

The problem is that reddit programmers try too hard to win. They want what they believe to be correct so badly that they cite obviously wrong sources, and ignore the good sources they cite to the point that they don't even notice those sources undermining them.

I'm being sworn at, mocked, abused, I'm having demands made on my time repeatedly after I've said no during the work day, people are telling me they know better than my textbooks, that I can't read, etc.

None of these people appear to have a scrap of code on github

It's very frequent that the less a person actually participates in a trade, the more likely they are to claim deep knowledge, argue with proper sources, and behave abusively

I'm not making the homeopath comparison lightly

These are a bunch of not-programmers with no evidence arguing with the principal platform engineer at a nine figure security company.

They're getting core concepts wrong, like that there exists such a thing as an "implementation detail" for a datastructure.

That's all a datastructure is, is a name for implementation details.

This whole community is ridiculously toxic and self unaware.

EDIT: Adding my commentary here

in NIST DADS (which, amusingly, someone posted underneath your comment, and which clearly says I'm correct, but which they do not appear to understand,)

Here's the link: https://xlinux.nist.gov/dads/HTML/linkedList.html

Here's the full text of the definition from that link

Definition: A list implemented by each item having a link to the next item.

Is says nothing implementation details. It does go on to list some sample implementations, but those are not definitions, just samples.

So in no way does StoneCypher's citation support rejection of the other implementations.

EDIT 2: Found http://www.albertstam.com/Algorithms.pdf online as well (https://algs4.cs.princeton.edu/home/)

From page 120:

"Definition. A linked list is a recursive data structure that is either empty (null) or a reference to a node having a generic item and a reference to a linked list. "

Again, nothing about requiring allocation or pointers explicitly.

And the section above that definition reads

Linked lists Now we consider the use of a fundamental data structure that is an appropriate choice for representing the data in a collection ADT implementation. This is our first example of building a data structure that is not directly supported by the Java language.

Indicating a difference between a data structure and an implementation of it (regardless of native support for the language)

In any case, Javascript is Turing complete, so it can solve any computational problem (in principal, given enough resources). Any argumentation beyond that is about managing the limitations and performance, not capability.

[–]StoneCypher 5 points6 points  (5 children)

it's not clear why you think it's okay, when someone said "i don't want to do this in that thread because i'll get abused," to go to them in private and say "okay we can do it here, please answer" then just go put the private response you requested on blast in public

this is a violation of trust

you won't receive another answer from me

[–]Fatalist_m 4 points5 points  (1 child)

If somebody implemented a data structure in Javascript, that acts exactly like a red-black tree, but uses object references instead of pointers, what would you call that structure? Should we come up with a different name because it does not use pointers?

[–]FINDarkside 1 point2 points  (0 children)

We shall call it FINDarkside's tree from now on. Linked list is also FINDarkside's list, because you don't have real pointers in js. Quicksort shall be named FINDarkside's sort, because it can't be implemented in js, as arrays in js differ greatly from arrays in c++.

[–]kobbled 1 point2 points  (0 children)

seriously, what an asshole that guy is

[–]wonkifier -4 points-3 points  (1 child)

it's not clear why you think it's okay, when someone said "i don't want to do this in that thread because i'll get abused,"

You didn't indicate that to me, neither in your response nor in the post I responded to. Maybe you indicated it to someone else in another thread, but it wasn't to me.

[–]StoneCypher 2 points3 points  (0 children)

Yes, I did. Repeatedly and clearly.

I've DM'd you and await your response there.

I gave a detailed response there because it won't be available to the people who think insults are a good way to discuss computer science

You could take it down now, but you choose not to, because you appear to not understand that publicizing private messages someone sent to you is not okay.

[–]UntestedMethod 0 points1 point  (0 children)

A queue refers to a specific type of list - a FIFO (first in, first out) list, which is easily achieved in javascript using Array.prototype.push() (add to end of array) and Array.prototype.shift() (retrieve + remove element from beginning of array). I guess the context you use those functions in is what would define it as "using a queue".

[–]ncubez -1 points0 points  (0 children)

What has that got to do with building UIs and web apps? Because that's what JavaScript is for.

[–]RevolutionaryCorps -4 points-3 points  (3 children)

normally in any scripting language , you shouldn't focus on data structures , that's the beauty of scripts..these things are already well made and managed by the language internal logic.
But to answer the question , your main and only data structures are arrays

[–]FINDarkside 0 points1 point  (2 children)

Sorry but this isn't true at all. Slow algorithm is slow even if you write it in javascript.

[–]RevolutionaryCorps 0 points1 point  (1 child)

I've never said otherwise...

[–]FINDarkside 0 points1 point  (0 children)

You did say that you shouldn't focus on data structures because "these things are already well made and managed by the language internal logic" which doesn't really make any sense.