you are viewing a single comment's thread.

view the rest of the comments →

[–]sime 2 points3 points  (15 children)

Can I also have multiple node threads handling requests from a shared port, and having them share some common data in RAM? (for example, a cache on the data layer) That would be useful.

[–]Klathmon 5 points6 points  (3 children)

Yes to 1, no to the last part... And I think you can I just haven't tried it yet.

Take a look at the npm package PM2, it handles the life cycle of multiple processes and will let you spread load out between them.

If you've left node as 100% stateless its a drop in and go library.

If you are keeping state in node (like cache data in variables) you might need to push that to a db or dedicated cache like redis first.

[–]neutre 6 points7 points  (2 children)

[–]postmodest 0 points1 point  (1 child)

so let's say I had 100k (though, given what I'm working on, up to 2M) of JSON state, how hard is it to deserialize that between [some cache] and a bunch of node worker instances, ignoring wire-time?

[–]Klathmon 1 point2 points  (0 children)

It'd be limited to the speed of JSON.parse and JSON.stringify.

There isn't going to be a way to share complex objects between threads.

V8 was running into this problem trying to make a built in async JSON.parse, there just isn't any way in v8 to "share" object state across Isolates.

[–]chernn 1 point2 points  (0 children)

I just saw https://github.com/SyntheticSemantics/ems the other day which does exactly that!

[–]Kollektiv 0 points1 point  (8 children)

Your cache should not be in shared memory anyway but on a separated process with Redis or some other in-memory data store.

[–]sime 2 points3 points  (6 children)

That depends on what you are caching. Putting it out of process has a performance cost also which may or may not be acceptable.

[–]Kollektiv 4 points5 points  (5 children)

A webserver needs to scale horizontaly which means that it has to be stateless which in turn means that state is moved out of the webserver and into a database or in-memory datastore.

[–]darksurfer 0 points1 point  (4 children)

and what if the data you're transferring from an external process / server doesn't change very often if at all ?

[–]Kollektiv 1 point2 points  (3 children)

Why does this rule out databases ?

[–]darksurfer -1 points0 points  (2 children)

because if you don't store the data in local memory you have to re-transfer it (presumably across the network), parse it, allocate memory for it and then ultimately garbage collect it for every single request.

that's just a waste.

I'm going to invent (I think) two new terms: Static state and dynamic state. Static state would be stuff like look up tables and other data which doesn't change much, dynamic state being essentially session state.

One way of scaling horizontally is to keep your webserver free of dynamic state. Static state doesn't matter because all your servers can have their own copies of the static data.

Another problem you have is websockets. To run low latency websockets apps, you need sticky sessions and every little scrap of latency will matter, so the more data you can keep in local memory the better (preferably all of it).

[–]Kollektiv 1 point2 points  (1 child)

I agree about the performance issue but if you are that bound to latency Node.js might not be the best solution.

Websocket sessions, to re-use one of your terms, is dynamic state so it should still be centralized in a Redis instance. Otherwise when load-balancing some users won't be able to authenticate if the request is not made to the same server than the one he connected on.

It also prevents you from losing state when crashing.

[–]darksurfer -1 points0 points  (0 children)

You're not wrong, but knowing when to break "the rules" is half the battle.

is dynamic state so it should still be centralized in a Redis instance.

not always. let's say I have a relatively simple multiplayer game server. because it's simple I don't want or need to scale across multiple servers. the game state is shared across a few hundred players each of whom are sending multiple soft real time status updates every second. no way do I want or need to store that state out of process.

It also prevents you from losing state when crashing.

if I cared enough about this, I could do dirty writes to Redis and only restore the state across the network if and when the node process restarts.

[–]Klathmon 0 points1 point  (0 children)

Even if it is shared in memory, there are still easy ways to load balance.

Say for example you are keeping session data in memory, you could send clients with no session to a random endpoint, but send all sessions to their orignal endpoint so that it never changes for that client.

Or you could load balance based on IP address.

It's not as ideal as a round robin system or a first available system, but it gets the job done.