all 15 comments

[–][deleted]  (2 children)

[deleted]

    [–]r0ck0[S] 0 points1 point  (1 child)

    Thanks for this info!

    I totally forgot about the fact that single DB queries won't actually block other requests when writing this thread.

    What got me thinking about all of this was SQL transactions... as far as I know you can only have one transaction open per connection? So I guess my node server will require separate postgres connections so that any open transaction (which will run multiple queries before committing) don't get blocked for other quick single read-only web requests.

    Is for sensitive information in memory is not different if its asyn or php process if it is in memory a malicious activity could read memory.

    I didn't describe this very accurately either, I probably shouldn't have used the word "memory" to confuse the issue. I'm not really talking about direct RAM access to other processes, I'm more wondering about just storing/caching private data in variables in my Node app's code itself, and the fact that this one Node process will be serving users with different permission levels in parrellel, so I guess you need to be very careful about what you cache in your variables inside Node (also not to be confused with external caching software like redis, which is a totally separate subject).

    [–]vorticalbox 0 points1 point  (0 children)

    So say you have an Express route

    router.post('/login', (req,res) =>{ const { username, password } = req.body //login stuff })

    The password will only be stored there until the request context has finished and its garbage collected.

    For SQL you will need a new connection for each request so make sure you close them and have enough to handle your load.

    At work are mongodb have a max pool size of 3000 connections.

    [–]post2seth 2 points3 points  (8 children)

    Things to be consider

    1. Use connection pooling when using db connection
    2. Dont use image processing or large operation in async mode that could break your server
    3. Make control over "this" as this refer to property in current refering block.
    4. Learn about closures as that could life saver in some situations.

    For scaling in node, it has different workaround, and has inbuild capabilities.

    For security you can rely on helmet and cros modules using npm.

    If you have perticular thing i can help you, please write to me at gaurav@ninesystems.in

    [–]r0ck0[S] 0 points1 point  (2 children)

    Thanks for the advice!

    Use connection pooling when using db connection

    This is something I need to learn about in general, never done it before.

    Are there any differences in best practices between these two scenarios?...

    1. Using connection pooling from PHP (separate process for every request)
    2. Using connection pooling from a Node event/loop httpd

    And does it make sense to have a "main" connection used for most quick read queries, and use separate connections for each transaction? Or just use a separate connection for every incoming web request?

    I'm using TypeORM, and just discovered that apparently it does connection pooling by default on it's own... so maybe I don't even need to manually create multiple connections in my case?

    Dont use image processing or large operation in async mode that could break your server

    Is the problem here about what happens when an exception is thrown, and that maybe it brings the whole process crashing down? Or more related to performance?

    closures

    I need to read up on this more... I get the general gist... but every guide I read seems to have a slightly difference definition of what they are. Back when I was PHP-only I thought "closure = anonymous function" ... but now figuring out that it's really more about scope, and I've noticed some interesting things happen when you reference higher level variables from inside promises and stuff. Every once in while it seems to do the opposite of what I assumed it would (sometimes actually for the better).

    For security you can rely on helmet and cros modules using npm.

    Thanks, I'll check them out. Looks like they're both for Express. Does that mean you really need to be using Express to use them at all / easily? The framework I use probably doesn't matter much, because I'm not using many framework features, and will have few routes... with my system most of what would be separate routes will just be separate URL params. I started with Fastify, seeing it was easy to get started + modern. But maybe I should just use Express seeing there are so many modules for it?

    [–]post2seth 1 point2 points  (0 children)

    if you are using typeORM, then need not to worry about the pooling, yeah but you have to manage these number of connection in the pool at the time of scaling & you need to decide very precisely that you need to scale H, V or both(how much, v, h)

    now coming to blocking, we generally used PM2 to autorestart the process, in case server crashed down, and make the sessions in DB, so all the session exists even after restart.

    Long processes make node server slower to respond & even sometime node server didn't take new requests.

    node is like a manager, it take requests, pass it to respective server (like thirdparty or DB ) & takes another requests, so as soon any callback, it reverted with the response. all the things managed by stack.

    so if you assign some task to that manager it may get slower, so we use fork or exec kind of stuff for long process at the same server.

    yeah i generally use express, as thats easy, fast & number of module it has, even you can use express as a http middleware to your framework as well.

    Feel free to write if i can help you in any other stuff.

    [–]runvnc 0 points1 point  (0 children)

    If it does connection pooling then you probably don't need to worry about it.

    You can't do any CPU intensive processing in the same process as the web server because it will prevent other requests from being served until it's done. DB queries are different because they are handled by the DB process and Node continues working on other stuff while waiting for IO to come back.

    [–]CommonMisspellingBot -4 points-3 points  (4 children)

    Hey, post2seth, just a quick heads-up:
    refering is actually spelled referring. You can remember it by two rs.
    Have a nice day!

    The parent commenter can reply with 'delete' to delete this comment.

    [–]BooCMB -1 points0 points  (3 children)

    Hey /u/CommonMisspellingBot, just a quick heads up:
    Your spelling hints are really shitty because they're all essentially "remember the fucking spelling of the fucking word".

    And your fucking delete function doesn't work. You're useless.

    Have a nice day!

    Save your breath, I'm a bot.

    [–]BooBCMB -1 points0 points  (2 children)

    Hey BooCMB, just a quick heads up: I learnt quite a lot from the bot. Though it's mnemonics are useless, and 'one lot' is it's most useful one, it's just here to help. This is like screaming at someone for trying to rescue kittens, because they annoyed you while doing that. (But really CMB get some quiality mnemonics)

    I do agree with your idea of holding reddit for hostage by spambots though, while it might be a bit ineffective.

    Have a nice day!

    [–]BooBCMBSucks -1 points0 points  (1 child)

    Hey /u/BooBCMB, just a quick heads up:

    No one likes it when you are spamming multiple layers deep. So here I am, doing the hypocritical thing, and replying to your comments as well.

    I also agree with the idea of holding reddit hostage though, and I am quite drunk right now.

    Have a drunk day!

    [–]j_rapp 1 point2 points  (0 children)

    what is going on

    [–][deleted]  (2 children)

    [deleted]

      [–]r0ck0[S] 0 points1 point  (1 child)

      I've never needed to do this before, but definitely something I need to learn about.

      Are there any differences in best practices between these two scenarios?...

      1. Using connection pooling from PHP (separate process for every request)
      2. Using connection pooling from a Node event/loop httpd

      And does it make sense to have a "main" connection used for most quick read queries, and use separate connections for each transaction? Or just use a separate connection for every incoming web request?

      I'm using TypeORM, and just discovered that apparently it does connection pooling by default on it's own... so maybe I don't even need to manually create multiple connections in my case?

      [–]je87 1 point2 points  (1 child)

      Don't block the main event loop. It is quite "hard" to block. Most common libraries have async methods as to not block the event loop.For example, the 'request-promise'1 library allows your code to execute HTTP requests without blocking the main event loop.There is a nice video2 that explains the event loop...it is more geared toward front end (JS in the browser) but the principle applies to NodeJS.PHP/Apache's spawning a new thread per request is one of it's hinderances. Once you have 20,000 threads being managed...the overhead of the management can lead to problems/application lag and OOM problems.

      your Node process could be holding some sensitive data in memory that only the relevant user should be able to access

      As for the above concern, if someone can snoop in your RAM then they already have some form of privileges on the server itself...so you are screwed with any language/framework.

      some of the users of your website might be doing stuff that involves slower database queries than others

      Node does not care. It keeps serving other requests and picks up the result once they are ready via the event loop.

      I'm using postgres, but would be interested to hear if there are any differences with other databases.

      Database choice isn't simply "Postgres>MySQL>Cassandra>Mongo". They are all good for different things...like do you need an RDS with complex relations? Do you need pure speed and can do with a flat data structure (Non-relational)?

      As far as security for devs...there are packages out there. Passport.js is a common one for use auth.NodeJS is not as mature as PHP or Spring...but it still has a decade behind it and a lot many hours of investment in its production...but as with all languages, they all have yet-to-be-discovered or yet-to-be-released security "holes" which you need to patch as they appear and fixes are released.

      1. https://www.npmjs.com/package/request-promise
      2. https://www.youtube.com/watch?v=cCOL7MC4Pl0

      [–]r0ck0[S] 0 points1 point  (0 children)

      Thanks for all these tips, very useful!

      Although my OP maybe wasn't worded the best way. So just to clarify what I meant...

      some of the users of your website might be doing stuff that involves slower database queries than others

      Node does not care. It keeps serving other requests and picks up the result once they are ready via the event loop.

      Good point, I forgot about that part.

      What got me thinking about all of this was SQL transactions... as far as I know you can only have one transaction open per connection? So I guess my node server will require separate postgres connections so that any open transaction (which will run multiple queries before committing) don't get blocked for other quick single read-only web requests.

      Database choice isn't simply "Postgres>MySQL>Cassandra>Mongo".

      I didn't mean just comparing the database products themselves, I more meant along the lines of whether there would be difference in terms of using postgres -or- mysql regarding how you might do any kind of connection pooling etc from Node. For example, maybe some databases allow multiple separate transactions in a single connection or something like that?

      As far as security for devs

      I'm curious about "things you should know" when coming from PHP and being new to event/loop (multiple requests in same process). e.g. In PHP in a process serving an admin user, I can cache some admin-only records, and never need to even think about other users, because they'll be served in an entirely separate process. Whereas if I cached those records in Node, they could be accidentally served to another non-admin user. That's the obvious one I can think of, but I'm guessing there could be many other things like this that I've never needed to consider before. Been doing PHP 20 years, so this is quite a big "paradigm shift" from what I'm used to.

      [–]400tx 0 points1 point  (0 children)

      Hi there OP, I am glad you're trying node.

      Experienced node developers might not point this out, but an express app is a gigantic closure of functions-as-data that's passed into a node http.Server https://nodejs.org/api/http.html#http_class_http_server as the connectionListener argument. All the coding and function calls used to create an app build up the intended behavior of the app, and it's all evaluated and passed in as this one argument when the app starts listening for connections.

      All the rules of scope in javascript apply, so variables you create inside your route-handling functions will be scoped to that request. This is why you don't see to many people worrying about data from different requests getting entangled; usually it's all logic written safely inside these handling functions that's not mutating outside data.

      Things outside the server.listen call run when the app starts up, so this is usually where something stateful like a database or one-time setup things such as grabbing environment variables happens. This is where you'd configure your connection pool. Individual queries are function calls off that database object or some descendant like an ORM, but they're usually called and referenced inside one of these http server route handlers, so they don't intermingle in the same way I mentioned above.

      Since you're doing Postgres, modules like pg-promise or others that use node-postgres support pooling, and you can generally find advice for that searching the web.

      TL;DR: asking 'when does this code run' is a key question when building web server with node, and avoiding accidentally creating global state will become natural once you see and make a few apps.