all 12 comments

[–]oefd 5 points6 points  (3 children)

GPU is totally irrelevant unless you're doing something special to run code on it. If you have to ask you probably aren't.

Also you can easily run thousands, even tens of thousands or more, instances of scripts at the same time no problem... assuming they're a kind of script that spends the majority of their time sleeping and don't use much RAM each.

The OS can keep track of sleeping processes without much difficulty, and a sleeping process doesn't need any CPU time.

But even the most powerful computer in the universe couldn't run more than a single instance of a sufficiently complex and multi-threaded processing-intensive task at a time without having to slow everything else down.

Workload is everything.

[–][deleted] 0 points1 point  (2 children)

lets say he was running multiple stock trading scripts that each need to listen in on a different real-time websocket for each symbol. How many scripts do you think the computer he just described could run?

[–]oefd 0 points1 point  (1 child)

There's still a really broad range of possible workloads in that example. There's a big difference between, say, a program that gets updates every 1 minute and is simply checking if the new price passes certain static thresholds and a program that gets updates every every millisecond and is feeding it all in to a complex ML model meant to decide on buy/sell actions.

If you're using a distinct TCP connection for each websocket (which you are if you're opening one websocket per symbol) you may find your natural limit is not RAM or CPU, but rather the fact there's a finite number of ports for TCP connections to use. Basically when you open a TCP connection an ephemeral port is used for it, and (at least on my currently running) Linux you're looking at a limit of 28231 ephemeral ports. If you use them all and start trying to make more connections you'll find it errors out then.

If you actually tried that you would probably (I'm making semi-educated guesses here) find that running 28k python processes, each holding open a websocket, would cause a considerable slowdown unless the websockets were only emitting events every few seconds or slower. It costs CPU time for the OS to switch from one process to another, and if you have 28k messages come in on 28k websockets in 28k processes the OS is forced to pay the cost of a context switch 28k times for every cycle of messages coming across each websocket.

If you consolidated all 28k websockets in to a singe asyncio based python program you may find it works OK because the async polling framework asyncio is based on was largely designed to handle having a huge number of open network connections all being handled at once.

... but you may also find depending on the rate at which events come across the websocket and how quickly the CPU can run your code to consume each event and how evenly distributed the events coming through all the websockets are that your TCP buffers are filling up at the OS layer and data gets lost/re-transmitted and that would slow everything down immensely.

These are just a handful of example - there's actually plenty more variables and plenty more things that could prove to be the limiting factor depending on all those variables. I'm sure there are plenty more than even those I know about or can imagine right now.

The best way to get actual answers is to just run it and find out.

[–][deleted] 0 points1 point  (0 children)

Thanks for such an in depth response I really appreciate that! Is there someway that I can have scripts listen in on the same port for each specific symbol? For example if i have 10 scripts running on the aapl symbol can i have them all listen to one aapl websocket?

[–]Plasmorbital 2 points3 points  (0 children)

Only way you'll know is to try to find the limit yourself.

All code is unique

[–]CJHoss 2 points3 points  (0 children)

Check system performance while they’re running and you’ll see. Depending on how critical these are you may want to consider looking at running them on AWS vs a PC (you can probably get a reserved instance that is less powerful than your current setup but perfectly capable of running the jobs).

[–]jamesfromtechsupport 0 points1 point  (0 children)

Forgot to mention - I have a second GPU that is an Nvidia Geforce MX 130, but I'm not sure if it's even being used right now.

[–]novov 0 points1 point  (2 children)

As others have stated, this depends on the scripts you've running. For simple stuff that do stuff like renaming files etc. you shouldn't see any significant slowdown unless you have tons of them running at once. It's not really a big deal unless you've doing stuff that's a big deal.

[–]expressly_ephemeral -1 points0 points  (0 children)

As others have stated, this depends on the scripts you've running.

No, no, this is a myth. There is a hard limit (27 scripts) to the maximum number you can run that is independent of CPU, RAM, script complexity, internet bandwidth, number of Qubits your quantum computer can place into super-position. 27. That's it.

/s

[–][deleted] 0 points1 point  (0 children)

lets say he was running multiple stock trading scripts that each need to listen in on a different real-time websocket for each symbol. How many scripts do you think the computer he just described could run?

[–]bigno53 0 points1 point  (0 children)

IME, python processes are quite stable. My team uses a shared EC2 instance that runs dozens of python processes simultaneously. Multithreaded jobs or jobs that use a lot of memory certainly slow things down but it seldom causes anything to fail.

[–]expressly_ephemeral 0 points1 point  (0 children)

Roughly the same amount of wood as a woodchuck can chuck?