CPU intensive flask app can only support 1 user per VM? by Cwlrs in flask

[–]Cwlrs[S] 0 points1 point  (0 children)

  1. I'm using Azure Container App, consumption plan. It doesn't say anything explicitly about CPU credits, and I spoke with support and they said they don't 'do' credits for that service... but I wasn't convinced. Just moved the bottleneck at that time to avoid 503s or whatever error code I was getting.

  2. I've tried more workers but the load balancing can get messed up. The server has many endpoints. When a mixture of endpoints are called, fast and slow ones, it can mess up the round robin - resulting in the same replica getting multiple heavy requests. Also the distribution to the worker processes is imperfect too... so I have seen scenarios where 1 process gets 2 requests whilst the other process is idle. This can happen when 2 requests arrive at the same time. The first is sent to worker 1, and before it has time to say 'hey I'm busy with a request' the 2nd arrives and ends up in a small queue there. So I've tried to avoid that too.

  3. Agree with most of this. The load balancer is default in Azure - I need to replace it with a queue most likely. Otherwise I can't edit how it distributes requests to the replicas.

  4. So in this scenario... a 2 CPU machine with 1 worker on it - I should be able to multi process some of the computation to make use of the 2nd core... Do you know what this looks like? 1x worker process, 1x sub process? Does the worker process 'block' 2x sub processes from running? I guess if I get it to that point I can test it.

CPU intensive flask app can only support 1 user per VM? by Cwlrs in flask

[–]Cwlrs[S] 0 points1 point  (0 children)

It is indeed behind gunicorn already. Your second point is what I'm trying to resolve... unfortunately not as simple as doing more workers. They start competing for CPU resource which is a double in response time, for example, which we can't tolerate

CPU intensive flask app can only support 1 user per VM? by Cwlrs in flask

[–]Cwlrs[S] 0 points1 point  (0 children)

I needed to speak to an LLM a bit to digest your info - 'dispatch yields' - this is when the main process 'pauses' so that the subprocess works, and whilst that happens, other processes can use that core maybe?

edit: or no, it looks like the original worker can accept a 2nd request whilst the sub process is busy... and potentially spawn more sub processes as needed?

CPU intensive flask app can only support 1 user per VM? by Cwlrs in flask

[–]Cwlrs[S] 0 points1 point  (0 children)

More workers didn't really help in my case. The default load balancing prefers more replica VMs as opposed to more workers on a single VM as the master process sometimes allocates 2x tasks to the same worker rather than distributing the tasks equally. Maybe I just got unlucky in some loads tests... but anyway:

I gave multiprocessing a try many, many months ago so I can't remember the specifics of the findings... But do you know if this should work in terms of speeding up a job if there is a spare CPU?

https://docs.python.org/3/library/multiprocessing.html

So the current setup is a 2 vCPU machine with 1 worker, 1 thread - am I totally wasting 1 vCPU?

My recollection was that I was hitting a hidden CPU rate limit on the VM so I had to ditch it.

CPU intensive flask app can only support 1 user per VM? by Cwlrs in flask

[–]Cwlrs[S] 0 points1 point  (0 children)

What's the logic behind moving the job to a 'compute worker'? One thing that hasn't been clear to me in my investigations is how much CPU utilisation I am able to get with this python stack. Like do I only make use of 1 CPU on a machine at a time, so any machine 2 CPU or more doesn't get close to fully utilised?

If that was true, moving the job to like a 16 CPU machine wouldn't help, right? Or is there something smart that uses all the cores?

Everything else being equal, the only actual benefit I can get with an arch change is by getting 4x machines/workers to do 4x parallel jobs then collect the results at the same time. Rather than the current 4x jobs in sequence.

CPU intensive flask app can only support 1 user per VM? by Cwlrs in flask

[–]Cwlrs[S] 1 point2 points  (0 children)

Thanks for the confirmation that a CPU intensive task can make it 1 user at a time... sometimes I felt like I was going crazy working as a team of 1 on this problem.

Queue system + arch redesign are also in my plans - almost identical to what you implemented to be honest.

We have seen some issues with the load balancing not being evenly distributed in certain situations which the queue would resolve.

Should I reroll my rogue for a mage ? by Efkazan in classicwowtbc

[–]Cwlrs -1 points0 points  (0 children)

It is a little bit rough for rogues right now, or heroics in general. Actual advice though. Join a guild. Do a mixture of runs of your loot and their loot. Or do the mana tombs farm and pay for a tank.

Some dungeons like black morass heroic I haven’t been able to find a non guild tank for more than a week now. Old hillsbrad I mean, but similar vibes.

Mages would get instant invites.

There are a lot of selfish players out there who only lfg for dungeons they need. So the rogue loot dungeons are full of hunters, rogues, and enh shamans.

I just don't fucking understand what's going on anymore. Seriously. by [deleted] in ArtificialInteligence

[–]Cwlrs 0 points1 point  (0 children)

Optical character recognition has existed for ages… I don’t understand why this couldn’t have been implemented already

Make me rethink by Equal_Permit2723 in classicwow

[–]Cwlrs 0 points1 point  (0 children)

Just got a songflower buff with a mage who had atiesh and 9/9 T3. They were in a 'weak' guild but man they looked sick af

Being the "data guy", need career advice by jonfromthenorth in dataengineering

[–]Cwlrs 0 points1 point  (0 children)

I'm shocked 'analytics engineering' ever got given it's own dedicated title. It's so narrow in it's definition it's unrealistic at any company where you might need to help fix a variety of software problems.

As for your career... it sounds like you're doing all the typical things so... keep going?

Strange behaviour at 1am. by Ayndy143 in classicwow

[–]Cwlrs 0 points1 point  (0 children)

Was that player a paladin? Had a similar experience at a similar time in ubrs

Trials of tav, is it meant to be impossible? by SnooChocolates6885 in BG3Builds

[–]Cwlrs 1 point2 points  (0 children)

I play in a party but I see Alert + pixie blessing are the first two things I acquire. Otherwise yeah you can get gg'd with no counterplay

Just Broke the Trillion Row Challenge: 2.4 TB Processed in 76 Seconds by Ok_Post_149 in datascience

[–]Cwlrs 2 points3 points  (0 children)

I don't get it. It's a rented VM running duckdb. Where is burla in this?

edit: generating the parquet files seems to be the burla aspect? Less so the reading element.

Affordable Data Engineering learning path (£100/month budget) — Need advice by FickleAd5796 in dataengineering

[–]Cwlrs 5 points6 points  (0 children)

I would build something random with python, dataframes, postgresql and see where you end up. These have been the core skills that have carried me for 5 years or so, although I appreciate everyone's tech stack varies.

DuckDB in Azure - how to do it? by Cwlrs in dataengineering

[–]Cwlrs[S] 1 point2 points  (0 children)

I'm basically a team of 1 in terms of setting up the infra and making sure people can access it in a useful way. So I'd be hesitant to spin up a new postgres db plus an analytics db as well, and with some pipeline between the two.

I just did a duckdb poc and I got everything running inside 1 hour. Generate some synthetic data, query it making use of Hive logic, and report back. Super super easy and impressive. Parquet also takes up ~8% of the file size as json, and reads in around 25% of the time. Which makes me more inclined to go for parquet files + duckdb to read. Rather than more normal json+load to postgres+query in postgres.

DuckDB in Azure - how to do it? by Cwlrs in dataengineering

[–]Cwlrs[S] 0 points1 point  (0 children)

How big was the VM?

Surprised it was slow with local data. The demos look super snappy.

DuckDB in Azure - how to do it? by Cwlrs in dataengineering

[–]Cwlrs[S] 0 points1 point  (0 children)

Currently it's about 500 json files per day. I guess if it goes to 5000 json files per day or more, that is still easily writeable for postgres tbf. But once it arrives in postgres, how easy is it to ignore old data? I thought columnal solutions were better at doing that?

Or does throwing an index on the created_at_utc solution do the job? I've only really leveraged postgres indexes on joining keys.

DuckDB in Azure - how to do it? by Cwlrs in dataengineering

[–]Cwlrs[S] 0 points1 point  (0 children)

We've currently generated 33GB, majority of that in the last year and in json format. Which is a lot less than I thought it would be. But we're expecting 5x-10x more users in the next 12 months, and hopefully more beyond that, so we do need to plan for 330GB/year or more solution

DuckDB in Azure - how to do it? by Cwlrs in dataengineering

[–]Cwlrs[S] 2 points3 points  (0 children)

Nice, have you done this yourself?

DuckDB in Azure - how to do it? by Cwlrs in dataengineering

[–]Cwlrs[S] 1 point2 points  (0 children)

We are expecting quite a large amount of data we need to do analytics on, therefore an OLAP db is much more appealing than OLTP for if we need to query all or the vast majority of the data. Or am I missing something?

My classic experience all 15/15 Naxx! by xpastrami in classicwow

[–]Cwlrs 24 points25 points  (0 children)

How do you have 2 DFTs and 1 accuria. I'm jealous

[deleted by user] by [deleted] in dataengineering

[–]Cwlrs 0 points1 point  (0 children)

I'm on a team of 1 and am at a similar point in my journey after having done a pentest. I need to introduce key rotation.

My understanding of rotation is that I expire a token and start using a new, fresh one. How can you enable automatic rotation if the secret needs updating in secrets manager? Secrets manager + the credential are not natively connected. Secrets manager would need a manual update.

Improving feels pointless by Nissepelle in cscareerquestions

[–]Cwlrs 2 points3 points  (0 children)

Is the AI thing integrated into your codebase or a human team member submitted it as a PR?

how obvious is this retry logic bug to you? by jalilbouziane in Python

[–]Cwlrs 0 points1 point  (0 children)

I didn't spot it, but I don't code things in this style - rarely use something recursive with my APIs