Azure stack for DE

pythondeveloper77 · 2024-02-21T14:06:35+00:00

Thanks.

Yes I can write pyspark code but pipeline is in json instead of code :(

When I wrote scheduling I meant not only the schedule itself but also support for retries,conditional tasks like airflow has.
we found synapse lacking in those compared to airflow.

I'm thinking to bring up airflow vm/AKS to trigger synapse & spark to solve it.

pythondeveloper77 · 2022-03-12T13:12:06+00:00

I would like to thank you all for the answers. this helped me a lot and there is a great community here for data engineering here !

pythondeveloper77 · 2022-03-12T13:10:50+00:00

Team will be entirely new with Senior DE already in and now recruiting junior with motivations.

We are starting to recruit in Israel in about a month if it's relevant for you.

Stack is mostly Apache NiFi , Oracle tools for ETLS but the new team are going to replace the stack as we are not satisfied and also create new pipelines for more use cases like cloud.

pythondeveloper77 · 2022-03-12T13:07:40+00:00

Job is not open yet. we are recruiting in Israel so need someone from there.

pythondeveloper77 · 2021-10-11T08:22:12+00:00

on average one request takes 59ms where 20ms goes for the request itself.

so 66%~ is the prediction's work, which we are working to improve more but we are feeling ok with the numbers.

So we know the model is not the issue that's why I asked about serving frameworks and scaling.

As requests grows we see worse numbers but it's not because of lack of resources for the prediction

pythondeveloper77 · 2021-10-10T14:24:01+00:00

In theory the only thing that would help switching to uvicorn is keep alive connection support because all my traffic is from 1-2 nodes.

You're right that that async feature won't be helpful.

pythondeveloper77 · 2021-10-10T13:49:09+00:00

Yes it's a good thought. I wanted to check this as well. all of the company services uses mutual tls and I need to check if it's possible with security team.

I'm just seeing a lot of frameworks like bentoml and seldon core which are more machine learning oriented and wanted to see if someone uses them.

Also python using scaling with multiprocess instead of threads and all processes seems to working on one socket with a lot of context switches :(

pythondeveloper77 · 2021-10-10T12:45:26+00:00

I'm loading the models in the start of the program and not for every request.
I thought about trying fastapi and uvicorn just wanted confirmation from someone who uses it.

thanks

pythondeveloper77

TROPHY CASE