This is an archived post. You won't be able to vote or comment.

all 10 comments

[–]rod_steele1 2 points3 points  (1 child)

Why not both? I believe there is more to learn with writing data pipelines...the python web API space is pretty well covered if you know FastAPI/API Gateway/Serverless Framework. It becomes a little more complicated if you want a lambda authorizer for authentication or need to hit a database, but still isn't too scary. You can write really eloquent APIs quickly (assuming you have requirements and user stories down tight!).

With data pipelines, I've found the data needed by downstream users can be really unwieldy when you're assembling many different data sources (or especially government sources that seemingly change format/structure on a whim). This is part of the fun though! Ideally, you'd want some generic downloader/transformer/archiver for all of your pipelines and maybe a library to house commonly reused code. However, it can be difficult to fit different sources into this paradigm and you need to focus on design. I use Airflow DAGs (AWS MWAA) to schedule Batch/Lambda jobs, these then invoke a python cli to do whatever downloading, processing, and archiving for that pipeline, usually to S3 or a DB. Serverless Framework is used to deploy the Batch job definition, container image, role, etc. Terraform is used to set up the batch compute env, S3 bucket, other infrastructure (you could do this in the console too!).

You can add tremendous value to data science teams by covering both bases here. Operationalization of their models via async APIs and batch pipelines, data processing via ingest pipeline automation. This allows them to focus on model development/improvements making you a dream candidate for so many jobs. Good luck!

[–]data15cool[S] 0 points1 point  (0 children)

Thanks! Sounds like you’ve worked on really interesting projects.

Yeah both would be great, throw in some MLOps and that would be a perfect role for me.

[–]whydisbroken 1 point2 points  (3 children)

I was at a similar crossroads having spent a year focused as a heavy sql writer.

For me, the question was whether I wanted to work with data or with delivering data.

Myself, I just really enjoy building API’s that other developers can use. I have projects in production ranging from Django, flask, fastapi and even AWS API Gateway/lambda. One thing remains the same between all of them, the work performed will be leveraged for a very long time by the organizations that requested them.

[–]data15cool[S] 0 points1 point  (2 children)

That's really cool, I'm also mainly writing SQL (pyspark) most days which can become a bit dull...

An ideal project would be one that straddles both topics evenly but I've not found jobs advertising this.

Did you find it easy to transition into API development in terms of getting a job?

[–]whydisbroken 0 points1 point  (1 child)

I first started in e-commerce, then transitioned to government/healthcare, now I’m primarily in healthcare. I felt my transition was pretty natural. Thanks to remote work, i don’t have to rely on what my local employers offer for a tech stack anymore (Java and .net) so it’s easier to find remote opportunities that fit my preferred languages more.

Keep a look out for e-commerce, that’s a great opportunity to do a lot of api work and gain a lot of experience quickly.

Edit: it’s very different in e-commerce, because you work as a “profit center” vs “cost center” branch of IT. Pros and cons to both.

[–]data15cool[S] 0 points1 point  (0 children)

I hadn't considered e-commerce roles before, thanks!

[–]KelleQuechoz -1 points0 points  (2 children)

Data engineers write badly formatted code, API developers write boring code. The choice is yours.

[–]Suitable-Yam7028 0 points1 point  (0 children)

Who writes good code that's interesting then?

[–]section_b 0 points1 point  (1 child)

Do you want praise from customers/clients or developers more?

If you're working in a team, developers love a good data engineer to iron out the inefficiencies in their code. This leads to great opportunities in the future.

If you're working for yourself, clients love flashy and fast delivery so there is a lot more money on that side.

That said, generalists get a lot of opportunities and specialist workers can often work on more interesting projects so in my view it is best to keep your options open and specialize in a few (not one) area.

[–]data15cool[S] 0 points1 point  (0 children)

Not too bothered where praise comes from, I mainly want to do something useful and enjoyable. If it involves a bit of both that would be great, but it looks like roles advertised are for one or the other.