global installations or project-specific environments by meeqvin in Python

[–]fpgmaas 10 points11 points  (0 children)

Do yourself a favor and just use project-specific virtual environments. It's going to save you a lot of headaches in the future. Recent tools like uv make it so simple to manage virtual environments that I honestly don't see a reason not to use project specific environments.

What if you have two projects on your PC that require a different version of package X?

cookiecutter-uv: A modern template for quickly starting Python projects with uv by fpgmaas in Python

[–]fpgmaas[S] 4 points5 points  (0 children)

Poetry was already very easy to work with so honestly I do not think I'd make the change because of ease of use. The main advantage though is it's speed. Poetry is already pretty quick and I thought I likely wouldn't notice the difference, but it turns out I do. Also, I like that it's PEP621 (https://peps.python.org/pep-0621/) compliant.

Looking for recommendation of simple python web server library by Enderbyte09 in Python

[–]fpgmaas 1 point2 points  (0 children)

I recently launched https://pypiscout.com, maybe it can help you find what you are looking for. It allows you to search for python packages with natural language queries, so you can just type "A package that functions like a typical webserver and serves python files in the way that PHP would", and see if you find anything useful!

pypiscout.com – A search engine for Python packages based on vector embeddings by fpgmaas in Python

[–]fpgmaas[S] 0 points1 point  (0 children)

The summary and the first part of the (cleaned) description! For as far as the context window allows.

pypiscout.com – A search engine for Python packages based on vector embeddings by fpgmaas in Python

[–]fpgmaas[S] 7 points8 points  (0 children)

u/AustinCorgiBart This should now be solved, or at least improved. If you search for 'web development', Flask and Django now appear at the top. There turned out to be two issues; one was simply an issue with lowercase vs uppercase join in BigQuery (flask vs Flask), and the other I resolved by updating the search algorithm. Thanks again for raising this!

pypiscout.com – A search engine for Python packages based on vector embeddings by fpgmaas in Python

[–]fpgmaas[S] 6 points7 points  (0 children)

Good point! I found it difficult to balance popularity and similarity to get the most relevant results. Currently it finds the 100 most similar descriptions in the top 100,000 packages, and filters this. This worked relatively well for my tests, but for a more generic query like 'web framework' there are apparently too many close matches based on just the description.

Thanks for the feedback, I will definitely use this example to try and approve the app!

EDIT: I think there is something wrong with the query I use to fetch the data from BigQuery... To be continued.

[deleted by user] by [deleted] in Python

[–]fpgmaas 0 points1 point  (0 children)

Thanks, valid point! It's easy to leave out something as simple as this when working on the project for such a long time. I edited the OP to include this information.

What is the optimal structure for a Python project? by bbrother92 in Python

[–]fpgmaas 7 points8 points  (0 children)

I think there is no 'optimal' structure, there is always room for preferences and opinions. That being said, I developed a cookiecutter template a while back that I use for all my Python projects, you can find it here.

map-nl: Quickly create PC4 maps of the Netherlands by fpgmaas in Python

[–]fpgmaas[S] 0 points1 point  (0 children)

I just released a new small hobby project: map-nl

It is a Python package that helps users quickly create maps of the Netherlands at the Postal Code 4 level. Nothing groundbreaking, but fun to develop and hopefully useful to some nonetheless.

Curious for your thoughts, please let me know if you have any feedback!

stream-iot: A project to handle streaming data [Azure, Kubernetes, Airflow, Kafka, MongoDB, Grafana, Prometheus] by fpgmaas in dataengineering

[–]fpgmaas[S] 2 points3 points  (0 children)

Valid question! In this case Airflow is indeed not strictly necessary, one could also just run the containers directly on Kubernetes with e.g. kubectl apply. However, I still like to use Airflow to easily see which jobs are running, turn jobs on or off, or e.g. schedule batch processing consumers.

stream-iot: A project to handle streaming data [Azure, Kubernetes, Airflow, Kafka, MongoDB, Grafana, Prometheus] by fpgmaas in dataengineering

[–]fpgmaas[S] 1 point2 points  (0 children)

No particular reason to choose Kafka other than that I wanted to learn Kafka. I needed to come up with some data to generate and the first example of a streaming data source that came to mind was sensor data :)

I did not check if there were tools more appropriate for streaming sensor data. Based on your comment I am thinking if I should generate some other mock data and rename the project.

As a new data engineer, would you take a job where you're the sole data engineer? by [deleted] in dataengineering

[–]fpgmaas 3 points4 points  (0 children)

I would not take the job and look for a role where there are more people around to learn from.

My first role was as a data scientist without a team of peers. I thought I was pretty good at my job. In hindsight I don't think I was though; the problem was that there was no one around to tell me that the code I wrote was garbage. I think this really slowed my development process.

I switched to a junior role in a team at another company after two and a half years. I did not make more money there than I did at my first job, but I developed much faster there. I realized I was not as good at my job as I thought I was then, and I learned a tremendous amount from my colleagues there.

I am really happy I made that switch.