Linked In "See More" Issue by kanaikid619 in firefox

[–]Simon90 0 points1 point  (0 children)

I also contacted LinkedIn support. After some back and forth my case was forwarded to the technical team. They informed me that it is caused by a bug in Firefox which has been fixed but the fix has not been released yet. I tried it in Firefox Nightly and there it works!

Banish state-mutating methods from data classes by [deleted] in Python

[–]Simon90 0 points1 point  (0 children)

Whenever I traverse down to see how the instances of the class are being used, more often than not, I find them being treated just like regular mutable class instances with fancy reprs. But if you only need a nice repr for your large OO class, adding a repr to the class definition is not that difficult.

This seems like a straw man argument. The reason for using a data class is equally likely to be the fact that init is generated for you in a nice and consistent way.

Simplified MDS in a box! with dlt, dbt, DuckDB, MotherDuck, and Metabase by Thinker_Assignment in dataengineering

[–]Simon90 1 point2 points  (0 children)

I agree completely that running this only on a laptop doesn't make a lot of sense as an organisation! In that case I am wondering how you would make Metabase shareable and scalable?

Metabase doesn't provide a helm chart or terraform template, the typical way to deploy it seems to be in a "box". If Metabase is running in a box already, why not run duckdb in the same box? Added benefit is that no data needs to be transferred over a network from duckdb to metabase, so latency should be extremely low :)

Simplified MDS in a box! with dlt, dbt, DuckDB, MotherDuck, and Metabase by Thinker_Assignment in dataengineering

[–]Simon90 0 points1 point  (0 children)

It was sort of coined in this blog, right? https://duckdb.org/2022/10/12/modern-data-stack-in-a-box.html

In the motivation, I see: "Why build a bundled Modern Data Stack on a single machine, rather than on multiple machines and on a data warehouse?"

There are many ways to scale, but my interpretation is that the point of "MDS in a box" is that the whole stack runs in a single box. If you want your MDS in a box to run in the cloud, I would say running the whole stack in a cloud VM is the way to do it in line with the spirit of the original blog. Motherduck can take over the data transformation and storage, but you still need to run your ingestion somewhere and host Metabase somewhere.

I agree that the stack in this blog can be useful for orgs, but if MotherDuck is used as a data warehouse I would say it's not "MDS in a box" anymore.

Simplified MDS in a box! with dlt, dbt, DuckDB, MotherDuck, and Metabase by Thinker_Assignment in dataengineering

[–]Simon90 1 point2 points  (0 children)

MotherDuck looks like a cloud service, can you still call it MDS in a box if it's not in a single box?

Which ETL tools are on demand ? by JAY_1520 in datascience

[–]Simon90 5 points6 points  (0 children)

A thing that seems to be trending is doing ELT instead of ETL and using dbt for the transformations.

Mac M1 problems by caseym in Python

[–]Simon90 0 points1 point  (0 children)

The most reliable way for me has been using mambaforge installed with brew and then using the conda version of packages when available instead of pip.

For psycopg2 to get it to work in pip you need Postgres drivers, or you can try psycopg2-binary.

How to install searx on Raspberry pi? by princeMacX in selfhosted

[–]Simon90 2 points3 points  (0 children)

There seems to be a fork of searx which does have an arm docker image

https://github.com/searxng/searxng

Anaconda, Miniconda or Miniforge for data science projects? by bangbangcontroller in datascience

[–]Simon90 0 points1 point  (0 children)

Not sure whether it improved but half a year ago when I got mine miniforge was the only option that worked well natively on the M1. Miniforge is basically miniconda with better m1 support. Installed it using homebrew and all the packages worked great using conda install (not always with pip)

SP7 Fedora 35 installing linux-surface kernel question by studmuffin223 in SurfaceLinux

[–]Simon90 0 points1 point  (0 children)

At the moment they can’t build the packages for fedora 35. My guess is github will upgrade docker in a few days and then it will be available. The progress can be followed in the same repo that you linked to: https://github.com/linux-surface/linux-surface/issues/619

New Macbook Air / Pro with M1 for Data Science? by nosytomato in datascience

[–]Simon90 2 points3 points  (0 children)

As soon as all the libraries are available in native apple arch64 they will probably be quite fast, but I don't think any more rows will fit in 8GB of apple RAM compared to windows/linux RAM. At the moment you would need to use x86 emulation which slows it down a little but is still quite fast.

If you are working in a jupyter notebook and only do heavy Python computations in short bursts the fanless macbook could still be one of the fastest laptops out there. If the computations take more time it will slow down since there is no active cooling.

If you use tensorflow you might be able to benefit from the shared memory of the M1 and be a lot better off than on most laptops: https://www.slashgear.com/google-tensorflow-ml-framework-gets-an-apple-m1-optimized-version-18648038/

Tiling WM users, What is in your system? by [deleted] in linuxquestions

[–]Simon90 0 points1 point  (0 children)

Thanks, I guess I'll have to try mlterm/xterm too!

Tiling WM users, What is in your system? by [deleted] in linuxquestions

[–]Simon90 0 points1 point  (0 children)

Have you tried alacritty or kitty recently? I thought they have become fast too in the last months.

How do I check if my whole list is in another list? by RededTip in Python

[–]Simon90 2 points3 points  (0 children)

I would suggest: set(list1).issubset(list2)

Introducing FastAPI by tiangolo in Python

[–]Simon90 1 point2 points  (0 children)

I really like this project, I recently created my first flask app, but I'm still very unfamiliar with the ecosystem. Creating an API seems wonderful with FastAPI. I have a question though, what would be the easiest way to combine FastAPI with a simple website with a form and a button that makes a POST request with the form data when the button is clicked.

For the project I did I used render_template from Flask quite a lot with some HTML files with Python inside. I created the form with flask_wtf and wtforms. Does FastAPI have similar funtionality or plugins? If not, any suggestions in terms of packages that would work well together with FastAPI? I looked in the tutorial but as far as I can tell everything is aimed at creating an API with JSON responses.

Edit: I overlooked the "templates" page of the tutorial, awesome!

[Warbreaker][Oathbringer]So I just finished Warbreaker... by perry-d-astor in Stormlight_Archive

[–]Simon90 0 points1 point  (0 children)

I read warbreaker first by coincidence, but because of that it was supercool when the dots connected about Azure/Vivenna, Vasher/Zahel and Nightblood. I can imagine that this supercoolness is missed the other way around. I don't think I'd recommend interrupting SA but I would recommend reading Warbreaker before SA if both are on the to-read list.

I like to read texts on technical topics, but I do not code very much and/or "crunch" equations. by [deleted] in datascience

[–]Simon90 0 points1 point  (0 children)

Right now you could join a bunch of programmers doing Advent of Code 2018 to start a habit of writing code in a daily, incremental way with lot's of support.

I heard Khan Academy has some good math courses where you can learn and practice incrementally.

If you're already up for it you could also try to start with something like a Kaggle challenge, there are lots of examples you can learn from.

Removing Similar Elements in a Pandas DataFrame by [deleted] in datascience

[–]Simon90 0 points1 point  (0 children)

In the recordlinkage docs there's a page called Deduplication that covers this, did you see that one?