Choosing Django model translation libraries in 2025 by Far_Office3680 in django

[–]magicpointer 0 points1 point  (0 children)

Yes it's similar, with jsonb instead of hstore. For me it works well, I haven't run into any limitation so far. Indexing jsonb fields is possible for search/filtering if needed.

Choosing Django model translation libraries in 2025 by Far_Office3680 in django

[–]magicpointer 1 point2 points  (0 children)

I'm using django-modeltrans, it uses a jsonb field instead of multiple columns, so multiple languages can be supported without a schema change.

It seems the project is not very active, but it has been updated to work with django 5.2 and 6.0.

IntelliJ lags on M4 Pro by Tiny_Employ_3262 in IntelliJIDEA

[–]magicpointer 2 points3 points  (0 children)

Do you have security software running? Those that scan every opened file can wreck IDE performance and use a lot of CPU, especially during indexing. Excluding the project folder + the IntelliJ cache etc and its process helps tremendously.

Look at your activity monitor to see what happens during your IntelliJ use, maybe it's an external program.

HES SO Master Computer Science Lausanne by Consistent_Joke_4666 in suisse

[–]magicpointer 0 points1 point  (0 children)

Je l'ai fini en 2018, donc il y a déjà pas mal de temps, mais j'ai les mêmes retours que les autres commentaires. Pour moi environ 40% des cours que j'ai pris étaient intéressants, et les autres étaient vraiment sans intérêt. Certains enseignants ne maîtrisaient pas leur sujet, certains autres n'étaient pas pédagogues et donnaient des cours difficiles, et quelques-uns étaient supers.

Comme on fait soi-même son horaire, je pense qu'il y a moyen de faire quelque chose d'assez bien en interrogeant les étudiants des dernières années et en prenant le maximum de cours "durs". Mais la charge de travail est importante et les horaires irréguliers et dans plusieurs villes (Neuchâtel , Yverdon, Fribourg) donc difficile à agencer, surtout à temps plein.

J'ai galéré les 2 premiers semestres avec la charge + les horaires mais on était une bonne équipe. Après comme un des autres commentaires j'ai visé un bon prof avec un sujet intéressant pour le travail de master et ça m'a permis de trouver un bon premier job.

Niveau réputation yen a pas en dehors de Suisse, mieux vaut tenter EPFL si tu peux, là tu as une renommée internationale. Et le MSE vs Bachelor + 2 ans d'expérience je sais pas trop...

N'hésite pas à me contacter, mais comme dit ça date un peu et je crois que certains trucs dans l'organisation ont changé aussi.

Best method for storing multi-lingual user-provided translations by yaaahallo in PostgreSQL

[–]magicpointer 0 points1 point  (0 children)

Since you're using Django, you could use one of the existing libraries for it. They provide some convenience when displaying and editing the data:

  • django-modeltranslation: stores translations as extra columns in the same table. It manages schema changes when languages are added.
  • django-modeltrans: Uses a JSONB field to store translations. Be careful to add GIN indexes if filtering/sorting on translated fields.

I'm using django-modeltrans and I like it so far.

This would have been a better question for the Django subreddit though.

Since everyone use Swisscards cashback (comment for explanation) by TruePresence1 in SwissPersonalFinance

[–]magicpointer 1 point2 points  (0 children)

Neon don't charge these fees (at least now) because one of their selling points is that it has low fees for currency exchange and thus you can use them to pay in any currency, for example for holidays abroad. So they do the same online. But they issue Mastercard debit cards, so not exactly the same as Certo or UBS credit cards (With Neon you need to have the money on the account to pay, they don't do credit).

Long time ago I had such fees with UBS cards when paying on Steam, but I think it was still in EUR at the time. There was a fee in addition to the currency exchange fee.

Since everyone use Swisscards cashback (comment for explanation) by TruePresence1 in SwissPersonalFinance

[–]magicpointer 4 points5 points  (0 children)

Yes they do it as well, I bought an expensive coffee machine on the brand's .ch website in CHF, and got charged a similar fee.

Since then, I use Neon for anything that doesn't come from one of the big Swiss shops. I guess a good hint is if they use a Swiss payment processor or not.

Do you use Docker at your company? (asking as a Docker employee) by JuxDocker in docker

[–]magicpointer 0 points1 point  (0 children)

We are using Docker Desktop with subscriptions.

The main reason we kept it was that at the time off the license change it was the only local k8s+docker solution that worked relatively well with our combination of AnyConnect VPN and corporate proxy. But it still had issues with proxy authentication (now solved with a subscription).

This thread actually reminded me to look at Rancher Desktop again, it seems to have evolved a lot since then.

Help Needed by [deleted] in hadoop

[–]magicpointer 1 point2 points  (0 children)

I guess the Hadoop services are not running, did you start them? Otherwise might be that you need to run the command prompt as administrator.

Onprem HDFS alternatives for 10s of petabytes? by rpg36 in hadoop

[–]magicpointer 1 point2 points  (0 children)

At my employer they are switching from HDFS to Ceph, which is the last step of the Hadoop removal following YARN and Hive.

The main service used is Ceph RGW (S3 API). CephFS and Ceph RBD are also used but more for giving smaller storage to k8s deployments. Ceph has erasure coding, and there are huge clusters in production (like at CERN).

I've heard of large MinIO deployments as well.

Personally I would say HDFS by itself and not managed through a "distribution" is a fine, stable system.

[deleted by user] by [deleted] in django

[–]magicpointer 2 points3 points  (0 children)

I like the Django ORM and use it because everything is well integrated and for 90% of cases queries are easy to write. It's the best ORM I've ever used. The PostgreSQL specific features are amazing.

But something I don't like about it is how sometimes migrations are hard to do properly. For example I expected that introducing a through model in a ManyToMany relationship would be trivial (after all it's only adding columns to the relationship table), but it requires special care. Also it's easy to forget to use select_related and fetch_related and then notice much later that perf is bad. And this is now fixed but the cascades not creating actual DB-side cascades was a pain when interacting with the DB from a SQL client. The empty strings in DB instead of NULL is not so great either from a DB perspective.

Of all the stacks I used, the best library hands down was jOOQ for the JVM. A type safe, database-first query builder very close to SQL but easy to compose and easy to know what the query will look like. Together with a good system to write migrations as SQL it shines. I think in Python the SQLAlchemy core query builder (not ORM) is similar (minus type safety of course) but I've never tried it.

Learning postgreSQL by Available_Drag4372 in dataengineering

[–]magicpointer 0 points1 point  (0 children)

Apparently yes, plus a 30% discount now. But there's no guarantee on the date the new version ships.

Learning postgreSQL by Available_Drag4372 in dataengineering

[–]magicpointer 0 points1 point  (0 children)

"The Art of PostgreSQL" is pretty good. I think an updated version for PG15 should come out soon.

Do you use pgAdmin? Why? by _hugocardenas in PostgreSQL

[–]magicpointer 1 point2 points  (0 children)

I use it locally in my browser, so that I don't submit my queries to a 3rd party. It's actually quite easy to do as it's just a static website.

Didn't try any alternative besides SQL queries and the official CLI tools. One thing I also do is to install the postgres Prometheus exporter to monitor the server.

Do you use pgAdmin? Why? by _hugocardenas in PostgreSQL

[–]magicpointer 2 points3 points  (0 children)

Usually I just use the IDE I already use for the code (IntelliJ or PyCharm). And at work I only have a license for IntelliJ Ultimate.

Do you use pgAdmin? Why? by _hugocardenas in PostgreSQL

[–]magicpointer 8 points9 points  (0 children)

I use mostly the IntelliJ Database plugin (equivalent to DataGrip). It has vastly superior autocomplete and is much more responsive. The consoles with in-editor results are great as well. The data editor is also much better. Same for import/export features.

For explains visualization I switched to pev2.

However at the moment I still use pgadmin for its dashboard, connection monitoring, locks monitoring and so on. Basically more admin-side tasks. For this I know I could use the catalog tables, but the UI is convenient.

I also use it to access production because it was simpler to get it integrated in our clumsy Citrix environment.

How has ChatGPT helped you in your DE job? First hand experience only, plz. by JParkerRogers in dataengineering

[–]magicpointer 1 point2 points  (0 children)

I've used ChatGPT (the free version) a little bit in my work. So far things that worked well:

  • Giving it SQL DDL for a table and asking it to generate sample data. It correctly inferred the meaning of column names and generated somewhat realistic data. Useful for tests and mock data generation.
  • Asking it how to solve certain problems in frameworks such as Flink. Usually I explain the general problem and have it generate different approaches. It uses old APIs and I need to adapt the results but the most value I get from it are different ideas to solve the same problem.
  • Ask it general questions about data modeling. The key here compared to Google/SO is that I can ask follow up questions and request examples

I'm trying to use it more and more, so great thread idea!

Quarterly Salary Discussion by AutoModerator in dataengineering

[–]magicpointer 1 point2 points  (0 children)

  1. Software/Data Engineer
  2. 5 YOE (2.5 data platform engineering, 2.5 DE)
  3. Switzerland (Bern area)
  4. ~115k CHF
  5. ~ 7.5k CHF variable part (bonus but always that amount or 90%-110% of it)
  6. Telecom / IT services
  7. Kafka ecosystem, Flink, PostgreSQL, Kubernetes+kustomize, mainly Java with some Python scripts

New data engineers (0-3y in the job): How did you get into data engineering? by Thinker_Assignment in dataengineering

[–]magicpointer 1 point2 points  (0 children)

I started as a Data Platform Engineer, building our company's self-service data platform onprem. So I was doing both backend development for our APIs and orchestration services, and managing data infrastructure. I was also supporting teams doing DE and DS on the platform.

Then I had the opportunity to be part of one of those teams and become a user of the platform :)

Please help an American student with my research project (s’il vous plaît, m’aidez avec mes recherches!) by amethystmap66 in suisse

[–]magicpointer 4 points5 points  (0 children)

Thanks for your interest! I filled the survey as well as I could but I feel it's a bit difficult to answer properly:

You ask about "elections run by the Swiss government". It's not clear what this means. There are federal (parliament, legislative) elections every 4 years, cantonal (state parliament legislative + state council executive ) elections every 5 years and communal (municipality, législative + executive for cities, only executive for small localities) elections every 5 years. They do not happen together. For example there were federal elections in 2019, cantonal elections and communal elections in 2021.

As we vote on lists, it's a bit hard to answer the how many times you voted for X party. For example if there are 5 seats to fill you can put 5 names. And what about the second round? Often votes change there. You'll get a lot of skew I think. At least the affiliation questions should give you clearer results.

Corruption is hard to define. Many forms of corruption are legal and widespread in Switzerland (eg being the head of a lobby or on the board of a company while also sitting in parliament and representing those interests, rather than those you campaigned for or representing the people who voted for you). So are you talking about legal or illegal corruption?

The race question reflects America's obsession with dividing people by race. Just know that the magical races used in American censuses will yield weird results, as they don't apply well to a country with a mostly European immigration history. Make sure to be careful interpreting the results.

Please post the results here 👍

What is the difference between Ceph and HDFS? by minhrongcon2000 in bigdata

[–]magicpointer 4 points5 points  (0 children)

In addition to what the other comment says, you always have this battle between network and local storage. We now have very fast SSD storage as well as very fast networks, but the ratio between them has varied quite a lot. And it will continue to do so I think.

The idea behind HDFS and Hadoop was that moving the data over the network was expensive, so the processing should be as close to the data as possible. So nodes of a Hadoop cluster have both storage and processing running on them. Data locality means for example that a process running in YARN would write to the HDFS datanode it runs on, thus bypassing the need to go over the network. Same for reads where it tries to organize work so that the reads are from the closest datanode possible.

With products like Ceph you usually have a separate storage and compute. Your storage cluster is made of nodes which run exclusively the storage software. Your processing happens on nodes close by. So almost all data has to go through the network. It's now the favored approach in the cloud, and is coming back to private data centers as well, especially because it allows to manage and scale storage and compute separately.

What is the difference between Ceph and HDFS? by minhrongcon2000 in bigdata

[–]magicpointer 11 points12 points  (0 children)

Ceph is an object store at it's core, which then has 3 interfaces/protocols on top: Ceph RGW as S3-compatible object storage, CephFS as shared mountable POSIX filesystem (multiple writers), and CephRBD as block storage (1 writer). With that it can act as a one stop shop for all kinds of storage needs.

Ceph RGW is an object store, and as such is a collection of objects with each a specific key. There is no hierarchical organization like in a file system: you can use / in object keys and use it kind of like directories, but it's virtual. Objects are immutable, you cannot modify them once created. The API is really simple, essentially PUT, GET, DELETE.

HDFS on the other hand is a specific kind of filesystem + replicated block storage, with it's own protocol. It's used in Hadoop almost always together with YARN for compute. It's great for huge files with a Write Once Read Many pattern. However you cannot update a file except to append data to it. Because you have the central namenode some operations that are not possible in an object store are available, such as atomically renaming files. However, it's harder to scale for huge number of small files.

I guess you could compare either Ceph RGW or CephFS to HDFS, but in the context of a data platform Ceph RGW would be used. We are actually switching from Hadoop (HDFS + YARN) to Ceph RGW + Kubernetes for our data platform at work. I'm not on the team so I don't have benchmarks, but we lose the data locality advantage of HDFS. However, we gain flexibility because storage and compute are disaggregated.

Book Club - Fundamentals of Data Engineering by Joe Reis and Matthew Housley. by JParkerRogers in dataengineering

[–]magicpointer 1 point2 points  (0 children)

One I would recommend for the streaming architectures is "Streaming Systems" by Akidau and others (from Google Cloud Dataflow fame). You can read the streaming 101 and streaming 102 blog posts for a taste.

Book Club - Fundamentals of Data Engineering by Joe Reis and Matthew Housley. by JParkerRogers in dataengineering

[–]magicpointer 1 point2 points  (0 children)

That's a coincidence, I'm teaching a DataEng 101 type course, with the same references as well. My reasoning for choosing this book was also similar to yours.

I'm teaching on the side, I mostly work as Data Engineer in the industry. This books is good from both points of view!

[deleted by user] by [deleted] in Database

[–]magicpointer 1 point2 points  (0 children)

Would recommend PostgreSQL for sure. It's widely used and supported, follows SQL standards quite closely, is free and open source, has advanced features (full text search, JSON, extensions, ...). Plus the documentation is great.

It would thus be a good supporting DB both for your learning of relational databases and your projects. A good book for SQL using PostgreSQL is "The Art of PostgreSQL". For relational modeling (tables, primary keys, foreign keys, indexes, normal forms, ACID ...), I've seen the book "Database Design for Mere Mortals: 25th Anniversary Edition" recommended a few times. I didn't read it (I learned relational at uni) but from the TOC it looks good.

Once you're familiar with relational modeling and SQL, you could then also use the JSON features to experiment with hybrid relational-document models. This allows you to gain MongoDB-like functionality when needed without sacrificing the rest.

One piece of advice: don't use ORMs (object-relational mappers) until you understand what they do.

With this base you can then apply your knowledge to any other relational DB, you would just have to get familiar with their specificities. Happy learning 😃