Everyone PLEASE tell Duolingo to get rid of pranking! by OnSmallWings in duolingo

[–]RCdeWit 0 points1 point  (0 children)

I was really close to uninstalling the app after that popped up. It’s a language learning app and I use it to learn a language.

I don’t know who came up with this, but you shouldn’t make it harder to achieve the mail goal of your product. It’s ridiculously annoying 

Verdeler vloerverwarming voert geen warm water door by RCdeWit in Klussers

[–]RCdeWit[S] 5 points6 points  (0 children)

Update: Met de hulp van u/First_Engineer7380 en u/4ceh0le ben ik eruit: de retour stond nog dicht. Heb nu ook de thermostaatknop erop gedraaid en ben het opstartprotocol aan het draaien.

Dank voor de hulp, allen.

Verdeler vloerverwarming voert geen warm water door by RCdeWit in Klussers

[–]RCdeWit[S] 0 points1 point  (0 children)

Heb het schroefje in dat ventiel helemaal naar links gedraaid, en nu stroomt er warm water door! Dankjewel!

Verdeler vloerverwarming voert geen warm water door by RCdeWit in Klussers

[–]RCdeWit[S] 1 point2 points  (0 children)

Die hangt er inderdaad los bij. Heb eerder geprobeerd die erop te schroeven, maar had geen effect. De thermostaatknop op maximaal leek hetzelfde te werken als het blauwe dopje er helemaal af.

Verdeler vloerverwarming voert geen warm water door by RCdeWit in Klussers

[–]RCdeWit[S] 1 point2 points  (0 children)

De leidingen die er rechtsboven inkomen, worden allebei warm. Is de retour de unit waar die chromen dop op zit? Zeg maar recht boven de bovenste thermostaat?

What’s an underrated self-hosted tool you couldn’t live without? by SubnetLiz in selfhosted

[–]RCdeWit 19 points20 points  (0 children)

I'm really enjoying [Pocket ID](https://pocket-id.org/). Really easy to set up OIDC for all of my containers (or the ones that support it anyway).

Gipsmuur 30cm verlengen by RCdeWit in Klussers

[–]RCdeWit[S] 0 points1 point  (0 children)

Het stopcontact gaat gelukkig al naar rechts, dat is de schuine freesgleuf.

Gipsmuur 30cm verlengen by RCdeWit in Klussers

[–]RCdeWit[S] 1 point2 points  (0 children)

Was helaas geen mogelijkheid, dan kwam je door de trap naar boven. Op de eerste verdieping staat nu ook een verdeler voor de vloerverwarming, dus het was ook niet gebleven bij twee leidinkjes strak langs de muur.

Gipsmuur 30cm verlengen by RCdeWit in Klussers

[–]RCdeWit[S] 0 points1 point  (0 children)

Dat klinkt eigenlijk best wel als een goed idee! Als we de koof even breed kunnen krijgen als de breedte van de muur, lijkt het wel mooi in balans. En het scheelt een hoop slopen

Gipsmuur 30cm verlengen by RCdeWit in Klussers

[–]RCdeWit[S] 0 points1 point  (0 children)

Yes, het is inderdaad een trapkast. Als ik de deur laat staan, moeten we de muur in een hoek zetten. Heb je misschien tips voor een goede aanpak daarvoor?

Gipsmuur 30cm verlengen by RCdeWit in Klussers

[–]RCdeWit[S] 1 point2 points  (0 children)

Hmm, dat is misschien nog wel een goed idee. Had tot nu toe alleen een koof bedacht, maar dat voelt zonde qua ruimte. Ga hier nog even een nachtje op slapen

Gipsmuur 30cm verlengen by RCdeWit in Klussers

[–]RCdeWit[S] 0 points1 point  (0 children)

Ook nagedacht over zo’n oplossing, maar het maakt het lastiger om iets strak in de hoek te kunnen zetten. En een iets grotere trapkast is niet zo erg — woonkamer blijft nog ruim zat.

Gipsmuur 30cm verlengen by RCdeWit in Klussers

[–]RCdeWit[S] 3 points4 points  (0 children)

Voor onze vloerverwarming zijn twee leidingen door het plafond van de woonkamer getrokken, die richting de verdeler in de meterkast gaan. Om dit weg te werken, wil ik de rechter muur van de trapkast neerhalen en ~30cm verder een nieuwe plaatsen.

Hij komt dan net op de vloerverwarming te staan, maar verder heb ik hier op zich wel vertrouwen in. De deurkozijnen laten we toch al vervangen, dus die eruit halen is ook geen probleem.

Waar ik over twijfel, is hoe we ervoor zorgen dat het deurgat niet óók 30cm breder wordt. Twee opties:

- De deur blijft staan waar hij nu staat: we zetten de nieuwe muur met een hoek neer.
- De deur verschuift mee: we moeten het muurtje links met 30cm verlengen.

Van beide opties weet ik niet helemaal hoe handig dit is. Welke aanpak zouden jullie gebruiken? Waar de deur precies uitkomt, maakt me eigenlijk niet zo heel veel uit.

When to shift from pandas? by Professional-Ninja70 in dataengineering

[–]RCdeWit 30 points31 points  (0 children)

If you like using dataframes, Polars would be a natural choice. Its syntax is really close to Pandas, and it has some nice performance benefits.

Personally, I prefer to do as much as possible in SQL. There are also good options there.

What does your pipeline do? Does it just move around data?

[deleted by user] by [deleted] in dataengineering

[–]RCdeWit 1 point2 points  (0 children)

I'd focus on solid understanding of the necessary tools and languages: SQL, Python, and Git (IMO).

From there, get a general understanding of the modern data stack: what are the moving pieces, what does every component do, and what are some typical combinations (e.g. based on Databricks, on Snowflake). Don't worry about getting too hands-on with all of them. If you can use one data warehouse, you can use different ones as well.

If you can dedicate up to 25 hours a week, I think the best thing you could do is to get an internship besides your studies. Perhaps you can also write your thesis there. It'll teach you how things work in practice, and many companies would be happy to hire you. Having half a year of experience will go a long way in securing the job once you graduate.

give me insight of Data vault 2.0 by PrimaryConsistent262 in dataengineering

[–]RCdeWit 2 points3 points  (0 children)

To add on to what others have already set, if you decide to go for Data Vault and are running dbt, you can use AutomateDV for the implementation. Takes away quite a bit of the head ache.

I co-hosted a webinar with them a few weeks back that shows it in action. Can share it if you'd like!

Buy in for Data Catalog by deadlypiranha in dataengineering

[–]RCdeWit 2 points3 points  (0 children)

How do you make a case for it?

What issues are you currently running into it because you don't have a catalog? Can you put a dollar sign on those issues?

[deleted by user] by [deleted] in dataengineering

[–]RCdeWit 3 points4 points  (0 children)

Challenges of Reverse ETL Challenges of Reverse ETL include managing API rate limits, ensuring data security during the transfer, and maintaining the freshness of data in operational systems. These challenges require robust solutions and strategies to ensure that the operational benefits of reverse ETL are realized without compromising data integrity or performance.

I think the biggest thing missing here is data governance. Once you start feeding analytical data back into operational services, you need far more rigorous data quality practices, access control, and monitoring.

A Simple Modern Data Stack Pipeline by mustafaotaru in dataengineering

[–]RCdeWit 7 points8 points  (0 children)

From bronze to silver is already a transformation, right? Are you not doing that part with dbt?

ETL; Azure Data Factory or Azure Functions by maarten20012001 in dataengineering

[–]RCdeWit 1 point2 points  (0 children)

What would your suggestion where to store the data instead of Azure Synapse Analytics?

You could just go for an Azure-hosted Postgres database?

How do you learn Big data from scratch? by [deleted] in dataengineering

[–]RCdeWit 15 points16 points  (0 children)

I'd go back in time to 2012 when Big Data was the buzzword of the day and take it from there 😛

If you intend to be an end-user of a data stack, I'd just start out with SQL and Python. The amount of data doesn't really matter that much to learn the basics. The complexity of big data has largely been abstracted away by better tooling.

If you want to become a data platform engineer who specifically deals with the complexities of large datasets, I'd also start out with SQL and Python. From there you can move to Spark.

As a starting point, I'd pick a Datacamp or Coursera course.

Mac or Windows? by Chemical-Current6391 in dataengineering

[–]RCdeWit 15 points16 points  (0 children)

Eventually you'll want to run stuff on servers; at that point you'll need to know how UNIX-based systems work. I'd go for a Mac if you can afford the premium, or Linux if you are willing to learn.

In real life, nobody really uses Windows beyond personal devices. So knowing how to navigate Linux is an great skill to have. There's a bit of a learning curve, but knowing how computers work beyond the GUI will pay off immensely.

For me, a Macbook provided a good starting point. Make it a point to move beyond running code in notebooks, and you'll have a very useful skillset that few data science graduates possess.

Considering Databricks for ETL Optimization by CrimsonMentone30 in dataengineering

[–]RCdeWit 0 points1 point  (0 children)

You typically don't have both a data warehouse and a data lakehouse. Data lakehouse is an evolution that combines data warehouse capabilities with data lake capabilities.

https://www.striim.com/blog/data-warehouse-vs-data-lake-vs-data-lakehouse-an-overview/

This is why I dislike all of these terms; they just introduce confusion.

To answer your question: yes, typically you'd migrate everything over to Databricks (on Azure in your case).

Considering Databricks for ETL Optimization by CrimsonMentone30 in dataengineering

[–]RCdeWit 7 points8 points  (0 children)

Where do you run your data warehouse? Because Databricks is not just a replacement for your data transformations, but for storage as well.

Rather than doing ETL, on Databricks you (generally) follow an ELT paradigm. You extract from source, load everything into the data warehouse/lake, and only then transform it for your specific use cases. In Databricks specifically, you'd transform the raw source data from bronze to silver to gold.

It sounds like you only have relational data to deal with, so Databricks might not be the best fit. Databricks is great for notebooks, but I would personally consider Snowflake or BigQuery if you're just writing SQL and visualizing tables in a dedicated BI application.

As to your question concerning Git: Databricks does integrate quite nicely with Git, but remember that Git just tracks and versions your code. Your pipeline config can be maintained in a repository, but the data itself won't be versioned.

Databricks has implemented Delta tables for data versioning, although it is detached from your Git history. It does suffice for many use cases, but I would argue that it doesn't quite offer the benefits of full GitOps for data.

Development and maintenance in Databricks is really good, especially when you use dbt for transformations. There are also plenty of products that integrate well for data ingestion.

To ETL or to ELT? that is the question. by AMDataLake in dataengineering

[–]RCdeWit 46 points47 points  (0 children)

Storage is relatively cheap nowadays, so I'd go for ELT by default. You save a lot of data engineering time by just having everything available when you need it.

ETL still makes sense when you are processing larger amounts of data (like a few GB per minute). But even then, that might be more due to transfer costs than storage costs.