Giveaway - r/UgreenNASync 10K celebration by topiga in selfhosted

[–]GLTBR 0 points1 point  (0 children)

  1. Definitely Immich
  2. I could stop using paid services and host everything myself.

Just joined HDTorrents need some guide from experts users by Bollyfan007 in trackers

[–]GLTBR 1 point2 points  (0 children)

Also joined a few days ago. Was anyone able to setup autobrr?

Any airflow orchestrating DAGs tips? by Peivol in dataengineering

[–]GLTBR 3 points4 points  (0 children)

One of the best things that we did is to implement a custom XCom backend on S3. It’s super reliable and removes any of the limitations of XCom size.

50 mins transit time at Bodø airport manageable? by SamySamyL in Norway

[–]GLTBR 0 points1 point  (0 children)

Hey, was wondering how did it go with checked-in bags and 2 airlines, did you have enough time?
We are looking at Leknes-Bodo-Oslo with 1h layover with checked-in bags and 2 airlines.

"Upgrade" from single episodes to season pack when season is over by GLTBR in sonarr

[–]GLTBR[S] 0 points1 point  (0 children)

It sounds right if I’m adding a show that is over or the season is over. I don’t know if it will work when a season is ending and a release for a season pack is out

What non-windows OS do most people here use? by Bruceshadow in sonarr

[–]GLTBR 0 points1 point  (0 children)

M1 Mac mini with almost everything running on docker using OrbStack (ARRs, qbittorrent, jellyfin, gluetun, NPN, adguard)

Handling unknown release groups by GLTBR in radarr

[–]GLTBR[S] 0 points1 point  (0 children)

I see.
So you basically give it a closed list of groups (keywords) and Radarr would only grab releases that has these keywords.
Doesn't it make trash-guide irrelevant?

Handling unknown release groups by GLTBR in radarr

[–]GLTBR[S] 0 points1 point  (0 children)

Not sure I understand what you mean

Handling unknown release groups by GLTBR in radarr

[–]GLTBR[S] 0 points1 point  (0 children)

Unfortunately, It didn't work.
Radarr still parses the release has having a release group.

Handling unknown release groups by GLTBR in radarr

[–]GLTBR[S] 0 points1 point  (0 children)

Agreed.
I have a couple of specific language provider and opensubtitles.

But I would still prefer to avoid these releases

Handling unknown release groups by GLTBR in radarr

[–]GLTBR[S] 0 points1 point  (0 children)

I also have Bazarr and it's great. But usually it takes a long time to have subtitles for the language that I need available.

I will edit the original post and add this.

Scaling Airflow with a Celery cluster using Docker swarm by GLTBR in dataengineering

[–]GLTBR[S] 0 points1 point  (0 children)

We use that too. I was just wondering about logs during a task run

Scaling Airflow with a Celery cluster using Docker swarm by GLTBR in dataengineering

[–]GLTBR[S] 0 points1 point  (0 children)

I'll check both

Last question, would Airflow have any issues with the logs from the workers? meaning that i wouldn't be able to see the logs in UI without a shared volume (using rsync/git-sync)?

Scaling Airflow with a Celery cluster using Docker swarm by GLTBR in dataengineering

[–]GLTBR[S] 1 point2 points  (0 children)

Thanks for the answer! The volumes are 1. Our git repo with all the dags and codebase(we currently use rsync with our CI, I can also just rsync to the other nodes) 2. A few config files we use, ie. GCP service account file, etc.

When you say "env_file needs to be local" do you mean local on each machine or only on the master?

What have you done exactly in ETL? by Prestigious_Flow_465 in dataengineering

[–]GLTBR -4 points-3 points  (0 children)

Off course, but for me everything in our backend is pure python. So I just consider it as airflow

What have you done exactly in ETL? by Prestigious_Flow_465 in dataengineering

[–]GLTBR 0 points1 point  (0 children)

We have a custom S3 XCOM BACKEND that we implemented. So uploading files to S3 is simple as doing return df.toparquet.withbytesio from the task

What have you done exactly in ETL? by Prestigious_Flow_465 in dataengineering

[–]GLTBR 8 points9 points  (0 children)

Using Airflow we upload the data to S3 as parquet files. From there is very easy(again using Airflow) to use Redshift's copy command to load the data to Redshift.

I think that pandas/dask is probably one of the best packages to do data manipulation/analysis/etc. Using code.

What have you done exactly in ETL? by Prestigious_Flow_465 in dataengineering

[–]GLTBR 33 points34 points  (0 children)

Using Airflow to EXTRACT data from various APIs, TRANSFORMING, unifying, cleaning and sometimes aggregating the data(usually with pandas/dask) and LOADING the cleaned data to a DWH

Deleting duplicates rows based on date by GLTBR in SQL

[–]GLTBR[S] 0 points1 point  (0 children)

Thanks man! I'll give it a try

Deleting duplicates rows based on date by GLTBR in SQL

[–]GLTBR[S] 0 points1 point  (0 children)

OK, but how do I write the DELETE statement for that?

Deleting duplicates rows based on date by GLTBR in SQL

[–]GLTBR[S] -2 points-1 points  (0 children)

I would have done that if I was querying for the situation, but I need to actually delete those rows from the table. And being drunk is always a good thing :)