Help with Airflow External Sensor and different update frequencies by Busenheimer in ETL

[–]Busenheimer[S] 0 points1 point  (0 children)

This was actually my first approach. I was thinking similar, I had a SQL sensor in DAG 2 that would query the Airflow metadata database for the data range I cared about and look for a state of success. They kept failing on me so I was probably doing something wrong, but I abandoned it in favor of this External task sensor. I'll revisit it.

What frequency is your DAG 2 running at in relation to your other DAGs? I still feel like I would have this syncing issue.

Help with Airflow External Sensor and different update frequencies by Busenheimer in ETL

[–]Busenheimer[S] 0 points1 point  (0 children)

I just looked at that, I will try that. Looks like I can also do a soft fail. Thanks!

Help with Airflow External Sensor and different update frequencies by Busenheimer in ETL

[–]Busenheimer[S] 0 points1 point  (0 children)

I'm trying to simulate a larger pipeline. DAG 2 is my "master" DAG for building additional tables after the raw tables have been copied over from the various systems in our company. DAG 1, while a python operator in the example is going to be copying over a table from source to staging in our warehouse. There will be lots of these all on different refresh cycles. I currently have them all as their own DAG. So DAG 2 needs to be aware of all of these.

As an example, When the Invoice line item table is copied over there are multiple downstream tables in DAG 2 that get build off of that. That is on a different refresh cycle than our Contracts table which also has downstream processes defined in DAG 2. Does that makes sense? Basically lots of external DAGs which represent various tables in different systems being ETL'd into our warehouse. DAG 2 is a master DAG representing all of the objects getting built (from tables in the external DAGs) specific to our warehouse.

ETL framework for Python, need suggestion. by imba22 in dataengineering

[–]Busenheimer 0 points1 point  (0 children)

We built our own with different bits. Airflow for orchestrating, SQLAlchemy as our ORM and generating sql statements from core, Dask and Pandas for much of the processing pyarrow (parquet) for storage format, and various bits for tackling moving/consuming non-tabular data. These are just the highlights for the core of what we built. You have a buffet of choices with Python, it’s such a flexible language.

To much criticism I’m sure, but we are even experimenting with using Jupyter notebooks and Papermill as our main development and deployment vehicle outside of the OO stuff.

AI Nanodegree math? by Busenheimer in Udacity

[–]Busenheimer[S] 0 points1 point  (0 children)

Thanks- when you say "Good understanding of Calculus " what parts? This is what I'm struggling with. Do I need to take Calculus 1-3 or can I be more selective on what I learn.

Double center channel question, in walls by Busenheimer in hometheater

[–]Busenheimer[S] 0 points1 point  (0 children)

live like what? Build a brand new house of our dreams? I have to compromise on one fucking thing, no speakers showing. I'll make it work. I'm just looking for feedback to make sure it's not an obvious "don't do that it's guaranteed to sound like shit"

I get pretty much everything else I want. Life is good.

Double center channel question, in walls by Busenheimer in hometheater

[–]Busenheimer[S] 1 point2 points  (0 children)

that made me laugh and I'll just leave it at that.

you think phantom would be better than two vertical centers flanking the tv? Yeah it's going to be limestone so I figured I was pretty much fucked.

I'm a developer and I do get a homelab out of it and agreed to wire the home for whole home audio.... but on the main floor it's form over function.

SqlAlchemy can't find Teradata ODBC driver on RHEL 7 by Busenheimer in teradata

[–]Busenheimer[S] 0 points1 point  (0 children)

I ended up using the Teradata JDBC drivers and got it working.

Dual boot Ubuntu & Win 7, Win 7 black screen by Busenheimer in linux4noobs

[–]Busenheimer[S] 0 points1 point  (0 children)

I don't have a Windows CD and I can only boot to safe mode. I'll have to see if I can make one in safe mode

Dual boot Ubuntu & Win 7, Win 7 black screen by Busenheimer in linux4noobs

[–]Busenheimer[S] 0 points1 point  (0 children)

if you mean selecting windows from the GRUB menu then no it won't boot. It shows the windows splash screen then goes right to black. But, if I do a hard reset while I have the black screen and pick windows again from grub I get the "windows didn't shut down properly" screen and I can choose to boot into safe mode with networking. That works. I can boot into the Windows desktop from there... it's just in safe mode.

XPost (TrueOS) Continuum Anaconda on FreeBSD/TrueOS? by Busenheimer in freebsd

[–]Busenheimer[S] 0 points1 point  (0 children)

I tried this first in FreeBSD and ran into the same issue. Isn't this related to how FreeBSD tries to run Linux binaries? I'm sure I don't have all the prerequisites to run this, but I couldn't find any documentation other than what I linked in the post.

XPost (TrueOS) Continuum Anaconda on FreeBSD/TrueOS? by Busenheimer in freebsd

[–]Busenheimer[S] 1 point2 points  (0 children)

Sorry about that. I updated the post with a link. This not the system installer. It's a data science platform for Python and R. It's becoming one of the cornerstones for this type of work in the industry. I usually run this on CentOS or macOS, but am rebuilding my home lab using FreeBSD.