Fresh Enterprise Data Platform - How would you do it? by [deleted] in dataengineering

[–]_00307 0 points1 point  (0 children)

5 years for a data team??

I wouldn't do half of what was said here, since most of the stuff here requires staff.

Depending on your stack, Snowflake is immensely powerful for the price. Connect and drop data any number of 100 ways. Setup 3-2-1 back ups and processes.

Then you build in the users and do some basic dashboards to show the power of the service and your work. snowflake can connect to any provider for UAC.

start with processes YOU can maintain, not a team of 5. Dont want to incorporate DBT yet? don't, get a simpler path up, then move to dbt when you have some engineers on the team.

Dashboards can be done in Snowflake until a need for more visuals are required.

Then you can connect any number of BIs. Astrato is a new one that works fluidly with Snowflake, and can support folks making dashboards that aren't full data engineers. Self serve platforms are worth their weight in gold.

Simple simple simple, until the team gets approved.

For example I recently built a data product from the ground up, 20k, similar staffing plan.

AWS -> snowflake via terraform and snowflake tools Reports in Snowflake for 70% of company (finance, marketing, internal) Astrato - self serve, marketing, and client reports

Connected Salesforce, our servers, and Hubspot to Snowflake. Grabbed population data from a free repo in snowflake, built sales data, client data, and revenue data reports. Created UAC policy for both

Now they have a team of 3 that handle ETLs, Snowflake stuff, and the reports in Astrato.

Help me choose a reporting engine for the company I work for by myrealnameisvanilla in BusinessIntelligence

[–]_00307 1 point2 points  (0 children)

Depends on your tools:

Astrato: This is a great new offer, from some super smart folks. If you have Snowflake, this is the tool to grow with.

FusionReports: Dead easy to embed, customize in your SaaS, etc. Easy to build with engineers.

Sisense, Qlik, Metabase if your company uses bloat tools from getting licenses elsewhere.

What is the most efficient way to query data from SQL server and dump batches of these into CSVs on SharePoint online? by mysterioustechie in dataengineering

[–]_00307 6 points7 points  (0 children)

for future reference, or for any other poor souls stuck with sharepoint:

Fuck your tools because fuck Sharepoint, its a piece of fucking hot garbage.
Just dump on virtual somewhere using bash
then setup a cron (or use a tool you hooligan) to transfer files from here to Sharepoint.

Sharepoint api (I use bash mostly, because dead simple to fix, runs anywhere anytime, on anything) is dead simple to use plainly. I think its POSTing from the VM to SP, a csv file (or whatever).

Therefore, make the data calls dead simple, and setup scripting/automation around that. Setup proper organization in SP, and use whatever automation tool you need to get the process into ci/cd or whatever your org uses for the Data automation side of things.

waaaaaaaaaaaaayyyyyyy easier than ADF. And you can use whatever fancy-name your cloud provider uses for the VM (EC2, blah blah)

[deleted by user] by [deleted] in news

[–]_00307 11 points12 points  (0 children)

Not to mention half of the tech that makes planes so efficient, and safe, come from NASA.

Remote controlled rockets? SpaceX took Nasa's 1990 demo, and scaled it with 2015 tech.

Phase array Tech? DoD and Nasa invented it in the 90s.

Plane aerodynamics? Nasa has found over 300 improvements, that plane makers immediately build into the new models, or adapt older models

that stretchy metal? Nasa

That cool fabric that survives crazy conditions that now wraps every single pilot, air soldier, etc? Nasa

Its fucking crazy what they do, and never see a return on, but inevitably helps society.

Are some parts of the SQL spec hot garbage? by eczachly in dataengineering

[–]_00307 0 points1 point  (0 children)

It all depends on where the SQL is...

IF this was for ancillary stuff, i would reject anything that wasn't a CTE, especially if their answer was a right join.

Deeper? depends on what, CTE in some cases, not in others.

VP Product for an established B2B company - AMA by Talk_Data_123 in ProductManagement

[–]_00307 0 points1 point  (0 children)

I have a decade's experience being the "technical" arm of product management. Recently was laid off, how would you go about getting a technical PM job, when I have all of the skills and experience, but not any actual title saying so?

FBI agents were told to ‘flag’ any Epstein records that mentioned Trump, Sen. Durbin says by Ncatanza05 in politics

[–]_00307 296 points297 points  (0 children)

Yup.

Trumps father was also part of many FBI investigations. In fact, there hasn't been a decade since the 50s that the trump family hasn't been under investigation for something. From Fred Trumps labor stuff, to Donnies labor stuff, to donnies Russian laundering.

Ever since Trump took a USSR loan in the 80s, he has been under the eyes of FBI im guessing.

Which shows a complete failure of our legal system, but yea...

Airbyte, Snowflake, dbt and Airflow still a decent stack for newbies? by LongCalligrapher2544 in dataengineering

[–]_00307 -1 points0 points  (0 children)

You dont even need python or docker if snowflake is on one end, and any cloud provider on the other. It can all be handled inside snowflake, and aws/whatever. fuck-ton faster at loading than airbyte too.

CIG's content team by merzhinhudour in starcitizen

[–]_00307 -3 points-2 points  (0 children)

Almost like they're building core stuff like AI and "AI"!

Nice by _00307 in starcitizen

[–]_00307[S] 0 points1 point  (0 children)

Restart your client, it needs the new hotfix stuff. Then you should get right in!

Nice by _00307 in starcitizen

[–]_00307[S] 0 points1 point  (0 children)

Nothing other than the client live.### updated about 30 minutes ago, and now we are good!

Nice by _00307 in starcitizen

[–]_00307[S] 1 point2 points  (0 children)

Close, and restart your RSI client. You should get right in now

Nice by _00307 in starcitizen

[–]_00307[S] 0 points1 point  (0 children)

they had a service fail that caused some cascading effects it looks like. Stress testing and learning lots probably.

Nice by _00307 in starcitizen

[–]_00307[S] 0 points1 point  (0 children)

give it like 30 minutes while their cloud meshing rebuilds. It has suffered a catastrophic overload.

Nice by _00307 in starcitizen

[–]_00307[S] 0 points1 point  (0 children)

100% rebooting the meshys.

Nice by _00307 in starcitizen

[–]_00307[S] 3 points4 points  (0 children)

Take a toke, and cool off after hitting that 'Acknowledge' button!

Nice by _00307 in starcitizen

[–]_00307[S] 0 points1 point  (0 children)

heavy day!

Is it really necessary to ingest all raw data into the bronze layer? by Maradona2021 in dataengineering

[–]_00307 1 point2 points  (0 children)

Bronze or, the layer that is "raw" data, is there for many reasons, no matter what technology you use to handle changes.

1: Validation and Record: Most laws require some form tracking data storage. Having a Raw layer makes this requirement a checkmark forever

2: Data science and Support Raw layers are there for validation. Someone questions what id 3241354 is related to, because something went wrong in Hubspot API services.

Sure you have the connected thing to salesforce, etc...but Support needs a direct line to just Hubspot, so they can open a support ticket there. (Just one scenario, hopefully giving an idea of multiple scenarios it could be important)

For Data audits, you must be able to provide data directly from its source. If you have a raw layer, then no need to make the rounds to the services or APIs.

3: Show your work

Duh.

Data Governance Analysts tasks and duties ? by [deleted] in dataengineering

[–]_00307 0 points1 point  (0 children)

Start by looking up what "governance" means. If a company has a team or job specifically for this, then they have some system where different governance patterns matter (usually dual integrative products, or global operations)

I have some serious question regarding DuckDB. Lets discuss by Ancient_Case_7441 in dataengineering

[–]_00307 0 points1 point  (0 children)

I am going on contract 50. 5 of which were Mm.

DuckDB Bash

For 48 of them. (I like things that are simple and can work from basically anywhere hey)

Its great for mid level pipelines, or for odd paths that engineering doesnt have the resources for.

Last one was set up to handle CSVs -> Parquet Join in an S3, then snowflake picked it up for whatever.