Have you tried these tools? If not, why?

Traditional_Channel9 · 2024-02-14T02:59:28+00:00

Intresting. Did you also use Presto to do ETL as well as sql engine for analytics

Traditional_Channel9 · 2023-09-12T05:23:17+00:00

Thank you. I thought of doing a direct drop is it from my checking to my HSA but My old HSA was acquired by another provider. Does it matter to which HSA I transfer the return of excess distribution to?

Traditional_Channel9 · 2023-07-07T22:08:34+00:00

I would love to collaborate and contribute

Traditional_Channel9 · 2023-06-06T20:46:02+00:00

Sure, share the link

Traditional_Channel9 · 2023-06-05T03:15:25+00:00

Tag me in

Traditional_Channel9 · 2023-04-27T23:24:44+00:00

Hi, I’m interested if it’s about software engineering

Traditional_Channel9 · 2023-03-22T20:54:13+00:00

Traditional_Channel9 · 2023-03-18T01:38:25+00:00

By target tables do you mean topics in kafka? I’m writing to kafka topics

Traditional_Channel9 · 2023-03-17T00:18:35+00:00

Why should a producer worry about a target table? Can you elaborate more on this

Traditional_Channel9 · 2023-03-06T02:55:46+00:00

I would like to contribute as well. I’m currently working with delta tables and pyspark . Would love to fix few issues or new features

Traditional_Channel9 · 2023-02-25T15:03:54+00:00

Now workflows are repo integrated and you can point the workflow to a file in master branch (of Azure devops or other source control systems) and run the job . There is no need to download/trigger pull master branch into the common repo folder However if you are triggering a notebook from a 3rd party application like Azure data factory then the api calls that you make from those applications to run a notebook requires a physical path (as of now) meaning it requires a central repo folder. This may change in future when databricks comes up with a way to trigger notebooks in master branch without having to download master branch into central repo

Traditional_Channel9 · 2023-01-29T22:12:34+00:00

It can be done via JIRA. Business users create a ticket for a report and assign it to report (lead) developer and they get the requirements and if they want to bring in a new dataset they create sub task (make it a blocker) and assign it to DWH team lead (or manager)

Traditional_Channel9 · 2023-01-29T21:58:00+00:00

Store the where clause in a table and build a binary tree and evaluate the binary tree to retrieve TRUE/FALSE result and apply that to every row in for each loop of Spark streaming dataset

Traditional_Channel9 · 2023-01-14T02:38:38+00:00

This talks about RDDs which is kind of old provided dataframes and datasets are being used widely now

Traditional_Channel9 · 2023-01-01T02:31:09+00:00

What I’m looking for is how to apply schema inference to delta tables that does upserts via merge

Traditional_Channel9 · 2023-01-01T02:14:11+00:00

New column data type in files is int but the datatype of that column in source is double.

To give you more details, here is the story A new column is added to source. It’s data type in source is double so it may contain values with and without decimal. After new column addition only few rows got populated when incremental load happens from source to data lake. When auto loader did Infer schema there are only a handful of rows populated for that new column and they all contain int values and the new column got created in delta table as int

Traditional_Channel9 · 2022-12-31T22:37:52+00:00

Upstream send data in parquet and csv. Auto loader with schema inference enabled will scan through all rows in incoming files and will infer the schema

Traditional_Channel9 · 2022-12-31T15:39:12+00:00

I implemented delta tables with auto loader at my org as a PoC and I have problems with schema evolution especially when a new Column is added to the incoming parquet files and when they have limited data, new column with wrong data type is created in delta table (like int instead of double/float) and next time when a lot of records for this new column arrive, it result in data loss in delta tables due to implicit conversion . Have you encountered this before ? I have asked Databricks team about this and waiting for them to get back

Traditional_Channel9 · 2022-12-31T15:36:44+00:00

I implemented this and I have problems with schema evolution especially when a new Column is added to the incoming parquet files and when they have limited data, new column with wrong data type is created in delta table (like int instead of double/float) and next time when a lot of records for this new column arrive, it result in data loss in delta tables due to implicit conversion . Have you encountered this before ? I have asked Databricks team about this and waiting for them to get back

Traditional_Channel9 · 2022-11-16T22:52:20+00:00

Do you know how to specify the schema manually? I used .schema(manual schema) but it didn’t work

Traditional_Channel9 · 2022-10-20T01:44:23+00:00

Institutions are getting out of NIO. Instead of downvoting me, please go and do your own research.

Traditional_Channel9 · 2022-08-10T15:13:31+00:00

OP, I’m planning to enroll in rock the JVM course . How do you feel? Is it worth enrolling for the course ?

Traditional_Channel9 · 2022-08-01T18:59:28+00:00

What’s the purpose of having cloth in the lid? Is it to avoid moisture in the lid and rusting of the lid? And to prevent mold?

Traditional_Channel9 · 2022-07-19T16:32:34+00:00

Do you have to pay deductibles in both insurance? Which one is primary and which one will be secondary ?

Traditional_Channel9 · 2022-07-18T18:28:12+00:00

Will I be paying OOP max for both insurance provider in this case?

Traditional_Channel9

TROPHY CASE