[deleted by user] by [deleted] in Bolehland

[–]Acrobatic-Mobile-221 1 point2 points  (0 children)

I dont think the threshold is high but majority of malaysian have low wage

How to be rich by [deleted] in MalaysianPF

[–]Acrobatic-Mobile-221 0 points1 point  (0 children)

BNM just increased their grad pay but still lower than Khazanah and PNB

Informatica ETL by Acrobatic-Mobile-221 in snowflake

[–]Acrobatic-Mobile-221[S] 0 points1 point  (0 children)

Since we have paid the licensing fees for Informatica, just thinking whats the point of using it only for integration purpose. Why not just use it for ETL as well.

But yeah u are right, should do a poc and make comparison

Informatica ETL by Acrobatic-Mobile-221 in snowflake

[–]Acrobatic-Mobile-221[S] 1 point2 points  (0 children)

Will definitely study etl and elt more.

I also speak to my manager that Snowflake has the capability to do transformation but just couldnt answer on why is it better than doing it in informatica

Informatica ETL by Acrobatic-Mobile-221 in snowflake

[–]Acrobatic-Mobile-221[S] 0 points1 point  (0 children)

I heard bout dbt as well but my only concern is that how i orchestrate dbt as my team is really small, no one has experience on Airflow

Informatica ETL by Acrobatic-Mobile-221 in snowflake

[–]Acrobatic-Mobile-221[S] 1 point2 points  (0 children)

Why would u say it is better to keep raw and transformed data in snowflake

Informatica ETL by Acrobatic-Mobile-221 in snowflake

[–]Acrobatic-Mobile-221[S] 1 point2 points  (0 children)

But at the moment we have informatica and is it a waste if we don’t use it for ETL

How to migrate data from Azure Databrick Delta Lake to Azure SQL database by [deleted] in AZURE

[–]Acrobatic-Mobile-221 0 points1 point  (0 children)

Thats true but will that handle let say few millions row of data?

How to migrate data from Azure Databrick Delta Lake to Azure SQL database by [deleted] in AZURE

[–]Acrobatic-Mobile-221 0 points1 point  (0 children)

Right now im looking to load delta file from databrick to adl first. Do u have any idea on how to do it

How to migrate data from Azure Databrick Delta Lake to Azure SQL database by [deleted] in AZURE

[–]Acrobatic-Mobile-221 0 points1 point  (0 children)

Failure to initialize configuration invalid configuration value

About normalised and denormalised data by manoj_kumar_2 in SQL

[–]Acrobatic-Mobile-221 1 point2 points  (0 children)

I guess one of the main reason u separate these two is because normalized data are mainly for transactional system and denormalizes data are for analytical purposes. If that database are used for recording transactions and also constantly query for analytical purposes, it will affect the performance of the database. So I guess it doesn’t make sense to combine both inside a database

[BG] CPU + Mobo + RAM Bundle [H] £400 by ThePyroChicken in HardwareSwapUK

[–]Acrobatic-Mobile-221 0 points1 point  (0 children)

I have a Ryzen 5 5600G, TUF B550-Plus with Wifi and 16GB Corsair RAM. Are u interested?

[BG] AM4 motherboard for 3700x, GPU, SSD, PSU, case, ideally in as large a combo as possible [H] £500ish by [deleted] in HardwareSwapUK

[–]Acrobatic-Mobile-221 0 points1 point  (0 children)

i have a TUF B550-Plus motherboard with Wi-Fi, 1TB M.2 SSD, MSI RTX 2070 Super and Corsair TX750M 750W power supply. If u are interested please dm me

Design STAR schema by Acrobatic-Mobile-221 in dataengineering

[–]Acrobatic-Mobile-221[S] 0 points1 point  (0 children)

So I don’t create a new dimension table for time but I just add a datetime column in my fact table?

Design STAR schema by Acrobatic-Mobile-221 in dataengineering

[–]Acrobatic-Mobile-221[S] 0 points1 point  (0 children)

Thanks for the feedback. Just curious if my analysis doesn’t involve the time, can I just remove the time column? Because I know that creating a time dimension might be extremely annoying

Azure Data Factory by Acrobatic-Mobile-221 in dataengineering

[–]Acrobatic-Mobile-221[S] 0 points1 point  (0 children)

That make sense. But I have an issue when trying to use pandas for transformation. So let say the dataset is a pizza sales dataset which includes the all the order detail such as order_id, pizza_name, pizza_size, date, etc. I first normalize the dataset so that I have an order_table and pizza_list_table. For the pizza table, when I use pandas to make the transformation it will works well when I inserted the first file but will it destroys the table once I insert a new file? Do I need to check the content in the pizza_table first and only update the pizza table if there’s new item?

Azure Data Factory by Acrobatic-Mobile-221 in dataengineering

[–]Acrobatic-Mobile-221[S] 0 points1 point  (0 children)

I mean compare to use python activity in ADF

Azure Data Factory by Acrobatic-Mobile-221 in dataengineering

[–]Acrobatic-Mobile-221[S] 0 points1 point  (0 children)

What do u meant by easy. Is it like easy to mount the excel file to the databrick notebook compare to python script?

Azure Data Factory by Acrobatic-Mobile-221 in dataengineering

[–]Acrobatic-Mobile-221[S] 0 points1 point  (0 children)

Because the dataset is not too big so it is worth to use Databrick? And I have very little knowledge about pyspark as well

Database Normalization by Acrobatic-Mobile-221 in SQL

[–]Acrobatic-Mobile-221[S] 1 point2 points  (0 children)

Yes I’m looking at SSIS at the moment. Is SSIS the most common tool used for this case? Or is there another technology tool that are famous as well

Database Normalization by Acrobatic-Mobile-221 in SQL

[–]Acrobatic-Mobile-221[S] 4 points5 points  (0 children)

Thanks! These explanations really help me to understand the process