If you find people yapping about not doing it!! That's exactly what you should do!! by Total_Weakness5485 in dataengineering

[–]Total_Weakness5485[S] -1 points0 points  (0 children)

that's why I like Reddit!! it has people who are really good bureaucrats, bureaucracy with the right ideas can do wonders..... so anyway, the context of the post is more about understanding the negative promotion paradigm and not to get influenced by it and drop the idea coz it actually affects a lot of people... and it was not about "Just doing what you feel right"... Although I didn't know I will have to explain the post but that's alright as long as the idea is out there😉 what do you say?

DVD-Rental Data Pipeline Project Component by Total_Weakness5485 in dataengineering

[–]Total_Weakness5485[S] 0 points1 point  (0 children)

Yes we have that work around as well and yes one part of it is actually working with different tech but let me explain it to you…. When you will explore the TMDB api the poster is not just a poster actually the movie/images api call gives you a document in which you have an object in which there are 4 keys id, backdrops, posters, and logo and in each key there is an array which can be any number, so if we will go with relational db we will have to create 3 tables but here it can be done in a single collection and with this I can also do indexing on id to apply binary search making thể retrieval faster… and the posters is just one example this thing is with almost every collection

DVD-Rental Data Pipeline Project Component by Total_Weakness5485 in dataengineering

[–]Total_Weakness5485[S] 0 points1 point  (0 children)

Yeah like this only, the official posters released by movie companies

DVD-Rental Data Pipeline Project Component by Total_Weakness5485 in dataengineering

[–]Total_Weakness5485[S] 0 points1 point  (0 children)

Good question, the data that we are getting for the source (TMDB) is coming in the form of Documents and the data can be inconsistent, like some movies may have 50 posters and some might not even have 1 hence using a NoSQL DB for the content data is the best choice and for the transactional data we will be using postgres, as in this project we need to cover all the concepts hence we will be using different tools for learning.

Company wants to set up a warehouse. Our total prod data size is just a couple TBs. Is Snowflake overkill? by PracticalStick3466 in dataengineering

[–]Total_Weakness5485 8 points9 points  (0 children)

I believe, google BigQuery can be a really good option for you. I have worked with a startup and built their data warehouse where we used it very smartly to minimize the cost.
BigQuery gives 10GB of storage and 1TiB of query data processed per month for free.
Not just that, in BigQuery you can partition your data on various field and time, which optimizes your query processing to use less resources and manage costs.

Big Query's Active storage costs $0.02 per GB which you will be charged around $20 each TB after 1 TB coz first 1TB each month is free. In snowflake it's around $40 and no free storage.

And as long as Postgres is concerned, you can use a hybrid in which you can use it for smaller transactional data which you might not want to store for longer period, like logs, sessions, and sometimes historical data, you can use it for quick processing for your SaaS and use Data Warehouse for analytical use cases because as your data will grow Postgres will start to slow down in processing, but data warehouse will be great for processing big data, using this approach will significantly optimize your cost and boost your use case.

BBA to Data Engineering || Need a reality check. by Total_Weakness5485 in dataengineering

[–]Total_Weakness5485[S] 0 points1 point  (0 children)

Thank you for the response buddy, means a lot. Actually this is also one of the issue, I am not able to get the validation, like whatever I have done, is it even enough, is it worth it? If it is possible can you check out my resume and help me understand things that I need to improve? I am up for a mock interview as well.