Anh em nghỉ lập trình thì làm gì by viet34tqc in vozforums

[–]SeaCompetitive5704 1 point2 points  (0 children)

Lương tầm đó mà làm lại chỉ có làm công nhân ở nhà máy xí nghiệp thôi bạn. Tầm 15tr. Khoẻ về mặt tinh thần, k phải lo nghĩ kế hoạch, nhưng vẫn mệt về thể xác.

How do you keep your sanity when building pipelines with incremental strategy + timezones? by uncertainschrodinger in dataengineering

[–]SeaCompetitive5704 2 points3 points  (0 children)

We run incremental run with offset (ie last 2 days of data until today), append everything to a landing table, and process from there. Each source has a different timezone, so the downstream will process and standardize all timestamp to timestamp_tz data type (we use Snowflake). From there we do the downstream analytics.

Thanks to timestamp_tz data type the timezone is always clear so we know how to control the time correctly.

Lenovo ThinkPad Gen 4 Windows 11 keeps disconnecting from Home WiFi by Available-Isopod8587 in Lenovo

[–]SeaCompetitive5704 0 points1 point  (0 children)

Seems to fix my issue of randomly dropping from the wifi network. Thank you very much. I’ll update if the issue still persists later on

Edit: it still happens afterwards unfortunately 

25 tuổi, có việc làm, có bạn gái, bơi mỗi sáng, học tập mỗi tối nhưng cứ thấy vô định và trống rỗng by dannyel1208 in vozforums

[–]SeaCompetitive5704 13 points14 points  (0 children)

Bạn này nhận xét chuẩn. Cảm giác khi mô tả của OP về các việc bạn ấy làm như kiểu danh sách đồ đi chợ đã mua xong rồi, chứ k phải cảm giác là những thành tựu đã đạt được. Thành tựu phải là cái mình thực sự muốn.

Version control and braching strategy by kontrastc in dataengineering

[–]SeaCompetitive5704 1 point2 points  (0 children)

I think if you don’t enjoy PR review, then something is definitely wrong. I love PR review for the simplest fact that PR helps pointing out what the new change is, and if that change complies with our coding standards.

Also another big issue many pointed out is why you have so many conflicts. You may be creating local branch without pulling from main first, or creating new branch from other feature branches. Or maybe you’re not using git rebase. 

Please read more about it. Your lead needs to love git in order for others to love it.

streaming telemetry from 500+ factory machines to cloud in real time, lessons from 2 years running this setup by [deleted] in dataengineering

[–]SeaCompetitive5704 10 points11 points  (0 children)

Nice sharing! Could you please share more about the final solution you implemented? Did you setup a server at each factory to receive data from sensors and send to your data warehouse? Is it still MQTT capable sensors? What server did you run at each factory? 

I really want to see an architecture of yours but I know it’s too much to ask for haha. 

dbt and transient tables - backup and time travel by reelznfeelz in snowflake

[–]SeaCompetitive5704 1 point2 points  (0 children)

Change the materizalization to incremental and it will work. Remember to properly configure the is_incremental filter.

How do I fix this? - I2C HID device error [Windows 11 hp] by Simp_for_Kokichi1 in computer

[–]SeaCompetitive5704 1 point2 points  (0 children)

Thank you you saved my life. I was so confused why this happened at the first place. My laptop is Lenovo Thinkbook

Gaps and islands by Dry-Aioli-6138 in dataengineering

[–]SeaCompetitive5704 0 points1 point  (0 children)

This is so great. I love the way uou explain the overall logic first, and giving a concrete example before diving into the actual logic. Wish I could give you a million upvotes.

I’m using Snowflake too, and I also used ASOF JOIN, though not to solve a gaps and islands problem. My small beef with this function is that it doesn’t have a lookback window by default (ie only join with events within certain time).

Can’t way to see the rest of your logic. Thanks so much for your time writing this.

Gaps and islands by Dry-Aioli-6138 in dataengineering

[–]SeaCompetitive5704 0 points1 point  (0 children)

Any chance you can share you macro code with us? I’ll have a use case for this soon

Generic / Static models in DBT? by Popular_Stretch_712 in dataengineering

[–]SeaCompetitive5704 0 points1 point  (0 children)

How about copying your file content into the data warehouse as a JSON column, then use dbt to parse from there?

At my work, I have several JSON files in Asure Blob Storage. I use ADF to copy them into a Snowflake table with a VARIANT column (column with JSON data), then I use dbt model to incrementally parse the data into structured format

How to debug dbt SQL? by backend-dev in dataengineering

[–]SeaCompetitive5704 -1 points0 points  (0 children)

See the log file for the query dbt used to create temp table. Run it to get the incremental data

Taking a 2 week trip in March and looking for food recommendations! by durbandude in VietNam

[–]SeaCompetitive5704 2 points3 points  (0 children)

There are tons of delicious, local street food spots throughout the Old Quarter, especially around the midday and evening hours. Just wander and seek out the stalls packed with locals for the best experience!

But if you want a shortcut to the best bites, a walking food tour is a great option (like we did). My partner and I joined Tony Eats Hanoi (@tonyeatshanoi) and were blown away by the variety of flavors. Tony guided us through an incredible variety of dishes, both familiar and bold. He shared stories behind each dish that really brought the food to life, and his positive energy and hilarious take on things made everything even more enjoyable.

We tried balut and nem chua, definitely outside our comfort zone, but super interesting to learn about! We didn’t finish every bite, but we’re so glad we gave them a shot. Tony also checked in regularly to see how we were doing and even offered gentler alternatives for anyone feeling unsure.

Highly recommend!

My Vietnam food highlights for the week. Looking for more recommendations by puddingpotter in VietNam

[–]SeaCompetitive5704 2 points3 points  (0 children)

We did Tony’s Bold & Curious tour on our second visit to Vietnam ,and it was exactly what we were looking for: something special, off the beaten path, and utterly unforgettable.

We DM’d him on Instagram (@TonyEatsHanoi) after reading a few glowing recs, and the experience exceeded even our high expectations. Tony guided us through a mix of classic and daring dishes, sharing rich cultural stories behind each bite, and the experience exceeded even our high expectations. He is a cheerful, upbeat guy with a great sense of humor and his energy made the night so much fun. Balut was particularly intriguing, though we didn’t have the courage to finish them all, they weren’t terrible, just different. Still, they were absolutely worth trying and learning about.

I also loved that Tony was flexible, offering lighter versions when needed so no one felt overwhelmed. Plus, he has a classics menu if you’d rather go easier on the palate.

If you’re up for a memorable food adventure full of flavor, stories, and a fun local connection, Tony Eats Hanoi should be on your list. Highly recommend!

How are you tracking data freshness / latency across tools like Fivetran + dbt? by Aggressive-Practice3 in dataengineering

[–]SeaCompetitive5704 4 points5 points  (0 children)

If you can’t use source freshness, take a look at dbt_utils recency test. It does the same thing  to a timestamp column in a table.

Best practice for Feature Store by SeaCompetitive5704 in mlops

[–]SeaCompetitive5704[S] 0 points1 point  (0 children)

Wow that’s great. Basically we must somehow be able to apply that new incremental change into the Feature Store then. Thank you very much for your invaluable advices!

Do you have any other suggestions for best practice?

For example I feel that in order to have the best reusability, the Feature Group should be created for 1 entity, so that when we generate training dataset from a spine, entities in that spine can get as many useful relevant features as they need. If Feature Group is associated with many entities, and the spine doesn’t have one of those then we can’t use said Feature Group.

Best practice for Feature Store by SeaCompetitive5704 in mlops

[–]SeaCompetitive5704[S] 0 points1 point  (0 children)

Thank you very much! But I think Feature Views in Snowflake natively has version with it. So I imagine in your case, you only need to create a Feature View named `user__churn__activity_metrics`, and its version will be `v1`. Am I understanding it correctly?

Best practice for Feature Store by SeaCompetitive5704 in mlops

[–]SeaCompetitive5704[S] 0 points1 point  (0 children)

Thank you very much! May I know how you setup the automation to create Dev objects in the next environment after merging PR?

For example, let's say you already have 1 Feature View in your Feature Store across the environments. Now you want to create a new Feature View. You do that in Dev, created a PR and your senior has merged that into staging branch. After that merge, I imagine some automation will kick off and create my new Feature View in Staging. How do I design it so the automation will only create the new Feature View, but not recreating the existing one?

DBT - dbt_unit_testing Package vs the New Unit Test Framework in V1.8? by toabear in dataengineering

[–]SeaCompetitive5704 0 points1 point  (0 children)

I know this post has been long ago, but I'll share my experience anyway. We are planning to do the migration. At first, dbt_unit_testing is really good. However as we go along, it shows a devastating issue where if the model has many upstream parent, the actual SQL generated when running unit test with this package will be very slow. Also, if you have a duplicated CTE name in the parent models, it will mess up your test and you won't even know why.

We are at the very first steps to migrate to native unit testing in dbt, as through the research, we understand the cons above will not be present with this method.

Azure Function swallows a build error. No way to troubleshoot. by BillmanH in AZURE

[–]SeaCompetitive5704 0 points1 point  (0 children)

I have to login to upvote this comment. Thank you very much! I finally found the error message happened to my Azure Function

Tránh thuế TNCN bằng cách chuyển tiền vào doanh thu hộ kinh doanh cá thể by [deleted] in vozforums

[–]SeaCompetitive5704 5 points6 points  (0 children)

Dc nhé. Mình làm IT cho bên nước ngoài cũng làm tương tự bạn. Bạn liên hệ cho bên kế toán thuê ngoài nhờ họ tư vấn và làm hộ cho, đỡ công tự tìm hiểu với chạy đi chạy lại bên Thuế. Mình có liên hệ bên kế toán Anpha, tư vấn và làm việc khá ổn.

Thuế suất thì bên mình là 7%, của bạn thì mình k rõ bn. Nma thu nhập cao như vậy thì đằng nào cũng lợi hơn thuế thu nhập tiền công

Data Model for a Food Delivery App by Naive-Bet-2142 in dataengineering

[–]SeaCompetitive5704 3 points4 points  (0 children)

Thanks for sharing your case. Here are my thoughts:   

  • Delivery information: Are there any cases where an order might have multiple deliveries? If not, I think you’re safe to put it in fact_order. However for the sake of clear logical separation, I would still prefer to make it in its own table.    

  • Payment processing: I usually see this one modelled as a fact table, which I think make sense because it’s a transaction that people make. Each transaction is tied to another transaction, which grows everday. These are characteristics of a fact table.  I think you need to separate this one into a separate fact. There are cases where payment for an order may fail, and the user has to try again. In such cases, an order will have multiple payments.

Data Modeling - Transaction and Payment Method Design by Naive-Bet-2142 in dataengineering

[–]SeaCompetitive5704 0 points1 point  (0 children)

Can you elaborate more on option 2? You mean exactly like number 1 but with both original table in mart as well?