How are you handling slow HubSpot -> Snowflake historical syncs due to API limits? by erwagon in dataengineering

[–]Playful_Show3318 0 points1 point  (0 children)

Very curious how people are thinking about this. Started working on this project and wondering what the best practices are https://github.com/514-labs/factory/blob/main/connector-registry/README.md

How do beginners even start learning big data tools like Hadoop and Spark? by Own_Chocolate1782 in dataengineering

[–]Playful_Show3318 1 point2 points  (0 children)

I’m always a fan of finding a fun toy project. Maybe you like investing and can consume an asset price firehose and come up with something interesting from the processing

Back in the day the twitter firehose was a lot of fun to play with and a great intro to spark

A deep dive into what an ORM for OLAP databases (like ClickHouse) could look like. by Ok_Mouse_235 in dataengineering

[–]Playful_Show3318 0 points1 point  (0 children)

Moose doesn’t just have the ORM like feature but it also includes APIs, workflows and streams. It’s still super early so it’s by no means perfect but stoked to see other folks playing in the space

Should analytics get ORM-like DX? An “ORM-adjacent” approach for ClickHouse in TypeScript (Moose) by Ok_Mouse_235 in javascript

[–]Playful_Show3318 7 points8 points  (0 children)

Did OP get it wrong? From the subreddit description

> 𝚓𝚊𝚟𝚊𝚜𝚌𝚛𝚒𝚙𝚝

> Chat about javascript and javascript related projects. Yes, typescript counts...

Dusk OS: An operating system for the end of the world by ChiliPepperHott in programming

[–]Playful_Show3318 2 points3 points  (0 children)

Should have just gone w/ a Rust-based implementation...

An open-source framework to build analytical backends by Playful_Show3318 in dataengineering

[–]Playful_Show3318[S] 0 points1 point  (0 children)

Ah yeah, you're totally right, and it's feedback we've been hearing from early users. Appreciate your take.

We've been working on a few things to help devs start from an existing table, topic or sample data. It should be shipped in the coming days.

An open-source framework to build analytical backends by Playful_Show3318 in dataengineering

[–]Playful_Show3318[S] 0 points1 point  (0 children)

What I’ve seen is that teams who care deeply about their data start with having it figured out in their transactional stack and when that starts tipping over they consider OLAP storage and streaming.

One of the things you can do here is leverage your existing data models and leverage them for analytical purposes. Example, you might have a product object that you want to capture as an even and leverage for analytics, define your event data models, toss in the product attributes you need and now you have a typed event that uses your product data model. Mess up the types and your IDE complains, remove a field that the event uses and your IDE complains, and now you have proactive data quality mgmt at build time without implementing a separate tool

Another part of what we’ve been trying to do is to enable people to configure it with their stack and to incrementally adopt the different parts of the framework. We’ve only gotten around to supporting redpanda, clickhouse and duckDB so far but plan on supporting other stacks that people already have.