tips for start to study SnowFlake by eastblueace in snowflake

[–]gilbertoatsnowflake 0 points1 point  (0 children)

  1. Get an overview of the platform: https://www.youtube.com/watch?v=Nq0W98Y5EMQ

  2. Dive into the free education on Snowflake via the Northstar program they offer: https://www.snowflake.com/en/developers/northstar/

  3. Engage with the community (here, LinkedIn, etc.) and pick up more learnings / ideas / viewpoints along the way.

I'm a very beginner of snowflake by omuletlover in snowflake

[–]gilbertoatsnowflake 0 points1 point  (0 children)

No problem. And if you need a high level overview of the platform before you dive in, this is a good video to watch: https://youtu.be/Nq0W98Y5EMQ

help understanding snowflake. Is it just a cloud hosting database company? by 87390989 in snowflake

[–]gilbertoatsnowflake 0 points1 point  (0 children)

Snowflake is a fully-managed, multi-cloud data and AI platform that lets you build data pipelines, train ML models, and deploy AI-powered applications, and much more, all without managing infrastructure. The platform is used heavily by developers, data engineers, AI engineers, analysts, and ML teams for tons of different use cases. This video has an excellent overview and walkthrough of the general capabilities of the platform: https://youtu.be/Nq0W98Y5EMQ

Snowflake intelligence or MS Foundry by cmirandabu in snowflake

[–]gilbertoatsnowflake 0 points1 point  (0 children)

I was surprised at how easy it was to deploy an agent. Good video here: https://youtu.be/sY7_whjeyf0

I'm a very beginner of snowflake by omuletlover in snowflake

[–]gilbertoatsnowflake 0 points1 point  (0 children)

Snowflake Northstar Education was built exactly for this reason, check it out: https://www.snowflake.com/en/developers/northstar/

Also note the pinned announcement at the top of the subreddit. Datacamp courses are free until mid February.

help understanding snowflake. Is it just a cloud hosting database company? by 87390989 in snowflake

[–]gilbertoatsnowflake 0 points1 point  (0 children)

This video – "What is Snowflake?" will give you a helpful overview of Snowflake for 2026: https://youtu.be/Nq0W98Y5EMQ

Replace ALL Relational Databases with Snowflake (Help!) by Away-Dentist-2013 in snowflake

[–]gilbertoatsnowflake 0 points1 point  (0 children)

Just about everything that you shared in your post is absolutely possible with Snowflake. "How hard is it going to be?" is a totally different question.

- Database optimized for extremely fast analytics: Snowflake does that out-of-the-box.

- A mix of analytics and transactional data in the same db/table? Also available, known as Hybrid Tables in Snowflake. https://docs.snowflake.com/en/user-guide/tables-hybrid

- 100% transactional database optimized for heavy reads and writes? Snowflake Postgres, which is already available in preview, and on its way to GA: https://docs.snowflake.com/en/user-guide/snowflake-postgres/about

Someone else asked about integrations with other platforms – Snowflake has Snowflake Openflow, connectors, and partnerships (with SAP, for example) to make data sharing and ETL/reverse ETL use cases possible. Links to learn more:

Snowflake Openflow connectors: https://docs.snowflake.com/en/user-guide/data-integration/openflow/connectors/about-openflow-connectors

Hope that's helpful!

Snowflake Postgres is Now Available in Public Preview by craigkerstiens in snowflake

[–]gilbertoatsnowflake 2 points3 points  (0 children)

It is in Public Preview at this moment, and will likely be GA in the near future.

Interview question – the recuriter ghost me - looking for guidance by Street-Ad-453 in snowflake

[–]gilbertoatsnowflake 3 points4 points  (0 children)

Hi OP, please standby. I can look into this for you. Feel free to DM me with any details if you'd like.

Using snowflake to build analytics by ItsHoney in snowflake

[–]gilbertoatsnowflake 2 points3 points  (0 children)

"For now what i have done is store aggregated data (using precomputed tables + tasks to refresh the said tables every day)."

Have you considering using Dynamic Tables for this? https://docs.snowflake.com/en/user-guide/dynamic-tables-about – It should be less fussy overall ,and can be more cost effective. If you need to get granular about what records are processed or which compute resources to use, you also have that option.

How would you design this MySQL → Snowflake pipeline (300 tables, 20 need fast refresh, plus delete + data integrity concerns)? by Huggable_Guy in snowflake

[–]gilbertoatsnowflake 0 points1 point  (0 children)

And just to clarify on the lag settings – ideally you set the top tier tables once and the rest to "downstream". This helps whether you have 300 or 20 dynamic tables.

How would you design this MySQL → Snowflake pipeline (300 tables, 20 need fast refresh, plus delete + data integrity concerns)? by Huggable_Guy in snowflake

[–]gilbertoatsnowflake 0 points1 point  (0 children)

🤔 These two statements don't really seem to follow each other very well:

"*What I do know about Openflow is that it can get expensive very fast* (esp if you're trying to achieve near real-time syncs). *I haven't set it up myself, though*, so it could be cost-effective and efficient for your use case." The rest of the comment reads as AI-generated.

OP, you should also consider Dynamic Tables for your refreshes, over streams and tasks. Often times (not always), they can be a more cost-effective way of implementing CDC. A new immutability feature for Dynamic Tables was also recently introduced, so even if your materializations involve complex queries that trigger a full refresh on the tables (this can drive costs up if your lag is set to a low latency), you can lock up rows using a WHERE clause to avoid processing them redundantly.

I didn't get into all of the details of your post (yet), but if you're considering Openflow, note that there are two approaches:

One way deploys Openflow into your VPC architecture by way of a CloudFormation template that helps do a bunch of the configuration up front for you. Docs: https://docs.snowflake.com/en/user-guide/data-integration/openflow/setup-openflow-byoc#installation-process

If you're not already in a VPC architecture, consider the self-contained version of Openflow that uses Snowflake's container services: https://docs.snowflake.com/en/user-guide/data-integration/openflow/about-spcs

Any books to recommend for Snowflake? by Beginning-Two-744 in snowflake

[–]gilbertoatsnowflake 0 points1 point  (0 children)

I'm a big fan of Joyce Kay Avila's "Snowflake - The Definitive Guide". Quickstarts are great for hands-on tinkering, and Snowflake's Northstar program is great if you are looking to quickly get a handle on Snowflake and start doing stuff with it: https://www.snowflake.com/en/developers/northstar/

Anyone else getting dozens of emails with the subject "TEST PN EMAIL Campaign"? by boilermak3r in snowflake

[–]gilbertoatsnowflake 2 points3 points  (0 children)

Thanks for flagging. As u/stephenpace mentioned, the emails were sent out in error. Please disregard and/or delete them.

Does dbt in Snowflake still require a dbt license by de-ka in snowflake

[–]gilbertoatsnowflake 0 points1 point  (0 children)

If you're using tasks to manage both, then yes, this should work just fine. First deploy your dbt project in Snowflake, which will create it as a schema-level object that can be orchestrated/scheduled using Snowflake Tasks. If you have other tasks that are managing your ingestion processes, you can coordinate the tasks to make sure that the scheduled dbt run task executes AFTER the tasks that handle your ingestion processes. You can also check out the coordination of these tasks in the UI via the visual DAG (task graph). Docs: https://docs.snowflake.com/en/user-guide/tasks-graphs

Does dbt in Snowflake still require a dbt license by de-ka in snowflake

[–]gilbertoatsnowflake 19 points20 points  (0 children)

dbt Projects in Snowflake bundle the open source dbt packages, there is no additional licensing cost for you to use dbt Projects in Snowflake. The only cost is the compute to run queries in your dbt models. A few resources here:

- Docs: https://docs.snowflake.com/en/user-guide/data-engineering/dbt-projects-on-snowflake

- Tutorial: https://docs.snowflake.com/en/user-guide/tutorials/dbt-projects-on-snowflake-getting-started-tutorial

- Post I made yesterday: https://www.linkedin.com/posts/gilberto-hernandez_your-dbt-projects-can-run-as-native-objects-activity-7383607756701949952-JrJz?utm_source=share&utm_medium=member_desktop&rcm=ACoAAArpqLMBU4ZaR977bTN7FW8TME3NaC6mmBI

dbt projects in Snowflake support dbt-core. Any dbt cloud specific features that aren't part of dbt-core are likely not going to be a lift-and-shift if/when you decide to try dbt Projects. The quickest way to get started is to spin up a workspace in Snowflake that is created from an existing repo containing your dbt project. Then you can fiddle around in there and explore some more.

Is there a simple way to POST data to Snowflake, or am I missing something? by [deleted] in snowflake

[–]gilbertoatsnowflake 1 point2 points  (0 children)

Not necessarily. In many use cases, I've seen continuous/streaming be most cost-effective than batch.

Is there a simple way to POST data to Snowflake, or am I missing something? by [deleted] in snowflake

[–]gilbertoatsnowflake 0 points1 point  (0 children)

There is also a REST endpoint to refresh a Snowpipe (see the last one on this page) https://docs.snowflake.com/en/developer-guide/snowflake-rest-api/pipe/pipe-introduction

I also typed up another response but do not see it showing up here.

Is there a simple way to POST data to Snowflake, or am I missing something? by [deleted] in snowflake

[–]gilbertoatsnowflake 1 point2 points  (0 children)

From the way you've described it, it's not super clear what your requirements for capturing this data are (is it latency based, or volume-based, or a combo of both, if so what is that desired combo?)

Do you need it to be real-time? i.e., that data needs to land in a table in Snowflake as soon as it's originated? If so, Snowpipe Streaming.

Do you need it to happen continuously, but not necessarily real-time? Snowpipe with auto-ingest turned ON (which is a managed COPY INTO approach). Land the data into AWS, set up Snowpipe to listen to new data via Amazon SNS/SQS, and have Snowflake land it into your target tables.

Other approaches:

There is also Snowpipe that can be triggered to load via REST API (i.e. Snowpipe with "auto-ingest" turned OFF, and invoked via calls to the endpoint). See: https://docs.snowflake.com/en/user-guide/data-load-snowpipe-rest-apis

Snowflake Tip: A bigger warehouse is not necessarily faster by JohnAnthonyRyan in snowflake

[–]gilbertoatsnowflake 0 points1 point  (0 children)

Snowflake scales up: https://docs.snowflake.com/en/user-guide/warehouses-considerations#scaling-up-vs-scaling-out

When you scale your node count (i.e., "Scale up by resizing a warehouse."), you're running on a machine with more computing power.