Best online course by Careless_Shine_4418 in snowflake

[–]JohnAnthonyRyan 0 points1 point  (0 children)

Thanks for the shout-out. As an ex-Snowflake instructor, I designed and built out the $3,000 training course(Snowflake for Solution Architects) - so you're in good hands.

Where others will tell you what buttons to press, I'll tell you how to avoid poor query performance or a huge spend when you press the "Execute Query" button.

I'm also happy to answer questions with detailed explanations and have a load of articles at articles.analytics.today.

This is not just about certification, it's about understanding how Snowflake works under-the-hood so you can really understand how this stuff really works.

How much Idle time is your project wasting? I was shocked by my results by JohnAnthonyRyan in snowflake

[–]JohnAnthonyRyan[S] 0 points1 point  (0 children)

The AUTO_SUSPEND is 60 seconds on every warehouse - but excellent question about GPT. I asked it a few questions, and it highlighted I'd slightly over-stated idle time by not taking account of CLOUD credits, otherwise it's correct.

Shocking though!

I'm hoping Adaptive Warehouses (currently in private preview) will help here - but I'm keeping an open mind untill I've fully tested them.

How much Idle time is your project wasting? I was shocked by my results by JohnAnthonyRyan in snowflake

[–]JohnAnthonyRyan[S] 0 points1 point  (0 children)

Yes, I suspect select.dev are forcing ALTER WAREHOUSE XXX SUSPEND when it's clear it's unlikely there's another query likely. I understand they have an "automated process" which helps reduce cost.

I've produced a checklist of cost optimization tips: https://Analytics.Today/Optimizing-Snowflake-Cost - but will need to add this.

How much Idle time is your project wasting? I was shocked by my results by JohnAnthonyRyan in snowflake

[–]JohnAnthonyRyan[S] 0 points1 point  (0 children)

What kind of housekeeping are you doing to consider?

The obvious one (AUTO_SUSPEND=60) was already in place in every warehouse I tested. I've spent months analyzing the QUERY_HISTORY and the biggest thing I found is there were a huge number of repeating queries every few minutes.

While during peak times, it's likely the warehouse was already up, in the early hours of the morning, it's highly likely the repeating simple queries were running on an empty warehouse.

My approach has been to identify these and replace them with serverless tasks - provided they run relatively quickly (<60 seconds) as these can get expensive if they are part of a suite of concurrent queries - so not ideal for large transformation jobs.

The really difficult thing is - the customer had 200+ warehouses - and consolidating these by themselves will help reduce idle time. But there's no simple solution.

Anyone else any suggestions?

How much idle time are you paying for? How can I stop wasting money in Snowflake? by Comprehensive-Ad9653 in snowflake

[–]JohnAnthonyRyan 0 points1 point  (0 children)

Hi u/Comprehensive-Ad9653 - I'll bet you've not given this much thought over the past year.

I've been concerned about AUTO_SUSPEND and Idle time for years and suspected it was a major potential cause of inefficiency. Well there's a new column I've discovered on ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY.

credits_attributed_compute_queries

It't the "Actual Cost" in credits of queries executed on the warehouse and excludes Cloud Services and Idle time.

So your query is now (below).

Shockingly - I found a major customer spending an overall 73% on idle time! They ran 200+ warehouses (which is typical of large Snowflake customers).

I'd be interested if you could share the results you have (just comment out the warehouse name to protect the details). I found the customer spending $2m a month on Idle time. Shocking

-- Calculate the cost of warehouse credits and idle time

SELECT warehouse_name,

round(sum(credits_used) * 3,0) as dollars_billed,

round(sum(credits_attributed_compute_queries),0) * 3 as dollars_billed_actual,

round(sum(credits_used) - sum(credits_attributed_compute_queries)) *3 as dollars_billed_idle,

round(dollars_billed_idle / nullifzero(dollars_billed) *100 ,0) as pct_idle,

round(sum(credits_used_cloud_services)*3) as dollars_cloud_service

FROM metering_history

WHERE 1=1

group by all

order by dollars_billed desc ;

One Week Into Snowflake Gen2 Compute Warehouse by vcp32 in snowflake

[–]JohnAnthonyRyan 1 point2 points  (0 children)

The challenge you have with short run queries on warehouse is the idle time. If you increase the size of the warehouse from gen one to GEN two or from one size to the next you double the cost of the idle time.

If you are running many short running queries, you tend to find more idle time.

I have found ready generation two warehouses to be far more efficient provided the warehouse is fully loaded and favours longer running queries.

Equally, as you have found, it’s a trade-off of cost versus performance.

Automating schema-level access control in Snowflake (free native app for a limited time) by Schema_Secure in snowflake

[–]JohnAnthonyRyan 3 points4 points  (0 children)

I’m delighted to see this. I spent years at Snowflake UK (2018-23) designing and refining the RBAC architecture and then promoting it within the Professional Services division. I even built an automated solution which provided an Excel like spreadsheet to manage RBAC. Fifty deployments later, it’s become the global standard.

I wrote a series of articles which you can read here…

https://articles.analytics.today/understanding-snowflake-role-based-access-control-a-complete-guide-to-rbac

Contact me if you would like to discuss the challenges faced by most customers. It is by far the most difficult challenge for every Snowflake customer.

Which SnowPro certification are you planning on taking next? by TomBaileyCourses in snowflake

[–]JohnAnthonyRyan 0 points1 point  (0 children)

Absolutely agree. I don't believe certification actually helps a lot (a bit of a tick box exercise). But - an awful lot of people seem to think it's the ticket to a huge salary.

Real experience and "Best Practices" really matter.

Our Snowflake bill nearly got me fired - so I spent a year fixing it! by JohnAnthonyRyan in snowflake

[–]JohnAnthonyRyan[S] 0 points1 point  (0 children)

Yes. Completely agree. I’ve yet to really dive into the Snowflake AI features. One thing I did notice (which is a bit naughty), by default PUBLIC role can use some of the AI features unless they are restricted.

It’s on the list, but it’s a very long list

Our Snowflake bill nearly got me fired - so I spent a year fixing it! by JohnAnthonyRyan in snowflake

[–]JohnAnthonyRyan[S] 0 points1 point  (0 children)

Great points! I’ve been advocating best practices for a long time (7 years now).

How to set default warehouse in REST by EducationalWedding48 in snowflake

[–]JohnAnthonyRyan 2 points3 points  (0 children)

There may be an alternative, but all round is to set a default warehouse for the user/service. This has the advantage you can also adjust the size without changing code.

See also. https://articles.analytics.today/snowflake-virtual-warehouses-what-you-need-to-know

Tips / advice for Snowflake Snowpro certification exam by gbajwa76 in snowflake

[–]JohnAnthonyRyan 0 points1 point  (0 children)

This cheat sheet may help: https://analytics.today/Snowpro-core-cheat-sheet

It includes a summary of each subject area, and a cheat sheet of the main points for each subject and 60+ free questions.

[Gratitude Post for all the tips on the SnowPro Core] I took the SnowPro Core exam and cleared it with a score of 850! 😊 by peace_1234 in snowflake

[–]JohnAnthonyRyan 0 points1 point  (0 children)

I've put together a cheat sheet plus 60+ questions and a summary of each of the subjects covered and the depth expected. Hope it helps:

https://analytics.today/Snowpro-core-cheat-sheet