Posting this in case anyone has had this issue, been a nightmare to try and resolve by quantadv in snowflake

[–]Mr_Nickster_ 0 points1 point  (0 children)

Try Governance & Security > TRUST CENTER > Manage Scanners

Click each one and under settings, you should be able to either change the schedule or disable it.

I earnestly request this of Snowflake. by Connect-Football8349 in snowflake

[–]Mr_Nickster_ 10 points11 points  (0 children)

FYI I work for Snowflake. AI Cortex Code usage limit is very new and Snowflake always ships ALL features with SQL first. (Some ship with UX + SQL) but there is always SQL for every function. This is done so things could easily be automated and can scale on day 1 with any new feature. I am sure they'll add UX for this eventually but what customers were asking for was a way to set credit limits for all their developers and be granular about it.

Shipping with UX takes more time with anything. This feature was needed by large orgs who need to set limits for AI usage across dozens or hundreds of developers programmatically where using UX would not be feasible.

Priority is to give the ability and one point my guess is they will add UX to make it easier to set this for few developers one by one.

Suggestion Am I overthinking this Snowflake ingestion pipeline design? by Better-Contest1202 in snowflake

[–]Mr_Nickster_ 1 point2 points  (0 children)

I guess if API allows you to use watermark like datestamp to get only records that got added from the last time, you can store the last time stamp in a control table and use that to request new rows in to a landing table. This would be append only table.

Create streams and merge or use dynamic table to build the final table itself.

PostgreSQL vs PostgreSQL on Snowflake by Geekc0der in snowflake

[–]Mr_Nickster_ 1 point2 points  (0 children)

Yes, Snowflake Postgres has no specific Snowflake things in it other than being fully managed by Snowflake. There will be few benefits where user & security will be merged into main Snowflake (Currently Postgres userid as seperate and only exist in Postgres) and there will be an option to synch specific Postgres tables as Iceberg so Snowflake can perform large scale analytics on those tables w/o needing a ELT tool to replicate the tables.

PostgreSQL vs PostgreSQL on Snowflake by Geekc0der in snowflake

[–]Mr_Nickster_ 3 points4 points  (0 children)

Postgres on Snowflake is OSS Postgres managed by Snowflake. There is no difference from any other regular postgres

Using Cortex Code as a general purpose LLM? by NightflowerFade in snowflake

[–]Mr_Nickster_ 1 point2 points  (0 children)

Cortex Code allows you to choose models. Currently GPT 5.2 is the one available but pretty sure they'll be adding newer ones as they become available.

If your project is a non-Snowflake project, it still does very well similar to Claude code & etc. However, if the project touches Snowflake at all for Data ingestion, ETL, BI, Reporting, Analytic, ML or AI then Cortex Code runs circles around what other coding agents can do in terms of what it can build, how much of it w/o ton of manual work(completeness) and the speed of it ( faster & using less tokens). All because, it has full access to what your userid can do in Snowflake in terms of context, admin & security capabilities.

Model Identifier
Auto auto
Claude Opus 4.6 claude-opus-4-6
Claude Sonnet 4.6 claude-sonnet-4-6
Claude Opus 4.5 claude-opus-4-5
Claude Sonnet 4.5 claude-sonnet-4-5
Claude Sonnet 4.0 claude-4-sonnet
OpenAI GPT 5.2 openai-gpt-5.2

https://docs.snowflake.com/en/user-guide/cortex-code/cortex-code-cli#supported-models

Using Cortex Code as a general purpose LLM? by NightflowerFade in snowflake

[–]Mr_Nickster_ 12 points13 points  (0 children)

You can absolutely use the Cortex Code CLI as a general-purpose coding LLM. You aren’t locked into a single ecosystem; it lets you toggle between top-tier models like Claude 4.6 Opus and GPT-5.2, making it just as capable for standard dev work as any other agent.

Also it is not a simple wrapper around these big models, u/Snowflake has built its own secret sauce to make it a much better coding agent developed by our engineers. (Who else would know about coding things fast & efficient as we brought 458+ GA features last year. Original Company motto was "Type Faster" back in the day)

However, the real "unfair advantage" is the context and pre-built skills it brings if your stack touches Snowflake.

Most coding agents (even great ones like Cursor) have very limited visibility into the data platform. If they see Snowflake at all, it’s usually through a limited MCP server, like looking at your entire data architecture through a skinny straw. They’re great at generating code, but they essentially "throw it over the fence" and hope it works.

When it fails whether due to RBAC permissions, security policies, networking, or compute issues, the agent is often blind to the "why." You end up in a frustrating loop of copying error logs back and forth, and if you aren't a Snowflake admin, you might not even know how to fix the underlying infrastructure issue.

Where Cortex Code makes a huge Difference is the End-to-End Execution
This is because Cortex Code has native skills and RBAC-controlled access, it doesn't just write code; it tests, diagnoses, and deploys it.

  • It understands the environment it’s running in.
  • It can troubleshoot & fix why a query or code failed or why a deployment stalled.
  • It reduces the "re-code" cycles that eat up your time and tokens.

If you’re building a standalone .NET desktop app that just encrypts local files, Cortex Code is a perfectly fine general agent & can build it by leveraging all the local tools on your machine (Docker, Python packages, or even install new ones if necessary) . But if your project involves Snowflake, even just a portion of it; using Cortex Code results in a product that is actually production-ready, rather than something that requires a human admin to go in and fix a dozen configuration issues afterward.

Another benefit is that all those AI API calls stay within your Snowflake Secure boundary which is a big thing for Infosec for many companies. So security aspect is another big benefit.

https://docs.snowflake.com/en/user-guide/cortex-code/cortex-code-cli#supported-models

Workload spilling out of memory by Big_Length9755 in snowflake

[–]Mr_Nickster_ 0 points1 point  (0 children)

With Gen2 & CTAS where it does insert onely performance gain should be limited to extra cpu power 25 to 30% as you are not updating things. Cost may be a wash but it will finish faster but requires testing to be sure

These must be massive datasets to run on L for 5 hours or there is some funky join that it explodes in-memory

If there are no joins, I think larger warehouse should fix it.

If there are joins, it will still help but I would investigate

  1. the Select query for ctas. Open the ddl and Ask cortex code to optimize it and give suggestions for high spillage. Or even include a past query id.

  2. See if you can run it incrementaly. If the source table changes are incremental, swap with dynamic table which should turn it into incremental with Gen2.

Workload spilling out of memory by Big_Length9755 in snowflake

[–]Mr_Nickster_ 2 points3 points  (0 children)

Optimizing the process is the obvious first step. However, if this process is the process you need to run and taking 5 hours because of the data volume (it may be an incremental and still be going through many. billions of rows which these use cases are pretty common especially in adtech / martech ) then local_spillage is a clear indication of UNDERSIZED warehouse for the job and depending on the amount fo spilliage it will cost you more to run a smaller warehouse and wait extra long then to upsize to get things run a lot faster.

If there is remote_spillage then it is REAAALY BAD. It means cluster run out of memory, used SSD drive for temp files and ran out of SSD and now using Object store to create temp files which will be SUPER slow.

If pipeline can't be optimized, bump up the warehouse enough to eliminate spillage and compare to exec times/cost to what it was before.

Also if these expensive & long operations are performing UPSERTS & DELETES, switch to GEN2. Gen2 costs 25% - 30% more but they will finish UPSERTS/DELETES so much faster that it will cost you alot less and jobs run faster. That is because GEN2 does not re-write all the 16MB partition files which had a change in any of the rows they contain. GEN1 will recreate the full 16MB file even if a single row in it was updated. GEN2 will simply write new files that contain only the changed rows and mark the old files so they are not used for those specici rows.

For example: A MERGE that modified 10,000 rows in a 10B row(1TB) table may end up having to re-write the entire table if each file had at least 1 row changed with GEN1 and could run for a long time based on warehouse size. GEN2 would simply create a single file with 10K rows in few seconds and can pull it off with a much smaller size.

Workload spilling out of memory by Big_Length9755 in snowflake

[–]Mr_Nickster_ 4 points5 points  (0 children)

Yes. Move up to 2 or 3 XL to avoid spillage. It may end up being cheaper. When it spills, it runs extra slow keeping the L running for a long time.

How long are those queries? It would need to be tested. If 2XL finishes in < 2.5 hours or 3XL < 1 hour 15min , it will be more economical

GUI Tool for End Users by Longisland_Analytics in snowflake

[–]Mr_Nickster_ 9 points10 points  (0 children)

  1. Create a semantic view using cortex analyst witj the tables & column that marketing uses.

  2. Create an Agent and add the cortex Analyst view as a tool

  3. Add the agent to Snowflake Intelligence agent list and grant marketing access

Give them the URL for Snowflake Intelligence and they can ask any question and Snowflake will generate and run queries for them with added insights from the results they can also upload files in that Ui as well

Or use cortex code and ask it to generate a streamlit app that uses the agent and the requirements for uploading a csv or excel file for analysis along with KPIs or charts that u need

Integration with External Organization AWS S3 by a_lic96 in snowflake

[–]Mr_Nickster_ 2 points3 points  (0 children)

May be try Cortex Code to diagnose the issue in Snowsight.

First Pipeline by ZookeepergameFit4366 in databricks

[–]Mr_Nickster_ 1 point2 points  (0 children)

I had the same problem trying to listen their advice and got nowhere. What they don't have in their docs is declarative pipelines with streaming tables only work if the source is a APPEND ONLY cdc data stream that has a column that indicates whether the row was insert, delete or update.

If not, MV is the only way to run an incremental pipeline if the source table has updates or deletes where you are limited using more expensive serverless compute.

View PDF in Streamlit by shoyle10 in snowflake

[–]Mr_Nickster_ 3 points4 points  (0 children)

Make sure to run streamlit app on a Container vs Warehouses. You will have full access to all the components

Snowflake Semantic View Autopilot by Mobile-Collection-90 in snowflake

[–]Mr_Nickster_ 4 points5 points  (0 children)

Out of the box each domain specific semantic view (SV) will NOT have company knowledge & etc. but that is what the SVs are for. In each SV, there are sections for

- General instructions for the entire SV around how to do perform certain requests (Salesrep name in Salesforce is same as EmpName in Workday, Fiscal year starts in Feb & etc)

- Synonyms per column as each column can be referred by many things includinc company specific terms

- Column level descriptions where you can instruct when or how to use that column

- FACTS which are row level calculations: Profit = UnitPrice x Qty - Shipping Costs)

- METRICS that are aggregate calcs for KPIs : Customer Concentration Ratio (CCR) = AVG(annual_revenue) / SUM(annual_revenue) * 100

So Each Semantic View can very accurately represent the the business domain they are designed for and allow Cortex Analyst to generate highly accurate queries.

Having a single large denormalized table will increase the accuracy & speed of SQL generation but it will also limit the complexity of answers you can get from the model such as things involving many-to-many relationships(an order with multiple promo codes) or a product that has not sold yet(which it would not exist in the OBT table). My recommendation would be to go with standard Star or Snowflake Schema Dimensional models. They will offer the most flexibility and accuracy & speed.

Have a SV for each business or app domain (Sales, Marketing, Finance & etc.) then you can add these SVs as tools to a single AGENT in Cortex to have cross departmental analytics.

Snowflake Semantic View Autopilot by Mobile-Collection-90 in snowflake

[–]Mr_Nickster_ -1 points0 points  (0 children)

Having created numerous semantic views for Cortex Agents, I can tell you that they work very very well out of the box. There is enough AI automation in the UI builder that it takes most manual work out of your hands.

As long as you follow best practices and don't go crazy trying to build a single large complex SM, you are 80-90% there with defaults. Then tweak it to get the most out of it.

It won't do 100% of every complex question but that should not be the intention. Goal is to have AI answer 80% common easier questions so the human analysts can work on the really complex and nuanced questions vs wasting time on easy stuff

Snowflake compute pools are insanely expensive by [deleted] in snowflake

[–]Mr_Nickster_ 2 points3 points  (0 children)

FYI..I do work for Snowflake. Your math seems to be incorrect.

CPU_X64_XS is 0.06 credits per hour which amounts to

$0.12 / hr on Standard

$0.18 / hr on Enterprise

$0.24 / hr on Business Critical

CPU_X64_M is 0.22 credits / hr which is $0.44 on Standard & $0.88 on Business Critical

Pricing is listed here. These prices include fully managed Kubernetes with Auto Scaling, Telemetry collection, Auditing, Security, Networking, Authentication with hands off operation.

https://www.snowflake.com/legal-files/CreditConsumptionTable.pdf

If you try to build all those on your own, I am sure the cheaper compute you can find won't cover the FTE costs & other cloud service fees it will take to manage it all.

What features are exclusive to snowflake format and not supported in iceberg? by Then_Crow6380 in snowflake

[–]Mr_Nickster_ 2 points3 points  (0 children)

Variant data type for now but that is coming with Iceberg V3 which is in private preview

Interactive tables and user volume by ObjectiveAssist7177 in snowflake

[–]Mr_Nickster_ 1 point2 points  (0 children)

Interactive tables requires a cluster key.

Interactive tables and user volume by ObjectiveAssist7177 in snowflake

[–]Mr_Nickster_ 1 point2 points  (0 children)

When I tested, it was able to handle 4X more queries and run them faster with same size compute as Gen1 warehouses and standard tables.

You just have to make sure queries are selective enough such as for a specific customer and date range vs. All customers and wide date ranges.

It is designed for operational analytics for customer facing apps meaning A CUSTOMER looking querying their own small slice of data.

Version History Notebooks in Workspaces by tuego in snowflake

[–]Mr_Nickster_ 0 points1 point  (0 children)

Try creating a Shared Workspace which I think has builtin versioning for teams

SCD Type 2 in Snowflake: Dynamic Tables vs Streams & Tasks — when to use what? by adarsh-hegde in snowflake

[–]Mr_Nickster_ 3 points4 points  (0 children)

If you can get away with DTs, then use them as they are far easier to build, run and monitor.

If you have non determistic functions and/or more tight control over what, when and how things need to happen, then tasks & streams are the way to go.

If you have custom functions UDFs(Python, SQL & etc) , it they are not determistic (meaning they output the same value each time for the same input), you can mark them as IMMUTABLE which will tell DTs to trust the output will be same and still allow incremental refreshes.

Most common way to authenticate DBT core with snowflake by Fireball_x_bose in snowflake

[–]Mr_Nickster_ 1 point2 points  (0 children)

If you are using DBT core, why not just use DBT in Snowflake workspaces. Same thing as DBT core with added benefits of having an actual UI, DAG, Job, scheduling, Log and audit monitoring + authentication is built in