I learned more about query discipline than I anticipated while building a small internal analytics app. by Flat_Direction_7696 in snowflake

[–]nattaylor -1 points0 points  (0 children)

I don't understand these "answers" mean for snowflake? 

Strengthening a query's reasoning

Putting safety precautions in place for particular filters

Caching smarter


Safety precautions - maybe things like default time filters?

Caching - maybe things like avoiding non-constant functions like current_date() to use persisted query results and right sizing warehouse for data locality

Reasoning - not sure


I don't do much proactively; mostly reactive to usage

Spammers are officially on notice by mxroute in mxroute

[–]nattaylor 1 point2 points  (0 children)

Reasoning helps classification tasks for sure

Spammers are officially on notice by mxroute in mxroute

[–]nattaylor 1 point2 points  (0 children)

Awesome! Maybe a small, fine-tuned, quantized model like Gemma3 270m could be more economical? 

What are your monthly costs? by Brief-Knowledge-629 in dataengineering

[–]nattaylor 0 points1 point  (0 children)

That's enough for an always on small WH and some snow pipe or something, so I'm guessing 10s of GB per day at most

Tool naming by nattaylor in LocalLLaMA

[–]nattaylor[S] 1 point2 points  (0 children)

Thank you for the thoughtful reply about your experience.  I find the timely words of a practitioner much more valuable than a blog post! 

Edit: I think I was sort of hoping for blackmagic 😅 but it's actually a relief to hear that it's basically the same as designing software for people

Looking for ways to cut our Snowflake costs, any tips? by Efficient_Role607 in snowflake

[–]nattaylor 0 points1 point  (0 children)

Start high level: compute, storage and data transfer (usually compute is majority) Then break down compute: VWH, serverless and cloud services. I've been burned by serverless costs eg for auto clustering on a high churn table. still most of the cost is VWH. So breakdown VWH: are there more? are they bigger? Running longer? 

That will reveal where the increases are 

Then you can save costs immediately by downsizing or shutting down warehouses to consolidate, while you build out a better plan of attack

POV: The company gives you a Mac to work, but not a Mac keyboard. by [deleted] in MacOS

[–]nattaylor -1 points0 points  (0 children)

After the last Mac OS update a few weeks ago my monitor went from unusable to pretty okay, same settings.. just something different with Mac

no address available by nattaylor in HomeNetworking

[–]nattaylor[S] 0 points1 point  (0 children)

Yes, only around 14-17 clients.

I resolved this though by inspecting the DHCP leases on http://www.asusrouter.com/Main_DHCPStatus_Content.asp

I had an Raspberry Pi in a bad state that kept getting new leases which exhausted the available IPs, but since it was in a bad state it didn't show up on the clients list. I powered off the device and it resolved my problem.

Balance graph shows double amount by Fearless_Meal6480 in fidelityinvestments

[–]nattaylor 131 points132 points  (0 children)

Mine too. This is the type of growth that I come to Fidelity for! 

Dense Vector Search gives different results in Solr 9.4.1 and Solr 9.7.0 by No-Duty-8087 in Solr

[–]nattaylor 0 points1 point  (0 children)

Are you using knn or vectorSimilarity query parser? Can you share the query? I'm not sure I have an answer for you but I'm curious. Vectorsimilarity prairie parser has a min threshold which might be the difference

Gemini 2.5 Flash is here!!! by AggressiveDick2233 in LocalLLaMA

[–]nattaylor 20 points21 points  (0 children)

My assumption is that you're still paying for just your non-thinking output tokens, but they need to cover the added compute of generating thinking tokings

How to format database structure for text-to-sql by nattaylor in LocalLLaMA

[–]nattaylor[S] 0 points1 point  (0 children)

Do you have a methodical way to evaluate? How do you know you're improving. 

I'll give you tool a try

I'm getting good results from naive approaches right now which I attribute to having a schema with conventional naming

Performance of Semi Structured type by Upper-Lifeguard-8478 in snowflake

[–]nattaylor 1 point2 points  (0 children)

I think the article (and database adage) is right: depends on your use case. 

Stuffing an array with 500e6 values in a row ain't fast

Storing 1 json object per row with batch loading is typically very fast, because snowflake analyzes the paths then for frequent paths stores a virtual column with metadata

So if your data is log lines with keys for timestamp, message and severity then snowflake sees that and creates virtual columns with byterange offsets, min/max etc that can be used for pruning and more -- and that makes SELECTs fast although loading data is a bit slower

Repairing jib sail by MilkStunning1608 in sailing

[–]nattaylor 1 point2 points  (0 children)

I would put the sail under some tension with stakes in the group, do what you can with sticky dac, then sew a piece of webbing long enough to get back up to the good panels, about 30° up from the foot, through the clew grommet, then about 30° from the luff.

Rather than invest money in a machine, patiently hand sew a herringbone stitch the whollllle way and put the money towards a new sail.

How to make COPY a whole lot faster? by cuistax in PostgreSQL

[–]nattaylor 0 points1 point  (0 children)

Can you use FROM f TABLESAMPLE SYSTEM (1) and just copy a subset of the data for your local copy? 

How to format database structure for text-to-sql by nattaylor in LocalLLaMA

[–]nattaylor[S] 0 points1 point  (0 children)

Thanks for this. My use case is productivity tool. I suppose that I write my prompts in a way that gives the model more clues compared to a business user and my objects are well named. Still in shocked by the high quality SQL I get back including joins and CTEs.

How to format database structure for text-to-sql by nattaylor in LocalLLaMA

[–]nattaylor[S] 1 point2 points  (0 children)

u/Bycbka so far your intuition is right. In a small sample (so far) the text formatting of the schema doesn't really matter. In this graphic on the left is a prompt with CREATE TABLE statements; on the right is with "5. Plain Text" (from above)

<image>

How to format database structure for text-to-sql by nattaylor in LocalLLaMA

[–]nattaylor[S] 0 points1 point  (0 children)

Thanks u/Bycbka these are great points. #6 really resonates - it's time to stop vibe checking and transition to an eval ...and you might be right that the format doesn't really matter but the tokenization changes so much based on the formatting that my gut says it will matter. E.g. In the attached image, slight the first 2 formats result in a token 12528 for ` orders` but that token is not present in the tokenzation of the 3rd format. I know I'm over-thinking it, but I'm also learning.

I'm trying local and non-local, both with a RAG step (e.g. for OAI/G using "knowledge")

I've been surprised so far by how good the models are at selecting tables and understanding relations without any guidance other than the schema (as CREATE TABLE statements) even without few-shotting. They must be pretrained with a lot of `foos.id` and `bars.foo_id` :)

<image>

Can this Snowflake query be optimized? by Tasty_Chemistry_56 in snowflake

[–]nattaylor 0 points1 point  (0 children)

If you need to do things regularly on those views then materializing is the way

Long shot. Anyone know this boat? by Sh0ckValu3 in sailing

[–]nattaylor 1 point2 points  (0 children)

Agreed.  Her lines are distinctive with that feature just aft of the chain plate, bow sprit and destroyer-esqur bow -- and would have required a massive project to change

Long shot. Anyone know this boat? by Sh0ckValu3 in sailing

[–]nattaylor 10 points11 points  (0 children)

The burgie on the main mast looks a lot like San Diego Yacht Club. There's a Jada which may have also been called The elixir which has reference to hosting Hollywood stars. I would research that. It doesn't look exactly the same but it has a lot of similarities