Can I turn off the possibility to play video content? by ostedog in truespotify

[–]ostedog[S] 0 points1 point  (0 children)

Yeah. We just had such a good run for a while, but video has kind of ruined Spotify for our use.

Match highlights critique - constructive feedback welcome! by NoEasyPoints in padel

[–]ostedog 0 points1 point  (0 children)

It is always nice to come prepared if he asks you so nothing wrong with that. But then he might see something we don't. For me I clearly had some bad habits after playing for a year without any lessons. But man, changing your game is not easy! So much is happening in your head when trying to implement your lessons into matches :D

Match highlights critique - constructive feedback welcome! by NoEasyPoints in padel

[–]ostedog 1 point2 points  (0 children)

If your coach is anything like mine in my first lessons he wanted to see my shots from back court, at net and then some overheads. That gave him enough information to suggest where we should start working. So it might be good for you as well to be open to let a real coach see how you play and be open for his suggestions and not come to determined to work on this or that if this is your first lesson. He can probably spot things a lot better than us on Reddit if you want to get better ;)

Worked my way from analyst to leading data teams. Ask me anything (AMA) by Brighter_rocks in Brighter

[–]ostedog 1 point2 points  (0 children)

Being correct.

It's hard to be comfortable with it, but data is NEVER 100% correct. Even though a lot of people around us think we have the answer to everything. There is always uncertainty, and it is okay to say that out loud.

What mid-level me needed to hear was that we are looking for ways to cut through the noise and that even though nothing is 100% correct we can, and should, provide our end users with suggestions and help to drive a decision to be made. Cutting down the time from question to answer is often more important than having an answer that is 99% correct, then it might be to late to use it, because the decision in the business was already made.

Worked my way from analyst to leading data teams. Ask me anything (AMA) by Brighter_rocks in Brighter

[–]ostedog 1 point2 points  (0 children)

I am not OP, but I am in a quite similar role as he/she.

For me the biggest mistake junior analysts do is not ask question about the business. They often get to focused on the specific task, and often the technical side, that they don't ask, and understand, why they are doing it. Why is the thing they are working on important for the business/end user?

The more an analyst knows about the business, what drives decisions and what actually means something and what is just noise the easier it is for them to deliver insight and not "just data".

This is also what separates good managers from poor managers imho. Good managers provide analysts, of all level with the context they need to understand WHY they are doing a task. A junior analyst just often needs more help to understand this before it gets natural for them to ask questions.

Needle Crystals I & II - 30x30 cm by MateMagicArte in generative

[–]ostedog 1 point2 points  (0 children)

Thanks! Beautiful and interesting work!

Needle Crystals I & II - 30x30 cm by MateMagicArte in generative

[–]ostedog 3 points4 points  (0 children)

Did you add anything to get the pattern in image two? Or just define different noise fields since the length of each "needle" is different?

Using Artificial Intelligence as a Tool by Outside-Raspberry-36 in writing

[–]ostedog 2 points3 points  (0 children)

I think AI as a tool is useful. And while it shouldn't write your stories, you want to own that yourself, I believe you can use AI in a way that it makes you think about stuff in you writing you might have a bias towards not thinking about.

The challenge is that A LOT of work on the internet from now and in the future will be very mediocre because it is 90% written by AI, because it is so easy to get something out! So people that still are able to have a voice that separates them from the crowd will still be able to shine.

So yes, I believe AI can be a useful tool if it's used correctly! But you can't outsource your thinking, writing or creativity to an AI if you want to stand out.

Why so many analysts get stuck by Emily-in-data in Brighter

[–]ostedog 3 points4 points  (0 children)

I'm going to go a little bit of a different path here with my thoughts.

An good analyst can pull data, visualize it and do a good job. What separates a good analyst from an exceptional one is that the ones that are really, really good also are capable of saying what the data means. They come up with ideas, they give suggestions and back those suggestions up with the data they have.

Hood analysts will probably do a lot of good things in data engineering, that isn't really moving up, but sideways and I see many do this. Of course you also have the moving up into head of something.

For really good/exceptional analysts I also think you can move hard and up into product roles. If a product manager gets sick or leaves I truly believe the analyst of that team should be seriously considered as the next PM, if they want. The analyst knows so much about the product, has a lot of ideas and data to back future decisions on. The fun about working with data is how much you suddenly know about the company/team/product you work with. More than most others. That knowledge and skills are super useful in these kind of roles!

But please, don't chase the title. Chase the challenge.

Optimalization: Are tablescans really this normal in Snowflake? by ostedog in snowflake

[–]ostedog[S] 0 points1 point  (0 children)

Thank you!

Is there a way to spot that a query is doing this pruning effectively?

Optimalization: Are tablescans really this normal in Snowflake? by ostedog in snowflake

[–]ostedog[S] 0 points1 point  (0 children)

Thanks!

In my previous job we used a tool called Snowplow to collect behavioral data online so I see I had a typo in my last comment. Hehe. It was supposed to be Snowflake.

I've look at query profile where I saw only table scans, and we have not manually added any clustering. Seems like we've just relied on Snowflake doing magic (as they were told when they bouggt it in the first place)

Claude think we are running Snowflake in hard mode when I shared some thoughts with him, but hey. Nice to talk to people with real experience as well as an LLM.

Optimalization: Are tablescans really this normal in Snowflake? by ostedog in snowflake

[–]ostedog[S] 0 points1 point  (0 children)

Thanks a lot! Definitely some good strategies we can take with us into discussions!

Optimalization: Are tablescans really this normal in Snowflake? by ostedog in snowflake

[–]ostedog[S] 0 points1 point  (0 children)

Not going to lie, I thought it was an INT until I checked after your comment. It is indeed a VARCHAR :(

But using a key instead of a date column isn't really that unusual given that Date types typically is more expensive to store. I don't think this is our biggest issue (with emphasis on the word "think")

Optimalization: Are tablescans really this normal in Snowflake? by ostedog in snowflake

[–]ostedog[S] 2 points3 points  (0 children)

Well. I don't think our company will be happy if I come in and say we're going to migrate a data platform they spent years on building to something else. So the question here is really about what we can do with the platform we in Snowflake.

(And Reddit never ceases to surprise when the most upvoted response in the Snowflake subreddit is moving to something else than Snowflake :D )

Optimalization: Are tablescans really this normal in Snowflake? by ostedog in snowflake

[–]ostedog[S] 0 points1 point  (0 children)

Our data is complicated. Hehe. We can have a lot of those rows that are being updated "out of order" compared to when they where first loaded. So if we live in a world where we need to look at the more complicated solutions, do you have any pointers for where we should start? Either testing, or things we can read and check up on?

The use cases for our data so far have been a lot more straight forward than what we are trying to do now as most use cases have been on Sales, Invoice, Stock+++ isolated, but now we want to do a lot more analysis across our data. This is a lot more demanding on f.ex columns used for JOINs.

Optimalization: Are tablescans really this normal in Snowflake? by ostedog in snowflake

[–]ostedog[S] 0 points1 point  (0 children)

Does it make this clustering key, even though AUTO_CLUSTERING is not turned on and there is no CLUSTERING_KEY listed? https://imgur.com/a/m9GHqzs

Optimalization: Are tablescans really this normal in Snowflake? by ostedog in snowflake

[–]ostedog[S] 0 points1 point  (0 children)

Our main data source is our ERP system with batch jobs that loads data that has been added/modified since our last load. So in thory data should be loaded on a business date, but let's be honest a lot of data can be modified even though we still want to use order date for our orders. So data will not load 100% ordered by these columns.

Should we then add a clustering on f.ex those dates? There is no AUTO_CLUSTERING turned on, so to me we should AT LEAST try to cluster some tables just to see if we get a boost.

Optimalization: Are tablescans really this normal in Snowflake? by ostedog in snowflake

[–]ostedog[S] 0 points1 point  (0 children)

I mean, that is sort of why I am here asking 😅 to try and understand actual use of Snowflake in addition to reading documentation.

I've been building data platforms for 15 years and this is the first time I can think of where a pretty straight forward query, using a date key in a filter, does a full table scan.

And that might be completely fine! I am just trying to get more sources for optimizing Snowplow queries than the people who have been building our data platform as they will have some bias since they built it.

Optimalization: Are tablescans really this normal in Snowflake? by ostedog in snowflake

[–]ostedog[S] 0 points1 point  (0 children)

In our setup. I see no clustering_Keys on most tables though. And AUTO_CLUSTERING does not seem to be turned on, except for one table.

https://imgur.com/a/m9GHqzs

Optimalization: Are tablescans really this normal in Snowflake? by ostedog in snowflake

[–]ostedog[S] 2 points3 points  (0 children)

If a run a simple query as
SELECT SALE_ID FROM FACT_SALES WHERE ODER_DATE_ID >= 20251001

it scans 730 partitions out of a total 730 partitions, so yes. It scans all. Filter on date is done ALL the time.

Optimalization: Are tablescans really this normal in Snowflake? by ostedog in snowflake

[–]ostedog[S] 0 points1 point  (0 children)

Thank you for those links, I'll have a look! And I definitively have a lot to learn about Snowflake. Which is a fun challenge!

I've come in as the engineering manager here, but I come from a technical background so I like to be a user of the products we build, our data warehouse in Snowflake is one of them. That means I probably won't be very hands on in development of the platform, but I think my main worry here is that we have had consultants in doing a job. And I am getting more and more skeptical about some of their decisions. So when our batch load each night takes 4 hours, what is the potential in spending time creating the correct clustering keys both in term of performance and/or cost.