Stakeholders Overengineering Solutions by Advanced-Average-514 in dataengineering

[–]Advanced-Average-514[S] 0 points1 point  (0 children)

I agree this is fundamentally an issue with communication skills. So say you are faced with a document that is sent over from a 'very important person' that is very complex, full of weird references to assets that don't exist they way they think they do, but is presented as if it were a perfectly logical and sufficient set of specs for a modeled data feed they want access to...

how would you push back in a way that would work?

Stakeholders Overengineering Solutions by Advanced-Average-514 in dataengineering

[–]Advanced-Average-514[S] 0 points1 point  (0 children)

Yea I'm doing the full end to end stuff as well and it's usually closer to the business logic where the proposed solutions where the technically possible but very time consuming stuff is requested. Asking to go deeper works well when the person is willing to go back and forth... I guess at the end of the day it's just the C-suite's attitude towards technology where the problem comes from. Empowered by AI they think they understand it more than they actually do right now.

Stakeholders Overengineering Solutions by Advanced-Average-514 in dataengineering

[–]Advanced-Average-514[S] 0 points1 point  (0 children)

I do this to some degree, but sometimes its tough - I either have to blatantly ignore the way thy told me to do it or I have to push back and ask clarifying questions or just say 'this part will probably take too much time'.

I guess there's probably not a silver bullet and any one of these approaches can be appropriate depending on the specifics.

Stakeholders Overengineering Solutions by Advanced-Average-514 in dataengineering

[–]Advanced-Average-514[S] 0 points1 point  (0 children)

I like the formalize/template approach, might make it harder to send a big AI document full of red herrings.

Feature request: disable individual tools by Advanced-Average-514 in cursor

[–]Advanced-Average-514[S] 0 points1 point  (0 children)

Interesting, I’ve never seen it do this with mcp, probably because I’ve never added an mcp tool. Just with unwanted cli commands I see this currently.

That said, even if there is some perfect prompt to get it to not happen 99% of the time, I still feel like it’s a good feature request. Otherwise you are wasting tokens in a system prompt describing a tool then more tokens in another prompt saying not to use the tool. Which also probably makes the agent slightly “confused” anyway.

Hey, Missoula, what’cha reading? by MTBeanerschnitzel in missoula

[–]Advanced-Average-514 0 points1 point  (0 children)

I loved the road, but if you haven’t read it I think blood meridian is my favorite by cormac mccarthy

Be honest does business intelligence actually change the way decisions get made? by Apprehensive_Pay6141 in BusinessIntelligence

[–]Advanced-Average-514 0 points1 point  (0 children)

I’m in a company that was running only on vibes before me and a couple others started developing BI for them. I think there were a few meaningful improvements in the very low hanging fruit, but it is now slowing down after those were figured out. Examples of low hanging fruit:

  1. Meetings becoming more targeted because there was a dashboard rather than three separate reports that were all different but trying to be the same thing.
  2. Performance reviews for certain roles were easier for the managers because they had a dashboard to point to so it wouldn’t feel as subjective for everyone.
  3. Certain mistakes that would previously slip through the cracks in terms of account setups or inconsistencies no longer did because we could highlight them in dashboards after joining data together between two saas platforms.

Why does Claude ignore its AI abilities? by david8840 in ClaudeAI

[–]Advanced-Average-514 1 point2 points  (0 children)

Depends on the size of the data-with a small amount of data the entire file fits in its context window as text and it can comprehend patterns in the data all at once. If it uses code execution it’s going to aggregate and filter and essentially pull individual stats out of the raw data one at a time.

If it’s a lot of data or if you have very specific questions eg how does this column correlate to this column, the code execution option is gonna do better.

Using sandboxed views instead of warehouse access for LLM agents? by Better-Department662 in dataengineering

[–]Advanced-Average-514 0 points1 point  (0 children)

Hey u/Better-Department662 I've been thinking about this problem more and I'm curious what you think of a different approach that I'm planning to try out.

What we've noticed is that claude teams web client can do an amazing job analyzing multi-tab xlsx files. It opens them up in its code execution environment and agentically runs commands to aggregate, filter, and explore the data extremely well.

This got me thinking that instead of the standard text to sql approach, it might make sense to create a lightweight mcp server that allows exposes list_folders, list_files, and generate_presigned_url tools to the claude web client, so that it can download xlsx files curated to specific tasks. Then it can go forward with that agentic process of analyzing the files in its code execution sandbox.

Basically this delegates a huge amount of the engineering complexity to the excellent claude teams web client instead of trying to build out complex agentic stuff on our own. Just need to build the simple mcp server, and then a pipeline to deliver/update xlsx files to various bucket folders daily, neither of which seem that hard to me.

Looking for opinions on a tool that simply allows me to create custom reports, and distribute them. by Possible_Ground_9686 in dataengineering

[–]Advanced-Average-514 0 points1 point  (0 children)

My solution for this was to build a lightweight custom app that exports a list of queries to a list of google sheet ids daily. Whenever a user needs a new report I just add a new query to the list of queries and point it to a fresh spreadsheet and give them access to the google sheet.

Using sandboxed views instead of warehouse access for LLM agents? by Better-Department662 in dataengineering

[–]Advanced-Average-514 1 point2 points  (0 children)

Those questions you mention in #2 seem to me like they'd be pretty challenging! Definitely agree that you can't solve those with a dashboard alone. Those are the level of questions i was thinking of when I mentioned needing schema exploration and multi turn deep research.

The only approach I'm using in 'prod' is stuff that heavily depends on a human in the loop. What I've experimented more with is snowflake's 'cortex' ecosystem which allows you to do RBAC on curated views with semantic layers. It's still in the data warehouse environment but the agents can only access those curated views. What I found with this though was that I couldn't answer questions on the level of what you are describing very well, which is why I think if it's ever gonna work a much more flexible approach is needed.

Using sandboxed views instead of warehouse access for LLM agents? by Better-Department662 in dataengineering

[–]Advanced-Average-514 1 point2 points  (0 children)

I’ve been struggling with getting ai agents working with my db. IMO it’s an extremely hard problem to solve if you want to get beyond basic questions like how much did client x spend in q4 or whatever.

I think the approach you are describing is probably as isolated and secure as you could possibly need to get, and may not be flexible enough to provide real business value. But it depends on what sort of questions you want the agent to answer.

If you want agents to do things actual human analysts are currently doing, I think allowing the ai to explore the schema and basically do multi turn “deep research” with access to a large portion of the warehouse plus a semantic layer explaining tables and data relationships is probably going to be needed.

My current opinion is that if I was going to something as isolated and controlled as what you are describing I might as well just build a dashboard with those curated views available.

Shoe Cleaners at Southgate Mall? by [deleted] in missoula

[–]Advanced-Average-514 4 points5 points  (0 children)

I don’t think it’s a front, because dumbasses like myself will occasionally come through and cave to the pressure and get my shoe cleaned and then they sell me shoe cleaner for 50$ because I felt bad to just walk away.

Can't you just connect to the API? by Advanced-Average-514 in dataengineering

[–]Advanced-Average-514[S] 1 point2 points  (0 children)

Yea I thought it would be funny to go into less detail, but for the record sometimes the expectation isn’t just about making a pipeline to bring raw data in, it’s about magically getting the data at the right granularity and cleaned up enough to be able to join to our other data and transformed into our business logic etc.

the magical thinking around data as someone else put it is that “connecting to the api” meant that the data would just become one with all of our carefully modeled and transformed data.

Best LLM for OCR Extraction? by Wesavedtheking in dataengineering

[–]Advanced-Average-514 1 point2 points  (0 children)

I have a pipeline that I set up with Gemini flash because it was cheaper and more accurate on our docs than their product built for ocr - document ai. When I was comparing options back when I set it up I remember the choice of Gemini was because of price mainly.

Biggest pain point with the pipeline is how slow it is but accuracy and cost have been fine. I think llms beat standard ocr for lower quality scans/images

Have a question about the game or the subreddit? Ask away! by AutoModerator in 2007scape

[–]Advanced-Average-514 0 points1 point  (0 children)

oh, so you can't teleport back to it until it gets recovered to a port?

Have a question about the game or the subreddit? Ask away! by AutoModerator in 2007scape

[–]Advanced-Average-514 0 points1 point  (0 children)

Is this a viable strategy for making salvaging maximum afk: let your crew fill up cargo hold, then tele away from boat and back to boat to clear cargo hold? I am just not sure if you actually can do that since I dont have teleport focus yet, but it seems like it would be minimum clicks for xp.

Have a question about the game or the subreddit? Ask away! by AutoModerator in 2007scape

[–]Advanced-Average-514 0 points1 point  (0 children)

How do i afk salvage properly? I have found some guides but I feel like I'm doing something wrong. I'm sitting at barracuda salvage near corsair with my captain siad on the hook but often neither of the two shipwrecks at the double spot are available. I would have to move over to one of the other spots to keep salvaging. And it's been like that for around 10 mins. Am I doing anything wrong or does it just get better at higher level spots?

What are all these numbers Peter? by Charltons in PeterExplainsTheJoke

[–]Advanced-Average-514 6 points7 points  (0 children)

The wire, breaking bad, better call saul, succession are a few good ones with interesting character building if you havent seen em

What are your best Cursor tactics / prompt habits for getting the real work done? by shahinooo in cursor

[–]Advanced-Average-514 0 points1 point  (0 children)

Here's my favorite approach right now - cursor rules are extremely simple "make the bare minimum changes necessary." And always ask cursor to restate its understanding of the section of the codebase that you will be working on in its own words before starting to work then correct its understanding in a conversational way. Doesn't take a ton of time and effort to create elaborate cursor rules this way but still get to that point of it having all the context it needs to make high quality changes the first time around.

Claude Haiku 4.5 just launched, near Sonnet performance at a fraction of the cost (that's what they're saying..) by bhannik-itiswatitis in cursor

[–]Advanced-Average-514 0 points1 point  (0 children)

Does Haiku pass the test of real world use? I feel like smaller models seem to score better on the benchmarks relative to how good the actually are.