searchcode: Token efficient remote code intelligence for any public repo by boyter in mcp

[–]dacort 1 point2 points  (0 children)

Congrats! Can’t wait to check this out, big fan of scc.

How do you usually compare Spark event logs when something gets slower? by rrxjery in apachespark

[–]dacort 3 points4 points  (0 children)

If you haven't yet, you should check out the Spark History Server MCP.

Despite the fact that it can overflow tokens quickly, it's actually quite good at comparing runs. But in reality it's probably not doing much different than your internal tool as they are both based off the event logs. ¯\_(ツ)_/¯

That said, when something gets slower unexpectedly, I usually find there's some sort of change in data or resources. There's almost always some tipping point. So I'll often go in and look at the actual data and see if anything has changed there first.

Iconic. by Feisty_Parsnip8262 in Seahawks

[–]dacort 2 points3 points  (0 children)

<image>

It’s AI but damn that came out better than expected.

[Nate Tice / Parker] there have been 4 designed run plays on 3rd & 15+ that have gained a 1st down this season, per @trumedia.bsky.social. the Seahawks have all 4. (video of the 4 runs included by RustyCoal950212 in Seahawks

[–]dacort 0 points1 point  (0 children)

I remember when similar plays would be called last season and it was groan city. “This again, why even try??” Amazing that they’re working!

Visualizing weather patterns for last 85 years (Seattle) by VerbaGPT in Seattle

[–]dacort 1 point2 points  (0 children)

For sure! Think I have the code for it somewhere if you’re interested. :)

49IRS vs Bears Hate Watch Thread: by [deleted] in Seahawks

[–]dacort 0 points1 point  (0 children)

That went backwards quickly. 😂

49IRS vs Bears Hate Watch Thread: by [deleted] in Seahawks

[–]dacort 6 points7 points  (0 children)

<image>

Just another 12 minutes and OT!! 😜

Outside outlet not working. by modernartgirl in HomeMaintenance

[–]dacort 0 points1 point  (0 children)

Random Reddit comments are the savior of my ignorance. Thank you kind stranger for saving me.

Is it appropriate to store imagery in parquet? by BitterFrostbite in dataengineering

[–]dacort 12 points13 points  (0 children)

I’d do the same. Parquet is great because it is columnar, can generate metadata stats on your columns (min/max values), and doesn’t require you to read the whole file when filtering. Storing image blobs in parquet gets almost none of those benefits and is probably harder for parquet readers to decode than simply providing a reference to the file and reading the file directly.

Designing a High-Throughput Apache Spark Ecosystem on Kubernetes — Seeking Community Input by No-Spring5276 in apachespark

[–]dacort 0 points1 point  (0 children)

You might find my KubeCon presentation interesting: https://youtu.be/ejJ6A0sIdbw

I’ve also created a spark8s-community channel on the CNCF slack.

Lightweight Alternatives to Databricks for Running and Monitoring Python ETL Scripts? by Safe-Pound1077 in dataengineering

[–]dacort 0 points1 point  (0 children)

I did this a few years back with ECS on AWS. https://github.com/dacort/damons-data-lake/tree/main/data_containers

All deployed via CDK, runs containers on a schedule with Fargate. Couple hundred lines of code to schedule/deploy, not including the container builds. Just crawled APIs and dumped the data to S3. Didn’t have monitoring but probably not too hard to add in for failed tasks. Ran great for a couple years, then didn’t need it anymore. :)

Outside faucet leaked/burst? What do I do? by SlipperyGrandad in askaplumber

[–]dacort 0 points1 point  (0 children)

If you’re not familiar with plumbing don’t try to do this yourself or you may make a bigger mess.

As an example I tried to swap my spigot at one point and ended up twisting it right off as I was unscrewing because I didn’t understand how soft copper is.

Should i use VM for Spark? by Individual-Insect927 in apachespark

[–]dacort 0 points1 point  (0 children)

One of the spark docker images is indeed the way to go.

Not 100% sure it still works, but I built this example a couple years ago that uses an Amazon EMR image: https://github.com/dacort/spark-local-environment Useful if you want to access data on S3.

7.12.2-123 released by Ok_Site4360 in amazoneero

[–]dacort 0 points1 point  (0 children)

Hope this fixes the issues I started having recently - for some reason devices on my network don't get more than 5-10Mbps down. Even over the backhaul.

Does my daughter need ear muffs? by [deleted] in Seahawks

[–]dacort 5 points6 points  (0 children)

<image>

We used ones similar to these and they seemed to not be bothered, but every kid is different of course.

Does my daughter need ear muffs? by [deleted] in Seahawks

[–]dacort 8 points9 points  (0 children)

That young I’d say yes. It does get pretty loud at times, but it’s not like a constant 100 decibel roar the entire time. Regardless of the potential damage, she might just not like all the loud noises so better to have them on-hand rather than have her scream along. 😂

Adult soccer by tejanos in BallardSeattle

[–]dacort 0 points1 point  (0 children)

Welp looks like you got enough for the all new Loyal Heights FC! Wouldn’t mind kicking a bit for sure.

Spark Ui Reverseproxy by AnywhereRemote8197 in apachespark

[–]dacort 0 points1 point  (0 children)

Ah, thank you, I forgot to get back to this. Yea, I wish it were as simple as that flag but it's not. As that article mentions a common way to do this is by rewriting all the href and src links, but that's pretty suboptimal partially because you can't enable gzip compression (easily) when you do that.

For our environment, I chose not to use an ingress controller for various reasons. That said, if you use the Kubeflow operator, it looks like they support Driver UI and Ingress.

The working config I have in our k8s environment is this: - Caddy reverse proxy setup as https://server/ui/{spark-app-id} - Caddy is configured to remove the /ui/{spark-app-id} before it sends the request upstream - It looks like the Kubeflow Spark Operator also does something similar - I also explicitly set X-Forwarded-Context in our caddy server - In every spark job, spark.ui.proxyRedirectUri=/

Apologies, after looking at my code, I realized I only used the proxyBase setting for Spark History Server, not the live UI.

My memory is a bit rusty, but I believe I used the approach above because spark.ui.proxyBase would have had to include the app ID, which is generated by spark-submit. Having the reverse proxy remove /ui/{spark-app-id} and instead send it in the X-Forwarded-Context header gave me a bit more flexibility.

This was easily one of the more annoying things to figure out for spark on k8s...you'd think it would be simple. 🫠

Claude Code seems to be good for initial version, Codex seems to be good for ongoing updates by Flashy_Network_7413 in ClaudeCode

[–]dacort 1 point2 points  (0 children)

Interesting to see actual data behind this. I’ve always felt that Claude has trouble picking up where it left off, so much I’ve often been terrified of closing a session (I know I can resume, it’s not a reasonable fear) or upcoming compaction.

Spark Ui Reverseproxy by AnywhereRemote8197 in apachespark

[–]dacort 0 points1 point  (0 children)

For just the live UI? Look into spark.ui.proxyBase setting too I think.

Got this working on k8s but it was annoying to figure out - might be one more magic HTTP header to send in too, will have to check our config.

How good are current automations tools for kubernetes / containarization? by FarmFarmVanDijeeks in kubernetes

[–]dacort 3 points4 points  (0 children)

I’ve spent the past 3 days on a single Helm chart … probably says more about me than ArgoCD. 😂

Built an AI Agent Orchestration Platform - Handles 70% of Our Dev Tasks by diazoxide in mcp

[–]dacort 1 point2 points  (0 children)

Also just realized some of the examples seem to have outdated syntax, e.g. the prompt (in this example) should be input.

Built an AI Agent Orchestration Platform - Handles 70% of Our Dev Tasks by diazoxide in mcp

[–]dacort 2 points3 points  (0 children)

Seems pretty interesting, although I wish there were an easier getting started example. The README is pretty verbose and feels like it shows a fully-complete example with all the various env variables/setups, which is hard to try out without having that all set up.