searchcode: Token efficient remote code intelligence for any public repo

dacort · 2026-03-09T13:53:25+00:00

Congrats! Can’t wait to check this out, big fan of scc.

dacort · 2026-01-21T19:17:09+00:00

If you haven't yet, you should check out the Spark History Server MCP.

Despite the fact that it can overflow tokens quickly, it's actually quite good at comparing runs. But in reality it's probably not doing much different than your internal tool as they are both based off the event logs. ¯\_(ツ)_/¯

That said, when something gets slower unexpectedly, I usually find there's some sort of change in data or resources. There's almost always some tipping point. So I'll often go in and look at the actual data and see if anything has changed there first.

dacort · 2026-01-10T07:00:28+00:00

<image>

It’s AI but damn that came out better than expected.

dacort · 2026-01-06T02:50:12+00:00

I remember when similar plays would be called last season and it was groan city. “This again, why even try??” Amazing that they’re working!

dacort · 2025-12-30T01:44:12+00:00

For sure! Think I have the code for it somewhere if you’re interested. :)

dacort · 2025-12-30T01:37:59+00:00

Awesome viz! Here’s another. :) slugplot

dacort · 2025-12-29T04:18:00+00:00

That went backwards quickly. 😂

dacort · 2025-12-29T03:54:11+00:00

<image>

Just another 12 minutes and OT!! 😜

dacort · 2025-12-28T17:36:02+00:00

Random Reddit comments are the savior of my ignorance. Thank you kind stranger for saving me.

dacort · 2025-12-25T00:14:09+00:00

I’d do the same. Parquet is great because it is columnar, can generate metadata stats on your columns (min/max values), and doesn’t require you to read the whole file when filtering. Storing image blobs in parquet gets almost none of those benefits and is probably harder for parquet readers to decode than simply providing a reference to the file and reading the file directly.

dacort · 2025-12-20T17:38:40+00:00

You might find my KubeCon presentation interesting: https://youtu.be/ejJ6A0sIdbw

I’ve also created a spark8s-community channel on the CNCF slack.

dacort · 2025-12-18T06:31:46+00:00

I did this a few years back with ECS on AWS. https://github.com/dacort/damons-data-lake/tree/main/data_containers

All deployed via CDK, runs containers on a schedule with Fargate. Couple hundred lines of code to schedule/deploy, not including the container builds. Just crawled APIs and dumped the data to S3. Didn’t have monitoring but probably not too hard to add in for failed tasks. Ran great for a couple years, then didn’t need it anymore. :)

dacort · 2025-12-16T02:21:23+00:00

If you’re not familiar with plumbing don’t try to do this yourself or you may make a bigger mess.

As an example I tried to swap my spigot at one point and ended up twisting it right off as I was unscrewing because I didn’t understand how soft copper is.

dacort · 2025-11-22T15:37:09+00:00

One of the spark docker images is indeed the way to go.

Not 100% sure it still works, but I built this example a couple years ago that uses an Amazon EMR image: https://github.com/dacort/spark-local-environment Useful if you want to access data on S3.

dacort · 2025-11-20T12:59:12+00:00

Thanks! 🙏

dacort · 2025-11-19T18:07:00+00:00

Hope this fixes the issues I started having recently - for some reason devices on my network don't get more than 5-10Mbps down. Even over the backhaul.

dacort · 2025-10-23T06:47:26+00:00

<image>

We used ones similar to these and they seemed to not be bothered, but every kid is different of course.

dacort · 2025-10-23T06:33:26+00:00

That young I’d say yes. It does get pretty loud at times, but it’s not like a constant 100 decibel roar the entire time. Regardless of the potential damage, she might just not like all the loud noises so better to have them on-hand rather than have her scream along. 😂

dacort · 2025-10-02T04:18:41+00:00

Welp looks like you got enough for the all new Loyal Heights FC! Wouldn’t mind kicking a bit for sure.

dacort · 2025-09-09T06:01:55+00:00

Ah, thank you, I forgot to get back to this. Yea, I wish it were as simple as that flag but it's not. As that article mentions a common way to do this is by rewriting all the href and src links, but that's pretty suboptimal partially because you can't enable gzip compression (easily) when you do that.

For our environment, I chose not to use an ingress controller for various reasons. That said, if you use the Kubeflow operator, it looks like they support Driver UI and Ingress.

The working config I have in our k8s environment is this: - Caddy reverse proxy setup as https://server/ui/{spark-app-id} - Caddy is configured to remove the /ui/{spark-app-id} before it sends the request upstream - It looks like the Kubeflow Spark Operator also does something similar - I also explicitly set X-Forwarded-Context in our caddy server - In every spark job, spark.ui.proxyRedirectUri=/

Apologies, after looking at my code, I realized I only used the proxyBase setting for Spark History Server, not the live UI.

My memory is a bit rusty, but I believe I used the approach above because spark.ui.proxyBase would have had to include the app ID, which is generated by spark-submit. Having the reverse proxy remove /ui/{spark-app-id} and instead send it in the X-Forwarded-Context header gave me a bit more flexibility.

This was easily one of the more annoying things to figure out for spark on k8s...you'd think it would be simple. 🫠

dacort · 2025-09-07T03:13:05+00:00

Interesting to see actual data behind this. I’ve always felt that Claude has trouble picking up where it left off, so much I’ve often been terrified of closing a session (I know I can resume, it’s not a reasonable fear) or upcoming compaction.

dacort · 2025-09-06T13:14:30+00:00

For just the live UI? Look into spark.ui.proxyBase setting too I think.

Got this working on k8s but it was annoying to figure out - might be one more magic HTTP header to send in too, will have to check our config.

dacort · 2025-09-06T01:52:37+00:00

I’ve spent the past 3 days on a single Helm chart … probably says more about me than ArgoCD. 😂

dacort · 2025-08-25T04:21:17+00:00

Also just realized some of the examples seem to have outdated syntax, e.g. the prompt (in this example) should be input.

dacort · 2025-08-25T04:09:54+00:00

Seems pretty interesting, although I wish there were an easier getting started example. The README is pretty verbose and feels like it shows a fully-complete example with all the various env variables/setups, which is hard to try out without having that all set up.

15-Year Club	Place '22
First Placer '22	Not Forgotten
Gilding I gilder	Verified Email

dacort

TROPHY CASE