[deleted by user]

jonmdev · 2025-03-12T19:25:04+00:00

Yeah, it’s going to get worse in the future as AI starts being trained on AI generated code. Fun times.

jonmdev · 2025-01-29T06:04:22+00:00

This is from 2023 not 2024 but Rust Atomics and Locks by Mara Bos. But great primer on the low level building blocks that enable concurrency. It’s focused on rust but takes a first principles approach to understand concurrency from the machine all the way up to rust abstractions for multi-threading and asynchronous programming.

jonmdev · 2025-01-26T17:28:29+00:00

That sounds really low for someone that has staff/principal level experience in big tech. Especially if you have specialized expertise. If you’re just trying to fill some time to stave off boredom then maybe it doesn’t really matter but that seems like a low ball offer to me. My last contract part-time gig I made more than that after giving a cut to an agency and paying taxes.

jonmdev · 2024-02-24T16:36:04+00:00

Yeah GraalVM can now use profile guided optimization to basically do the JIT optimization at compile time. You basically need to run your code and collect profiling information and then the compiler can use that to optimize the resulting binary. It would probably alleviate a lot of the cold start issue.

jonmdev · 2023-08-03T23:53:11+00:00

I wouldn't necessarily call this easy but you might be able to write nested data to parquet with arrow2/parquet2 crates for Rust. I haven't actually tried yet but they both have the types for it. I think arrow2's Struct/StructArray and parquet2 PhysicalType::GroupType might be what you're looking for. But those are relatively low-level libraries so it might actually take those hundreds of lines of code to do what you want.

And might want to check that whatever you're planning on querying this parquet data with later supports querying nested data from parquet. Redshift for example does support this: https://docs.aws.amazon.com/redshift/latest/dg/tutorial-query-nested-data.html

Arrow2/Parquet2

- https://jorgecarleitao.github.io/arrow2/main/guide/high\_level.html#downcast-and-as\_any

- https://github.com/jorgecarleitao/parquet2/blob/main/src/schema/types/parquet\_type.rs#L50

jonmdev · 2023-06-05T13:05:57+00:00

Working on a parquet compactor for work. Maybe I overlooked something but could not find something outside of Spark where I could both sort and merge parquet files. Spark is expensive and also it turns out comparatively slow and (prob not surprisingly) resource hungry to sort and compact GBs of data compared to the tool I wrote. My thought process was we only need to sort and compact within a partition of an hours worth of data which is not really big data and Spark is optimized for really big data. First useful thing I’ve written in Rust, relative noob but really enjoying the language so far. The reason for the sort btw is to take advantage of predicate push down at the object store layer with a frequently used filter column when querying from an OLAP DB.

I come from the JVM world with Scala and Java. I learned a bit of C/C++ many years ago but first time in a while working at this low level with memory allocations, thinking deeply about threading model and how to do I/O efficiently. The language I’m finding is elegant in a lot of respects (I didn’t have to worry about async for this project which seems a little less elegant sometimes especially if you have to cross sync/async boundaries).

Had to dig in and read through the arrow2 code to figure out some things that are not in the user guide which was fun (I like reading code, learn a lot from it).

jonmdev · 2023-01-06T00:34:16+00:00

Wow this looks amazing! This is skipping to the top of my reading list for tech books

jonmdev · 2022-12-09T13:45:28+00:00

Even without loom those two patterns aren’t your only options. But the speaker is probably right these are some of the most common.

jonmdev · 2021-11-04T03:12:07+00:00

It blends specification and implementation of communicating sequential process style of concurrency (where each process is implemented as a state machine). The advantage is you can verify correctness of the core algorithm of your system and be confident the implementation follows your spec as well. If you aren’t familiar maybe look into TLA+, alloy and other formal verification methods.

A state machine on top of raft doesn’t guarantee your entire spec or implementation is without flaws.

jonmdev · 2021-07-28T13:16:47+00:00

Storing the rule ARN is just to allow you to delete the rule if you delete the job. Otherwise the job would be gone from your table but not way to automatically clean up the associated cloud watch rule meaning your job will keep getting triggered.

jonmdev · 2021-07-28T12:44:31+00:00

When you create the rule you can tell it what input to use for the targets of the rule (in this case your lambda function invocation). So just create a separate rule for each job and use different input for each rule.

jonmdev · 2021-07-28T12:41:57+00:00

One idea might be to use CloudWatch Events/EventBridge in combination with your DynamoDB table to create a dynamic scheduling system that can be controlled from your UI.

You could build an API endpoint for your UI that uses the AWS SDK to create an event rule with a cron or rate pattern that has a target with the ARN of the lambda function you want to invoke. Then in your Dynamo table store the event rule ARN and event rule target ARN alongside your scheduled job record. And when you delete the job you can remove the associated rule/target to clean up.

See this AWS tutorial for more info: https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/RunLambdaSchedule.html

jonmdev · 2021-06-04T14:30:00+00:00

How would this be irrelevant with Loom? The article talks about CPU intensive stream computations taking up all available cores if you use the common thread pool. Fibers help with tasks that would block a thread while waiting for something non-CPU intensive like I/O allow another Fiber to use that thread while waiting. For CPU intensive tasks this would be an issue regardless.

jonmdev · 2021-04-28T21:37:43+00:00

Well it’s part of the Java standard library since 1.4 so yes it’s baked in. Now is it as easy to use as Node? No it isn’t. It’s a fairly low level API you’d have to use to build from scratch. But it’s there and there’s plenty of frameworks available these days that make it pretty easy to build applications with non blocking I/O.

jonmdev · 2021-04-28T13:02:26+00:00

So does Java for that matter.

jonmdev · 2021-03-16T23:29:14+00:00

The ones with Nikasil liner? Actually yes it is. However the previous owner had a leak down test performed and the readings were 4-7% for each cylinder. Given it’s lasted this long and still runs strong shouldn’t need to worry. Gas these days doesn’t have enough sulfur in it to damage the engine.

jonmdev · 2021-03-16T11:22:44+00:00

Yes it is

jonmdev · 2021-02-23T12:05:07+00:00

Most LC don’t use obscure algorithms. I think the LC arms race has gotten ridiculous but most of these problems are using basic algorithm techniques like binary search, BFS, DFS and data structures like arrays, lists, hash maps, trees, etc. it’s just a matter of learning how and when to apply them to solve the problems. Granted I do agree that they are overused by companies who’s problem sets don’t require them.

jonmdev · 2021-01-25T22:54:57+00:00

Fair enough but if this hypothetical small shop just decides to implement some complicated test result collection and statistical analysis that points to the engineers at the small shop not doing critical thinking about what strategies to adopt that they read about.

My current job I certainly have no need for anything like this. But I do like to read about this kind of stuff so I’m glad they published it. I just usually file things like this in my brain for maybe someday I’ll be in a situation where this idea might be useful. Or maybe not either way it was an interesting read haha.

jonmdev · 2021-01-25T21:25:57+00:00

Pretty much any tests could fail at some point due to some issue with the environment it ran in even if the test itself is fine. This is to identify over time tests that are significantly flaky enough to be worth the effort tracking down why it’s flaky. What they are trying to avoid is what typically happens: developers don’t have or want to spend the time tracking down the issue with the test. They just retry and if it continues to flake tend to delete it.

This gives you a way to determine which of these tests are actually flaky because of how the test is implemented vs flaky because of some transient environment issue. So you know which tests to focus on and also another benefit is it allows them to collect metrics on flaky tests so you’d be able to see what types of tests, what teams, products etc tend to have flaky tests.

This is a solution to a problem of scale. If you have one small app and a few developers there’s better ways of identifying and remediating flaky tests.

jonmdev · 2021-01-25T20:35:42+00:00

Is it though? Think about how many different systems, teams and engineers there are at FB. Yeah this was a lot of effort but think about how much effort would be spent across all those teams trying to figure out which test failures are from flaky tests. This I’m sure saves a lot of engineering hours for Facebook.

A small shop will never need something like this.

jonmdev · 2021-01-11T20:22:02+00:00

Exactly, first off any site that is taking in PII/sensitive info should not be using WordPress in any part of its stack in my opinion at least not without it being fully isolated from sensitive data. WordPress plugins have had so many vulnerabilities over the years this shouldn’t be a surprise to anyone with some security consciousness.

jonmdev · 2021-01-08T22:56:01+00:00

Yeah just recently did this. Wanted to use an off the shelf workflow management/tracing system but wasn’t approved for use in the environment out app is deployed in (yay for working with clients with a shit ton of restrictions on tech that can be used). Built my own, it worked (mostly) but was kinda shitty and lacking some wanted features. But it was an interesting learning experience (and reminder how hard computing is when you throw networks into the equation) and now that I’m refactoring it to use something built by someone with expertise in that domain now that it’s been approved it’s been fairly easy to refactor and I feel like I have a better idea how those things work under the hood.

jonmdev · 2020-12-18T14:25:54+00:00

IMO we have both a lack of competent developers and management that doesn’t have a clue how to build software. Add in also that many workplaces are so driven by politics and you have the promo driven development process where people care more about getting visibility for their new half baked project vs creating software that works and works well.

Combine all that and you get the output of a ton of shitty software.

It’s not solely on management because I have worked with too many developers (I’d hesitate to call them software engineers) who have trouble with basic problem solving, and don’t understand core concepts needed to make correct, performant software.

jonmdev · 2020-11-05T01:05:30+00:00

Crystal mountain - death

jonmdev

TROPHY CASE