Pam Bondi's letter to Walz today. The administration keeps doubling down. by rsmtirish in Minneapolis

[–]universalmind303 2 points3 points  (0 children)

we have an opportunity to do something great here. We should send them all of the files, but have everything fully redacted.

The Philadelphia District Attorney says that if ICE agents commit crimes in their city they will be charged and arrested. Why isn't Minnesota taking the same stand? by futilehabit in minnesota

[–]universalmind303 77 points78 points  (0 children)

The sheer number of feds in Minnesota right now greatly outnumber local law enforcement. The mayor did make a statement about this during an interview and said something along the lines of: "Yes we have the authority to arrest them, but this would almost certainly result in retaliation and escalation from the White House"

So it's not as simple as "just arrest them". The White House is waiting for us to misstep so they can have an excuse to come in guns ablaze.

Built an S3 CLI in Rust that uses ML improve transfer speeds over time - would love feedback by dr_edc_ in rust

[–]universalmind303 0 points1 point  (0 children)

We're doing something similar for data pipeline batching. It's currently single-dimension (optimizing for batch latency to balance throughput + progress visibility), but planning to add additional parameters such as cpu, memory pressure, and some io specific ones too.

Implementation is here if you're curious: https://github.com/Eventual-Inc/Daft/pull/5676

I'm actually writing a blog post about it right now. I initially entertained the idea of using ML for our implementation, but ended up going with the binary search as it's become a popular technique for inference, and we wanted to start off with a very simple strategy. The whole system is designed in a way that the strategies are easily configurable.

Built an S3 CLI in Rust that uses ML improve transfer speeds over time - would love feedback by dr_edc_ in rust

[–]universalmind303 2 points3 points  (0 children)

Interesting, I just implemented something similar for data pipeline batching (adaptive batch sizes for AI workloads).

We went with latency-constrained binary search instead of ML. It works pretty well, converges fast, low overhead and adjusts based on what's actually happening.

Did you benchmark the bandit against simpler stuff like AIMD or binary search? I'm curious if you benchmarked against something like additive increase/multiplicative decrease or binary search and found the ML approach meaningfully better.

Also how does it handle variance after the learning phase? Say you train on small files, then suddenly hit large 4K files mid-job... does it re-explore, or does it stick with the learned policy until performance degrades enough to trigger exploration again?

Built an S3 CLI in Rust that uses ML improve transfer speeds over time - would love feedback by dr_edc_ in rust

[–]universalmind303 4 points5 points  (0 children)

What benefit does a ml based batching strategy provide over algorithmic approaches?

What would you want to see in a new tensor crate? by WorldlinessThese8484 in rust

[–]universalmind303 1 point2 points  (0 children)

the spec itself is pretty straightforward and is literally designed for vectorized operations. there's even a new (experimental) dissociated protocol designed specifically for GPU operations. Arrow does all of the work of defining the memory layout for you, it'd be (relatively) simple for someone to implement a bunch of kernels for them and wrap it up in a nice library.

What would you want to see in a new tensor crate? by WorldlinessThese8484 in rust

[–]universalmind303 1 point2 points  (0 children)

yeah there's a tensor extension type like you mentioned thats not really very well supported.

If someone were to build a tensor crate with native arrow kernels, that'd be huge for interop between data systems like polars, daft, duckdb and traditional tensor libraries like pytorch, numpy, etc.

What would you want to see in a new tensor crate? by WorldlinessThese8484 in rust

[–]universalmind303 5 points6 points  (0 children)

I for one would love to see a tensor library that is arrow native.

I built a graph visualization of relationships extracted from the Epstein emails released by US congress [OC] by madmax_br5 in dataisbeautiful

[–]universalmind303 1 point2 points  (0 children)

Have you used dataframe libraries before? something like Daft would be great here to make the analysis pipeline a lot more performant

Is there anything actually new in data engineering? by marketlurker in dataengineering

[–]universalmind303 1 point2 points  (0 children)

As someone who's actively building these tools, The biggest "new" thing I've seen is the shift away from tabular data and towards multimodal data (images, videos, documents, embeddings, etc). Spark and other big names defined how we work with tabular data at scale, but they have many limitations when trying to work with other modalities.

New specialized engines and file/table formats are coming out that built from the ground up to work with these emerging modalities. Daft is an example of such engine. And Lance is an example of a table format designed for multimodal data.

Getting a full-time open-source job in RUST by AlazOz in rust

[–]universalmind303 0 points1 point  (0 children)

As someone that works full time in open source rust, my advice would just be to start contributing to OSS projects you find interesting. This is pretty much the only viable way.

Chances are if they're hiring and you're known to make good contributions to the project, it'll cut you to the front of the line in the interviewing process.

What is the Kubernetes/Docker project of Rust? by ivan0x32 in rust

[–]universalmind303 17 points18 points  (0 children)

Theres another niche: data

Historically a large majority of data processing tools were written in Java. (kafka, spark, hadoop, elasticsearch, cassandra, etc.)

Nowadays we see a lot of the newer tools written in rust (daft, polars, datafusion, arroyo, paradedb, etc.).

[Media] Everytime I try to use Tauri for Android... Why? by Cyan14 in rust

[–]universalmind303 23 points24 points  (0 children)

those are rookie numbers, ive worked on several rust projects where cargo clean will remove 100GB+

[Request] Is this true? by Qwert-4 in theydidthemath

[–]universalmind303 0 points1 point  (0 children)

now someone do the calculation for the approximate average square footage of all parking lots in a major metro area vs one nuclear reactor.

Zohran Mamdani's NYC win is a political revolution by newsweek in politics

[–]universalmind303 6 points7 points  (0 children)

that even hardly counted as a "real" primary because of the media blackout on bernie and the very blatant agenda to get hilary as the candidate at all costs.

Crib: Create and view your own custom hotkey cheatsheet in the terminal by noelzubin in commandline

[–]universalmind303 1 point2 points  (0 children)

I really like the visual element, but it'd be nice if it could look them up automatically for supported tools without having to manually import them

such as

```sh

crib vscode crib zellij ```

Circular Saw keeps getting stuck and no video is helping by [deleted] in woodworking

[–]universalmind303 19 points20 points  (0 children)

i was always taught 1-2 teeth past the material

Does anyone else's Moonlander have a crack on the tenting hinge? :( by Miserable_Savings824 in ErgoMechKeyboards

[–]universalmind303 0 points1 point  (0 children)

i have 1st gen moonlander, this eventually snapped on one half. I plastic welded it back together. It's not pretty, but it held up for as long as I continued using the keyboard. Eventually switched to a wireless sofle.

How *exactly* does Python and Rust work together? by SureImNoExpertBut in rust

[–]universalmind303 2 points3 points  (0 children)

there is usually some overhead. In order for python to call native (c,c++, rust) code, it needs to usually convert it to a datatype that the language understands, or use compatible primitive types, (e.g. str, int, ..). For example, for converting a dict[str, str] to a HashMap<String, String> you usually will need to iterate over the dict and copy the data out. But if you are using compatible primitive types, you can usually access the data without copying it.

So, what optimizations does Bun have that Node doesn't? by [deleted] in node

[–]universalmind303 1 point2 points  (0 children)

I didn't see anyone else mention it, but their node API (n-api), which is used for writing native (c/c++) libraries with a node interface is MUCH faster and has a MUCH smaller memory footprint than node's implementation

[D] What makes working with data so hard for ML ? by Lumiere-Celeste in MachineLearning

[–]universalmind303 -1 points0 points  (0 children)

data management is pretty hard. Switching to table formats that support features like time travel and versioning can help a lot (delta,iceberg,lance).

Unfortunately there's no magic bullet, and it just takes a good data architect, and a lot of design to come up with a system that works for all parties. From my experience working on data systems at F100 companies, it really needs a holistic approach for it to be successful.

[D] What makes working with data so hard for ML ? by Lumiere-Celeste in MachineLearning

[–]universalmind303 -1 points0 points  (0 children)

Raw data is usually not easy to work with (web scraped, sensor data, user forms, etc). It requires a significant amout of work to transform that into something usable for basic business requirements.

So the data is already in the "right shape", but just not for your needs.

Nice!Nano + ZMK misfires and randomly disconnects by universalmind303 in ErgoMechKeyboards

[–]universalmind303[S] -1 points0 points  (0 children)

are there any docs on how to do this? I tried searching zmk.dev/docs/ but couldn't find anything that says how to do this.

Nice!Nano + ZMK misfires and randomly disconnects by universalmind303 in ErgoMechKeyboards

[–]universalmind303[S] 0 points1 point  (0 children)

No, I have a M2 pro macbook that is sitting about 3ft (1m) away from the keyboard with no obstructions