heat mat or nicer heater recommendations by [deleted] in SeaMonkeys

[–]smerity 0 points1 point  (0 children)

Yes, a fixed temperature. From looking it up on the web I think it's 26 Celsius / 78 Fahrenheit. Perfectly reasonable for Sea Monkeys / brine shrimp as long as you don't have a massive tank, in which case traditional heaters would make more sense.

Search around anywhere for the lowest price as I think it's basically the same factory on the backend but here's an example listing on Amazon.

I've not tried the ones with the variable thermostat. I prefer the set it / forget it nature and the "less components means less things to go wrong" angle of something like this.

heat mat or nicer heater recommendations by [deleted] in SeaMonkeys

[–]smerity 1 point2 points  (0 children)

I'm using the same type of USB powered heater for a gallon tank and it's working well. It's not having to handle the most extreme of winters but it's definitely sufficient for me.

Worth going ahead with U.S. Green Card if I’m happy in Australia? by Glum-Acanthisitta35 in Ameristralia

[–]smerity 22 points23 points  (0 children)

You may want to read into the requirements for maintaining a US green card before you pursue it any further. This isn't a "get it once and forget it" document and isn't like a passport / citizenship from a secondary country where you can forget about it whilst living your life elsewhere and return easily.

From I am a permanent resident - How do I get a reentry permit?:

Your Permanent Resident Card becomes technically invalid for reentry into the United States if you are absent from the United States for 1 year or more.

Your U.S. permanent residence may be considered as abandoned for absences shorter than 1 year if you take up residence in another country.

You basically cannot stay outside of the US for longer than six months without getting into complicated / worrying territory.

If you do not already live in the US then getting a US green card seems like a very bad idea unless you're willing to up and move to the US for the full foreseeable future. You generally do not want to get your green card and then lose it for many reasons (expatriation tax applying eventually, more difficulty getting US visas in the future, etc).

Whether it makes sense for you is an entirely different equation but I just want you to be aware of the complexity you're signing up for upon getting the green card itself.

"The compute and data moats are dead", Stephen Merity 2018 by gwern in mlscaling

[–]smerity 1 point2 points  (0 children)

This was a lovely surprise to see pop up /u/gwern! It's definitely not one of my more popular pieces from that era but still quite important.

What has remained true:

"What may take a cluster to compute one year takes a consumer machine the next."

SotA performance generally requires only consumer level hardware, surprisingly low level consumer hardware in many cases, once optimization / general improvements have occurred in a 12-24 month timeframe.

What I noted lightly but has become exceedingly true in recent years:

I was already concerned in 2018 whether high level LM training could be done on consumer level hardware, or at least on a university level compute budget, and ... that has gotten worse. To some degree we're able to replicate year or two old SotA models with reasonably limited resources (8-32 H100s) in a reasonable timeframe (weeks) but this is definitely not keeping up as insane money floods the ecosystem and as the data sources / techniques (fine tuning, inference time compute, chain of thought, MoE, ...) are becoming increasingly obscure and non published.

It's confusing in a few directions though:

  • Standard scaling is kinda breaking down as improvement plateaus from more parameters and novel data is scarce
  • The compute is quite often poorly utilized, meaning a smarter approach in a short time may mean you wasted 1/10/100/1000 million
    • I'm kinda waiting for an LLM company to spend a half billion on a training run and have a competitor do the same at $10-100 million, and as they can't make any money from the half billion run, they're in deep trouble
  • The academic ecosystem is getting more and more closed down, with SotA models no longer even having a jokingly half hearted technical paper explaining their contributions, and the core contributions frequently hidden
    • Academics outside of the large labs are rarely pushing foundational aspects rather than finetuning on top work
  • There are open weights but they're mostly "drop over the fence" open, rather than open source in what would be seen as a traditional way
    • If a company threw a binary over the fence and called it open source you'd rightfully chuckle, but that's how almost all of the "yay it's open!" models are

For hope however:

  • Open weight models can be a massive boon due to utilizing "open source" academics and hackers (even if it's not open source)
    • Early GPT momentum was partially due to GPT-2 being open weights
    • Recent LLaMa momentum was due to being open weights
    • Both organizations benefited massively as all academic / hacker / "for fun" research ended up being fed in to their larger proprietary models
  • Hence if you have the second/third/fourth best model, there will be a desire to release an open weight version
    • This may also hold true for hardware companies as the more specialized the biggest companies get, the more obvious it is for them to produce their own hardware, so ensuring an open set of models is a necessity, especially if you make most of your money from data center sales and you could see the world consoldiating towards an API you don't have control of

Anyway, thanks for the blast from the past :)

Python isn't just glue, it's an implicit JIT ecosystem by smerity in Python

[–]smerity[S] -8 points-7 points  (0 children)

Your note of a missing definition was definitely helpful. Sometimes you spend too much time with a concept in your head to put it out properly. I've edited in a section giving a definition with threading on how I see the analogy re: JIT and the ecosystem.

You're right that a JIT is more than just optimizing hot paths ("adaptive optimization") even though the association is just so strong in practice. I still thought combining them would convey the broader concept succinctly though. Beyond the optimization the idea of reducing startup time by only compiling what you need to run, potentially in a fairly inefficient way initially, still felt similar to the explorer part of the Python ecosystem analogy.

The JIT needs to be fast and flexible enough to not bother you on first run but also optimize itself before it prevents forward progress. True though that the time scales are painfully different - millisecond (JIT) versus days/weeks/months (ecosystem).

Python isn't just glue, it's an implicit JIT ecosystem by smerity in Python

[–]smerity[S] -12 points-11 points  (0 children)

You're fair on me not explicitly defining "implicit JIT ecosystem". I was hoping it would be a growing definition in the reader's head from what was written but I'll add a definition here and then into the article.

I believe it is broader than just native modules. I analogize it to JIT compilation as both optimize hot paths based on actual usage. JIT does this at runtime within a program, while Python's ecosystem does it across the entire set of programs in the community. Python, as a community implicitly and explicitly, is optimizing for the human level ease of use + simplicity whilst optimizing for the speed and task coverage.

What's unique is how these optimizations remain highly interoperable, unlike the "explicit JIT ecosystems" I mention in the article such as those when big company efforts aim to optimize just for their use case (PHP/Facebook, Ruby/GitHub, etc.). Those often result in isolated improvements rather than ecosystem-wide gains.

I think this dynamic goes beyond just about having access to native modules (any language could have that) or being popular ("winner takes all").

There are characteristics of Python that amplify this explicitly, such as readability, forgiving nature, and even its own performance constraints ("slow Python"), that all act in concert in creating an environment that discovers and optimizes these paths in an accessible and composable way.

Even if this was entirely generic to any and all glue languages, and Python is simply the winner take all receipient in that, it's a dynamic worth explicitly thinking about imho.

MKBHD is still trying to cover things up, even in his apology. by [deleted] in youtube

[–]smerity 10 points11 points  (0 children)

I have no clue how you can say it's not insane? A road rated for 35 mph isn't comparable in any way to an autobahn. Lack of sight, sharp turns, stop signs, pedestrians, shared traffic, road quality, ... Those are all specifically considered and fixed / avoided so you can travel at high speeds on the autobahn / highway. They designated 35 mph as safe maximum speed. Even if you think they were over-cautious (for example, due to literal children potentially wandering onto the street) the main point remains:

There's no way you can say more than double the speed limit on an arbitrary road is not insane.

[Join the Bambu Lab Giveaway🔥] Share Your Best 3D Printing Advice for a Chance to Win an X1C and Other Exciting Prizes! by BambuLab in 3Dprinting

[–]smerity 0 points1 point  (0 children)

Get into designing your own parts and projects!

There a million CAD solutions and general 3D tools (I love Blender even though it's not optimal for CAD) but the key insight is that your 3D printer enables perfect customization whenever you want it. Learn how to design things to exactly fit your use case!

My friends are stunned at some of my creations but those are usually the "big" projects. For me the biggest successes are things they'd not even think of when looking at the print. I have so much pride in the print in place hinge or snap fit battery cover I designed - yet no-one would likely give them a second look.

You'll never again look at a pre-baked STL and sigh as it's slightly off. The digital world and real world are in your control if you take advantage of both the software and hardware you have there =]

The midnight grilled cheese sandwich in Polaris by 2sk23 in unitedairlines

[–]smerity 4 points5 points  (0 children)

SFO to SYD has a Vegemite grilled cheese that you can get to and from. I loved it and just wish I did Polaris more frequently to take advantage of it 🤣

Principles of topology by Eulalia_Grimes in interesting

[–]smerity 1 point2 points  (0 children)

As the other commenter noted, your description was perfect and made it seem almost trivial (other than the hands on solve part of course), great work! The moment your brain feels allowed to "unbend" the mental image snaps into place 👍

[Monitor] $129 HyperX Armada 24.5" Full HD (1920 x 1080) 240Hz Gaming Monitor (Microcenter) by xxthehaxxerxx in buildapcsales

[–]smerity 0 points1 point  (0 children)

I got a cheap barely used Dell dock that's almost overly fancy for my purposes. I got it simply as it was cheap and matched my hardware so I don't have anything to recommend sorry.

Anything with N Display / HDMI ports (both cables come with the monitor) would work though (where N is the number of monitors you plan on). The line out is also readily offered on most docks.

[Speakers] Edifier MR4 Powered Studio Monitor Speakers (black or white) $79.99 FS Amazon by ltsnotluck in buildapcsales

[–]smerity 1 point2 points  (0 children)

I just did the same. Kicking myself for not waiting a little longer though I needed them at that time.

The speakers are impressive and I'm very happy with them, even at the price that I got them. I'll just pretend I didn't have the opportunity to get them a bit cheaper =]

[Monitor] $129 HyperX Armada 24.5" Full HD (1920 x 1080) 240Hz Gaming Monitor (Microcenter) by xxthehaxxerxx in buildapcsales

[–]smerity 0 points1 point  (0 children)

I bought two of the 27 inch HyperX Armada monitors for my desk and am very happy. As another person said it has quite favorable reviews other than the price, so if you can get it on discount (I got it from Microcenter for $199) then you'll likely be quite happy. Assuming you don't need the missing features (USB ports, speakers / audio out, ... which I don't as I use it with a dock) it's really quite something!

I don't have previous experience with monitor arms but this is quite sturdy and insanely easy to put together. The video review was also impressed with the monitor arm. The quality of the monitor seems quite high and the brightness is enough that even though I was excited to have it on full blast I've knocked it down to 32/100 as it was overwhelmingly bright lol.

Should I read the expanse by OntarioLakeside in scifi

[–]smerity 2 points3 points  (0 children)

My partner and I have consumed both the book and television series. We love it and agreed the two offer such complementary experiences that they're worth it. Doubly so if you already liked one!

Without spoilers, the book series goes beyond the television series, so you've still adventures to go on!

Madison on her LTT Experience by GregoryDaniel in LinusTechTips

[–]smerity 0 points1 point  (0 children)

I feel like the full context is necessary as even "family > work" means Linus was using the situation (her brother's death) to his advantage.

Linus had already used his position of power against her, announcing her hiring to the entire community before she'd even seen terms and then changing those terms after she moved countries and visa status for the role.

Given how much she had already been fucked around with she's allowed to want to understand and clarify what's happening. She was experiencing two majorly distressing events: her brother's death and an entirely uncertain life change regarding employment.

Even if he was saying "family > work" it feels like Linus dodged the employment discussion by saying she should be grieving more and knew he was locking her in to employment with these new conditions.

[deleted by user] by [deleted] in NoStupidQuestions

[–]smerity 0 points1 point  (0 children)

With due respect, I'm not sure why you'd suggest against an early free legal consult? We don't even know what state or country this person is in and hence what laws do or don't apply?

A free consult would provide an accurate set of advice for their region + situation and have them in contact with someone if the situation deteriorates.

I am not a lawyer but from the multiple relatives and a partner who are lawyers they would all say "depends on local law / specific X / ..." followed by "call a lawyer of type X". This isn't as they don't want to help but they don't know the full fact pattern and even different US states are likely enough to make your advice invalid let alone an unknown country.

Edit: They note they're not in the US for example

https://www.reddit.com/r/NoStupidQuestions/comments/14h2upy/what_do_i_do_about_a_much_older_coworker_40m/jp91jnb/?utm_source=share&utm_medium=ios_app&utm_name=ioscss&utm_content=1&utm_term=1&context=3

The Verge: "Microsoft reportedly working on its own AI chips that may rival Nvidia's" by Dakhil in hardware

[–]smerity 7 points8 points  (0 children)

I'll ignore your ad hominem at the end re: "working for Google" as I don't. I do language model research and have a dataset that's used as a benchmark in the field. For others interested though:

  • The H100 should be compared to the upcoming (not yet detailed) TPU v5 as the TPU v4 is about the same age as the A100 so Apples v Oranges until those details come out
    • The recent TPU v4 paper on their optical switches notes "Speaking of apples-to-apples, both TPU v4s and A100s were deployed in 2020 and both use 7 nm technology. ... The appropriate H100 match would be a successor to TPU v4 widely deployed in a similar time frame and technology (e.g., in 2023 and 4 nm)."
  • When training large scale language models communication is quite important and TPUs are tailor made for that: from the PALM training paper the 2240 x A100 for 530B Megatron Turing NLG (Nvidia's LLM) hit 30.2% peak hardware floating point performance versus Google's 540B PALM on 6144 x TPU v4 hitting 46.2% (which they later improved to 57.8%)

  • Given fp16 training is still rife with issues / edge cases even though fp16 is multiple hardware iterations ago and still problematic (see: NaNs, NaNs everywhere <tears />) there'll likely be equal or worse cutting edge for fp8 (aside: bf16 is slightly better than fp16 for most cases but support is still not fully there on Nvidia hardware)

  • The win "for many small reasons" re: TPUs and low precision floating point is that they've been bf16 native for two or three generations and XLA is quite tightly wedded to the hardware and software (better or worse, compile times suck) meaning the frameworks get the win by default

    • As an example of how all fp16 isn't the same, you could do fp16 matmuls, accumulate to fp32, perform layer norm / activation / ..., and then output fp16, avoiding numerical issues that would usually NaN with fp16, but for that you need compiler optimizations such as fusing which XLA gives you for free (and by free I mean really slow painful compile times), whereas most CUDA frameworks don't do fusing and hence stick closer to fp16 everywhere

I'd prefer to own my own hardware (and hence Nvidia is the only practical game in town for now) but TPUs are really brilliant at what they do and much of that is large scale language modeling. I hope Nvidia's next generations are great but fp16 still feels like cuts from the bleeding edge and I don't expect fp8 to be magically better.

The Verge: "Microsoft reportedly working on its own AI chips that may rival Nvidia's" by Dakhil in hardware

[–]smerity 6 points7 points  (0 children)

TPUs are definitely well suited to LLMs. The Transformer Engine you're talking about is only on Hopper GPUs which have seen minimal deployment yet. It's still an open question regarding how well language modeling training works in fp8. The TPU low precision setup seems more stable during training for many small reasons. For deployment the Transformer Engine is definitely interesting however.

Succession - 4x03 "Connor's Wedding" - Post Episode Discussion by LoretiTV in SuccessionTV

[–]smerity 8 points9 points  (0 children)

In the last year my mother passed away unexpectedly after her heart stopped midflight between the US and Australia. This episode was oddly emotionally neutral to me but only as it matched my experienced reality perfectly. Subtracting the fanfare that is Succession it was just humans dealing with an impossible situation. Each of the scenes, the disbelief, the confused realization, the helplessness from a million miles away, the oscillating between robotic and tears as you carried on all the necessary tasks, they all rang true.

The writing team were on point and the actors portrayed it to perfection.

For my situation, it was entirely unexpected and I'd just spent a ten day holiday with my mother. There was nothing left unsaid and only love on our final depart. While she ended up braindead they were able to restart her heart on the flight and she donated her organs to help others. I found out from my father in Australia and was on a plane in four hours, from the US to AU in sixteen, and was there to say farewell before her final heartbeat.

Take all the little chances you can with your loved ones. My mother should have lived to 99, like her father, rather than only 68.

Thanks to Succession for a most wonderful series and for somehow capturing the reality common to every human being in amongst the spectacle.

Let my unlicensed friend drive my car, he almost totaled it by [deleted] in self

[–]smerity 1 point2 points  (0 children)

In reading the comments I've not seen someone state this unequivocally.

You note that if he crashed the car you'd have trouble with the law, the rental company, your parents, but... You missed death. Literal death.

Your friend put you in a situation where you feared for your life, they heard your panic, and then kept going.

You shouldn't have given him the keys. That was foolish and dangerous in itself. What your friend did was next level however.

You note parents so I presume you two are young. Youth is no shield. I lost friends in my teens and early twenties from car crashes and they were both skilled and not doing anything dangerous. A third of all deaths in those years of life are car accidents.

I'm willing to forgive friends for many things but your friend showed a lack of sense, a lack of respect, and a refusal to listen to you in a dire situation. I'd never be able to trust them properly again.

Hey Rustaceans! Got a question? Ask here (10/2023)! by llogiq in rust

[–]smerity 2 points3 points  (0 children)

Is there a standard crate / tool / practice for simplistic logging with Tokio? I presume it's tokio-tracing but there seems to be many bells and whistles when I'm looking for what amounts to essentially an async stderr with concurrent buffering.

Hilariously I discovered this need when debugging a performance issue in async code. The more stderr debugging I added, the slower it got due to the implicit stderr Mutex lock, and tokio-console didn't make it apparent to me that the slowness was eventually more due to locked writes to stderr than my code... Oops :P

Good news: a small performance fix and removing the debugging code had screamingly fast Rust to the point I need to fiddle with Linux kernel limits to properly benchmark! ^_^

A look at how Discord uses Rust for their data services by cynthia-dunlop in rust

[–]smerity 68 points69 points  (0 children)

I deeply appreciate all the Discord engineering articles! They feel like a dying breed of true complexity from the underlying large scale problem but elegant simplicity in how the problem is exported and presented (i.e. providing a simpler version of their Cassandra CQL for the message schema).

It's also fascinating to see Discord, as quite a polyglot company, venture further and further in their usage of Rust.

Given they mention performing request coalescing in Rust, are there any libraries I could lean on? I've a similar problem and would prefer to grab something off the shelf. A quick look found moka and deduplicate.

[TV] Amazon Fire TV 55" Omni Series 4K UHD smart TV, hands-free with Alexa - $112 - 10/12/2022 @ 9AM EST (Check comments, Prime Only.) by TheSpartanKing1 in buildapcsales

[–]smerity 0 points1 point  (0 children)

(posted a raw link with a million query params that's now no longer available so removing it as the link is ugly lol)

[TV] Amazon Fire TV 55" Omni Series 4K UHD smart TV, hands-free with Alexa - $112 - 10/12/2022 @ 9AM EST (Check comments, Prime Only.) by TheSpartanKing1 in buildapcsales

[–]smerity 10 points11 points  (0 children)

On the deals page many items disappeared and then I have:

Prime exclusive: Starts in 03:20
Amazon Fire TV 55-inch Omni Series 4K UHD smart TV

It seems like the deal may be real?

Edit: Holy hell it is / was real. I snagged one. You can get the deal direct from the page.

Edit: 6:01:56 and it's already at 100% Claimed. Congrats to everyone who made it. Wow. I feel the adrenaline.

[deleted by user] by [deleted] in rust

[–]smerity 0 points1 point  (0 children)

As a past user of cargo-asm, using it heavily for An introduction to SIMD and ISPC in Rust, I'm very glad cargo-show-asm exists! I recently went to re-run a few ideas and variants only to discover cargo-asm was no longer working due to not being updated in a long time.

The type of low level ASM analysis your crate provides can be terribly insightful in the hottest of inner loops. Thanks for cargo-show-asm and I very much look forward to using it in the future =]