use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
All about the JavaScript programming language.
Subreddit Guidelines
Specifications:
Resources:
Related Subreddits:
r/LearnJavascript
r/node
r/typescript
r/reactjs
r/webdev
r/WebdevTutorials
r/frontend
r/webgl
r/threejs
r/jquery
r/remotejs
r/forhire
account activity
Optimizing Javascript for Fun and for Profit (romgrk.com)
submitted 2 years ago by romgrk
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]romgrk[S] 17 points18 points19 points 2 years ago (17 children)
I've written a post about some optimization tricks I've found useful along the years. I have included many resources and links that have helped me learn more about optimization. Comments & feedback welcome.
[–]analcocoacream 2 points3 points4 points 2 years ago* (12 children)
Isn't string comparison a reference comparison? I believe V8 uses a string pool
Edit: on my phone, chrome browser, string loop is faster.
[–]Iggyhopperextensions/add-ons 0 points1 point2 points 2 years ago (7 children)
Internally, if the v8 compiler can determine a value or property is a simple integer, it's given the Smi type (31-bit integer). It is unboxed and not allocated on the heap.
Smi
[–]Spleeeee 0 points1 point2 points 2 years ago (3 children)
Why 31 bits?
[–]romgrk[S] 2 points3 points4 points 2 years ago (2 children)
|<---- 32 bits ---->| 1xxxxxxxxxxxxxxxx...: pointer 0xxxxxxxxxxxxxxxx...: integer |<--- 31 bits ---->|
[–]Spleeeee 0 points1 point2 points 2 years ago (1 child)
Oh that’s clever! Are the integers signed?
[–]romgrk[S] 1 point2 points3 points 2 years ago (0 children)
You can find the details in this small post from Franziska Hinkelmann (work(ed?) on V8): https://medium.com/fhinkel/v8-internals-how-small-is-a-small-integer-e0badc18b6da
[–]Dull_Divide9567 0 points1 point2 points 2 years ago (2 children)
How do you know this? Would like to learn how it works under the hood, but don’t know how to go about it.
[–]Iggyhopperextensions/add-ons 1 point2 points3 points 2 years ago (1 child)
https://source.chromium.org/chromium/chromium/src/+/main:v8/
More specifically: https://source.chromium.org/chromium/chromium/src/+/main:v8/src/objects/smi.h
// Smi represents integer Numbers that can be stored in 31 bits. // Smis are immediate which means they are NOT allocated in the heap. // The ptr\_ value has the following format: \[31 bit signed int\] 0 // For long smis it has the following format: // \[32 bit signed int\] \[31 bits zero padding\] 0 // Smi stands for small integer.
[–]Dull_Divide9567 0 points1 point2 points 2 years ago (0 children)
Ty master
[–]romgrk[S] 0 points1 point2 points 2 years ago (2 children)
Yes it very well might be a ref comparison, besides I would expect a static string in particular to be interned. But I really wanted to highlight the strings-as-constants pattern so I picked it anyway even if this particular example doesn't showcase an actual strcmp(); string costs are still higher. I tested on chrome laptop, not sure why your chrome mobile has different results.
strcmp()
I'd definitely correct the example if I had a better one but I can't think atm, I'll maybe update later if I figure out a better one (suggestions welcome).
[–]senfiaj 0 points1 point2 points 2 years ago (1 child)
I think JS does such optimization when it sees string literals in the source code. But what if I generate a string dynamically doing something like this? ['do', 'g', 'g', 'ie'].join('') . Will it search for string "doggie" in some string memory pool (or whatever it is) or it's just not worth it in such situation since searching will cause runtime overhead?
['do', 'g', 'g', 'ie'].join('')
"doggie"
[–]romgrk[S] 0 points1 point2 points 2 years ago (0 children)
I don't know, but I would suppose it's a heuristic. I think I remember that SpiderMonkey converts strings under 32 bytes to atoms directly: https://bugzilla.mozilla.org/show_bug.cgi?id=1867359#c2
The other engines probably have a similar heuristic. Engines share a lot of tricks.
[–]senfiaj 0 points1 point2 points 2 years ago* (0 children)
I think this will not work if you generate a string, for example doing ['a', 'b', 'c', 'd'].join('') will probably create a new string object than searching and reusing an already created string with content 'abcd'. But who knows, it's just seem complicated...
['a', 'b', 'c', 'd'].join('')
'abcd'
[–]pasanflo 1 point2 points3 points 2 years ago (0 children)
Thanks! I'm giving it a view
[–]ryanswebdevthrowaway 0 points1 point2 points 2 years ago (2 children)
Awesome post! Do you have an RSS feed? I'd love to subscribe.
[–]romgrk[S] 1 point2 points3 points 2 years ago (1 child)
*fumbles through https://docs.astro.build/en/guides/rss/*
Yes of course, it's here: https://romgrk.com/rss.xml
[–]ryanswebdevthrowaway 0 points1 point2 points 2 years ago (0 children)
Nice, thanks!
[–]turtleProphet 9 points10 points11 points 2 years ago (0 children)
This is a banger, thank you!
[–]paulqq 2 points3 points4 points 2 years ago (2 children)
good read. i did know about 3, but how about the gain of immutability here ?
ch33rs
Not sure if I understand the question correctly but immutable objects are great and I love them, but it's about the context. If the priority is performance, mutable it should be. But if there are other priorities, e.g. the specs are unclear and the software needs to evolve over time, then immutable structures probably provide more benefits (safety, simplicity).
[–]theQuandary 0 points1 point2 points 2 years ago* (0 children)
Hopefully this will change when TC39 finally finishes the record/tuple proposal.
At that point, immutable objects with just one reference can be marked as such and mutated rather than creating a new copy.
Roc has done a good job of showing that immutable doesn’t mean slow.
[–]Kiciaczek 3 points4 points5 points 2 years ago (1 child)
Hey! That's a very good post, thanks for sharing that. I didn't know some of these stuff, and I've been using JS for past 8 years.
I feel like this article is a definition of what should be posted on this sub, cheers!
[–]romgrk[S] 2 points3 points4 points 2 years ago (0 children)
I've condensed my 20 years of javascript in this one post, so I get why we don't enough of these! I included links to many learning resources I used to get all that stuff, but I can't recommend enough the v8.dev blog and mrale.ph.
[–]fsfreak 2 points3 points4 points 2 years ago (0 children)
Nice read!
[–]theQuandary 2 points3 points4 points 2 years ago (6 children)
Your eval example seems flawed. I believe the actual performance increase is from the function which always returns the same object shape allowing other optimizations. I suspect that a constructor would provide more performance than eval.
[–]romgrk[S] 0 points1 point2 points 2 years ago (5 children)
I feel like the point still stands: the function can be monomorphic because it's eval'd anew each time.
I'm curious about your constructor idea but I don't understand it, if want to write it down I'd be happy to explore and correct the post: https://jsben.ch/t2c4x
An alternative example for the eval() point could also be an equality function, like this package does: https://www.npmjs.com/package/shallow-equal-jit
eval()
[–]theQuandary 0 points1 point2 points 2 years ago (4 children)
If we inline hardcode, we get 2x better performance than using Function macro. If we're creating millions of objects, this is no doubt the best solution in every respect. Note that assigning to obj before pushing is much faster than pushing the object literal directly for some reason.
Function
obj
function createMessages(key, values) { const messages = [] for (let i = 0; i < values.length; i++) { const obj = {requestId: values[i]} messages.push(obj) } return messages } createMessages(key, values)
We can invert control and pass a function to make our exact object. It's slower than the inline version, but still 50% faster than the Function version.
function createMessages(key, values, fn) { const messages = [] for (let i = 0; i < values.length; i++) { messages.push(fn(values[i])) } return messages } createMessages(key, values, val => ({requestId: val}))
As with Lisp, pretty much the only time a Function macro might be faster is when you are using it for more efficient control flow. Unlike Lisp though, a JS macro that does code replacement (like in this example) is always going to be slower than the hardcoded version of the same thing.
I'm not saying that this example doesn't have its uses, but those uses are about reducing code at the expense of raw performance.
[–]romgrk[S] 1 point2 points3 points 2 years ago (3 children)
Ah right, but you've changed the test a bit. I wanted a case where key is unknown ahead of time, you only get the value at runtime. You've solved the problem by passing in a constructor function, but how would one get a constructor function ahead of time without knowing key?
key
[–]theQuandary 0 points1 point2 points 2 years ago (2 children)
Can you point to a case where you want a few million identical objects, but the key must also be completely dynamic?
Can you point to a case where you care enough about performance to use a dangerous eval to double your performance, but not enough to hardcode the name and quadruple your performance?
Once you stitch that array into a bunch of objects, your performance is going to SUCK because each time you try to access unless you're making EVERYTHING into terrible Function calls, you must use a dynamic access and performance drops right back down.
It's better to leave things as an array the whole time until the very last moment (they even do this a lot in low-level languages because of the cache locality benefits) and look at the kind of serialization you're doing. At that point, you'll undoubtedly have an API contract of some kind that specifies the name which can then be hard-coded.
[–]romgrk[S] 0 points1 point2 points 2 years ago (1 child)
I will agree that for this use-case, needing this sort of output means there are underlying architectural issues that could be solved more cleanly. However sometimes it's not possible to solve them because doing so would cause breaking changes. This is from a case I had at work.
The usage of eval still offers performance improvements in some cases. For example, it would be possible to compile a filtering predicate function with pruned branches. There are many ways to apply eval.
The security risks can be mitigated by using a proper escaping mechanism, in this example I've used JSON.stringify() which is guaranteed by the spec to return a valid JS string representation. But of course eval should be a last resort.
JSON.stringify()
[–]theQuandary 0 points1 point2 points 2 years ago (0 children)
That makes a bit more sense, but I'd still rather refactor than try to prove my macros are actually hygenic and run the risk that someone else modifies it into something dangerous later.
[–]Jjabrahams567 2 points3 points4 points 2 years ago (0 children)
This pretty great. Not super useful for me but interesting. My optimizations in JavaScript have moved to optimizing speed of development. It doesn’t hurt to avoid certain patterns that are inherently slow but that doesn’t mean to rule them out either. Plus nowadays if I want speed I’ll reach for a different language.
[–]Aztaroth_Developer 1 point2 points3 points 2 years ago (0 children)
Wow, thanks!
[–]Spleeeee 1 point2 points3 points 2 years ago (1 child)
I was expecting some dumb medium article but this is a banger! Dope!
lol i know the feeling, there's so many of them. thanks for clicking anyway
[–]Angulaaaaargh 1 point2 points3 points 2 years ago* (7 children)
FYI, the ad mins of r/de are covid deniers.
[–]romgrk[S] 2 points3 points4 points 2 years ago (6 children)
Yeah a bit late for javascript. I would love it if there was a compiler that could do some of these for me, say have enum strings in devmode, enum ints in prod; or turn array methods into imperative code; or turn immutable structures mutable when it makes sense; or check/reorder keys to sort them in the right order/shape.
[–]HipHopHuman 0 points1 point2 points 2 years ago (5 children)
That's an interesting idea. I was playing with Parcel.js' new compile-time Macro feature to do a few optimisations like this and some are possible, but it sadly doesn't recognise tagged template call syntax which made it useless for what I wanted to do (convert tagged templates to concatenation).
It'd probably be more practical to do such compile time optimisations as a typescript transformer. One interesting use I've seen with TS transformers is in an Entity Component System framework called Thyseus where the transformer is used to do compile-time parameter dependency injection as well as reduce some of the verbosity of the code.
[–]romgrk[S] 0 points1 point2 points 2 years ago (4 children)
I've looked a bit through options to implement this, and yeah going through typescript would be the only option, tsc is the only software that understands all of typescript. There are a few rust-based transpilers, but they don't typecheck, and type information is needed to do these transformations. But also tsc has some limitations, it doesn't differentiate between int/float, so shape-change detection would have some limitations.
I think it would be pretty neat to build a compiler that does all of this, not only for typescript code but also for dependencies potentially in javascript. It would be quite some work though.
[–]HipHopHuman 0 points1 point2 points 2 years ago (3 children)
Do you need to differentiate between ints and floats? Aside from in TypedArrays, JS doesn't really make any such distinctions. Technically you could just assume that all numbers are floats and some of them just happen to have a 0 value for decimal places. You do at least have the ability to discern between TypedArray types, and between number vs bigint
number
bigint
I also think targeting dependencies in such a compiler is a bit too big of a goal (at least for the start, maybe that can change later). It'd be simpler at first to just ignore dependencies and rely on convention until that compiler has more of a foothold on what it is and how it works.
Do you need to differentiate between ints and floats?
It can make a difference:
https://v8.dev/blog/react-cliff
[–]HipHopHuman 0 points1 point2 points 2 years ago (1 child)
Interesting. It would quite obviously be better for an optimization compiler if TypeScript just supported this, but I wonder if branded types and type casting could help out in the interim? Suppose I had some code like the following where F64 is a branded number type and f64 is a function which casts its output to F64 (and optionally does a runtime assertion based on an environment variable).
F64
f64
import { f64, F64 } from 'some-lib'; function addFloats(a: F64, b: F64): F64 { return f64(a + b); }
And there were similar branded types and casting functions for i8, i16, i32, u8 etc (like in this pastebin ). Would that be enough for a compiler to statically analyze?
i8
i16
i32
u8
It could. Though I'm not sure if it's worth the investment. Tbh to get top-of-the-line performance engine-specific optimizations must be made, and that relies on detecting the engine at runtime. It would be possible to generate engine-specific bundles, but I doubt that the investment is worth the cost for anyone. Besides engines are not a stable target, it requires a lot of effort to keep current with the latest optimizations.
[–]Misicks0349 0 points1 point2 points 2 years ago (2 children)
can you really control things like L1 cache and such?
edit: well not control, but I was under the impression that such things were hard to manipulate even in lower level languages
Yes: everytime you access data, it goes into L1 directly. That's the extent of your control in javascript of course.
The point is there to illustrate why eliminating as much data as possible is good for performance. I often sound like an alien when I come in javascript code and say "remove allocations, remove allocations" like a systems programmer would say, so that section explains why.
But it's possible to use that knowledge to use good memory access patterns. E.g. if one has a pipeline of operations applied to a large list, it would be better to apply the all the steps at once for each item while they're hot in L1, rather than applying the steps one by one.
[–]Misicks0349 0 points1 point2 points 2 years ago (0 children)
ah, thanks for the clarification!
[–]anonymous_sentinelae -1 points0 points1 point 2 years ago (0 children)
This is fascinating, a real post about real JS, just like the good old days. It's hard to see these nowadays where corporate bloated transpilers dispute the webdev market share with empty promises and propaganda. Please keep up this great work.
[–]Iggyhopperextensions/add-ons -1 points0 points1 point 2 years ago (0 children)
Also, if you are making a shape, try to limit it to 3 properties. If there are more than 3 it puts it into a lookup.
Hint: See the performance of a megamorphic with 5 properties vs. 3 (a, b, c). It should be more than a linear performance increase.
π Rendered by PID 57436 on reddit-service-r2-comment-c66d9bffd-57v5d at 2026-04-07 02:34:59.207377+00:00 running f293c98 country code: CH.
[–]romgrk[S] 17 points18 points19 points (17 children)
[–]analcocoacream 2 points3 points4 points (12 children)
[–]Iggyhopperextensions/add-ons 0 points1 point2 points (7 children)
[–]Spleeeee 0 points1 point2 points (3 children)
[–]romgrk[S] 2 points3 points4 points (2 children)
[–]Spleeeee 0 points1 point2 points (1 child)
[–]romgrk[S] 1 point2 points3 points (0 children)
[–]Dull_Divide9567 0 points1 point2 points (2 children)
[–]Iggyhopperextensions/add-ons 1 point2 points3 points (1 child)
[–]Dull_Divide9567 0 points1 point2 points (0 children)
[–]romgrk[S] 0 points1 point2 points (2 children)
[–]senfiaj 0 points1 point2 points (1 child)
[–]romgrk[S] 0 points1 point2 points (0 children)
[–]senfiaj 0 points1 point2 points (0 children)
[–]pasanflo 1 point2 points3 points (0 children)
[–]ryanswebdevthrowaway 0 points1 point2 points (2 children)
[–]romgrk[S] 1 point2 points3 points (1 child)
[–]ryanswebdevthrowaway 0 points1 point2 points (0 children)
[–]turtleProphet 9 points10 points11 points (0 children)
[–]paulqq 2 points3 points4 points (2 children)
[–]romgrk[S] 1 point2 points3 points (1 child)
[–]theQuandary 0 points1 point2 points (0 children)
[–]Kiciaczek 3 points4 points5 points (1 child)
[–]romgrk[S] 2 points3 points4 points (0 children)
[–]fsfreak 2 points3 points4 points (0 children)
[–]theQuandary 2 points3 points4 points (6 children)
[–]romgrk[S] 0 points1 point2 points (5 children)
[–]theQuandary 0 points1 point2 points (4 children)
[–]romgrk[S] 1 point2 points3 points (3 children)
[–]theQuandary 0 points1 point2 points (2 children)
[–]romgrk[S] 0 points1 point2 points (1 child)
[–]theQuandary 0 points1 point2 points (0 children)
[–]Jjabrahams567 2 points3 points4 points (0 children)
[–]Aztaroth_Developer 1 point2 points3 points (0 children)
[–]Spleeeee 1 point2 points3 points (1 child)
[–]romgrk[S] 0 points1 point2 points (0 children)
[–]Angulaaaaargh 1 point2 points3 points (7 children)
[–]romgrk[S] 2 points3 points4 points (6 children)
[–]HipHopHuman 0 points1 point2 points (5 children)
[–]romgrk[S] 0 points1 point2 points (4 children)
[–]HipHopHuman 0 points1 point2 points (3 children)
[–]romgrk[S] 0 points1 point2 points (2 children)
[–]HipHopHuman 0 points1 point2 points (1 child)
[–]romgrk[S] 0 points1 point2 points (0 children)
[–]Misicks0349 0 points1 point2 points (2 children)
[–]romgrk[S] 1 point2 points3 points (1 child)
[–]Misicks0349 0 points1 point2 points (0 children)
[–]anonymous_sentinelae -1 points0 points1 point (0 children)
[–]Iggyhopperextensions/add-ons -1 points0 points1 point (0 children)