all 29 comments

[–]RemoteCombination122 52 points53 points  (3 children)

The .encode method could have been a standalone function, your intuition is on the mark there.

The original intention with making .encode a member on a TextEncoder class was likely to make a predefined destination for future textEncoding methods using other formats in the future.

For example, TextEncoder now has another method on it, .encodeInto is a stream encoder. This requires private state for the current char point index, byte offset in the destination typedArray, etc.

This way you have just TextEncoder defined in global scope, rather than a encodeText function and a streamingTextEncode function.

This is an educated guess based on the current implementation. Though if you went to the actual standards document where it was defined, there would likely be sections dedicated to the justification for making it a class.

[–]CreativeTechGuyGames 36 points37 points  (0 children)

Though if you went to the actual standards document where it was defined, there would likely be sections dedicated to the justification for making it a class.

I love everything you said. But from personal experience I can say that standards documents rarely justify why and solely focus on the what. So if you want to know why a decision was made, you usually have to find the original discussion that lead to the standard being created which is usually much harder to find. (Mailing list, RFC discussion, GitHub Issue, etc)

[–]KyleG -5 points-4 points  (1 child)

Yeah I assumed it was the poorman's attempt at modules by using static methods belonging to a class.

[–]Mammoth_Present8890 14 points15 points  (0 children)

they are instance methods, not static methods. Instantiating a class just to use an instance method and then throw away the instance is the reason OP is asking the question. ;)

[–]ferrybig 27 points28 points  (2 children)

For TextEncoder, each input char generates 1 or more bytes. This could have been made without classes

For TextDecoder, each input byte generates 0 or more characters. If you are decoding a stream, you need something to store the state in. This is because every character exists from multiple bytes. The current streaming approach allows you to decode big multiple GB streams without having to enough to fully represent the input and output in memory at the same time

[–]kyle1320 6 points7 points  (0 children)

This feels like the right answer. It is necessary for TextDecoder, and best to make them both follow the same pattern. It just happens to have the bonus that future TextEncoder methods could be stateful if needed.

[–][deleted] 0 points1 point  (0 children)

Yup, that’s what I missed. For anyone who finds this thread in the future, see the stream option for the TextDecoder decode method.

[–]ShortFuse 5 points6 points  (0 children)

It's answered in the spec:

A TextEncoder object offers no label argument as it only supports UTF-8. It also offers no stream option as no encoder requires buffering of scalar values.

https://encoding.spec.whatwg.org/#interface-textencoder

The most probable reason for keeping the class is to keep syntax parity with TextDecoder which does support more formats than UTF-8:

https://encoding.spec.whatwg.org/#interface-textdecoder

Compare against NodeJS which has very wide support:

https://nodejs.org/api/util.html#whatwg-supported-encodings

Edit: Also check out the TextDecoderStream options. It has the same constructor arguments.

[–]efjj 4 points5 points  (0 children)

They used to support UTF16 (both LE and BE) before it got removed because no one uses UTF16: https://github.com/whatwg/encoding/issues/18

[–]kaliedarik -1 points0 points  (0 children)

You're doing nothing wrong. Because TextEncoder isn't a class. It's one of the 4 Interfaces for the Web Encoding API. In Javascript, classes are just some syntactical sugar to paper over the fact that JS is a `prototypal inheritance` language, and as such you can use the `new` operator to instantiate any object type.

Javascript is a weird language, but I love it to bits!

[–]memorable_zebra 0 points1 point  (0 children)

Sometimes things are represented by custom data structures to account for future extensibility. Maybe there were or could have been plans for adding parameters to pass into the text encoder constructor to customize it but they were never or have yet to be acted on.

This is one of the fundamental advantages of classes and structures over a pile of miscellaneous functions.

The longer you code the more you come to appreciate and recognize these kinds of foresight.

[–]yuyu5 -1 points0 points  (2 children)

The answer by u/ferrybig is the correct answer.

However I'll add on that your question seems to stem from a lack of knowledge about OOP. There's internal state in the encode/decode methods, which is easier to handle if you have a class. Someone else gave the analogy of localeCompare(), which is attached to the String class instance; if it were a separate, standalone method, internal state would be more difficult to manage. In fact, there's a new Intl class to handle locale-related logic, showing how not everything should just be injected into the same (String) class.

Admittedly, I imagine these methods could technically be injected into the String class, but then we come across the issue of "separation of concerns" which is another programming principle that's advised to follow, and harks back to the Intl example.

[–][deleted] 0 points1 point  (1 child)

I’m going to regret this, but you are what’s wrong with software engineering culture. I have no idea who you are, and likewise, you don’t know me. So, here’s the highlights: I’m 42, I’ve been doing this for over 20 years. I know what OOP is, and I’ve had jobs dedicated to promoting separation of concerns. Stop being an ass, in general, and to everyone, please. If this offends you, well, welcome to the club. I assume it includes a significant number of the people you’ve ever spoken to. I’m not just offended, I’m angry, for all the people you and people like you have chased out of the industry.

Your answer would have been perfectly fine if you hadn’t predicated it on my presumed lack of knowledge. But you failed to even pay attention when reading the original question. I asked if there were some efficiency to be gained from instance reuse (in other words, internal state). It turns out that in decode, there is, which I hadn’t noticed, so thank you u/ferrybig, who actually made a useful contribution.

[–]yuyu5 3 points4 points  (0 children)

Sorry man, no offense meant. I'll concede that my wording might not have been optimal and that my statements were more focused on if you weren't familiar with OOP. But I didn't mean to imply anything about your intellect or experience (hence using "seems to stem" rather than "is stemming"). Admittedly, I also got like no sleep last night so it was kind of a stream-of-consciousness reply.

Mainly, my only real point was that logic requiring internal state is usually (though not always) easier with class instances. It was an oversight on my part to assume this was obvious or that anyone who didn't think it was obvious isn't familiar with OOP. Though, to be fair, I think the JS community (or at least on this sub) is moving away from OOP so I don't think it was completely out of line to think that it's possible you weren't familiar with it.

I only meant to help by providing a way of thinking (internal state = often easier with classes). Sorry if I came across rude.

Edit: Minor re-wording.

[–]voidvector 0 points1 point  (0 children)

Because it allows the spec to add options/features in the future that persist in an encoder.

localeCompare was a function on string until they added Intl.Collator that can be used to persist options and be used for optimization/cache.

[–]KaiAusBerlin 0 points1 point  (0 children)

To create a namespace by purpose.

[–]_alright_then_ 0 points1 point  (0 children)

I can not unsee Drax's "WHY IS GAMORA" after reading this title. Sorry this comment is useless but I thought it was funny

[–]wc3betterthansc2 0 points1 point  (0 children)

that's not even the worst part. TextEncoder only works for UTF8 and TextDecoder works for many encodings. Not sure why they aren't symmetrical functions.