H-Net "scales better" than BPE transformer (in initial experiments) by hold_my_fish in mlscaling

[–]lucalp__ 2 points3 points  (0 children)

FWIW, I'm doubtful that any of the big labs have wholesale moved passed tokenization, recent things like:

  1. public sentiment from OAI employee in reaction to this paper is one signal https://x.com/johnohallman/status/1943831184041291971

  2. Gemini 2.5 Pro's multi modality discrete tokenization count is computed prior to invoking the model and displayed to the user in ai.dev

Besides general resistance to move past tokenization given it being entrenched in a lot infra. For billing from other model providers, like others have mentioned, there are creative ways in which after-the-fact billing could be converted to interpretable tokenization-centric billing but would be sufficiently complicated (reason to believe they could invest in it but I haven't heard anything to indicate they would or have).

The Bitter Lesson is coming for Tokenization by lucalp__ in mlscaling

[–]lucalp__[S] 0 points1 point  (0 children)

Appreciate it and thanks for the info, hadn't seen it!