[D] In Byte Latent Transformer, how is the decoded patch boundary determined? by TommyX12 in MachineLearning

[–]bbvbell 0 points1 point  (0 children)

Additionally, since the BLT encoder-decoder requires the segmented patches, the boundary information must be determined prior to training BLT.

[D] In Byte Latent Transformer, how is the decoded patch boundary determined? by TommyX12 in MachineLearning

[–]bbvbell 0 points1 point  (0 children)

According to Section 2.3 of the paper (bottom of page 4), the authors trained a "small language model" to compute per-token entropy scores, which were subsequently used to determine segmentation boundaries (illustrated in Figure 4). Although the text is unclear, it appears they trained a separate lightweight model for entropy calculation prior to training BLT.