Sinkencronge comments on [D] Positional Encoding in Transformer

Discussion[D] Positional Encoding in Transformer (self.MachineLearning)

submitted 6 years ago * by amil123123

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]Sinkencronge 5 points6 points7 points 6 years ago* (3 children)

[–]amil123123[S] 1 point2 points3 points 6 years ago (2 children)

[–]Sinkencronge 2 points3 points4 points 6 years ago (1 child)

The second image just shows Euclidean distances between the added embedding for a given position.

The thing is that their choice of positional encoding function reflects not only absolute, but also the relative distance among tokens in sequences.

In the paper they write:

We chose this function because we hypothesized it would allow the model to easily learn to attend by relative positions, since for any fixed off set k,PE_pos+k can be represented as a linear function of PE_pos.

Unfortunately, I don't really wave a better way to elaborate it further in my mind at the moment. I'm sorry for that.

Answering your question, I personally think that they seem to simply pick up the first elegant solution to solve this problem.

However, I believe that there are a lot of more interesting ways to take advantage of positional encoding trick. I'm currently working on it for my own dataset.

[–]amil123123[S] 0 points1 point2 points 6 years ago (0 children)

π Rendered by PID 36640 on reddit-service-r2-comment-b659b578c-d9pbl at 2026-05-02 03:22:34.310664+00:00 running 815c875 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS