Algorithm – czxttkl

LLM Long Context

In this post, let’s visit how modern LLMs encode positional information. We start from the most famous paper in this domain [1] and dive into some key details. Why we need positional encoding LLMs need positional encodings to differentiate different semantic meanings of the same word. We use the motivational example from [2]: The two …

Continue reading “LLM Long Context”