It has been a while since I learned GPU knowledge. I am going to keep updating more recent materials for ramping up my GPU knowledge. FlashAttention [1] We start from recapping the standard Self-Attention mechanism, which is computed in 3-passes: Notes: The shape and represent the sequence length and internal dimension, respectively. , , , …
Monthly Archives: September 2025
Journey to Agents
We are entering the second half of AI [1]. Environments and evals are becoming as important as algorithms. In my vision, a real useful consumer AI will be a general agent supporting both GUI and bash environments with real-time voice support. In this post, I’m going to list relevant pointers for building such an agent. …