Skip to content
czxttkl
Author Archives:
czxttkl
Information Bottleneck + RL Exploration
View LLMs as compressors + Scaling laws
TQQQ/UPRO + volatility
More details in DPO
Minimal examples of HuggingFace LLM training
Causal Inference 102
Reinfocement Learning in LLMs
Llama code anatomy
Improve reasoning for LLMs
Dollar cost average on TQQQ vs QQQ [Real Data]
Posts pagination
1
2
3
…
41
Older posts