In the past, we have tested TQQQ/UPRO on simulation data and real data. Today, I encountered an interesting video talking about using volatility indicators to decide when to hold leveraged ETFs. Here, I am just recording its link and its main result. We may come back and add more discussions in the future. …
Monthly Archives: May 2024
More details in DPO
In this post, we dig into more details of Direct Preference Optimization [1], a popular method used in RLHF. First, we start from the normal RLHF objective that is typically used in PPO literature, which is equation 3 in the DPO paper [1]. Typically, we have input prompts and an LLM’s responses . The objective …