Algorithm – czxttkl

Causal Inference 102

In my blog, I have covered several pieces of information about causal inference: Causal Inference: we talked about (a) two-stage regression for estimating the causal effect between X and Y even when there is a confounder between them; (b) causal invariant prediction Tools needed to build an RL debugging tool: we talked about 3 main …

Continue reading “Causal Inference 102”

Reinfocement Learning in LLMs

In this post, we overview Reinforcement Learning techniques used in LLMs and alternative techniques that are often compared with RL techniques. PPO The PPO-based approach is the most famous RL approach. Detailed derivation of PPO and implementation tricks are introduced thoroughly in [2]. Especially, we want to call out their recommended implementation tricks: SLiC-HF SLiC-HF …

Continue reading “Reinfocement Learning in LLMs”

Llama code anatomy

This is the first time I have read llama2 code. Many things are still similar to the original transformer code, but there are also some new things. I am documenting some findings. Where is Llama2 Code? Modeling (training) code is hosted here: https://github.com/facebookresearch/llama/blob/main/llama/model.py Inference code is hosted here: https://github.com/facebookresearch/llama/blob/main/llama/generation.py Annotations There are two online annotations …

Continue reading “Llama code anatomy”

Improve reasoning for LLMs

LLMs have become the hottest topic in 2023, when I did not have much time to cover related topics. Let’s deep dive into this topic in the beginning of 2024. Prompts Using few-shots prompts to hint LLMs how to solve problems is the simplest form to improve reasoning for LLMs. When you first come across …

Continue reading “Improve reasoning for LLMs”

Diffusion models

Diffusion models are popular these days. This blog [1] summarizes the comparison between diffusion models with other generative models: Before we go into the technical details, I want to use my own words to summarize my understanding in diffusion models. Diffusion models have two subprocesses: forward process and backward process. The forward process is non-learnable …

Continue reading “Diffusion models”

Mode collapse is real for generative models

I am very curious to see whether generative models like GAN and VAE can fit data of multi-modes. [1] has some overview over different generative models, mentioning that VAE has a clear probabilistic objective function and is more efficient. [2] showed that diffusion models (score-based generative models) can better fit multimode distribution than VAE and …

Continue reading “Mode collapse is real for generative models”

Causal Inference in Recommendation Systems

We have briefly touched some concepts of causal inference in [1, 2]. This post introduces some more specific works which apply causal inference in recommendation systems. Some works need to know the background of backdoor and frontdoor adjustments. So we will introduce them first. Backdoor and frontdoor adjustment Suppose we have a causal graph like …

Continue reading “Causal Inference in Recommendation Systems”

GATO and related AGI research

Policy Generalist Deepmind has recently published a work named Gato. I find it interesting as Gato learns a multi-modal multi-task policy to many tasks such as robot arm manipulation, playing atari, and image captioning. I don’t think the original paper [2] has every detail of implementation but I’ll try to best summarize what I understand. …

Continue reading “GATO and related AGI research”

Some latest recsys papers

7 years ago I posted one tutorial about recommendation systems. Now it is 2022 and there are many more advancements. This post will overview several latest ideas. CTR models Google’s recsys 2022 paper [1] introduces many practical details on their CTR models. First, to reduce training cost, there are 3 effective ways: applying bottleneck layers …

Continue reading “Some latest recsys papers”

New Model Architectures

There are many advancements in new model architectures in AI domain. Let me overview these advancements in this post. Linear Compression Embedding LCE [1] is simply using a matrix to project one embedding matrix to another: , where . Pyramid networks, inception network, dhen, lce Perceiver and Perceiver IO Perceiver-based architectures [5,6] solve …

Continue reading “New Model Architectures”