Information Bottleneck + RL Exploration

In this post, we are going to discuss one good idea from 2017 – information bottleneck [2]. Then we will discuss how the idea can be applied in meta-RL exploration [1]. 

Mutual Information

We will start warming up by revisiting a classic concept in information theory, mutual information [3]. Mutual information I(X;Y) measures the amount of information obtained about one random variable by observing the other random variable:
I(X;Y) = \int\int dx dy p(x,y)\log\frac{p(x,y)}{p(x)p(y)} = H(X) - H(X|Y) = H(Y)-H(Y|X). From [3], we can see how these equations are derived:

Stochastic Variational Inference and VAE

 

Information Bottleneck

 

 

 

References

[1] Decoupling Exploration and Exploitation for Meta-Reinforcement Learning
without Sacrifices: https://arxiv.org/pdf/2008.02790 

[2] DEEP VARIATIONAL INFORMATION BOTTLENECK: https://arxiv.org/pdf/1612.00410

[3] https://en.wikipedia.org/wiki/Mutual_information

[4] https://czxttkl.com/2019/05/04/stochastic-variational-inference/

 

Leave a comment

Your email address will not be published. Required fields are marked *