Predictive Information Bottleneck

$$$$max_p(theta / vb(x_P)) * I(theta ; vb(x_F)) s.t. I(theta ; vb(x_P)

) == I_0

Predictive Information Bottleneck

$$$$max_p(theta / vb(x_P)) * I(theta ; vb(x_F)) -(1 - beta) I(theta

; vb(x_P))

Predictive Information Bottleneck

$$$$I(theta ; vb(x_F), vb(x_P)) == I(theta ; vb(x_F)) + I(theta ;

vb(x_P) / vb(x_F)) == I(theta ; vb(x_P)) + cancel(I(theta ; vb(x_F)

/ vb(x_P)))

Predictive Information Bottleneck

$$$$max_p(theta / vb(x_P)) * I(theta ; vb(x_F)) s.t. I(theta ; vb(x_P)

) == I_0

Predictive Information Bottleneck

$$$$max_p(theta / vb(x_P)) * I(theta ; vb(x_F)) -(1 - beta) I(theta

; vb(x_P))

Predictive Information Bottleneck

$$$$I(theta ; vb(x_F), vb(x_P)) == I(theta ; vb(x_F)) + I(theta ;

vb(x_P) / vb(x_F)) == I(theta ; vb(x_P)) + cancel(I(theta ; vb(x_F)

/ vb(x_P)))