Natural Gradient Boosting

$$$$E_(y ~ Q) <= E_(y ~ Q) forall P, Q

Natural Gradient Boosting

$$$$def expectation(l):

N=10000

return mean(l() for _ in range(N))

D_L(Q / P) == E_(y ~ Q) - E_(y ~ Q)

== expectation(lambda: y = sample(Q); log(Q(y)/P(y)))

== : D_KL(Q / P)

Natural Gradient Boosting

$$$$D_C(Q / P) == E_(y ~ Q) - E_(y ~ Q)

== int((F_Q(z) - F_P(z)))

== : D_L**2(Q / P)

Natural Gradient Boosting

$$$$E_(y ~ Q) <= E_(y ~ Q) forall P, Q

Natural Gradient Boosting

$$$$def expectation(l):

N=10000

return mean(l() for _ in range(N))

D_L(Q / P) == E_(y ~ Q) - E_(y ~ Q)

== expectation(lambda: y = sample(Q); log(Q(y)/P(y)))

== : D_KL(Q / P)

Natural Gradient Boosting

$$$$D_C(Q / P) == E_(y ~ Q) - E_(y ~ Q)

== int((F_Q(z) - F_P(z)))

== : D_L**2(Q / P)