Model

$$$$p * __deriv_t == vec

p_t == ReLU(W_(2 *(p * __deriv_t ; ReLU(W_1**P * p * __deriv_t

+ b_1**P)))**P + b_2**P)

Model

$$$$h_t == RNN**A

p(a) propto exp(W**A + b**A)

Learning

$$$$L_(1 *theta_1) == _lambda_V * L_V(theta_1) + _lambda_G * L_G(theta_1)

+ _lambda_A * L_A(theta_1)

+ _lambda_P * L_P(theta_1) + _lambda_G__deriv * L_G__deriv(theta_1)

Model

$$$$p * __deriv_t == vec

p_t == ReLU(W_(2 *(p * __deriv_t ; ReLU(W_1**P * p * __deriv_t

+ b_1**P)))**P + b_2**P)

Model

$$$$h_t == RNN**A

p(a) propto exp(W**A + b**A)

Learning

$$$$L_(1 *theta_1) == _lambda_V * L_V(theta_1) + _lambda_G * L_G(theta_1)

+ _lambda_A * L_A(theta_1)

+ _lambda_P * L_P(theta_1) + _lambda_G__deriv * L_G__deriv(theta_1)

Model

$$$$p * __deriv_t == vec

p_t == ReLU(W_(2 *(p * __deriv_t ; ReLU(W_1**P * p * __deriv_t

+ b_1**P)))**P + b_2**P)

Model

$$$$h_t == RNN**A

p(a) propto exp(W**A + b**A)

Learning

$$$$L_(1 *theta_1) == _lambda_V * L_V(theta_1) + _lambda_G * L_G(theta_1)

+ _lambda_A * L_A(theta_1)

+ _lambda_P * L_P(theta_1) + _lambda_G__deriv * L_G__deriv(theta_1)