‹
Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

(Example of Inductive Bias: Preference of Hypothesis Sets with Low Complexity (Regularization)) (Perceptron) Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

(Perceptron) (Multilayer Perceptron and Representation Learning) (Deep Learning: from Late 60′ s to Today) (Early Stopping) Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

(Deep Learning: from Late 60′ s to Today) Fig. 2 Fig. 3 (Experiments with fully labeled data) (Multi-task Learning) (Transfer Learning) (Parameter Sharing: Particular Case of Convolutional Networks and Auto-encoders) (Dropout) Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

(Transfer Learning) (Transfer Learning) (Datasets) Fig. 1 Linear adaptation of the importance weights during training. (General training setup) (Experiments with fully labeled data) (Experiments with fully labeled data) Fig. 4 Fig. 5 (Experiments with fully labeled data) (Learning the TL-CNN) (Implementation and Optimization Details) (On Learning Invariance within Neural Networks) Fig. 6 Finding the L3 slice within a whole CT scan. (Prologue) (Prologue) (Prologue) (MIP Transformation) Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

(Decision Process using a Sliding Window over the MIP Images) (Backpropagation, Computing the Derivatives, and Issues) (Terminology) (Bias-variance Tradeoff) Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

(Backpropagation, Computing the Derivatives, and Issues) (Nonlinear Activation Functions) Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Fig. 7 Rectified linear unit function. (Nonlinear Activation Functions) (Universal Approximation Properties and Depth) (Universal Approximation Properties and Depth) (Experimenting in Deep Learning) (Experimenting in Deep Learning) (Experimenting in Deep Learning) (Prologue) (Deep Neural Networks Approaches) Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

Statistical Learning and Generalization: PAC Learning

$$$$ 1 xspace_(h(x) != c(x)) == if (if * h(x) != c(x)): (1 ) elseIf(if * h(x) == c(x) .): (0 )

Statistical Learning and Generalization: PAC Learning

$$$$ R(h) == Pr_(x ~ D) == E xspace_(x ~ D)

Statistical Learning and Generalization: PAC Learning

$$$$ hat(R * h) == 1 /N * sum(1 xspace_(h(x**(i)) != c(x**(i))))

›