机器学习概论

1 Introduction

什么是机器学习？

"Learning denotes changes in a system that enable a system to do the same task more efficiently the next time." – Herbert Simon

"Learning is constructing or modifying representations of what is being experienced." – Ryszard S. Michalski

Learning = improving with experience at same task – Tom M. Mitchell

T (Task) E (Experience) P(Performance)

提示

Learning: change / construct or modify / improve

如何设计一个机器学习系统？

Training Experience

What experience?
注意training data bias: data, training procedure, features
What exactly should be learned?
How shall it be represented?
What specific algorithm to learn it?

2 Preliminary: 基本概念

给定：
Instance Space \(\color{green}{X}\)：可能的情况
Hypothesis Class \(\color{green}{H}\)：假设
Training Examples \(\color{green}{D}\)：positive an negative examples of the Target Function \(\color{green}{C}\) \(<x_1,c(x_1)>,\ \ldots ,\ <x_m,c(x_m)>>\)

决定：
A hypothesis \(h \in H\) such that \(\color{green}{h(x) = c(x) ,\ \forall x \in X}\)，即找到一个假设使得对于所有的\(x \in X\)假设的输出都和实际的输出一样。

一般来说，\(X\)是无穷大或指数型的，所以一般来说无法保证对于所有的\(x \in X\)都有\(h(x) = c(x)\)。取而代之的是，选出一个好的近似，e.g. \(h(x) = c(x) ,\ \forall x \in D\)。