机器学习概论

  本笔记主要参考清华大学-机器学习概论


1 Introduction

什么是机器学习?

"Learning denotes changes in a system that enable a system to do the same task more efficiently the next time." – Herbert Simon

"Learning is constructing or modifying representations of what is being experienced." – Ryszard S. Michalski

Learning = improving with experience at same task – Tom M. Mitchell

T (Task)    E (Experience)    P(Performance)

提示

Learning: change / construct or modify / improve

如何设计一个机器学习系统?

Training Experience

  • What experience?
    注意training data bias: data, training procedure, features
  • What exactly should be learned?
  • How shall it be represented?
  • What specific algorithm to learn it?

2 Preliminary: 基本概念

给定:
  Instance Space \(\color{green}{X}\):可能的情况
  Hypothesis Class \(\color{green}{H}\):假设
  Training Examples \(\color{green}{D}\):positive an negative examples of the Target Function \(\color{green}{C}\) \(<x_1,c(x_1)>,\ \ldots ,\ <x_m,c(x_m)>>\)

决定:
  A hypothesis \(h \in H\) such that \(\color{green}{h(x) = c(x) ,\ \forall x \in X}\),即找到一个假设使得对于所有的\(x \in X\)假设的输出都和实际的输出一样。

  一般来说,\(X\)是无穷大或指数型的,所以一般来说无法保证对于所有的\(x \in X\)都有\(h(x) = c(x)\)。取而代之的是,选出一个好的近似,e.g. \(h(x) = c(x) ,\ \forall x \in D\)。