ื”ืจืฆืื” 4 - ืกื™ื•ื•ื’ ื“ื™ืกืงืจื™ืžื™ื ื˜ื™ื‘ื™

ืžื” ื ืœืžื“ ื”ื™ื•ื

ื‘ืขื™ื•ืช ืกื™ื•ื•ื’

ื‘ืขื™ื•ืช ืกื™ื•ื•ื’ ื”ื ื‘ืขื™ื•ืช supervised learning ืฉื‘ื”ื ื”ืžืฉืชื ื” ื”ืืงืจืื™ ืฉืื•ืชื• ืžื ืกื™ื ืœื—ื–ื•ืช y\text{y} ื”ื•ื ืžืฉืชื ื” ืืงืจืื™ ื‘ื“ื™ื“ ืืฉืจ ื™ื›ื•ืœ ืœืงื‘ืœ ืกื˜ ืขืจื›ื™ื ืกื•ืคื™ (ืœืจื•ื‘ ืงื˜ืŸ).

ื“ื•ื’ืžืื•ืช ืœื‘ืขื™ื•ืช ืกื™ื•ื•ื’:

  • ืžืขืจื›ืช ืœื”ืชืจืื” ืขืœ ืžื›ืฉื•ืœื™ื ื‘ื›ื‘ื™ืฉ (ื”ื•ืœืš ืจื’ืœ, ืงืจื‘ื” ืœืจื›ื‘ ืฉืžืœืคื ื™ื ื•ื›ื•') ืžืชื•ืš ืชืžื•ื ื•ืช ืžืžืฆืœืžืช ื“ืจืš.
  • ืžืขืจื›ืช ืœืกื™ื ื•ืŸ ื“ื•ืืจ ื–ื‘ืœ.
  • ืžืขืจื›ืช ืœื–ื™ื”ื•ื™ ื›ืชื‘ ื™ื“ ื‘ืชืžื•ื ื”.
  • ืžืขืจื›ื•ืช speach-to-text ืืฉืจ ื”ื•ืคื›ื•ืช ืงื˜ืข ืื•ื“ื™ื• ืœืžื™ืœื™ื.
  • ืžืขืจื›ืช ืœื–ื™ื”ื•ื™ ืคื ื™ื ื‘ืชืžื•ื ื•ืช.

ื”ื”ื‘ื“ืœ ื‘ืื•ืคื™ ืฉืœ ื”ืžืฉืชื ื” ืฉืื•ืชื• ืžืกื ื™ื ืœื—ื–ื•ืช ื‘ื™ืŸ ื‘ืขื™ื•ืช ืจื’ืจืกื™ื” ืœื‘ืขื™ื•ืช ืกื™ื•ื•ื’, ื”ื•ื ืœืžืขืฉื” ืžืื“ ืžืฉืžืขื•ืชื™ ืžืฉืคื™ืข ื‘ืื•ืคืŸ ื ื™ื›ืจ ืขืœ ื”ืžื•ื“ืœื™ื ื•ื”ืฉื™ื˜ื•ืช ืฉื‘ื”ื ื ืฉืชืžืฉ ืขืœ ืžื ืช ืœืคืชื•ืจ ื‘ืขื™ื•ืช ืกื™ื•ื•ื’. ืฉื ื™ ื”ืกื™ื‘ื•ืช ื”ืขื™ืงืจื™ื•ืช ืœืฉื•ื ื™ ื”ื™ื ื:

  • ื‘ื‘ืขื™ื•ืช ืกื™ื•ื•ื’ ืื ื• ื ื—ืคืฉ ื—ื–ืื™ ืืฉืจ ืžื™ื™ืฆืจ ืขืจื›ื™ื ื‘ื“ื™ื“ื™ื (ื‘ื“ื•ืžื” ืœy\text{y}) ื•ืœื›ืŸ ืœื ื ื•ื›ืœ ืœื”ืฉืชืžืฉ ื‘ืžื•ื“ืœื™ื ืจืฆื™ืคื™ื, ื›ื’ื•ืŸ ืคื•ืœื™ื ื•ืžื™ื, ื›ืคื™ ืฉื”ื. ื‘ื ื•ืกืฃ, ื‘ืคื•ื ืงืฆื™ื•ืช ืืฉืจ ืžื•ืฆื™ืื•ืช ืขืจื›ื™ื ื‘ื“ื™ื“ื™ื ืื™ืŸ ื˜ืขื ืœื“ื‘ืจ ืขืœ ื”ื ื’ื–ืจืช (ื”ื™ื ืงื‘ื•ืขื” ื‘ื›ืœ ืžืงื•ื ืžืœื‘ื“ ื ืงื•ื“ื•ืช ืื™ ื”ืจืฆื™ืคื•ืช ืฉื‘ื”ื ื”ื™ื ืขื•ื‘ืจืช ืžืขืจืš ืื—ื“ ืœืื—ืจ). ืœื›ืŸ ืœื ื ื•ื›ืœ ืœื—ืคืฉ ืืช ื”ื—ื–ืื™ ื”ืื•ืคื˜ื™ืžืืœื™ ืขืœ ื™ื“ื™ ื’ื–ื™ืจื” ื•ื”ืฉื•ื•ืื” ืœ-0 ืื• ื‘ืขื–ืจืช ืฉื™ื˜ื•ืช ื›ืžื• gradient descent.
  • ื‘ื‘ืขื™ื•ืช ืกื™ื•ื•ื’ ืœืจื•ื‘ ืœื ืชื”ื™ื” ืžืฉืžืขื•ืช ืœืžืจื—ืง ื‘ื™ืŸ ื”ื—ื™ื–ื•ื™ y^\hat{y} ืœืขืจืš ื”ืื™ืžื™ืชื™ ืฉืœ yy. ืœื“ื•ื’ืžื, ื‘ื ื™ืกื™ื•ืŸ ืœื–ื”ื•ืช ืืช ื”ืื•ืช gg, ื—ื™ื–ื•ื™ ืฉืœ ื”ืื•ืช ff (ืืฉืจ ืžื•ืคื™ืขื” ื‘ืฆืžื•ื“ ืœgg ื‘ืืœืฃ ื‘ื™ืช) ื”ื•ื ืœื ื‘ื”ืจื›ื— ื—ื™ื–ื•ื™ ื˜ื•ื‘ ื™ื•ืชืจ ืž qq (ืืฉืจ ื ืžืฆื ืจื—ื•ืง ื™ื•ืชืจ). ืขืงื‘ ื›ืš ืคื•ื ืงืฆื•ืช ืžื—ื™ืจ ื›ืžื• MSE ืืฉืจ ืžืชื™ื™ื—ืกื•ืช ืœื’ื•ื“ืœ ืฉืœ ืฉื’ื™ืืช ื”ื—ื™ื–ื•ื™ ื™ื”ื™ื• ืœื ืจืœื•ื•ื ื˜ื™ื•ืช ื•ื ืืœืฅ ืœื”ืฉืชืžืฉ ื‘ืคื•ื ืงืฆื™ื•ืช ืžื—ื™ืจ ืื—ืจื•ืช ื›ื’ื•ืŸ misclassification rate (ืฉืื•ืชื” ื ื–ื›ื™ืจ ืžื™ื“).

ื‘ื”ืจืฆืื” ื”ืงืจื•ื‘ื” ื ื›ื™ืจ ืฉืชื™ ืฉื™ื˜ื•ืช ื“ื™ืกืงืจื™ืžื™ื ื˜ื™ื‘ื™ื•ืช ืœืคืชืจื•ืŸ ื‘ืขื™ื•ืช ืกื™ื•ื•ื’.

ื”ื’ื™ืฉื” ื”ื“ื™ืกืงืจื™ืžื™ื ื˜ื™ื‘ื™ืช - ืชื–ื›ื•ืจืช: ืชื—ืช ื”ื’ื™ืฉื” ื”ื“ื™ืกืงืจื™ืžื™ื ื˜ื™ื‘ื™ืช ืื ื• ืžื ืกื™ื ืœื‘ื ื•ืช ื—ื–ืื™ ื‘ืขืœ ื‘ื™ืฆื•ืขื™ื™ื ื˜ื•ื‘ื™ื ื›ื›ืœ ื”ืืคืฉืจ ืขืœ ื”ืžื“ื’ื. ื›ื“ื™ ืœื ืกื•ืช ื•ืœื”ื’ื‘ื™ืœ ืืช ืžื™ื“ืช ื” overfitting ืื ื• ื ืฉื™ื ืžื’ื‘ืœื•ืช ืขืœ ืคื•ื ืงืฆื™ื™ืช ื”ื—ื™ื–ื•ื™, ื›ืคื™ ืฉืจืื™ื ื• ื‘ื”ืจืฆืื” ื”ืื—ืจื•ื ื”.

ื‘ืขื™ื” ืœื“ื•ื’ืžื

ื‘ื—ื‘ืจื•ืช ื›ืจื˜ื™ืกื™ ืืฉืจืื™ ืžืฉืชืžืฉื™ื ื‘ืžืขืจื›ื•ืช ืื•ื˜ื•ืžื˜ื™ืช ืฉืื•ืžืจื•ืช ืœื”ืชืจื™ืข ืขืœ ืขืกืงื•ืื•ืช ื—ืฉื•ื“ื•ืช ืฉืขืฉื•ื™ื•ืช ืœื”ื™ื•ืช ื”ื•ื ืื•ืช ืืฉืจืื™, ืœื“ื•ื’ืžื ื›ืืฉืจ ืžื‘ืฆืข ื”ืขื™ืกืงื” ืžืฉืชืžืฉ ื‘ื›ืจื˜ื™ืก ื”ืืฉืจืื™ ื’ื ื•ื‘. ืžืขืจื›ื•ืช ืืœื• ืžื ืกื•ืช ืœื‘ืฆืข ืกื™ื•ื•ื’ ืฉืœ ื”ืขื™ืกืงื” ืœืขื™ืกืงื” ืœื’ื™ื˜ื™ืžื™ืช ืื• ื—ืฉื•ื“ื” ืขืœ ืกืžืš ืคืจื˜ื™ ื”ืขื™ืกืงื”, ื›ื’ื•ืŸ:

  • ื”ืกื›ื•ื.
  • ืžืจื—ืง ื”ืขื™ืกืงื” (ื ื ื™ื— ื”ืžื™ืงื•ื ืฉืœ ื”ื—ื ื•ืช) ืžื”ืขื™ืกืงื” ื”ืื—ืจื•ื ื”.
  • ืžืจื—ืง ื”ืขื™ืกืงื” ืžื”ื›ืชื•ื‘ืช ืฉืœ ื”ืœืงื•ื—.
  • ื”ืฉืขื” ื‘ื™ื•ื.
  • ืื•ืคื™ ื”ืžื•ืฆืจื™ื ืฉื”ื—ื ื•ืช ืžื•ื›ืจืช (ืžื›ื•ืœืช, ืžื•ืฆืจื™ ื—ืฉืžืœ, ื‘ื™ื’ื•ื“, ืจื›ื‘, ื ื“ืœ"ืŸ, ื•ื›ื•')

ื“ืจืš ืื—ืช ืœื‘ื ื•ืช ืžืขืจื›ืช ืฉื›ื–ื• ื”ื™ื ื‘ืขื–ืจืช supervised learning ืขืœ ื™ื“ื™ ืฉื™ืžื•ืฉ ื‘ืžื“ื’ื ื’ื“ื•ืœ ืฉืœ ื“ื•ื’ืžืื•ืช ืžืชื•ื™ื™ื’ื•ืช ืžื”ืขื‘ืจ. ื‘ื”ืจืฆืื” ื–ื• ื ืฉืชืžืฉ ื‘ืขื™ื” ื–ื• ื›ื“ื•ื’ืžื. ื ืชื™ื™ื—ืก ืœืžื“ื’ื ื”ื‘ื:

ื ืจืฆื” ืœืžืฆื•ื ื—ื–ืื™ ืืฉืจ ืœื›ืœ ืฆืžื“ ื—ื“ืฉ ืฉืœ ืžืจื—ืง ื•ืžื—ื™ืจ ื™ื—ื–ื™ืจ ื—ื™ื–ื•ื™ ืฉืœ ื”ืื ื”ืขืกืงื” ื—ืฉื•ื“ื” ืื• ืœื. ืœื“ื•ื’ืžื:

(ื”ื—ื™ื–ื•ื™ ื‘ื›ืœ ื ืงื•ื“ื” ื”ื•ื ื”ืฆื‘ืข ืฉืœ ื”ืจืงืข)

ื—ืœื•ืงื” ืœ train-test

ื ื—ืœืง ืืช ื”ืžื“ื’ื ืœ80% train ื• 20% test:

ืฉืžื•ืช ื•ืกื™ืžื•ื ื™ื

ื ืฆื™ื’ ืืช ื”ืฉืžื•ืช ื”ืกื™ืžื•ื ื™ื ื”ืžืงื•ื‘ืœื™ื ื‘ื‘ืขื™ื•ืช ืกื™ื•ื•ื’:

  • ื‘ื‘ืขื™ื•ืช ืกื™ื•ื•ื’ ื ื”ื•ื’ ืœื”ืชื™ื™ื—ืก ืœื—ื–ืื™ ื›ืืœ ืžืกื•ื•ื’ (classifier) ืื• discriminator (ืžืงื˜ืœื’).
  • ืืช ื”ืขืจื›ื™ื ื”ืฉื•ื ื™ื ืฉืื•ืชื ื”ืชื•ื•ื™ื•ืช ื™ื›ื•ืœ ืœืงื‘ืœ ืžื›ื ื™ื ืžื—ืœืงื•ืช.
  • ืืช ืžืกืคืจ ื”ืžื—ืœืงื•ืช ื ืกืžืŸ ื‘ืงืจื•ืก ื‘ CC.
  • ื‘ืขื™ื•ืช ืกื™ื•ื•ื’ ืฉื‘ื”ื ื™ืฉ ืจืง 2 ืžื—ืœืงื•ืช, C=2C=2, ืžื›ื•ื•ื ื•ืช ื‘ืขื™ื•ืช ืกื™ื•ื•ื’ ื‘ื™ื ืืจื™.
  • ื‘ืกื™ื•ื•ื’ ื‘ื™ื ืืจื™, ืžืงื•ื‘ืœ ืœื”ืฉืชืžืฉ ื‘ืื—ื“ ืžื”ืื•ืคืฆื™ื•ืช ื”ื‘ืื•ืช ืœืกื™ืžื•ืŸ ื”ืžื—ืœืงื•ืช:

    • yโˆˆ{0,1}y\in\{0,1\}.
    • yโˆˆ{โˆ’1,1}y\in\{-1,1\}.
  • ื‘ืกื™ื•ื•ื’ ืœื ื‘ื™ื ืืจื™, ืžืงื•ื‘ืœ ืœื”ืฉืชืžืฉ ื‘ืื—ื“ ืžื”ืื•ืคืฆื™ื•ืช ื”ื‘ืื•ืช ืœืกื™ืžื•ืŸ ื”ืžื—ืœืงื•ืช:

    • yโˆˆ{1,2,โ€ฆ,C}y\in\{1,2,\dots,C\}.
    • yโˆˆ{0,1,โ€ฆ,Cโˆ’1}y\in\{0,1,\dots,C-1\}.

Misclassification rate

ืคื•ื ืงืฆื™ื™ืช ื”ืžื—ื™ืจ ื”ื ืคื•ืฆื” ื‘ื‘ืขื™ื•ืช ืกื™ื•ื•ื’ ื”ื™ื ื” ืคื•ื ืงืฆื™ื™ืช ื” misclassification rate. ืคื•ื ืงืฆื™ื” ื–ื• ืžื—ืฉื‘ืช ืืช ื”ืชื“ื™ืจื•ืช ืฉื‘ื” ืฆืคื•ื™ ื”ื—ื–ืื™ ืœื‘ืฆืข ืฉื’ื™ืื•ืช ื—ื™ื–ื•ื™ (ืœืœื ืงืฉืจ ืœื’ื•ื“ืœ ื”ืฉื’ื™ืื”). ื‘ื“ื•ืžื” ืœืจื•ื‘ ืคื•ื ืงืฆื™ื•ืช ื”ืฉื’ื™ืื” ื”ื ืคื•ืฆื•ืช ื’ื ืคื•ื ืงืฆื™ื” ื–ื• ืžื•ื’ื“ืจืช ื›ืคื•ื ืงืฆื™ื™ืช risk ื•ื”ื™ื ืžืฉืชืžืฉืช ื‘ืคื•ื ืงืฆื™ื™ืช ื” loss ื”ื‘ืื”:

l(y^,y)=I{y^โ‰ y}l(\hat{y},y)=I\{\hat{y}\neq y\}

ืคื•ื ืงืฆื™ื™ืช loss ื–ื• ื ืงืจืืช ืคื•ื ืงืฆื™ื™ืช ื” zero-one loss. ืชื—ืช ืคื•ื ืงืฆื™ื” ื–ื• ื—ื™ื–ื•ื™ื™ื ื ื›ื•ื ื™ื ื™ืงื‘ืœื• ืฆื™ื•ืŸ 0 ื•ืฉื’ื™ืื•ืช ื™ืงื‘ืœื• ืฆื™ื•ืŸ 1 (ืœื ืชืœื•ืช ื‘ื’ื•ื“ืœ ื”ืฉื’ื™ืื”). ืคื•ื ืงืฆื™ื™ืช ื” misclassification rate ื ืจืื™ืช ื›ืš:

R(h)=E[I{h(x)โ‰ y}]R(h)=\mathbb{E}\left[I\{h(\mathbf{x})\neq\text{y}\}\right]

ื”ื—ื–ืื™ ื”ืื•ืคื˜ื™ืžืืœื™

ื‘ื”ื™ื ืชืŸ ื”ืคื™ืœื•ื’ ื”ืžืฉื•ืชืฃ ืฉืœ x\mathbf{x} ื• y\text{y} ื ื™ืชืŸ ืœื—ืฉื‘ ืืช ื”ื—ื–ืื™ ื”ืื•ืคื˜ื™ืžืืœื™ ืฉืœ ื” misclassification rate. ื—ื–ืื™ ื–ื” ื ืชื•ืŸ ืขืœ ื™ื“ื™:

hโˆ—(x)=argโกmaxโกyย p(yโˆฃx=x)h^*(\boldsymbol{x})=\underset{y}{\arg\max}\ p(y|\mathbf{x}=\boldsymbol{x})

ื—ื–ืื™ ื–ื” ืœืžืขืฉื” ืžื—ื–ื™ืจ ืืช ื” y\text{y} ื”ื›ื™ ืกื‘ื™ืจ (ื”ื›ื™ ืฉื›ื™ื—, ื” mode) ื‘ื”ืกืชื‘ืจื•ืช ืฉืœ y\text{y} ื‘ื”ื™ื ืชืŸ x\mathbf{x}.

(ื”ืคื™ืชื•ื— ืœืžืงืจื” ืฉืœ ื—ื™ื–ื•ื™ ืœื ืžื•ืชื ื” ืžื•ืคื™ืข ื‘ืชืจื’ื•ืœ 2 ื•ื”ืคื™ืชื•ื— ืœืžืงืจื” ื”ืžื•ืชื ื” ื™ื ืชืŸ ื‘ืชืจื’ื™ืœ ื‘ื™ืช 1)

1-NN (1-Nearest Neighbours)

1-NN ื”ื•ื ืื—ื“ ื”ืืœื’ื•ืจื™ืชืžื™ื ื”ืคืฉื•ื˜ื™ื ื‘ื™ื•ืชืจ ืœื‘ื ื™ื™ืช ื—ื–ืื™ ืœื‘ืขื™ื•ืช superviesed learning. ื‘ืฉื™ื˜ื” ื–ื• ื”ื—ื™ื–ื•ื™ ืžืชื‘ืฆืข ื‘ืื•ืคืŸ ื”ื‘ื, ื‘ืขื‘ื•ืจ x\boldsymbol{x} ืžืกื•ื™ื™ื, ืฉื‘ืขื‘ื•ืจื• ื ืจืฆื” ืœื‘ืฆื ืืช ื”ื—ื™ื–ื•ื™, ื ื—ืคืฉ ื‘ืžื“ื’ื ืืช ื”ื“ื’ื™ืžื” ืขื ื” x(i)\boldsymbol{x}^{(i)} ื”ืงืจื•ื‘ ืขืœื™ื• ื‘ื™ื•ืชืจ (ื”ืฉื›ืŸ ื”ืงืจื•ื‘ ื‘ื™ื•ืชืจ) ื•ื ื‘ืฆืข ืืช ื”ื—ื™ื–ื•ื™ ืขืœ ืคื™ ื” y(i)y^{(i)} ืฉืžืชืื™ื ืœืื•ืชื” ื“ื’ื™ืžื”. ื‘ื—ื–ืื™ ื–ื” ืœืžืขืฉื” ืื™ืŸ ืฉืœื‘ ืฉืœ ืœื™ืžื•ื“ (ื‘ื ื™ื” ืฉืœ ืžื•ื“ืœ) ื•ื”ื—ื™ื–ื•ื™ ื ืขืฉื” ื™ืฉื™ืจื•ืช ืขืœ ืคื™ ื”ืžื“ื’ื. ืฉื™ื˜ื•ืช ืžืกื•ื’ ื–ื” ืžื›ื•ื ื•ืช ืฉื™ื˜ื•ืช ื-ืคืจืžื˜ืจื™ื•ืช.

ื ืจืฉื•ื ื–ืืช ื‘ืื•ืคืŸ ืžืชืžื˜ื™. ื ืชื•ืŸ ืžื“ื’ื D={x(i),y(i)}\mathcal{D}=\{\boldsymbol{x}^{(i)},y^{(i)}\} ื•ื“ื’ื™ืžื” ื ื•ืกืคืช x\boldsymbol{x} (ืืฉืจ ืื™ื ื” ื—ืœืง ืžื”ืžื“ื’ื) ืฉืขืœื™ื” ื ืจืฆื” ืœื‘ืฆืข ืืช ื”ื—ื™ื–ื•ื™. ื ื‘ืฆืข ื—ื™ื–ื•ื™ ื‘ืฉื™ื˜ืช ื” 1-NN ื‘ืื•ืคืŸ ื”ื‘ื:

  1. ื ืžืฆื ืืช ื”ืื™ื ื“ืงืก ืฉืœ ื”ืฉื›ืŸ ื”ืงืจื•ื‘ ื‘ื™ื•ืชืจ i=argโกminโกiโˆฅx(i)โˆ’xโˆฅ2i=\underset{i}{\arg\min} \lVert \boldsymbol{x}^{(i)}-\boldsymbol{x}\rVert_2
  2. ื”ื—ื™ื–ื•ื™ ื™ื”ื™ื” ื”ืชื•ื•ื™ืช ืฉืœ ื”ืฉื›ืŸ ื”ืงืจื•ื‘ ื‘ื™ื•ืชืจ y^=y(i)\hat{y}=y^{(i)}

ื›ืืŸ ื”ืฉืชืžืฉื ื• ื‘ืžืจื—ืง ืื•ืงืœื™ื“ื™ (ื ื•ืจืžืช l2l_2 ืฉืœ ื”ืžืจื—ืง ื‘ื™ืŸ ื” x\boldsymbol{x}-ื™ื). ื ื™ืชืŸ ื›ืžื•ื‘ืŸ ืœื”ื—ืœื™ืฃ ืžื“ื“ ื–ื” ื‘ื›ืœ ืžื“ื“ ืžืจื—ืง ืื—ืจ ื›ืชืœื•ืช ื‘ืฆืจื›ื™ ื”ื‘ืขื™ื”.

ื“ื•ื’ืžื

ื ืคืขื™ืœ ืืช 1-NN ืขืœ ื”ื“ื•ื’ืžื ืฉืœื ื•:

ืืœื’ื•ืจื™ืชื ื–ื” ื™ื“ืื’ ื›ืžื•ื‘ืŸ ืฉื”ื—ื™ื–ื•ื™ ื‘ื ืงื•ื“ื•ืช ืฉืœ ื”ืžื“ื’ื ื•ื‘ืกื‘ื™ื‘ืชื ื”ืžื™ื™ื“ื™ืช ืชืชืื™ื ืœืชื™ื•ื•ืช ืฉืœื”ื. ื–ืืช ืื•ืžืจืช ืฉืขืœ ื”ื ืงื•ื“ื•ืช ืžื”ืžื“ื’ื ืื ื• ืžืฆืคื™ื ืœืงื‘ืœ ื—ื™ื–ื•ื™ ืžื•ืฉืœื. ื ืฉื™ื ืื‘ืœ ืœื‘ ืฉื—ื–ืื™ ื–ื” ื™ื•ืฆืจ ื”ืจื‘ื” "ืื™ื™ื", ืœืžืฉืœ ื‘ืžืงืจื™ื ืฉื‘ื”ื ื™ืฉื ื” ื ืงื•ื“ื” ื›ืชื•ืžื” ื‘ืื™ื–ื•ืจ ืฉืœ ื”ืจื‘ื” ื ืงื•ื“ื•ืช ื›ื—ื•ืœื•ืช ื•ืœื”ื™ืคืš. ืกื‘ื™ืจ ืœื”ื ื™ื— ืฉืื™ื™ื ืืœื• ืžืชืื™ืžื™ื ืกืคืฆื™ืคื™ืช ืœืžื“ื’ื ื”ื ืชื•ืŸ ื•ืœื ื‘ื”ื›ืจื— ื™ื”ื™ื• ื ื›ื•ื ื™ื ื‘ืขื‘ื•ืจ ืžื“ื’ื ืื—ืจ. ื–ื•ื”ื™ ื‘ื“ื™ื•ืง ื‘ืขื™ื™ืช ื” overfitting.

ื ืชื™ื™ื—ืก ืœืจื’ืข ืœืฆื•ืจื” ืฉืœ ื”ื’ื‘ื•ืœื•ืช ื‘ื™ืŸ ืฉื ื™ ืื™ื–ื•ืจื™ ื”ื”ื—ืœื˜ื” ื”ืฉื•ื ื™ื. ืœืฉื ื›ืš ื ื—ืœืง ืœืจื’ืข ืืช ื”ืžืจื—ื‘ ืœืื™ื–ื•ืจื™ื ืฉื•ื ื™ื ืขืœ ืคื™ ื”ืฉื›ืŸ ื”ื›ื™ ืงืจื•ื‘ ื”ืžืฉื•ื™ื™ืš ืœื›ืœ ื ืงื•ื“ื” ื‘ืžืจื—ื‘. ื—ืœื•ืงื” ืฉื›ื–ื• ืชื™ืฆื•ืจ ืœืžืขืฉื” ืชืื™ื ื”ืžื•ื›ื•ื ื™ื Voronoi cells ืืฉืจ ืžืงื™ืคื™ื ื›ืœ ืื—ืช ืžื”ื ืงื•ื“ื•ืช. ื ืฉืจื˜ื˜ ื—ืœื•ืงื” ื–ื• ื‘ื’ืจืฃ ืžื•ื’ื“ืœ:

ื—ืœื•ืงื” ื–ื• ืœืชืื™ื ืงื˜ื ื™ื ื™ื›ื•ืœื” ืœืขื–ื•ืจ ืœื ื• ื”ื‘ื™ืŸ ืืช ื”ืฆื•ืจื” ื”ื›ื•ืœืœ ืฉืœ ืคื•ื ืงืฆื™ื™ืช ื”ื—ื™ื–ื•ื™. ืœืžืขืฉื” ื”ืื™ื–ื•ืจ ืฉื‘ื• ื”ื—ื™ื–ื•ื™ ืฉื”ื•ื ืฉื”ืขื™ืกืงื” ื—ืฉื•ื“ื” ืžื•ืจื›ื‘ ืžืื•ืกืฃ ื›ืœ ื”ืชืื™ื ืฉืœ ื”ื“ื’ื™ืžื•ืช ื‘ืžื“ื’ื ืฉืœ ืขืกืงืื•ืช ื—ืฉื•ื“ื•ืช, ื•ื‘ื "ืœ ืœื’ื‘ื™ ื”ืขืงืื•ืช ื”ื—ื•ืงื™ื•ืช.

ื ื‘ื“ื•ืง ืืช ื‘ื™ืฆื•ืขื™ ื”ื—ื–ืื™ ืขืœ ื” test set ืœืคื™ ืคื•ื ืงืฆื™ื™ืช ื” miscalssification rate:

testย score=1Ntestโˆ‘{x(i),y(i)}โˆˆDtestI{h(x(i))โ‰ y(i)}=0.12\text{test score}=\frac{1}{N_{\text{test}}}\sum_{\{\boldsymbol{x}^{(i)},y^{(i)}\}\in\mathcal{D}_{\text{test}}}I\{h(\boldsymbol{x}^{(i)})\neq y^{(i)}\}=0.12

ื–ืืช ืื•ืžืจืช ืฉืขืœ ื™ื“ื™ ืฉื™ืžื•ืฉ ื‘ 1-NN ืื ื• ืฆืคื•ื™ื™ื ืœืฆื“ื•ืง ื‘88% ืžื”ื—ื™ื–ื•ื™ื™ื ืฉืœื ื•.

ื”ืชืœื•ืช ื‘ื™ื—ื™ื“ื•ืช ืฉืœ x\mathbf{x}

ืžื›ื™ื•ื•ืŸ ืฉืืœื’ื•ืจื™ืชื ื” 1-NN ืชืœื•ื™ ื‘ืžืจื—ืง ื‘ื™ืŸ ื ืงื•ื“ื•ืช ื‘ืžืจื—ื‘, ื™ืฉื ื” ื—ืฉื™ื‘ื•ืช ืจื‘ื” ืœื™ื—ื™ื“ื•ืช, ืื• ื™ื•ืชืจ ื ื›ื•ืŸ ืœืกื“ืจ ื”ื’ื“ื•ืœ, ืฉืœ ื”ืจื›ื™ื‘ื™ื ืฉืœ x\boldsymbol{x}. ืจื›ื™ื‘ื™ื ื‘ืขืœื™ ื’ื•ื“ืœ ืื•ืคื™ื™ื ื™ ื’ื“ื•ืœ ื™ื•ืชืจ ื™ืงื‘ืœื• ืžืฉืงืœ ื’ื“ื•ืœ ื™ื•ืชืจ. ืœืžืฉืœ, ืื ื‘ื“ื•ื’ืžื ืฉืœื ื• ื”ื™ื™ื ื• ืžื•ื“ื“ื™ื ืืช ื”ืžืจื—ืง ื‘ืžื˜ืจื™ื ื”ืขืจื›ื™ื ืฉืœ ืจื›ื™ื‘ ื”ืžืจื—ืง ื”ื™ื• ื‘ืืœืคื™ื. ืœื›ืŸ ื›ืืฉืจ ื ื ืกื” ืœื—ืฉื‘ ืืช ื”ืžืจื—ืง ื‘ื™ืŸ ืฉืชื™ ื ืงื“ื•ืช ื ืงื‘ืœ ืฉื”ืฉื•ื ื™ ื‘ืจื›ื™ื‘ ืฉืœ ื”ืžืจื—ืง ื™ื”ื™ื” ื”ืจื‘ื” ื™ื•ืชืจ ืžืฉืžืขื•ืชื™.

ืœืžืขืฉื” ื”ื“ืจืš ื”ื ื›ื•ื ื” ื™ื•ืชืจ ืœืฆื™ื™ืจ ืืช ื”ืžื“ื’ื ื‘ืžืงืจื” ื–ื” ืชื”ื™ื”:

ืœื›ืŸ ื‘ืืœื’ื•ืจื™ืชื ื–ื” ื™ืฉ ืœื“ืื•ื’ ืฉื”ืจื›ื™ื‘ื™ื ืฉืœ x\boldsymbol{x} ื™ื”ื™ื• ื‘ืขืจืš ื‘ืื•ืชื• ืกื“ืจ ื’ื•ื“ืœ.

K-NN

ื ื™ืชืŸ ืœืงืœื•ืช ืœืฉืคืจ ืืช ื”ื‘ื™ืฆื•ืขื™ื ืฉืœ ื”ืืœื’ื•ืจื™ืชื ืขืœ ื™ื“ื™ ืฉื™ืžื•ืฉ ื‘ืžืกืคืจ ืฉื›ื ื™ื. ืืช ืžืกืคืจ ื”ืฉื›ื ื™ื ื ืกืžืŸ ื‘ KK. ื’ื•ื“ืœ ื–ื” ื”ื•ื hyper-parameter ืฉืœ ื”ืืœื’ื•ืจื™ืชื. ื”ื—ื™ื–ื•ื™ ื‘ K-NN ื™ืชื‘ืฆืข ืœืคื™ ื”ืชื•ื•ื™ืช ื”ื›ื™ ืฉื›ื™ื—ื” ืžื‘ื™ืŸ KK ื”ืฉื›ื ื™ื:

  1. ื ืžืฆื ืืช KK ื”ืฉื›ื ื™ื ื‘ืขืœื™ ื” x(i)\boldsymbol{x}^{(i)} ื”ืงืจื•ื‘ื™ื ื‘ื™ื•ืชืจ ืœ x\boldsymbol{x}. (ืœืจื•ื‘ ื ืฉืชืžืฉ ื‘ืžืจื—ืง ืื•ืงืœื™ื“ื™, ืืš ื ื™ืชืŸ ื’ื ืœื‘ื—ื•ืจ ืคื•ื ืงืฆื™ื•ืช ืžื—ื™ืจ ืื—ืจื•ืช).
  2. ืชื•ืฆืืช ื”ื—ื™ื–ื•ื™ ืชื”ื™ื” ื”ืชื•ื•ื™ืช ื”ืฉื›ื™ื—ื” ื‘ื™ื•ืชืจ (majorety vote) ืžื‘ื™ืŸ KK ื”ืชื•ื•ื™ื•ืช ืฉืœ ื”ื“ื’ื™ืžื•ืช ืฉื ื‘ื—ืจื• ื‘ืฉืœื‘ 1.

ื‘ืžืงืจื” ืฉืœ ืฉื™ื•ื•ืŸ:

  • ื‘ืžืงืจื” ืฉืœ ืฉื™ื•ื•ื™ื•ืŸ ื‘ืฉืœื‘ 2, ื ืฉื•ื•ื” ื’ื ืืช ื”ืžืจื—ืง ื”ืžืžื•ืฆืข ื‘ื™ืŸ ื” x\boldsymbol{x}-ื™ื ื”ืฉื™ื™ื›ื™ื ืœื›ืœ ืชื•ื•ื™ืช. ืื ื• ื ื‘ื—ืจ ื‘ืชื•ื•ื™ืช ื‘ืขืœืช ื”ืžืจื—ืง ื”ืžืžื•ืฆืข ื”ืงืฆืจ ื‘ื™ื•ืชืจ.
  • ื‘ืžืงืจื” ืฉืœ ืฉื™ื•ื•ื•ืŸ ื’ื ื‘ื™ืŸ ื”ืžืจื—ืงื™ื ื”ืžืžื•ืฆืขื™ื, ื ื‘ื—ืจ ืืงืจืื™ืช.

ื“ื•ื’ืžื

ื ืฉืชืžืฉ ื‘ 5-NN ืขืœ ืžื ืช ืœื‘ื ื•ืช ืืช ื”ื—ื–ืื™ ืฉืœื ื•:

ื ืฉื™ื ืœื‘ ืฉื›ืžื•ืช ื”ืื™ื™ื ื”ืฆื˜ืžืฆืžื” ื”ื’ื‘ื•ืœื•ืช ื‘ื™ืŸ ื”ืื™ื–ื•ืจื™ื ื ืขืฉื• ื™ื•ืชืจ ื—ืœืงื™ื. ื ื‘ื“ื•ืง ืืช ื”ืชืชื•ืฆืื•ืช ืขืœ ื” test set:

testย score=1Ntestโˆ‘{x(i),y(i)}โˆˆDtestI{h(x(i))โ‰ y(i)}=0.10\text{test score}=\frac{1}{N_{\text{test}}}\sum_{\{\boldsymbol{x}^{(i)},y^{(i)}\}\in\mathcal{D}_{\text{test}}}I\{h(\boldsymbol{x}^{(i)})\neq y^{(i)}\}=0.10

ื”ื•ืจื“ื ื• ืืช ืชื“ื™ืจื•ืช ื”ืฉื’ื™ืื•ืช ืœ10%.

ื‘ื—ื™ืจืช ื” KK ื”ืื•ืคื˜ื™ืžืืœื™

ื ื—ืœืง ืืช ื” train set ืœ 75% train ื• 25% validation:

ื ื—ืฉื‘ ืืช ื‘ื™ืฆื•ืขื™ ื”ื—ื–ืื™ ื‘ืขื‘ื•ืจ ืขืจื›ื™ KK ื”ืฉื•ื ื™ื ืขืœ ื” validation set

ืื ื• ืจื•ืื™ื ืคื” ืฉื•ื‘ ืืช ื” bias-variance tradeoff, ืฉื™ืžื• ืœื‘ ืื‘ืœ ืฉื”ืชืœื•ืช ื‘ KK ื›ืืŸ ื”ื™ื ื”ืคื•ื›ื”. ื›ืคื™ ืฉืฆื™ื™ื ื• ืงื•ื“ื, ื‘ืขื‘ื•ืจ 1-NN ื™ืฉื ื” ื›ืžื•ืช ื’ื‘ื•ื”ื” ืฉืœ overfitting ื•ื ื™ืชืŸ ืœืจืื•ืช ืขื“ื•ืช ืœื–ื” ื‘ื’ืจืฃ ื–ื” ื‘ื›ืš ืฉื” train score ื™ื•ืจื“ ืœ0. ื›ื›ืœ ืฉื ื’ื“ื™ืœ ืืช KK ื”ืืœื’ื•ืจื™ืชื ื™ืขืฉื” ืคื—ื•ืช overfitting ืืš ื”ื•ื ื’ื ื™ืžืฆืข ืขืœ ืื™ื–ื•ืจ ื’ื“ื•ืœ ื™ื•ืชืจ ื•ื‘ื›ืš ื™ื—ืœื™ืง ืžืื“ ืืช ื”ืฉืคื•ืช ืฉืœ ืื™ื–ื•ืจื™ ื”ื”ื—ืœื˜ื”, ืขื“ ืœื ืงื•ื“ื” ืฉื‘ื” ื™ื”ื™ื” ืจืง ืื™ื–ื•ืจ ื”ื—ืœื˜ื” 1.

ื ืฉืจื˜ื˜ ืืช ื”ื—ื–ืื™ ืฉืœ 40-NN:

ื–ื”ื• ื›ืžื•ื‘ืŸ ืžืงืจื” ืงื™ืฆื•ื ื™ ืฉืœ underfitting, ืฉื‘ื• ื”ื—ื–ืื™ ืžืชืื™ื ืœืžื“ื’ื ืจืง ื‘ืื•ืคืŸ ืžืื“ ื’ืก ื•ื ื™ืชืŸ ืขื•ื“ ืœืฉืคืจ ืืช ื”ื—ื™ื–ื•ื™ ืขืœ ื™ื“ื™ ื‘ื—ื™ืจืช ื—ื–ืื™ ื™ื•ืชืจ ืžืชืื™ื ืœืžื“ื’ื.

ื‘ื—ื™ืจื” ืฉืœ K=13K=13 (ืฉื ืžืฆื ื‘ืžืจื›ื– ื”ืชื—ื•ื ืฉื‘ื• KK ื ื•ืชืŸ ืฆื™ื•ืŸ ืžื™ื ืžืืœื™) ื ื•ืชืŸ misclassification rate ืฉืœ 10%.

K-NN ืœื‘ืขื™ื•ืช ืจื’ืจืกื™ื”

ื ื™ืชืŸ ืœื”ืฉืชืžืฉ ื‘ K-NN ื’ื ืœืคืชืจื•ืŸ ื‘ืขื™ื•ืช ืจื’ืจืกื™ื”, ืื ื›ื™ ืคืชืจื•ืŸ ื–ื” ื™ื”ื™ื” ืœืจื•ื‘ ืคื—ื•ืช ื™ืขื™ืœ. ื‘ื‘ืขื™ื•ืช ืจื’ืจืกื™ื” ืื ื• ื ื‘ืฆืข ืืช ื”ื—ื™ื–ื•ื™ ืœืคื™ ื”ืžืžื•ืฆืข ืขืœ ื”ืชื•ื•ื™ื•ืช ืฉืœ ื”ืฉื›ื ื™ื (ื‘ืžืงื•ื ืœื‘ื—ื•ืจ ืืช ืชื•ื•ื™ืช ื”ืฉื›ื™ื—ื”).

Decision trees (ืขืฆื™ ื”ื—ืœื˜ื”)

ืขืฆื™ ื”ื—ืœื˜ื” ื”ื ื›ืœื™ ื ืคื•ืฅ ืœืงื‘ืœืช ื”ื—ืœื˜ื•ืช ื•ื”ื ืžื•ืคื™ืขื™ื ื‘ืžืงื•ืžื•ืช ืจื‘ื™ื ื’ื ืžื—ื•ืฅ ืœืชื—ื•ื ืฉืœ ืžืขืจื›ื•ืช ืœื•ืžื“ื•ืช. ื”ื ืžื‘ื•ืกืกื™ื ืขืœ ืกื“ืจื” ืฉืœ ืฉืืœื•ืช ื›ืืฉืจ ื”ืชืฉื•ื‘ื” ืœื›ืœ ืฉืืœื” ืžื•ื‘ื™ืœื” ืื•ืชื ื• ืœืžืกืœื•ืœ ืื—ืจ ื‘ืขืฅ, ืขื“ ืืฉืจ ืื ื• ืžื’ื™ืขื™ื ืœืชืฉื•ื‘ื” ื”ืกื•ืคื™ืช ืฉื ืžืฆืืช ื‘ืงืฆื” ื”ืขืฅ:

ืœื“ื•ื’ืžื, ื›ืืฉืจ ืื ื• ื‘ืื™ื ืขื ื›ืื‘ ื‘ื‘ื˜ืŸ ืœืจื•ืคื, ื”ื•ื ืœืจื•ื‘ ื™ืฉืืœ ืื•ืชื ื• ื‘ื›ืœ ืคืขื ืฉืืœื” ื•ืขืœ ืคื™ ื”ืชืฉื•ื‘ื” ื™ืฉืืœ ืฉืืœื” ื ื•ืกืคืช (ืื• ื™ื‘ืฆืข ื‘ื“ื™ืงื” ืžืกื•ื™ื™ืžืช) ื•ื‘ืกื•ืฃ ืกื™ื“ืจืช ื”ืฉืืœื•ืช (ื•ื”ื‘ื“ื™ืงื•ืช) ื™ื’ื™ืข ืœื”ื—ืœื˜ื” ืœื’ื‘ื™ ื”ืื™ื‘ื—ื•ืŸ ืฉื”ื•ื ื—ื•ื–ื”.

ื“ื•ื’ืžื ื ื•ืกืคืช ื”ื™ื ื–ื•:

ื‘ื”ืงืฉืจ ืฉืœ ื‘ืขื™ื•ืช supervised learning ื ื•ื›ืœ ืœื ืกื•ืช ืœื‘ื ื•ืช ืขืฅ ืืฉืจ ื™ืฉืžืฉ ื›ื—ื–ืื™ ืž x\mathbf{x} ืœ y\text{y}. ืชื—ืช ื”ื’ื™ืฉื” ื”ื“ื™ืกืงืจื™ืžื™ื ื˜ื™ื‘ื™ืช, ืื ื• ื ื ืกื” ืœื‘ื ื•ืช ืื•ืชื• ื›ืš ืฉื™ืชืŸ ื—ื™ื–ื•ื™ ื›ืžื” ืฉื™ื•ืชืจ ื˜ื•ื‘ ืขืœ ื”ืžื“ื’ื ืชื—ืช ืื™ืœื•ืฆื™ื ืžืกื•ื™ื™ืžื™ื ืฉื ืฉื™ื ืขืœื™ื•.

ื”ื™ืชืจื•ื ื•ืช ืฉืœ ื”ืฉื™ืžื•ืฉ ื‘ืขืฅ ื”ื—ืœื˜ื” ื›ื—ื–ืื™:

  1. ืคืฉื•ื˜ ืœืžื™ืžื•ืฉ (ืื•ืกืฃ ืฉืœ ืชื ืื™ if .. else ..).
  2. ืžืชืื™ื ืœืขื‘ื•ื“ื” ืขื ืžืฉืชื ื™ื ืงื˜ื’ื•ืจื™ื™ื (ืจื›ื™ื‘ื™ื ืฉืœ x\mathbf{x} ืฉื”ื ืžืฉืชื ื™ื ื‘ื“ื“ื™ื ืืฉืจ ืžืงื‘ืœื™ื ืื—ื“ ืžืกื˜ ืžืฆื•ืžืฆื ืฉืœ ืขืจื›ื™ื).
  3. Explainable - ื ื™ืชืŸ ืœื”ื‘ื™ืŸ ื‘ื“ื™ื•ืง ืžื” ื”ื™ื• ื”ืฉื™ืงื•ืœื™ื ืฉืœืคื™ื”ื ื”ืชืงื‘ืœ ื—ื™ื–ื•ื™ ืžืกื•ื™ื™ื.

ื˜ืจืžื™ื ื•ืœื•ื’ื™ื”

ื ืฆื™ื’ ืืช ื”ืฉืžื•ืช ื”ืžืงื•ื‘ืœื™ื ื‘ืขื‘ื•ื“ื” ืขื ืขืฆื™ื:

  • Root (ืฉื•ืจืฉ) - ื ืงื•ื“ืช ื”ื›ื ื™ืกื” ืœืขืฅ.
  • Node (ืฆื•ืžืช) - ื ืงื•ื“ื•ืช ื”ื”ื—ืœื˜ื” / ืคื™ืฆื•ืœ ืฉืœ ื”ืขืฅ - ื”ืฉืืœื•ืช.
  • Leaves (ืขืœื™ื) - ื”ืงืฆื•ื•ืช ืฉืœ ื”ืขืฅ - ื”ืชืฉื•ื‘ื•ืช.
  • Branch (ืขื ืฃ) - ื—ืœืง ืžืชื•ืš ื”ืขืฅ ื”ืžืœื (ืชืช-ืขืฅ).
  • Depth (ืขื•ืžืง) - ืžืกืคืจ ื”ืฆืžืชื™ื ื‘ืžืกืœื•ืœ ื”ืืจื•ืš ื‘ื™ื•ืชืจ.

ื”ืฆืžืชื™ื

ื‘ืื•ืคืŸ ื›ืœืœื™ ื ื™ืชืŸ ืœื”ืฉืชืžืฉ ื‘ื›ืœ ืฉืืœื” ืฉืจื•ืฆื™ื ืขืœ x\boldsymbol{x} ื‘ื›ืœ ืฆื•ืžืช, ืืš ืœืฉื ื”ืฉืžื™ืจื” ืขืœ ื”ืคืฉื˜ื•ืช ืฉืœ ื”ืขืฅ (ืืฉืจ ื ื—ื•ืฅ ื’ื ืœืฉื ื”ืžื™ืžื•ืฉ ื•ื’ื ืœืžื ื™ืขืช overfitting) ืžืงื•ื‘ืœ ืœื”ื’ื‘ื™ืœ ืืช ื”ืฉืืœื•ืช ื‘ืฆืžืชื™ื ืœืชื ืื™ื ืคืฉื•ื˜ื™ื ืขืœ ืจื›ื™ื‘ ื‘ื•ื“ื“ ืฉืœ x\mathbf{x} ื‘ื›ืœ ืคืขื (ื–ืืช ืื•ืžืจืช ืœ xi\text{x}_i ืžืกื•ื™ื™ื). ืชื ืื™ื ืืœื• ื™ื”ื™ื•:

  • ืขื‘ื•ืจ ืจื›ื™ื‘ื™ื ืจืฆื™ืคื™ื: ื ืฉืชืžืฉ ื‘ืชื ืื™ ืžื”ืฆื•ืจื” ืฉืœ xi>ax_i>a, ื›ืืฉืจ ื™ืฉ ืœื‘ื—ื•ืจ ืืช ื”ืจื›ื™ื‘ ืฉืื•ืชื• ื ืจืฆื” ืœื‘ื“ื•ืง ii ื•ืขืจืš ื”ืกืฃ (threshold) aa ืฉืœืคื™ื• ื ืจืฆื” ืœืคืฆืœ.
  • ืขื‘ื•ืจ ืจื›ื™ื‘ื™ื ืงื˜ื’ื•ืจื™ื™ื (ื‘ื“ื™ื“ื™ื ืืฉืจ ืžืงื‘ืœื™ื ืกื˜ ืงื˜ืŸ ืฉืœ ืขืจื›ื™ื - ืื—ื“ื•ืช ื‘ื•ื“ื“ื•ืช): ื ืฉืชืžืฉ ื‘ืชื ืื™ ืืฉืจ ืžืคืฆืœ ืืช ื”ืขืฅ ืœื›ืœ ืื—ื“ ืžื”ืขืจื›ื™ื ืฉืื•ืชื ื™ื›ื•ืœ ื”ืžืฉืชื ื” ืœืงื‘ืœ.

ืœื“ื•ื’ืžื, ื—ื–ืื™ ืืฉืจ ืžื ืกื” ืœื—ื–ื•ืจ ื”ื•ื ืื•ืช ืืฉืจืื™ ืขืœ ืกืžืš ื ืชื•ื ื™ ื”ืขื™ืกืงื” ื™ื›ื•ืœ ืœื”ืจืื•ืช ื›ืš:

ื”ืคื™ืฆื•ืœ ื”ืจืืฉื•ืŸ ื‘ืขืฅ ื”ื•ื ืงื˜ื’ื•ืจื™ ืœืคื™ ืกื•ื’ ื”ืžื•ืฆืจ ื•ืฉืืจ ื”ืคื™ืฆื•ืœื™ื ื”ื ืœืคื™ ื”ืฉื•ื•ืื” ืœืขืจืš ืกืฃ ืžืกื•ื™ื™ื.

ื‘ื ื™ื™ืช ืขืฅ ื”ื—ืœื˜ื” ืœืกื™ื•ื•ื’

ืžืฆื“ ืื—ื“, ื›ื›ืœ ืฉื ื’ื“ื™ืœ ืืช ื›ืžื•ืช ื”ืฆืžืชื™ื ื‘ืขืฅ ื›ืš ืชื’ื“ืœ ื’ื ื™ื›ื•ืœืช ื”ื‘ื™ื˜ื•ื™ ืฉืœื• ื•ื ื•ื›ืœ ืœื”ืงื˜ื™ืŸ ืืช ืฉื’ื™ืืช ื”ื—ื™ื–ื•ื™ ืขืœ ื”ืžื“ื’ื. ืžืฆื“ ืฉื ื™, ื›ื›ืœ ืฉื”ืขืฅ ื™ื›ื™ืœ ื™ื•ืชืจ ืฆืžืชื™ื ื•ื™ื›ื•ืœืช ื”ื‘ื™ื˜ื•ื™ ืชื’ื“ืœ ื•ื›ืš ืชื’ื“ืœ ื’ื ื”ืชืืžืช ื”ื™ืชืจ ืฉื”ื•ื ื™ืขืฉื”. ื“ืจืš ืื—ืช ืœืžื ื™ืขืช ื”ืชืืžืช ื™ืชืจ ื”ื™ื ื” ืœื”ื’ื‘ื™ืœ ืืช ื›ืžื•ืช ื”ืฆืžืชื™ื. ื‘ืžืงืจื™ื ืจื‘ื™ื ืื ื• ื ืจืฆื” ืœื”ื’ื‘ื™ืœ ืืช ื”ื›ืžื•ืช ื”ืฆืžืชื™ื ื’ื ืžืฉื™ืงื•ืœื™ื ืžืขืฉื™ื™ื ืฉืœ ื—ื™ืฉื•ื‘ื™ื•ืช ื•ื–ื™ื›ืจื•ืŸ. ืœื›ืŸ ื”ืžื˜ืจื” ืฉืœื ื• ื‘ืฉืœื‘ ื‘ื ื™ื™ืช ื”ืขืฅ ื”ื™ื ื” ืœื‘ื ื•ืช ืขืฅ ืืฉืจ ืžืงื˜ื™ืŸ ืืช ืฉื’ื™ืืช ื”ื—ื™ื–ื•ื™ ืชื•ืš ืฉื™ืžื•ืฉ ื‘ื›ืžื” ืฉืคื—ื•ืช ืฆืžืชื™ื.

ื‘ืื•ืคืŸ ื›ืœืœื™ ืžืœื‘ื“ ื‘ืžืงืจื™ื ืฉื‘ื”ื ื™ืฉื ื ืฉืชื™ ื“ื’ื™ืžื•ืช ืขื x\boldsymbol{x}-ื™ื ื–ื”ื™ื ืืš ืขื yy ืฉื•ื ื”, ืชืžื™ื“ ื ื™ืชืŸ ื™ื”ื™ื” ืœืžืฆื•ื ืขืฅ ืขื ืžืกืคื™ืง ืฆืžืชื™ื ืืฉืจ ื™ื’ื™ืข ืœื—ื™ื–ื•ื™ ืžื•ืฉืœื. (ื‘ืžืงืจื” ื”ืงื™ืฆื•ื ื™ ื ื™ืชืŸ ืœืคืฆืœ ืืช ื›ืœ ื”ืžื“ื’ื ื›ืš ืฉืœื›ืœ ืขืœื” ืชื’ื™ืข ื“ื’ื™ืžื” ื‘ื•ื“ื“ืช ื•ืœืงื‘ื•ืข ืืช ื”ืชื•ืฆืืช ื”ื—ื™ื–ื•ื™ ื‘ืขืœื” ื–ื” ืœื”ื™ื•ืช ื”ืขืจืš ืฉืœ ืื•ืชื” ื“ื’ื™ืžื”).

ื™ืฉื ื ืืœื’ื•ืจื™ืชืžื™ื ืจื‘ื™ื ืœื‘ื ื™ื” ืฉืœ ืขืฅ ื”ื—ืœื˜ื”. ืฉื ื™ ื”ืืœื’ื•ืจื™ืชืžื™ื ื”ื ืคื•ืฆื™ื ื‘ื™ื•ืชืจ ื”ื ืฉื ื™ ืืœื’ืจื•ืชืžื™ื ื™ื—ืกื™ืช ื“ื•ืžื™ื ื‘ืฉื C4.5 (Ross Quinlan, 1986) ื• CART (Breiman et al., 1984). ื‘ืงื•ืจืก ื–ื” ื ืชืืจ ื’ื™ืจืกื” ืืฉืจ ืžืขืจื‘ืช ื‘ื™ืŸ ืฉื ื™ื”ื. ื ืฆื™ื™ืŸ ืจืง ืฉื›ื™ื•ื ื ืžืฆืื™ื ื‘ืฉื™ืžื•ืฉ ื”ืจื‘ื” ื’ืจืกืื•ืช ืฉื•ื ื•ืช ืฉืœ ืฉื™ื˜ื•ืช ืœื‘ื ื™ื™ืช ืขืฆื™ ื”ื—ืœื˜ื” ืืฉืจ ืžื‘ื•ืกืกื•ืช ืขืœ ืฉื™ื˜ื•ืช ืืœื•.

ื‘ื ื™ื™ืช ื”ืขืฅ ืชืขืฉื” ื‘ืฉื ื™ ืฉืœื‘ื™ื:

  1. ื’ื™ื“ื•ืœ ื”ืขืฅ: ื ื ืกื” ืœื‘ื ื•ืช ืขืฅ ืืฉืจ ืžื’ื™ืข ืœื—ื™ื–ื•ื™ ื”ื˜ื•ื‘ ื‘ื™ื•ืชืจ ืฉื ื™ืชืŸ ืขืœ train set.
  2. ื’ื™ื–ื•ื (pruning): ื ืฉืชืžืฉ ื‘ validation set ืขืœ ืžื ืช ืœื”ืกื™ืจ ืฆืžืชื™ื.

ืฉืœื‘ 1: ื’ื™ื“ื•ืœ ื”ืขืฅ

ื”ื‘ืขื™ื” ืฉืœ ืžืฆื™ืืช ื”ืขืฅ ืืฉืจ ืžื’ื™ืข ืœื—ื™ื–ื•ื™ ื”ืื•ืคื˜ื™ืžืืœื™ ืขืœ ื”ืžื“ื’ื ื‘ื›ืžื” ืฉืคื—ื•ืช ืจืžื•ืช ื“ื•ืจืฉืช ืœืขื‘ื•ืจ ืขืœ ื›ืœ ื”ืขืฆื™ื ื”ืืคืฉืจื™ื™ื, ื•ื›ืžื•ื‘ืŸ ืฉืคืชืจื•ืŸ ื–ื” ืœื ืžืขืฉื™. ื‘ืžืงื•ื ื–ืืช ื ื ืกื” ืœื‘ื ื•ืช ืืช ื”ืขืฅ ืฉืœื‘ ืื—ืจ ืฉืœื‘ ื‘ืฆื•ืจื” ื—ืžื“ื ื™ืช (greedy). ืื ื• ื ืชื—ื™ืœ ืžื”ืฉื•ืจืฉ ื•ื‘ื›ืœ ืคืขื ื ื•ืกื™ืฃ ืฆื•ืžืช ืœืขืฅ ื›ืืฉืจ ืื ื• ื‘ื•ื—ืจื™ื ืืช ื”ืชื ืื™ ื”ืžื•ืคื™ืข ื‘ืฆื•ืžืช ืขืœ ื™ื“ื™ ืžืขื‘ืจ ืขืœ ื›ืœ ื”ืชื ืื™ื ื”ืืคืฉืจื™ื™ื ื•ื‘ื—ื™ืจืช ื”ืชื ืื™ ืืฉืจ ื ื•ืชืŸ ืืช ืฉื’ื™ืืช ื”ื—ื™ื–ื•ื™ ื”ืงื˜ื ื” ื‘ื™ื•ืชืจ ืขืœ ื” train set.

ื“ื•ื’ืžื

ื ื’ื“ื™ื ื–ืืช ืขืœ ื”ื“ื•ื’ืžื ืฉืœ ื–ื™ื”ื•ื™ ื”ื•ื ืื•ืช ื”ืืฉืจืื™:

ื ืชื—ื™ืœ ืžื” node ื”ืจืืฉื•ืŸ:

ื ื‘ื“ื•ืง ื›ืขืช ื‘ืขื‘ื•ืจ ื›ืœ ืขืจืš ืฉืœ x\boldsymbol{x} ื‘ืขื‘ื•ืจ ื›ืœ ืกืฃ ืืคืฉืจื™ ืžื™ื”ื• ื”ืชื ืื™ ืืฉืจ ื™ื™ืฆืจ ืืช ื”ืขืฅ ื‘ืขืœ ืฉื’ื™ืืช ื”ื—ื™ื–ื•ื™ ื”ื ืžื•ื›ื” ื‘ื™ื•ืชืจ. ืืช ื”ืขืจื›ื™ื ื‘ืขืœื™ื ืฉืœ ื” node ืื ื• ื ืงื‘ืข ืขืœ ื™ื“ื™ ืคื™ืฆื•ืœ ืฉืœ ื”ืžื“ื’ื ืขืœ ืคื™ ื”ืชื ืื™ ื•ื‘ื›ืœ ืขืœื” ื ืฉื™ื ืืช ื”ืขืจืš ืฉืœ ื”ืชื•ื•ื™ืช ื”ืฉื›ื™ื—ื” ื‘ื™ื•ืชืจ ื‘ื“ื’ื™ืžื•ืช ืฉื”ื’ื™ืขื• ืœืื•ืชื• ืขืœื”.

ื›ื“ื™ ืœื”ื‘ื™ืŸ ืžื”ื ืขืจื›ื™ ื” treshold ืฉืขืœื™ื”ื ื ืจืฆื” ืœืขื‘ื•ืจ ื ืกืชื›ืœ ืœืจื’ืข ืขืœ ืกื™ื“ืจืช ื”ืžืกืคืจื™ื ื”ื‘ืื”:

{3,5,8}\{3,5,8\}

ืืช ืกื“ืจืช ื”ืžืกืคืจื™ื ื”ื–ื• ื ื™ืชืŸ ืœืคืฆืœ (ืขืœ ืคื™ ืขืจืš ืกืฃ) ืœ 2 ืคื™ืฆื•ืœื™ื ืืคืฉืจื™ื™ื ืขืœ ื™ื“ื™:

  • ื”ืขื‘ืจืช ืกืฃ ื‘ื™ืŸ ื” 3 ืœ 5
  • ื”ืขื‘ืจืช ืกืฃ ื‘ื™ืŸ ื” 5 ืœ 8

(ื”ืžืงืจื™ื ืฉื‘ื”ื ืื—ื“ ื”ืคื™ืฆื•ืœื™ื ืจื™ืง ืœื ืžืขื ื™ื™ืŸ ืื•ืชื ื•)

ืœื›ืŸ ืžืกืคื™ืง ืœื‘ื—ื•ืŸ ืฉื ื™ ืชื ืื™ื ืฉื™ืชืื™ืžื• ืœืฉื ื™ ื”ืคื™ืฆื•ืœื™ื ื”ืืคืฉืจื™ื™ื, ืœื“ื•ื’ืžื:

  • xโ‰ฅ5x\geq 5
  • xโ‰ฅ8x\geq 8

ื ื‘ื“ื•ืง ืื ื›ืŸ ืืช ื›ืœ ื”ืคื™ืฆื•ืœื™ื ื”ืืคืฉืจื™ื™ื ืขืœ ื”ืžื“ื’ื ืฉืœื ื•. ื‘ืขื‘ื•ืจ ื”ืžืจื—ืง ื ืงื‘ืœ ืืช ืฉื’ื™ืื•ืช ื”ื—ื–ื™ื•ื™ (misclassification rate) ื”ื‘ืื•ืช:

ื‘ืขื‘ื•ืจ ื”ืžื—ื™ืจ ื ืงื‘ืœ ืืช ืฉื’ื™ืื•ืช ื”ื—ื™ื–ื•ื™ ื”ื‘ืื•ืช:

ืžืŸ ื”ื’ืจืคื™ื ื”ืืœื” ื ื•ื›ืœ ืœื”ืกื™ืง ื›ื™ ื”ืชื•ืฆืื” ื”ื˜ื•ื‘ื” ื‘ื™ื•ืชืจ (0.16) ืžืชืงื‘ืœ ื‘ืขื‘ื•ืจ ื”ืชื ืื™ ืฉืœ xDistanceโ‰ฅ75x_{\text{Distance}}\geq 75.

ื•ืœื›ืŸ ื ื‘ื—ืจ ืืช ื” node ืœื”ื™ื•ืช:

  • ืžื›ื™ื•ื•ืŸ ืฉืœืขืœื” ื”ืฉืžืืœื™ ืžื’ื™ืขื•ืช 13 ื“ื’ื™ืžื•ืช ืฉืœ ื”ื•ื ืื” ื• 108 ื—ื•ืงื™ื•ืช ื”ื—ื™ื–ื•ื™ ื‘ืขืœื” ื–ื” ื™ื”ื™ื” ืฉื”ืขืกืงื” ื—ื•ืงื™ืช
  • ืžื›ื™ื•ื•ืŸ ืฉืœืขืœื” ื”ื™ืžื ื™ ืžื’ื™ืขื•ืช 18 ื“ื’ื™ืžื•ืช ืฉืœ ื”ื•ื ืื” ื• 11 ื—ื•ืงื™ื•ืช ื”ื—ื™ื–ื•ื™ ื‘ืขืœื” ื–ื” ื™ื”ื™ื” ืฉื”ืขืกืงื” ื—ืฉื•ื“ื” ื›ื”ื•ื ืื”

ื ืงื‘ืœ ืื ื›ืŸ ืืช ื”ื—ื–ืื™ ื”ื–ื”:

ื ื•ื›ืœ ืœื”ืžืฉื™ืš ื›ืš ื•ืœื”ื•ืกื™ืฃ nodes ืขื“ ืืฉืจ ื ื’ื™ืข ืœืฉื’ื™ืื” 0 ืื• ืœืขื•ืžืง ืžืงืกื™ืžืืœื™ ืฉืื•ืชื• ื”ื’ื“ืจื ื• ืžืจืืฉ. ืืš ืœืคื ื™ ื›ืŸ ืื ื• ื ื›ื ื™ืก ืฉื™ื ื•ื™ ืงื˜ืŸ ื‘ืืœื’ื•ืจื™ืชื ืฉื™ืฉืคืจ ืืช ื‘ื™ืฆื•ืขื™ื•.

ืžื“ื“ื™ ื—ื•ืกืจ ื”ื•ืžื•ื’ื ื™ื•ืช

ื‘ื“ื•ื’ืžื ืฉื”ืจืื™ื ื• ื”ืžื“ื“ ืฉืื•ืชื• ื ื™ืกื™ื ื• ืœืฉืคืจ ื‘ื›ืœ ื”ื•ืกืคื” ืฉืœ ื” node ื”ื™ื” ื” misclassification rate. ืžืกืชื‘ืจ ืฉื ื™ืชืŸ ืขืœ ื™ื“ื™ ื”ื—ืœืคื” ืฉืœ ืžื“ื“ ื–ื” ื‘ืžื“ื“ ื˜ื™ืคื” ืฉื•ื ื” ืœืฉืคืจ ืืช ื‘ื™ืฆื•ืขื™ ื”ื—ื–ืื™. ื”ืžื“ื“ื™ื ื”ืืœื˜ืจื ื˜ื™ื‘ื™ื ืฉื ืฆื™ื’ ืžื ืกื™ื ืœื”ืชื™ื™ื—ืก ืœื ืจืง ืœืฉื’ื™ืืช ื”ื—ื™ื–ื•ื™ ื”ืžื™ื™ื“ื™ืช ืืœื ื’ื ืœื”ืกืชื›ืœ ืงื“ื™ืžื” ื•ืœื ืกื•ืช ืœืฉืคืจ ืืช ื”ืžืฆื‘ ืœืคื™ืฆื•ืœื™ื ื”ื‘ืื™ื.

ืžื“ื“ื™ื ืืœื• ืžืกืชืžื›ื™ื ืขืœ ื”ืจืขื™ื•ืŸ ื”ื‘ื. ื ืกืชื›ืœ ืขืœ ื”ืชืคืœื’ื•ืช ืฉืœ ื”ืชื•ื•ื™ื•ืช ืฉื”ื’ื™ืขื• ืœืขืœื” ืžืกื•ื™ื™ื. ื›ื“ื™ ืœืงื‘ืœ ื‘ืขืœื” ื–ื” ืžืขื˜ ืฉื’ื™ืื•ืช ื—ื™ื–ื•ื™ ืขืœื™ื ื• ืœื“ืื•ื’ ืฉื”ืคื™ืœื•ื’ ืฉืœ ื”ื“ื’ื™ืžื•ืช ื™ื”ื™ื” ืžืจื•ื›ื– ื›ืžื” ืฉื™ื•ืชืจ ื‘ืขืจืš ืื—ื“ ืžืกื•ื™ื™ื ืฉืื•ืชื• ื ื‘ื—ืจ ืœื”ื™ื•ืช ื”ื—ื™ื–ื•ื™ ืฉืœ ืื•ืชื• ืขืœื”. ื›ืืฉืจ ื–ื” ืœื ื”ืžืฆื‘, ื•ื”ืชื•ื•ื™ื•ืช ืžืคื•ืœื’ื•ืช ืขืœ ืคื ื™ ืžืกืคืจ ืขืจื›ื™ื ื ืงื‘ืœ ื”ืจื‘ื” ืฉื’ื™ืื•ืช ื—ื™ื–ื•ื™.

ื‘ืžืงืจื” ืฉื‘ื• ื”ืชื•ื•ื™ื•ืช ืžืจื•ื›ื–ื•ืช ืกื‘ื™ื‘ ืขืจืš ื™ื—ื™ื“ ืื ื• ื ืื•ืžืจ ืฉื”ื ืžืคื•ืœื’ื•ืช ื‘ืฆื•ืจื” ื”ื•ืžื•ื’ื ื™ืช (ืื• ืœื—ื™ืœื•ืคื™ืŸ ืฉื”ืคื™ืœื•ื’ ื˜ื”ื•ืจ - pure). ืœืขื•ืžืช ื–ืืช, ื‘ืžืงืจื” ืฉื‘ื• ื”ืชื•ื•ื™ื•ืช ืžืคื•ืœื’ื•ืช ื‘ืฆื•ืจื” ืื—ื™ื“ื” ืขืœ ืคื ื™ ื›ืœ ื”ืขืจื›ื™ื ืื ื• ื ืื•ืžืจ ืฉื”ื ืžืคื•ืœื’ื™ื ื‘ืฆื•ืจื” ื”ื˜ืจื•ื’ื ื™ืช:

ื ืฉืืฃ ืื ื›ืŸ, ืฉื”ืชื•ื•ื™ืช ื‘ื›ืœ ืขืœื” ื™ื”ื™ื• ื›ืžื” ืฉื™ื•ืชืจ ื”ื•ืžื•ื’ื ื™ื•ืช. ืžื™ื–ืขื•ืจ ืžื“ื“ ื” misclassification rate ืื•ืžื ื ื™ื•ื‘ื™ืœ ื‘ืื•ืคืŸ ื™ืฉื™ืจ ืœื”ื’ื“ืœืช ื”ื”ื•ืžื•ื’ื ื™ื•ืช, ืืš ื ื™ืชืŸ ืœื—ื™ืœื•ืคื™ืŸ ืœื”ืฉืชืžืฉ ื’ื ื‘ืžื“ื“ื™ ื—ื•ืกืจ ื”ื•ืžื•ื’ื ื™ื•ืช ืื—ืจื™ื.

ื‘ื”ื™ื ืชืŸ ืžืฉืชื ื” ื“ื™ืกืงืจื˜ื™ ืžืกื•ื™ื™ื y\text{y} ื‘ืขืœ ืคื™ืœื•ื’ p(y)p(y), ื ื’ื“ื™ืจ ืขื•ื“ ืฉื ื™ ืžืžื“ื™ ื—ื•ืกืจ ื”ื”ื•ืžื•ื’ื ื™ื•ืช ื ื•ืกืคื™ื, ื‘ื ื•ืกืฃ ืœ misclassification rate:

  • misclassification rate (ื›ืžื•ืช ื”ืฉื’ื™ืื•ืช ืืฉืจ ืฆืคื•ื™ื” ืœื”ืชืงื‘ืœ ื‘ืขื‘ื•ืจ ื—ื™ื–ื•ื™ ืฉืœ ื”ืขืจืš ื”ื›ื™ ืกื‘ื™ืจ)

    Q(p)=1โˆ’maxโกyโˆˆ{1,โ€ฆ,C}p(y)Q(p)=1-\max_{y\in\{1,\dots,C\}}p(y)
  • ืื™ื ื“ืงืก Gini:

    Q(p)=โˆ‘yโˆˆ{1,โ€ฆ,C}p(y)(1โˆ’p(y))Q(p)=\sum_{y\in\{1,\dots,C\}}p(y)(1-p(y))
  • ืื ื˜ืจื•ืคื™ื” (ืืฉืจ ืžืกื•ืžืŸ ื‘ืžืงืจื™ื ืจื‘ื™ื ื’ื ื› HH):

    Q(p)(=H(p))=โˆ‘yโˆˆ{1,โ€ฆ,C}โˆ’p(y)logโก2p(y)Q(p)(=H(p))=\sum_{y\in\{1,\dots,C\}}-p(y)\log_2 p(y)

ืžื“ื“ื™ื ืืœื• ืฉื•ื•ื™ื ืœ-0 ื‘ืขื‘ื•ืจ ืคื™ืœื•ื’ื™ื ื”ื•ืžื•ื’ื ื™ื ื•ื”ื ื’ื“ืœื™ื ื›ื›ืœ ืฉื”ืคื™ืœื•ื’ ื”ื•ืœืš ื•ื ืขืฉื” ื”ื˜ืจื•ื’ื ื™. ื”ืฉืจื˜ื•ื˜ื™ื ื”ื‘ืื™ื ืžืจืื™ื ืืช ื”ื”ืชื ื”ื’ื•ืช ืฉืœ ื”ืžืžื“ื™ื ื”ืืœื” ื‘ืžืงืจื” ืฉืœ ืžืฉืชื ื” ืืงืจืื™ ื‘ื™ื ืืจื™:

ื—ื•ืกืจ ื”ื•ืžื•ื’ื ื™ื•ืช ืžืžื•ืฆืขืช ืฉืœ ืขืฅ

ื‘ื”ื™ื ืชืŸ ืžื“ื’ื ืžืกื•ื™ื™ื, ืขืฅ ืžืกื•ื™ื™ื ื•ืžื“ื“ ื—ื•ืกืจ ื”ื•ืžื•ื’ื ื™ื•ืช ื ื•ื›ืœ ืœื—ืฉื‘ ืืช ื—ื•ืกืจ ื”ื”ื•ืžื•ื’ื ื™ื•ืช ื”ืžืžื•ืฆืขืช ืขืœ ื”ืขืœื™ื ืฉืœ ื”ืขืฅ ื‘ืื•ืคืŸ ื”ื‘ื:

  1. ื ืขื‘ื™ืจ ืืช ื”ื“ื’ื™ืžื•ืช ืžื”ืžื“ื’ื ื“ืจืš ื”ืขืฅ ื•ื ืคืฆืœ ืื•ืชื ืœืชืชื™ ืžื“ื’ืžื™ื ืขืœ ืคื™ ื”ืขืœื™ื ืฉืืœื™ื”ื ื”ื ื”ื’ื™ืขื•. ื ืกืžืŸ ืืช ืื•ืกืฃ ื”ืื™ื ื“ืงืกื™ื ืฉืœ ื”ื“ื’ื™ืžื•ืช ืฉื”ื’ื™ืขื• ืœืขืœื” ื” jj ื‘ Ij\mathcal{I}_j. ื ืกืžืŸ ืืช ื›ืžื•ืช ื”ื“ื’ื™ืžื•ืช ืฉื”ื’ื™ืขื• ืœืขืœื” ื” jj ื‘ NjN_j.
  2. ืœื›ืœ ืขืœื” ื ื—ืฉื‘ ืืช ื”ืคื™ืœื•ื’ ื”ืืžืคื™ืจื™ ืฉืœ ื”ืชื•ื™ื•ืช ืฉื”ื’ื™ืขื• ืขืœื™ื• ื‘ืื•ืคืŸ ื”ื‘ื:

    p^j,y=1Njโˆ‘iโˆˆIjI{yi=y}\hat{p}_{j,y}=\frac{1}{N_j}\sum_{i\in\mathcal{I}_j} I\{y_i=y\}

    (pj,yp_{j,y} ื”ื•ื ืคืฉื•ื˜ ื”ืฉื›ื™ื—ื•ืช ืฉืœ ื”ืขืจืš yy ืžื‘ื™ืŸ ื”ืชื•ื•ื™ื•ืช ื‘ืขืœื” ื” jj)

  3. ื‘ืขื–ืจืช ื”ืคื™ืœื•ื’ ื”ืืžืคื™ืจื™ ื ื—ืฉื‘ ืืช ื—ื•ืกืจ ื”ื”ื•ืžื•ื’ื ื™ื•ืช ืฉืœ ื›ืœ ืขืœื”:

    Q(p^j)Q(\hat{p}_j)
  4. ื”ืฆื™ื•ืŸ ื”ื›ื•ืœืœ ืฉืœ ื”ืขืฅ ื™ื”ื™ื” ื”ืžืžื•ืฆืข ื”ืžื•ืฉื›ืœืœ ืฉืœ ื—ื•ืกืจ ื”ื”ื•ืžื•ื’ื ื™ื•ืช ืฉืœ ื”ืขืœื™ื ื‘ื™ื—ืก ืœื›ืžื•ืช ื”ื“ื’ื™ืžื•ืช ืฉื”ื’ื™ืขื” ืœื›ืœ ืขืœื”:

    Qtotal=โˆ‘jNjNQ(p^j)Q_{\text{total}}=\sum_j \frac{N_j}{N}Q(\hat{p}_j)

ื›ืขืช ื ื•ื›ืœ ืœื‘ื ื•ืช ืืช ื”ืขืฅ ืชื•ืš ื ืกื™ื•ืŸ ืœืžื–ืขืจ ืืช ืžืžื“ื™ ื”ืฉื’ื™ืื” ื”ืืœื˜ืจื ื˜ื™ื‘ื™ื ื‘ืžืงื•ื ืืช ื” misclassification rate. ืœืจื•ื‘ ืฉื™ืžื•ืฉ ื‘ Gini ืื• ื‘ืื ื˜ืจื•ืคื™ื” ื™ื•ื‘ื™ืœ ืœื‘ื™ืฆื•ืขื™ื ื˜ื•ื‘ื™ื ื™ื•ืชืจ.

ื‘ื—ื–ืจื” ืœื“ื•ื’ืžื

ื ืชื—ื™ืœ ืžื—ื“ืฉ ืืช ื”ื‘ื ื™ื” ืฉืœ ื”ืขืฅ ืืš ื”ืคืขื ื ืžื–ืขืจ ืืช ืžื“ื“ Gini.

ื‘ืขื‘ื•ืจ ื”ืฆื•ืžืช ื”ืจืืฉื•ืŸ ื ื—ืคืฉ ืืช ื”ืชื ืื™ ืืฉืจ ืžืžื–ืขืจ ืืช ื”ืžื“ื“:

ื‘ื“ื•ืžื” ืœืงื•ื“ื, ื ืงื‘ืœ ื›ื™ ื”ืชื•ืฆืื” ื”ื˜ื•ื‘ื” ื‘ื™ื•ืชืจ (0.25) ืžืชืงื‘ืœ ื‘ืขื‘ื•ืจ ื”ืชื ืื™ ืฉืœ xDistanceโ‰ฅ75x_{\text{Distance}}\geq 75.

ื ืžืฉื™ืš ื‘ืื•ืชื” ื”ืฉื™ื˜ื” ื•ื ืงื‘ืœ

ื ื‘ื“ื•ืง ืืช ื‘ื™ืฆื•ืขื™ ื”ื—ื–ืื™ ืขืœ ื” test set ืœืคื™ ืคื•ื ืงืฆื™ื™ืช ื” miscalssification rate:

testย score=1Ntestโˆ‘{x(i),y(i)}โˆˆDtestI{h(x(i))โ‰ y(i)}=0.14\text{test score}=\frac{1}{N_{\text{test}}}\sum_{\{\boldsymbol{x}^{(i)},y^{(i)}\}\in\mathcal{D}_{\text{test}}}I\{h(\boldsymbol{x}^{(i)})\neq y^{(i)}\}=0.14

ืฉืœื‘ ืฉื ื™ - pruning (ื’ื™ื–ื•ื)

ืขืœ ืžื ืช ืœืงื˜ื™ืŸ ืืช ื›ืžื•ืช ื” overfitting ืื ื• ื ืฉืชืžืฉ ื‘ validation set ืขืœ ืžื ืช ืœืืชืจ ืขื ืคื™ื ืืฉืจ ืื™ื ื ืžืฉืคืจื™ื ืื• ืคื•ื’ืขื™ื ื‘ื‘ื™ืฆืขื™ ื”ื—ื–ืื™. ืื ื• ื ืขืฉื” ื–ืืช ืขืœ ื™ื“ื™ ืžืขื‘ืจ ืฉืœ ืขืœ ื›ืœ ื”ืฆืžืชื™ื ืฉื ืžืฆืื™ื ื‘ืงืฆื•ื•ืช ื”ืขืฅ ื ื ืกื” ืœื”ืกื™ืจื. ื ื‘ื“ื•ืง ืืช ื”ืฆื™ื•ืŸ ื”ืžืชืงื‘ืœ ืขืœ ื” validation set ืื™ืชื ื•ื‘ืœืขื“ื™ื”ื. ืื ื”ื ื•ื›ื—ื•ืช ืฉืœื”ื ืœื ืžืฉืคืจืช ืืช ื‘ื™ืฆื•ืขื™ ื”ื—ื–ืื™ ืื ื• ื ืกื™ืจ ืื•ืชื. ืื ื• ื ืžืฉื™ืš ื•ื ื‘ื“ื•ืง ืืช ื”ืขื ืคื™ื ื‘ืงืฆื•ืช ื”ืขืฅ ืขื“ ืฉืœื ื™ื™ืฉืืจื• ืฆืžืชื™ื ืฉื™ืฉ ืœื”ืกื™ืจ.

ื“ื•ื’ืžื

ื ืฉืชืžืฉ ื‘ validation set ืขืœ ืžื ืช ืœืขืฉื•ืช pruning ืœืขืฅ ืฉืงื™ื‘ืœื ื•. ื”ืขืฅ ืœืคื ื™ ื” pruning ื”ื™ื ื•:

ื•ืœืื—ืจื™ื• ื”ื™ื ื•:

ื”ื‘ื™ืฆื•ืขื™ื ืขืœ ื” test set ื™ื”ื™ื• ื›ืขืช 8% ืฉื’ื™ืื”.

Regression Tree

ื ื™ืชืŸ ืœื”ืฉืชืžืฉ ื‘ืขืฆื™ื ื’ื ืœืคืชืจื•ืŸ ื‘ืขื™ื•ืช ืจื’ืจืกื™ื”. ื‘ืžืงืจื” ืฉืœ ืจื’ืจืกื™ื” ืขื ืคื•ื ืงืฆื™ื™ืช ืžื—ื™ืจ ืฉืœ MSE, ื”ื‘ื ื™ื” ืฉืœ ื”ืขืฅ ืชื”ื™ื” ื–ื”ื” ืžืœื‘ื“ ืฉื ื™ ื”ื‘ื“ืœื™ื:

  1. ืชื•ืฆืืช ื”ื—ื™ื–ื•ื™ ื‘ืขืœื” ืžืกื•ื™ื™ื ืชื”ื™ื” ื”ืขืจืš ื”ืžืžื•ืฆืข ืฉืœ ื”ืชื•ื•ื™ื•ืช ื‘ืื•ืชื• ืขืœื”. (ื‘ืžืงื•ื ื”ืขืจืš ื”ืฉื›ื™ื—)
  2. ืืช ืžื“ื“ ื—ื•ืกืจ ื”ื”ื•ืžื•ื’ื ื™ื•ืช ื ื—ืœื™ืฃ ื‘ืฉื’ื™ืื” ื”ืจื™ื‘ื•ืขื™ืช ืฉืœ ื”ื—ื™ื–ื•ื™ ืฉืœ ื”ืขืฅ.