1. kNN
- select variables based on pca results
- normalize (max-min / z norm) each variable
- scatter datapoints into space
- datapoints -> label O, label X group split
- apply kNN alg -> classify label X group, using k nearest datapoints(label O group)
* k is odd number! -> to escape the situation where k nearest datapoints show 50:50 proportion of two classes
* supervised learning: must use labeled group1
2. k means clustering
- select variables based on pca results
- normalize (max-min / z norm) each variable
- scatter datapoints into space
- datapoints -> all of them are label X group
(1) scatter center points randomly
(2) classify datapoints to group of nearest center point
(3) move center point: old coordinate -> mean coordinate of its datapoints
-> repeat (2),(3)
3. a priori method
- datapoints: {x1,x2,x3,...,y1,y2,...}
- to find possible causality pair of xn,yn (xn->yn?)
- association method
- maximize 'confidence' value, P(y|x)=P(y&x)/P(x) -> P(y&x), P(x): 'support' -> confidence=ratio between two support
* to minimize iteration cost to find all possible pairs, the a priori algorithm remove the x with small value of support in order to filter x,y of low possibility
source: [Practical Artificial Intelligence for Physicians]
'머신러닝' 카테고리의 다른 글
| fuzzy logic (0) | 2026.02.22 |
|---|---|
| Genetic Algorithm (0) | 2026.02.21 |
| Linear methods for classification - LDA, QDA, RDA (0) | 2026.02.02 |
| EM algorithm (0) | 2026.01.20 |
| Bootstrap in machine learning (0) | 2026.01.19 |