머신러닝

kNN, k means clustering, a priori method

jun1-cs 2026. 2. 21. 10:10

1. kNN

- select variables based on pca results

- normalize (max-min / z norm) each variable

- scatter datapoints into space

- datapoints -> label O, label X group split

- apply kNN alg -> classify label X group, using k nearest datapoints(label O group)

 

* k is odd number! -> to escape the situation where k nearest datapoints show 50:50 proportion of two classes

* supervised learning: must use labeled group1

 

2. k means clustering

- select variables based on pca results

- normalize (max-min / z norm) each variable

- scatter datapoints into space

- datapoints -> all of them are label X group

 

(1) scatter center points randomly

(2) classify datapoints to group of nearest center point

(3) move center point: old coordinate -> mean coordinate of its datapoints

-> repeat (2),(3)

 

3. a priori method

- datapoints: {x1,x2,x3,...,y1,y2,...}

- to find possible causality pair of xn,yn (xn->yn?) 

- association method

- maximize 'confidence' value, P(y|x)=P(y&x)/P(x)   -> P(y&x), P(x): 'support'  -> confidence=ratio between two support 

 

* to minimize iteration cost to find all possible pairs, the a priori algorithm remove the x with small value of support in order to filter x,y of low possibility 

 

 

 

 

source: [Practical Artificial Intelligence for Physicians]

 

'머신러닝' 카테고리의 다른 글

fuzzy logic  (0) 2026.02.22
Genetic Algorithm  (0) 2026.02.21
Linear methods for classification - LDA, QDA, RDA  (0) 2026.02.02
EM algorithm  (0) 2026.01.20
Bootstrap in machine learning  (0) 2026.01.19