๐Ÿ’ป/ML

[๋ชจ๋‘๋ฅผ ์œ„ํ•œ ๋”ฅ๋Ÿฌ๋‹] ๋‹ค์ค‘ ๋ถ„๋ฅ˜, ์†Œํ”„ํŠธ๋งฅ์Šค ํšŒ๊ท€

ruhz 2020. 8. 3. 17:53

๋‹ค์ค‘ ๋ถ„๋ฅ˜

์ง€๋‚œ ํฌ์ŠคํŒ…์—์„œ๋Š” 0 / 1๋กœ ์ด๋ฃจ์–ด์ง„ ๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šตํ•˜๊ณ  ๊ฒฐ๊ณผ๋ฅผ ๋„์ถœ(๋ถ„๋ฅ˜)ํ•˜๋Š” ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€์— ๋Œ€ํ•ด์„œ ๋‹ค๋ค„๋ณด์•˜๋‹ค. ํ•˜์ง€๋งŒ ์šฐ๋ฆฌ๊ฐ€ ์‚ฌ๋Š” ์„ธ์ƒ์€ ๋‘ ๊ฐ€์ง€ ํ•ญ๋ชฉ๋งŒ ๊ฐ€์ง€๊ณ  ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์—†๋Š” ๊ฒƒ๋“ค์ด ๋„ˆ๋ฌด๋‚˜๋„ ๋งŽ๋‹ค. ๊ทธ๋ ‡๋‹ค๋ฉด ๋ฐ์ดํ„ฐ๋ฅผ ์—ฌ๋Ÿฌ๊ฐ€์ง€๋กœ ๊ตฌ๋ถ„ํ•ด๋‚ด๋Š” ๋‹ค์ค‘๋ถ„๋ฅ˜(Multinomial Classification)๋Š” ์–ด๋–ป๊ฒŒ ๊ตฌํ˜„ํ•ด๋‚ผ ์ˆ˜ ์žˆ์„๊นŒ?

์™ผ์ชฝ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณด๊ณ  ์„ธ ๊ฐœ๋ฅผ ๋‚˜๋ˆ„๋Š” ๊ฒƒ์€ ์‚ฌ๋žŒ์—๊ฒŒ๋Š” ์ผ๋„ ์•„๋‹ˆ์ง€๋งŒ, ์ปดํ“จํ„ฐ๋Š” ์ง๊ด€์ ์ด์ง€ ๋ชปํ•˜๋‹ค. ๋จผ์ € ๋ฐฐ์šด ๋กœ์ง€์Šคํ‹ฑ ๋ถ„๋ฅ˜๋ฅผ ํ™œ์šฉํ•ด๋ณด์ž. ์ด ๋ฌธ์ œ๋ฅผ A(๋นจ๊ฐ„์ƒ‰)์ธ ๊ฒƒ๊ณผ ์•„๋‹Œ๊ฒƒ, B(์ดˆ๋ก์ƒ‰)์ธ ๊ฒƒ๊ณผ ์•„๋‹Œ๊ฒƒ, C(ํŒŒ๋ž€์ƒ‰)์ธ ๊ฒƒ๊ณผ ์•„๋‹Œ ๊ฒƒ์„ ๊ตฌ๋ถ„ํ•˜๋Š” ์„ธ๊ฐœ์˜ ์ž‘์€ ๋ฌธ์ œ๋กœ ์ชผ๊ฐœ๋Š” ๊ฒƒ์ด๋‹ค.

 

 

 


import tensorflow as tf
tf.set_random_seed(777)

x_data = [[1, 2, 1, 1],
          [2, 1, 3, 2],
          [3, 1, 3, 4],
          [4, 1, 5, 5],
          [1, 7, 5, 5],
          [1, 2, 5, 6],
          [1, 6, 6, 6],
          [1, 7, 7, 7]]
y_data = [[0, 0, 1],
          [0, 0, 1],
          [0, 0, 1],
          [0, 1, 0],
          [0, 1, 0],
          [0, 1, 0],
          [1, 0, 0],
          [1, 0, 0]]

X = tf.placeholder("float", [None, 4])
Y = tf.placeholder("float", [None, 3])
nb_classes = 3

W = tf.Variable(tf.random_normal([4, nb_classes]), name='weight')
b = tf.Variable(tf.random_normal([nb_classes]), name='bias')

์ž…๋ ฅ ๋ฐ์ดํ„ฐ๊ฐ€ ์˜๋ฏธํ•˜๋Š” ๊ฒƒ์€
x_data๊ฐ€ [1, 2, 1, 1]์ด๋ผ๋ฉด, y_data๋Š” 'C' (one-hot์œผ๋กœ ํ‘œํ˜„ : [1, 0, 0]='A', [0, 1, 0]='B', [0, 0, 1]='C)
x_data๊ฐ€ [2, 1, 3, 2]์ด๋ผ๋ฉด, y_data๋Š” 'C'
x_data๊ฐ€ [4, 1, 5, 5]์ด๋ผ๋ฉด, y_data๋Š” 'B' ...

nb_classes๋Š” ๋ถ„๋ฅ˜ํ•  ๊ธฐ์ค€์„ ์ˆซ์ž๋กœ ์ •์˜ํ•œ ๊ฒƒ์œผ๋กœ, nb_classes ๋งŒํผ ์ง์„  ์‹์„ ๋งŒ๋“ค๊ฒŒ ๋œ๋‹ค. ๋ฌผ๋ก  ์—ฌ๊ธฐ์„œ๋Š” ํ•˜๋‚˜์˜ ํ–‰๋ ฌ๋กœ ๋ณ‘ํ•ฉํ•ด์„œ ํ‘œํ˜„ํ•˜๊ฒŒ ๋œ๋‹ค. ๋„์‹ํ™”ํ•ด๋ณด๋ฉด ์•„๋ž˜ ๊ทธ๋ฆผ๊ณผ ๊ฐ™๋‹ค.

X์— ์‹ค์ œ๋กœ ์œ„ ์ฝ”๋“œ์˜ data๊ฐ€ ํ–‰๋ ฌ ํ˜•ํƒœ๋กœ ์ €์žฅ๋˜์–ด ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•˜๊ณ  ํ–‰๋ ฌ์˜ ๊ณฑ์„ ํ•ด๋ณด์ž. nb_classes = 3 ๊ฐœ์˜ ์ง์„ ์ด x1*w1 + x2*w2 + x3*w3 + x4*w4 = 'A'์˜ ํ˜•ํƒœ๋กœ ๋‚˜์˜ด์„ ์•Œ์ˆ˜์žˆ๋‹ค.
์ด ํ–‰๋ ฌ์˜ ๊ณฑ์„ ํ†ต์งธ๋กœ ๊ฐ€์ •์œผ๋กœ ์‚ฌ์šฉํ•  ๊ฒƒ์ด๋‹ค.

 

 

 


hypothesis = tf.nn.softmax(tf.matmul(X, W) + b)

cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))

optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)

์†Œํ”„ํŠธ๋งฅ์Šค

softmax๋Š” ๊ฐ๊ฐ์˜ ๊ฒฐ๊ณผ๊ฐ’๋“ค์˜ ํฌ๊ธฐ๋ฅผ ๋น„๊ตํ•ด ๋ชจ๋‘ ํ•ฉ์ณ 1์ด ๋˜๋„๋ก ์ ๋‹นํ•œ ๊ฐ’์„ ์ฃผ๋Š”, ์ฆ‰ ๊ฐ’์„ ์ „์ฒด์— ๋Œ€ํ•œ ํ™•๋ฅ ์ฒ˜๋Ÿผ ๋ฐ”๊ฟ”์ฃผ๋Š” ํ•จ์ˆ˜์ด๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด (10, 4, 6) → (0.5, 0.2, 0.3) ์ฒ˜๋Ÿผ ๋ณ€ํ™˜ ์‹œ์ผœ์ค€๋‹ค. ๊ทธ๋ ‡๋‹ค๋ฉด ์šฐ๋ฆฌ๋Š” nb_classes์˜ ์ˆ˜๋งŒํผ ์ง์„ ์„ ์ •์˜ํ•˜์˜€๊ธฐ ๋•Œ๋ฌธ์—, ๊ฐ ์ง์„ ๋ณ„๋กœ ํ™•๋ฅ ์ด ๋‚˜์˜ฌ ๊ฒƒ์ด๋‹ค. ์ด ๊ฐ’์„ argmax๋ผ๋Š” ํ•จ์ˆ˜์— ๋‹ค์‹œ ๋„ฃ์–ด ๊ฐ€์žฅ ๋†’์€ ํ™•๋ฅ ์„ ๊ฐ€์ง€๋Š” ๊ฐ’์„ 1, ๋‚˜๋จธ์ง€๋Š” 0์œผ๋กœ ๋ฐ”๊ฟ” one-hot์œผ๋กœ ํ‘œํ˜„ํ•ด์ค€๋‹ค.

์–ด๋– ํ•œ ์ž…๋ ฅ๋ฐ์ดํ„ฐ (1, 2, 3, 4)์— ๋Œ€ํ•˜์—ฌ (A / not A), (B / not B), (C / not C)๋ฅผ ๋‚˜๋ˆ„๋Š” ์„ธ ๊ฐœ์˜ ์ง์„ ์ด ์žˆ์„ ๊ฒƒ์ด๊ณ , ๊ฐ ์ง์„ ์— ๋Œ€ํ•˜์—ฌ ๊ฐ€์ •์œผ๋กœ ์ƒ์„ฑ๋œ Y๊ฐ’์ด ๋‚˜์˜ค๋ฉด, ์ด๋ฅผ sigmoid ํ•จ์ˆ˜์— ํ•„ํ„ฐ๋ง ํ›„ softmax๋ฅผ ํ†ตํ•ด ํ™•๋ฅ ๊ฐ’์˜ ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ํ•ด์ค€๋‹ค. ๋งˆ์ง€๋ง‰์— ๊ฒฐ๊ณผ๋ฅผ ํ™•์ธํ•  ๋•Œ๋Š” argmax๋ฅผ  ํ†ตํ•ด one-hot์œผ๋กœ ๋งŒ๋“ค์–ด์ฃผ๊ณ  [1, 0, 0]๊ณผ ๊ฐ™์€ ๊ฒฐ๊ณผ๋ฅผ ์–ป์—ˆ๋‹ค๋ฉด ์ž…๋ ฅ๋ฐ์ดํ„ฐ (1, 2, 3, 4)์— ๋Œ€ํ•˜์—ฌ ์ธ๊ณต์ง€๋Šฅ์€ 'A'๋กœ ๋ถ„๋ฅ˜ํ•œ ๊ฒƒ์ด ๋œ๋‹ค. 

 

 

 


with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    for step in range(2001):
            _, cost_val = sess.run([optimizer, cost], feed_dict={X: x_data, Y: y_data})

            if step % 200 == 0:
                print(step, cost_val)

    print('--------------')
    # Testing & One-hot encoding
    a = sess.run(hypothesis, feed_dict={X: [[1, 11, 7, 9]]})
    print(a, sess.run(tf.argmax(a, 1)))

    print('--------------')
    b = sess.run(hypothesis, feed_dict={X: [[1, 3, 4, 3]]})
    print(b, sess.run(tf.argmax(b, 1)))

    print('--------------')
    c = sess.run(hypothesis, feed_dict={X: [[1, 1, 0, 1]]})
    print(c, sess.run(tf.argmax(c, 1)))

    print('--------------')
    all = sess.run(hypothesis, feed_dict={X: [[1, 11, 7, 9], [1, 3, 4, 3], [1, 1, 0, 1]]})
    print(all, sess.run(tf.argmax(all, 1)))

 

ํ˜„์žฌ๊นŒ์ง€์˜ ๊ณผ์ •์„ ๋„์‹ํ™”ํ–ˆ๋‹ค.

 

 

์•ž์„  ๊ฒŒ์‹œ๋ฌผ์— ๋งํฌํ–ˆ๋˜ '๋ชจ๋‘๋ฅผ ์œ„ํ•œ ๋”ฅ๋Ÿฌ๋‹' ๊ฐ•์˜๋ฅผ ๋“ฃ๊ณ  ํ•„๊ธฐํ•œ ๋…ธํŠธ์—
์ถ”๊ฐ€์ ์œผ๋กœ ๊ณต๋ถ€ํ•œ ๊ฒƒ์„ ๋”ํ•ด ์ž‘์„ฑํ•œ ๊ฒŒ์‹œ๊ธ€์ž…๋‹ˆ๋‹ค.