메뉴 건너뛰기
.. 내서재 .. 알림
소속 기관/학교 인증
인증하면 논문, 학술자료 등을  무료로 열람할 수 있어요.
한국대학교, 누리자동차, 시립도서관 등 나의 기관을 확인해보세요
(국내 대학 90% 이상 구독 중)
로그인 회원가입 고객센터 ENG
주제분류

추천
검색

논문 기본 정보

자료유형
학위논문
저자정보

이보배 (고려대학교, 고려대학교 컴퓨터정보통신대학원)

지도교수
김현철
발행연도
2020
저작권
고려대학교 논문은 저작권에 의해 보호받습니다.

이용수9

표지
AI에게 요청하기
추천
검색

이 논문의 연구 히스토리 (2)

초록· 키워드

오류제보하기
데이터 불균형은 머신러닝 분류 모델의 성능을 저하하는 요인으로 작용한다고 알려져 있다. 이러한 불균형을 해결하기 위해 샘플링 방법이나 두 개 이상의 모델을 결합한 앙상블 기법 등을 이용하여 분류하려는 집단의 데이터 분포를 비슷하게 맞추고 성능을 높이려는 연구가 활발하게 진행되고 있다.
본 연구에서는 데이터의 불균형이 머신러닝 모델 성능에 미치는 영향을 알아보기 위해 역으로 클래스에 데이터의 불균형을 만들어 실험하였다. 실험에 사용된 데이터는 MNIST 손글씨 숫자 이미지 데이터이며, 머신러닝 모델은 총 4개로 컨볼루션 신경망(Convolutional Neural Network; CNN), 의사결정 나무(Decision Tree; DT), 로지스틱 회귀(Logistic Regression; LR), k-최근접 이웃(k-Nearest Neighbor; k-NN)이다. 실험은 세 가지 기준(클래스 0~4가 불균형일 때, 특정 클래스가 불균형일 때, 전체 클래스 데이터양에 차이가 있을 때)으로 데이터를 각각 10% 단위로 삭제한 후 실험하였다. 그중 데이터 불균형이 10, 40, 80%일 때를 선정하여 분석하였고 실제로 불균형이 클수록 머신러닝 모델 성능에 영향이 있음을 확인할 수 있었다.

목차

제 1 장 서 론 ········································································· 1
1.1 연구배경 및 목적 ··························································· 1
1.2 논문의 구성 ·································································· 2
제 2 장 관련 연구 ·································································· 2
제 3 장 연구 계획 ··································································· 3
3.1 실험 방법 ······································································ 3
3.2 데이터 준비 ··································································· 3
제 4 장 연구 결과 ··································································· 5
4.1 실험 1: 클래스 0∼4가 불균형일 때 ································· 5
4.1.1 컨볼루션 신경망 ························································· 5
4.1.2 의사결정 나무 ···························································· 7
4.1.3 로지스틱 회귀 ···························································· 9
4.1.4 k-최근접 이웃 ··························································· 11
4.1.5 실험 1: 종합 결과 ······················································ 13
4.2 실험 2: 특정 클래스(1, 3, 4, 6, 7)가 불균형일 때 ············ 15
4.2.1 컨볼루션 신경망 ························································ 16
4.2.2 의사결정 나무 ··························································· 19
4.2.3 로지스틱 회귀 ··························································· 22
4.2.4 k-최근접 이웃 ··························································· 25
4.2.5 실험 2: 종합 결과 ······················································ 28
4.3 실험 3: 전체 클래스의 데이터양에 차이가 있을 때 ············ 29
4.3.1 컨볼루션 신경망 ························································ 29
4.3.2 의사결정 나무 ··························································· 32
4.3.3 로지스틱 회귀 ··························································· 35
4.3.4 k-최근접 이웃 ··························································· 38
4.3.5 실험 3: 종합 결과 ······················································ 41
제 5 장 결론 ·········································································· 44
참고문헌 ··············································································· 45

최근 본 자료

전체보기

댓글(0)

0