데이터

Jump to bottom

Sung Yun Byeon edited this page Mar 9, 2019 · 3 revisions

머신러닝, 딥러닝에서 제일 중요한 것은 "데이터"입니다
만약 원하는 데이터가 존재하면 바로 사용하면 좋지만, 없을 경우 크롤링 등을 통해 획득해야 합니다
데이터를 얻을 수 있는 대표적인 곳은 아래와 같습니다

컴퓨터 비전 데이터

ImageNet
Labelme
LSUN : 무대 분류 데이터(Bedroom, Bridge, Classroom...)
CIFAR-10 : 말이 필요 없는 CIFAR 10
COCO : object detection, segmentation, captioning dataset
YouTube 8M : a large-scale labeled dataset that consists of millions of YouTube video IDs, with annotations of over 3,800+ visual entities
Visual Genome : Visual Genome is a dataset, a knowledge base, an ongoing effort to connect structured image concepts to language(10k)
Stanford Dogs Dataset : 강아지 120종 데이터
VisualQA
DeepFashion2

자연어 처리 데이터(한국어)

KorQuAD : The Korean Question Answering Dataset
카이스트 Corpus
세종 말뭉치
울산대 말뭉치
위키피디아 Dump
나무위키 Dump
네이버 movie review

SLAM

Complex Urban Data Set

카일스쿨 유튜브를 시작했습니다-! 이 Wiki 문서에서 다루지 않은 현업 이야기를 공유할 예정입니다!

궁금하시거나 요청할 내용이 있으시면 snugyun01@gmail.com으로 메일 보내주시면 감사하겠습니다 :)