Skip to content
This repository has been archived by the owner on May 9, 2024. It is now read-only.

Latest commit

 

History

History
41 lines (39 loc) · 1.03 KB

bigkinds.md

File metadata and controls

41 lines (39 loc) · 1.03 KB

Korean News Dataset from Big Kinds

Sample

name: bigkinds
fullname: Korean News Dataset from Big Kinds
lang: ko
category: formal
description: Korean News Dataset from Big Kinds
homepage: https://www.bigkinds.or.kr
version: 1.0.0
num_docs: 871304
num_docs_before_processing: 912468
num_segments: 6224096
num_sents: 7759115
num_words: 197746184
size_in_bytes: 2138439745
num_bytes_before_processing: 2217210906
size_in_human_bytes: 1.99 GiB
data_files_modified: '2022-02-21 19:33:15'
meta_files_modified: '2022-02-21 19:28:34'
info_updated: '2022-02-26 03:06:08'
data_files:
  train: bigkinds-train.parquet
meta_files:
  train: meta-bigkinds-train.parquet
features:
  columns:
    id: id
    text: text
  data:
    id: int
    text: str
  meta:
    id: int
    filename: str