Skip to content

Commit 0ad9551

Browse files
committed
Add more dataset
1 parent 55a3082 commit 0ad9551

File tree

4 files changed

+58
-0
lines changed

4 files changed

+58
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
type: huggingface
2+
id: fujiki/japanese_alpaca_data
3+
url: https://huggingface.co/datasets/fujiki/japanese_alpaca_data
4+
converted_size: 13.8MB
5+
license: CC-BY-NC-SA-4.0
6+
lang: ja
7+
description: This dataset is based on masa3141's great work on japanese-alpaca-lora [github]. Please also refer to this repo.
8+
structure:
9+
- id: instruction
10+
type: string
11+
description: Instruction
12+
- id: input
13+
type: string
14+
description: Context
15+
- id: output
16+
type: string
17+
description: Output
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
type: huggingface
2+
id: fujiki/llm-japanese-dataset_snow
3+
url: https://huggingface.co/datasets/fujiki/llm-japanese-dataset_snow
4+
converted_size: 5.54MB
5+
license: CC-BY-4.0
6+
lang: ja
7+
description: This dataset is a subset of izumi-lab/llm-japanese-dataset only including snow tasks.
8+
structure:
9+
- id: instruction
10+
type: string
11+
description: Instruction
12+
- id: input
13+
type: string
14+
description: Context
15+
- id: output
16+
type: string
17+
description: Output
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
type: huggingface
2+
id: if001/oasst1_ja_ppl
3+
url: https://huggingface.co/datasets/if001/oasst1_ja_ppl
4+
converted_size: 27.2MB
5+
license: Apache-2.0
6+
lang: ja
7+
description: kunishou/oasst1-89k-ja のforkです。instructionとinput、outputにまとめ、kenllmでperplexityのスコアが付与してあります。
8+
structure:
9+
- id: instruction
10+
type: string
11+
description: Instruction
12+
- id: input
13+
type: string
14+
description: Context
15+
- id: output
16+
type: string
17+
description: Output
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
type: huggingface
2+
id: shunk031/livedoor-news-corpus
3+
url: https://huggingface.co/datasets/shunk031/livedoor-news-corpus
4+
converted_size: Unknown
5+
license: CC-BY-ND-4.0
6+
lang: ja
7+
description: 本コーパスは、NHN Japan 株式会社が運営する「livedoor ニュース」のうち、下記のクリエイティブ・コモンズライセンスが適用されるニュース記事を収集し、可能な限り HTML タグを取り除いて作成したものです。

0 commit comments

Comments
 (0)