text2slide

環境構築

Python 3.8.6で動作を確認しています

$ git clone https://github.com/eeic-ai-01/text2slide --recursive
$ cd text2slide
$ pip install -r requirements.txt
$ python -m spacy download en

pyenv環境の場合fasttextのみ手動でインストールする必要があります．

$ git clone https://github.com/facebookresearch/fastText.git
$ cd fastText
$ pip install .

pandocのインストール

https://pandoc.org/installing.html

モデルなどの導入

BERT日本語pretrainモデルの読み込み

BERT日本語Pretrainedモデル — KUROHASHI-KAWAHARA LAB からBASE WWM版(1.6G; 19/11/15公開)をダウンロードして展開し、中身をsummarization/extractive/SlideMan/model/Japanese/に置く。 summarization/extractive/SlideMan/src/LangFactory.pyの46行目にその絶対パスを入力する。 summarization/extractive/SlideMan/config.iniに、vocab.txtの絶対パスを入力する。

Jumanのインストール

Juman++ V2の開発版に記載された通りに2.0.0-rc3をインストールする。 summarization/extractive/SlideMan/config.iniに、jumanpp、jumandic.jppmdl、jumandic.configの絶対パスを入力する。

wikihowデータにより学習させたモデルの読み込み

ここからcp_step_9000.ptとopt_step_9000.ptをダウンロードし、summarization/extractive/SlideMan/checkpoint/jp/に置く。 summarization/extractive/SlideMan/src/LangFactory.pyの50行目、51行目にその絶対パスを入力する。

日本語wikipediaのコーパスデータの導入

ここからwikipedia_wakati.jsonをダウンロードし，scraping/text/以下に配置する必要があります．

DeepL API キーの登録

一部の要約に英語向けのモデルを使用しているため.envにDeepL APIを登録する必要があります．

実行例

$ python text2slide.py --input example/test.in

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
example		example
generate		generate
scraping		scraping
summarization		summarization
textproc		textproc
.env.example		.env.example
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
requirements.txt		requirements.txt
text2slide.py		text2slide.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

text2slide

環境構築

pandocのインストール

モデルなどの導入

BERT日本語pretrainモデルの読み込み

Jumanのインストール

wikihowデータにより学習させたモデルの読み込み

日本語wikipediaのコーパスデータの導入

DeepL API キーの登録

実行例

About

Releases

Packages

Contributors 3

Languages

eeic-ai-01/text2slide

Folders and files

Latest commit

History

Repository files navigation

text2slide

環境構築

pandocのインストール

モデルなどの導入

BERT日本語pretrainモデルの読み込み

Jumanのインストール

wikihowデータにより学習させたモデルの読み込み

日本語wikipediaのコーパスデータの導入

DeepL API キーの登録

実行例

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages