@@ -30,6 +30,7 @@ The Scribe-Data CLI supports the following commands:
30302. ``get `` (alias: ``g ``)
31313. ``total `` (alias: ``t ``)
32324. ``convert `` (alias: ``c ``)
33+ 5. ``download `` (alias: ``d ``)
3334
3435Note: For all language arguments, if the language is more than one word then the argument value needs to be passed with double quotes around it.
3536
@@ -159,6 +160,55 @@ Examples:
159160 Getting and formatting English verbs
160161 Data updated: 100%| ████████████████████████| 1/1 [00:XY< 00:00, XY.Zs/process]
161162
163+ If we want to retrieve data using lexeme dumps, we can use the following command:
164+
165+ .. code-block :: bash
166+
167+ $ scribe-data get -lang german -dt nouns -wdp
168+
169+ **Example Output: **
170+
171+ .. code-block :: text
172+
173+ Languages to process: German
174+ Data types to process: ['nouns']
175+ Existing dump files found:
176+ - scribe_data_wikidata_dumps_export/latest-lexemes.json.bz2
177+ ? Do you want to: (Use arrow keys)
178+ » Delete existing dumps
179+ Skip download
180+ Use existing latest dump
181+ Download new version
182+
183+ **Instructions: **
184+
185+ 1. Use the arrow keys to navigate through the options.
186+ 2. Press **Enter ** to confirm your selection.
187+
188+ **Options Explained: **
189+
190+ - **Delete existing dumps **: Removes the existing dump files before downloading new ones.
191+ - **Skip download **: Skips the download process.
192+ - **Use existing latest dump **: Processes the existing dump file without downloading a new version.
193+ - **Download new version **: Downloads the latest version of the lexeme dump.
194+
195+ **Note: ** Ensure you have sufficient disk space and a stable internet connection if downloading a new version.
196+
197+ **If No Existing Dump Files Are Found: **
198+
199+ 1. If no existing dump files are found, the command will display the following message:
200+
201+ .. code-block :: text
202+
203+ No existing dump files found. Downloading new version...
204+
205+ 2. The command will then proceed to download the latest dump file:
206+ .. code-block :: text
207+
208+ Downloading dump to scribe_data_wikidata_dumps_export\latest-lexemes.json.bz2...
209+ scribe_data_wikidata_dumps_export\latest-lexemes.json.bz2: 100%|███████████████████| 370M/370M [04:20<00:00, 1.42MiB/s]
210+ Wikidata lexeme dump download completed successfully!
211+
162212 Behavior and Output:
163213^^^^^^^^^^^^^^^^^^^^
164214
@@ -304,11 +354,36 @@ If user selects ``Configure total lexemes request``:
304354
305355 Language Data Type Total Lexemes
306356 ======================================================================
307- english nouns 30,841
308- adjectives 12,840
309-
310- basque nouns 14,498
311- adjectives 278
357+ english nouns 123,456
358+ adjectives 234,567
359+
360+ basque nouns 34,567
361+ adjectives 250
362+
363+ The command ``scribe-data total -lang english -wdp `` retrieves total lexeme and translation counts for English, checks dumps, and provides detailed statistics.
364+
365+ .. code-block ::
366+
367+ $ scribe-data total -lang english -wdp
368+ Languages to process: English
369+ Data types to process: None
370+ Existing dump files found:
371+ - scribe_data_wikidata_dumps_export/latest-lexemes.json.bz2
372+ ? Do you want to: Use existing latest dump
373+ We'll use the following lexeme dump scribe_data_wikidata_dumps_export/latest-lexemes.json.bz2
374+ Processing entries: 100%|████████████████████████████████████████████████████| 1406276/1406276 [15:25<00:14, 1495.97it/s]
375+ Language Data Type Total Lexemes Total Translations
376+ ==========================================================================================
377+ english nouns 123,456 12,345
378+ adjectives 345,678 2,345
379+ adverbs 45,678 345
380+ verbs 5,678 4,567
381+ proper_nouns 6,789 5,678
382+ prepositions 789 100
383+ conjunctions 75 25
384+ pronouns 50 25
385+ personal_pronouns 25 50
386+ postpositions 1
312387
313388 Features:
314389^^^^^^^^^
@@ -327,6 +402,22 @@ The interactive mode is particularly useful for:
327402- Complex queries with multiple parameters.
328403- Viewing available options without memorizing commands.
329404
405+ Root Interactive Command
406+ ~~~~~~~~~~~~~~~~~~~~~~~~~
407+ .. code-block :: bash
408+
409+ $ scribe-data interactive
410+ Welcome to Scribe-Data v4.1.0 interactive mode!
411+ ? What would you like to do? (Use arrow keys)
412+ » Download a Wikidata lexemes dump
413+ Check for totals
414+ Get data
415+ Get translations
416+ Convert JSON
417+ Exit
418+
419+ The command ``scribe-data interactive `` initiates the interactive mode, allowing users to easily select and execute various Scribe-Data operations.
420+
330421Total Command
331422~~~~~~~~~~~~~
332423
@@ -426,3 +517,42 @@ Options:
426517- ``-f, --file FILE ``: The file to convert to a new type.
427518- ``-ko, --keep-original ``: Whether to keep the file to be converted (default: True).
428519- ``-ot, --output-type {json,csv,tsv,sqlite} ``: The output file type.
520+
521+ Download Command
522+ ~~~~~~~~~~~~~~~~
523+ Usage:
524+
525+ .. code-block :: bash
526+
527+ scribe-data download
528+
529+ Behavior and Output:
530+ ^^^^^^^^^^^^^^^^^^^^
531+
532+ - **If Existing Dump Files Are Found: **
533+
534+ 1. If existing dump files are found, the command will display the following message:
535+
536+ .. code-block :: text
537+
538+ Existing dump files found:
539+ - scribe_data_wikidata_dumps_export/latest-lexemes.json.bz2
540+
541+ 2. The command will prompt the user with options to choose from:
542+
543+ .. code-block :: text
544+
545+ ? Do you want to: (Use arrow keys)
546+ » Delete existing dumps
547+ Skip download
548+ Use existing latest dump
549+ Download new version
550+ - **If Downloading New Version: **
551+
552+ 1. If the user chooses to proceed with the download, the dump will be downloaded to the specified directory:
553+
554+ .. code-block :: text
555+
556+ Downloading dump to scribe_data_wikidata_dumps_export\latest-lexemes.json.bz2...
557+ scribe_data_wikidata_dumps_export\latest-lexemes.json.bz2: 100%|███████████████████| 370M/370M [04:20<00:00, 1.42MiB/s]
558+ Wikidata lexeme dump download completed successfully!
0 commit comments