-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Formatting #148
base: main
Are you sure you want to change the base?
WIP: Formatting #148
Conversation
README.md
Outdated
Before you run our pipeline, please choose a version of prompts to proceed, which can be revised in the beginning of **run_prompts.py** | ||
|
||
```shell | ||
from Database.Prompts.prompts import V_3 as target_prompts | ||
``` | ||
|
||
#### (Step 1) Raw output | ||
Choose the raw file contains the text you need to process, please use the clear raw file name to indicate your experiment, this name will be used as the output file, the api env you want to use, the decription of the experiment, the prompt category, and the batch file location you want to store the batch file (this is not mandatory, but it's good to check if you create correct batch file) | ||
Choose the raw file that contains the text you need to process. Please use clear raw file names to indicate your experiment. This name will be used as the output file, the api env you want to use, the decription of the experiment, the prompt category, and the batch file location you want to store the batch file (this is not mandatory, but it's good to check if you create correct batch file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this sentence:
This name will be used as the output file, the api env you want to use, the decription of the experiment, the prompt category, and the batch file location you want to store the batch file (this is not mandatory, but it's good to check if you create correct batch file)
Is it suggesting that the experiment name and description and category will all be the name of the output file?
Maybe adding a psuedo example (or a real example) could help
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, I will do that, where can I edit it, in the same branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can edit the same branch. I think for READMEs you can even safely edit directly in the Github website :D
Especially needed after undergoing a large number of changes
Pls keep this pr for a while, I will check other readmes I edited later, thanks! |
Before you run our pipeline, please choose a version of prompts to proceed, which can be revised in the beginning of **run_prompts.py** | ||
|
||
```shell | ||
from Database.Prompts.prompts import V_3 as target_prompts | ||
``` | ||
##### Step 1: Experiment Settings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, this looks great! Thanks :D
One thing that could help the reader is to say that these are the params to pass into run_prompts
##### Step 1: Experiment Settings | |
##### Step 1: Experiment Settings | |
Here is what you need to begin an experiment run with `Database/Prompts/run_prompts.py`: |
4. **Prompt Category**: Indicate the prompt category, such as "all". | ||
|
||
5. **Batch File Location** (Optional): Specify where to store the batch file. This helps verify the batch file's creation. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add something like this:
# check the args and flags
poetry run python3 Database/Prompts/run_prompts.py --help
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Output:
wikimpacts-py3.11➜ Wikimpacts git:(drop-l1-missing-all-impacts) ✗ poetry run python3 Database/Prompts/run_prompts.py --help
usage: run_prompts.py [-h] [-f FILENAME] [-r RAW_DIR] [-b BATCH_DIR] [-m MODEL_NAME] [-t MAX_TOKENS] [-e API_ENV] [-d DESCRIPTION] [-p PROMPT_CATEGORY]
options:
-h, --help show this help message and exit
-f FILENAME, --filename FILENAME
The name of the json file in the <Wikipedia articles> directory
-r RAW_DIR, --raw_dir RAW_DIR
The directory containing Wikipedia json files to be run
-b BATCH_DIR, --batch_dir BATCH_DIR
The directory where the batch file will land (as .jsonl)
-m MODEL_NAME, --model_name MODEL_NAME
The model version applied in the experiment, like gpt-4o-mini.
-t MAX_TOKENS, --max_tokens MAX_TOKENS
The max tokens of the model selected
-e API_ENV, --api_env API_ENV
The env file that contains the API keys.
-d DESCRIPTION, --description DESCRIPTION
The description of the experiment
-p PROMPT_CATEGORY, --prompt_category PROMPT_CATEGORY
The prompt category of the experiment, can only choose from impact, basic, and all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the suggestion, I will check them out after I fixed the visualization!
@@ -28,42 +30,137 @@ pre-commit installed at .git/hooks/pre-commit | |||
git lfs install | |||
``` | |||
|
|||
## Quickstart | |||
## Development |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per the suggestion from @koffiworou, I've moved the dev doc section further to the top so that users can make sure they have all the basics and dependencies set up before developing.
This PR is meant to do two things:
(1) format any left-over files using the pre-commit hook (done automatically)
(2) improve the README, especially after a large number of changes were made to the pipeline