-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalization issue in the raw output #26
Comments
in event ID 87, the article says the damage is Moderate, |
do we normalize "several communities", "three towns" EventID 82 |
We can handle this in #19 |
"Moderate" is a bit tricky for me... but I think "minimal" and "little" could be interpreted as zero. We can also handle these fixes in #19 |
More cases:
|
For this, in the gold data, I revised all of them to NULL, because we don't assume a number there, is it workable? |
I think that's okay from my end :) we have a big list of expressions that evaluate to None, I can add these all in #19 later on |
Hi @i-be-snek ,Once the raw for this column is No, need to convert to No instead of 0, could you check? Event_ID 1.0
raw: "Total_Economic_Damage_Inflation_Adjusted": "No",
output: Total_Damage_Inflation_Adjusted":0 same for this column: raw:"Total_Insured_Damage_Inflation_Adjusted": "No",
output:"Total_Insured_Damage_Inflation_Adjusted":0, find 53 events with this issue for Total_Economic_Damage_Inflation_Adjusted, and 6 for Total_Insured_Damage_Inflation_Adjusted |
During the evaluation, Same thing for Yes, etc Although I agree, that should probably be normalized properly in the gold data to "True" and "False" The The result below:
Is this what you get after running |
@i-be-snek Yes, I run the parse_events.py on the file below https://github.com/VUB-HYDR/Wikimpacts/blob/prompt_GPT4o/Database/raw/ESSD_2024/Wiki_dev_set_GPT4o_20240715_full_text_feeding_74single_events_main_col_name_updated_ni_0719.json the output file is https://github.com/VUB-HYDR/Wikimpacts/blob/prompt_GPT4o/Database/output/ESSD_2024/test_dev/Wiki_dev_set_GPT4o_20240718_full_text_feeding_74single_events_main_col_name_matched_fixed.parquet and I search Total_Damage_Inflation_Adjusted in the output and I can find this: "Total_Damage_Inflation_Adjusted":0, |
This will have to have a low priority now since it won't affect the flow. We can add it to your list of "improvements" when we have the core features out. |
I think at this point we can agree that 0 is evaluated as False in python (and vice versa) and that this has not been causing issues. Closing. |
from the GPT4o experiment, in the event ID 60.0, the total_damage is not well normalized,
the raw Rs14,000 crore (US$2.18 billion), and the normalized output is NULL
The text was updated successfully, but these errors were encountered: