-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP :update the annotation sql file #196
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR needs significant changes to pass. Please implement the changes as requested and re-tag me to review it again.
I don't really think we agreed on how exactly the new annotations would look like, let alone the annotation table. I'm a bit surprised since I'm on this project for my knowledge in data engineering yet I don't really get consulted much about major design choices that are related to databases. My job and expertise have been reduced to reviewing code (which I'm happy to do, but it's not the only thing I should be doing in this labor division). In my professional opinion, this is NOT a good recipe for future annotations based on all the issues we faced in V1, and it has NOT been ironed out properly.
import sqlite3 | ||
|
||
# Connect to SQLite database (or create it if it doesn't exist) | ||
conn = sqlite3.connect("/home/nl/Wikimpacts/Annotation_V2/impactDB_gold_V2.db") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not okay to have these paths hard-coded. You need to use the relative path where the repository is the root directory (so starting from Wikimpacts/
).
conn = sqlite3.connect("/home/nl/Wikimpacts/Annotation_V2/impactDB_gold_V2.db") | ||
cursor = conn.cursor() | ||
|
||
# Create tables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is redundant.
""" | ||
) | ||
|
||
# Commit changes and close the connection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The choice of placing this database in a new folder called "Annotation_V2" is unhelpful and will clutter the repository. Gold annotations have always landed in Database/gold
, so there is no reason for them to have a new directory. You need to reconsider where you want to place this data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When programming with a large group of people, it's absolutely vital to try to avoid repetitive code as much as possible. Here, you are basically making a new "copy" of the database but without any of the neat validation rules that the script above implements.
Also, all database related schemas have always been placed in Database/schema
so there is no justification for a new directory.
These changes need to be implemented in Database/schema
in a fashion similar to poetry run python3 Database/schema/create_db.py -db <DB-NAME>
and with proper validation rules.
We now have two files called create_db.py
. This kind of repetition is confusing and dangerous down the line.
Event_ID TEXT, | ||
Country TEXT NOT NULL, | ||
Country_GID TEXT, | ||
Administrative_Areas TEXT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to be an array or TEXT?
@i-be-snek , sure, thanks for your comments on this pr, I will share the tables with the group, and get a confirmed message for this, definitely, we can't decide the structure just by our two, I will double check the tables and email the group, thanks! |
@koffiworou, do you have further comments on this table, and I can update and share to the group for further check, thanks! |
@i-be-snek , could you have a look of the db file, and if the tables are correct, we can ask Camila start working on this branch, thanks!