Update normalization of ranges to match new guidelines #87

i-be-snek · 2024-08-28T14:17:00Z

i-be-snek · 2024-08-29T11:46:51Z

Update:

The failed tests here are the range formulas we have not agreed on 100%. Once they are ready, this should be ready for review.

i-be-snek · 2024-08-29T18:19:09Z

@koffiworou @liniiiiii

I think this PR should be almost ready. There is two smalls thing I would like to bring to your attention:

we need to upload the final version in the annotation guidelines for range normalization, but I can see you are still working on other (unrelated) sections on it. Should we upload a version now with these information and then later upload the final version when it's complete? It would be in PDF

we have two interesting test cases that fail (in essence they both mean the same thing). Do you think these make sense? They could be an edge case that we could handle with an exception, or we can leave it as is if you don't think it's a problem.

less than 1 is normalized as (1, 1)... should it not be (0, 0) or at least (0, 1)?
no more than 1... same meaning

That's because you are using max() to return a minimum of 0 should the number be any smaller than 1.

                    if "under" in k:
                        inc = 0 if "inclusive" in k else 1
                        return (
                            max(1, n - scale - inc) * lower_mod,
                            max(1, n - inc) * upper_mod,
                        )

but if you instead had max(0, n - scale - inc), then we get results that may be easier to interpret (at least I feel like they sound more natural)

# if we use max(1, ...)
###############
For N = 1, scale = 1
 over, inclusive (at least) 1 --> [1, 2]
 over, exclusive (more than) 1 --> [2, 3]
 under, inclusive (at most) 1 --> [1, 1]
 under, exclusive (less than) 1 --> [1, 1]
 about 1 --> [1, 2]


# if we use max(0, ...)
###############
For N = 1, scale = 1
 over, inclusive (at least) 1 --> [1, 2]
 over, exclusive (more than) 1 --> [2, 3]
 under, inclusive (at most) 1 --> [0, 1]
 under, exclusive (less than) 1 --> [0, 0]
 about 1 --> [0, 2]

Let me know if these issues are resolved with a comment + check in the checkbox :)

liniiiiii · 2024-08-30T08:39:53Z

@koffiworou @liniiiiii

I think this PR should be almost ready. There is two smalls thing I would like to bring to your attention:
we need to upload the final version in the annotation guidelines for range normalization, but I can see you are still working on other (unrelated) sections on it. Should we upload a version now with these information and then later upload the final version when it's complete? It would be in PDF
we have two interesting test cases that fail (in essence they both mean the same thing). Do you think these make sense? They could be an edge case that we could handle with an exception, or we can leave it as is if you don't think it's a problem.

less than 1 is normalized as (1, 1)... should it not be (0, 0) or at least (0, 1)?

no more than 1... same meaning

That's because you are using max() to return a minimum of 0 should the number be any smaller than 1.
                    if "under" in k:
                        inc = 0 if "inclusive" in k else 1
                        return (
                            max(1, n - scale - inc) * lower_mod,
                            max(1, n - inc) * upper_mod,
                        )
but if you instead had max(0, n - scale - inc), then we get results that may be easier to interpret (at least I feel like they sound more natural)
# if we use max(1, ...)
###############
For N = 1, scale = 1
 over, inclusive (at least) 1 --> [1, 2]
 over, exclusive (more than) 1 --> [2, 3]
 under, inclusive (at most) 1 --> [1, 1]
 under, exclusive (less than) 1 --> [1, 1]
 about 1 --> [1, 2]


# if we use max(0, ...)
###############
For N = 1, scale = 1
 over, inclusive (at least) 1 --> [1, 2]
 over, exclusive (more than) 1 --> [2, 3]
 under, inclusive (at most) 1 --> [0, 1]
 under, exclusive (less than) 1 --> [0, 0]
 about 1 --> [0, 2]
Let me know if these issues are resolved with a comment + check in the checkbox :)

Hi @i-be-snek , for the 1st point, I think we need to upload the final version, and I think it's mostly done and I will double check the rule for NLP paper today and save it in PDF and upload, for others I think we can freeze. and for the second point, I don't think we will have this extreme case in our database, because the numerical data is about human or money, if no death, it will be 0 or none, instead of saying less than 1 person dead, and for monetary damage, it also would not say less than 1 euro damage @koffiworou , how do you think about this case?

i-be-snek · 2024-08-30T08:45:18Z

Hi @i-be-snek , for the 1st point, I think we need to upload the final version, and I think it's mostly done and I will double check the rule for NLP paper today and save it in PDF and upload, for others I think we can freeze. and for the second point, I don't think we will have this extreme case in our database, because the numerical data is about human or money, if no death, it will be 0 or none, instead of saying less than 1 person dead, and for monetary damage, it also would not say less than 1 euro damage @koffiworou , how do you think about this case?

Thanks. That's a good case you are making but what about extreme climate events where there is one potential death or injury but it's not confirmed? In that case, I see how the range would be (0, 1).

Of course I understand this is not a standard case, but I also feel like it makes more sense, logically, to say that "at most 1" means (0, 1) and not (1, 1).

i-be-snek · 2024-09-02T16:14:49Z

@koffiworou any comments on the exceptional cases I listed here? Otherwise we are ready to squash+merge.

koffiworou · 2024-09-03T16:00:26Z

@i-be-snek , Sorry, I was not following on a real time the different comments. It is true that your example on " one potential death" suggests a narrative from someone who is not sure if there is one death, or zero, or more. If he was sure that there was at least one death, then he would have used the term "at least". So it makes sense to convert " one potential death" to the interval [0,1]. Again, this is one point of view :)

🔨 Move some phrases from zero-type to unknown-type

0ebd3b8

i-be-snek linked an issue Aug 28, 2024 that may be closed by this pull request

Small changes to the annotation rules for numbers (affecting the code) #83

Closed

3 tasks

i-be-snek changed the title ~~🔨 Move some phrases from zero-type to unknown-type~~ WIP: Update normalization of ranges to match new guidelines Aug 28, 2024

i-be-snek added 2 commits August 28, 2024 21:50

👔 Fix complex ranges due to guideline updates

f0513e3

✅ Update expected inputs in tests for complex ranges

9c611a5

i-be-snek added 3 commits August 29, 2024 13:48

✅ Update more test cases for complex ranges

02e6860

🐛 Fix incorrect range normalization ("at least")

f5c1ecb

✅ Update test cases for complex ranges

d8b24e0

👔 Set min limit to 0 (instead of 1)

0cd6a25

i-be-snek self-assigned this Aug 30, 2024

i-be-snek changed the title ~~WIP: Update normalization of ranges to match new guidelines~~ Update normalization of ranges to match new guidelines Sep 5, 2024

i-be-snek merged commit 880f27f into main Sep 5, 2024
1 check passed

i-be-snek deleted the 83-guideline-update-to-number-ranges branch October 28, 2024 15:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update normalization of ranges to match new guidelines #87

Update normalization of ranges to match new guidelines #87

i-be-snek commented Aug 28, 2024

i-be-snek commented Aug 29, 2024

i-be-snek commented Aug 29, 2024 •

edited by koffiworou

Loading

liniiiiii commented Aug 30, 2024

i-be-snek commented Aug 30, 2024

i-be-snek commented Sep 2, 2024

koffiworou commented Sep 3, 2024

Update normalization of ranges to match new guidelines #87

Update normalization of ranges to match new guidelines #87

Conversation

i-be-snek commented Aug 28, 2024

i-be-snek commented Aug 29, 2024

i-be-snek commented Aug 29, 2024 • edited by koffiworou Loading

liniiiiii commented Aug 30, 2024

i-be-snek commented Aug 30, 2024

i-be-snek commented Sep 2, 2024

koffiworou commented Sep 3, 2024

i-be-snek commented Aug 29, 2024 •

edited by koffiworou

Loading