You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Fixed the phrase 'In our we filtering we required all scores to be below the threshold…' by removing the extra 'we'. This correction clarifies the sentence and ensures proper grammar.
- Fixed the phrase 'for each filters as well as the captions from Florence-2' by changing 'filters' to 'filter'. 'Each' should be followed by a singular noun, which improves the accuracy of the text.
In our we filtering we required all scores to be below the threshold, in this case using the aesthetic score from the first frame only would be a more effective strategy.
108
+
In our filtering we required all scores to be below the threshold, in this case using the aesthetic score from the first frame only would be a more effective strategy.
109
109
110
110
If we review [`finetrainers/crush-smol`](https://huggingface.co/datasets/finetrainers/crush-smol) we can notice that many of the objects being crushed are round or rectangular and colorful which is similar to our findings in the example frames. Aesthetic scores can be useful yet have a bias that will potentially filter out good data when used with extreme thresholds like > 5.5. It may be more effective as a filter for bad content than good with a minimum threshold of around 4.25 - 4.5.
111
111
112
112
### OCR/Caption
113
113
114
-
Here we provide some visual examples for each filters as well as the captions from Florence-2.
114
+
Here we provide some visual examples for each filter as well as the captions from Florence-2.
0 commit comments