You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a large dataset of mixed-language entries (assume 100k+) that I want to run cld3's language detection on in order to detect non-english language snippets. However, I was running into the problem with the R Session aborting (fatal error) as soon as I try to run it over certain entries. I could isolate the problem and it seems that as soon as it hit an empty entry ("") , it would fail and take the whole session down with it. cld2::detect_language_mixed and cld3::detect_language() both do not seem to have that issue, so I'm assuming it would be an easy fix to escape these entries and return NA. Seeing that it took me a while to figure out, it might save quite a bit of heartache to implement this in the next update though. I'm running the latest cld3 release from CRAN (1.4.1).
Also, thanks for the great package! It's really helpful seeing that it seems to deal better with multi-language entries than cld2.
The text was updated successfully, but these errors were encountered:
Hey!
I have a large dataset of mixed-language entries (assume 100k+) that I want to run cld3's language detection on in order to detect non-english language snippets. However, I was running into the problem with the R Session aborting (fatal error) as soon as I try to run it over certain entries. I could isolate the problem and it seems that as soon as it hit an empty entry ("") , it would fail and take the whole session down with it. cld2::detect_language_mixed and cld3::detect_language() both do not seem to have that issue, so I'm assuming it would be an easy fix to escape these entries and return NA. Seeing that it took me a while to figure out, it might save quite a bit of heartache to implement this in the next update though. I'm running the latest cld3 release from CRAN (1.4.1).
Also, thanks for the great package! It's really helpful seeing that it seems to deal better with multi-language entries than cld2.
The text was updated successfully, but these errors were encountered: