Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doesn't return index of keywords found in text #13

Open
issamemari opened this issue Nov 10, 2021 · 2 comments
Open

Doesn't return index of keywords found in text #13

issamemari opened this issue Nov 10, 2021 · 2 comments

Comments

@issamemari
Copy link

I've noticed that the implementation doesn't return the index of where keywords were found in text. This forces the user to do another search for the keyword to find its index, while the Aho Corasick algorithm should be able to provide this information for no extra cost.

I've made several modifications to the implementation in my fork https://github.com/issamemari/ahocorasick, among which is making the algorithm return the index the index. I'm happy to submit a PR that includes only the changes related to this.

@dbolkensteyn
Copy link

Agreed that's a very useful feature to have.

I've just tried out your fork @issamemari , and it was unfortunately on my test data it was 10x slower than this one (330ms vs 30ms). The baseline solution of looping over strings.Index() was at 180ms.

@mrclmr
Copy link

mrclmr commented Nov 13, 2024

Agreed that's a very useful feature to have.

I've just tried out your fork @issamemari , and it was unfortunately on my test data it was 10x slower than this one (330ms vs 30ms). The baseline solution of looping over strings.Index() was at 180ms.

Hey @dbolkensteyn

I created a fork and added the index to the match. Hopefully it is not slower than the original implementation.

https://github.com/meyermarcel/ahocorasick

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants