@@ -21,14 +21,16 @@ composer require toflar/state-set-index
21
21
``` php
22
22
namespace App;
23
23
24
- use Toflar\StateSetIndex\Alphabet\InMemoryAlphabet;
24
+ use Toflar\StateSetIndex\Alphabet\Utf8Alphabet
25
+ use Toflar\StateSetIndex\DataStore\InMemoryDataStore;
25
26
use Toflar\StateSetIndex\StateSet\InMemoryStateSet;
26
27
use Toflar\StateSetIndex\StateSetIndex;
27
28
28
29
$stateSetIndex = new StateSetIndex(
29
30
new Config(6, 4),
30
- new InMemoryAlphabet(),
31
- new InMemoryStateSet()
31
+ new Utf8Alphabet(),
32
+ new InMemoryStateSet(),
33
+ new InMemoryDataStore()
32
34
);
33
35
34
36
$stateSetIndex->index(['Mueller', 'Müller', 'Muentner', 'Muster', 'Mustermann']);
@@ -44,15 +46,20 @@ you want to index and or search.
44
46
## Customization
45
47
46
48
This library ships with the algorithm readily prepared for you to use. The main customization areas will be
47
- the alphabet (both the way it maps characters to labels) as well as the state set storage, if you want to make the index
49
+ the alphabet (both the way it maps characters to labels) and the state set storage, if you want to make the index
48
50
persistent. Hence, there are two interfaces that allow you to implement your own logic:
49
51
50
52
* The ` AlphabetInterface ` is very straight-forward. It only consists of a ` map(string $char, int $alphabetSize) ` method
51
53
which the library needs to map characters to an internal label. Whether you load/store the alphabet in some
52
- database is up to you. The library ships with an ` InMemoryAlphabet ` for reference and simple use cases.
53
- * The ` StateSetInterface ` is more complex but is essentially responsible to load and store information about the
54
- state set of your index. Again, whether you load/store the state set in some
55
- database is up to you. The library ships with an ` InMemoryStateSet ` for reference and simple use cases.
54
+ database is up to you. The library ships with an ` InMemoryAlphabet ` for reference and simple use cases. You don't
55
+ even need to store the alphabet as we already have one with the UTF-8 codepoints, that's what ` Utf8Alphabet ` is
56
+ for. In case you don't want to customize the labels, use ` Utf8Alphabet ` .
57
+ * The ` StateSetInterface ` is responsible to load and store information about the state set of your index. Again,
58
+ how you load/store the state set in some database is up to you. The library ships with an ` InMemoryStateSet `
59
+ for reference and simple use cases and tests.
60
+ * The ` DataStoreInterface ` is responsible for storing the string you index alongside its assigned state. Sometimes
61
+ you want to completely customize storage in which case you can use the ` NullDataStore ` and only use the
62
+ assignments you get as a return value from calling ` $stateSetIndex->index() ` .
56
63
57
64
You can not only ask for the final matching results using ` $stateSetIndex->findMatchingStates('Mustre', 2) ` which is
58
65
already filtered using a multibyte implementation of the Levenshtein algorithm, but you can also access intermediary
0 commit comments