Skip to content

Commit 2e96e9a

Browse files
AntonEliatrakolchfa-awsnatebower
authored
Adding search analyzer mapping parameters docs (#9576)
* adding search_analyzer mapping parameters docs Signed-off-by: Anton Rubin <[email protected]> * adding search_analyzer mapping parameters docs Signed-off-by: Anton Rubin <[email protected]> * fixing vale errors Signed-off-by: Anton Rubin <[email protected]> * adding link to existing docs on search analyzer Signed-off-by: Anton Rubin <[email protected]> * adding a note about performance Signed-off-by: Anton Rubin <[email protected]> * addressing the PR comments Signed-off-by: Anton Rubin <[email protected]> * Apply suggestions from code review Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: AntonEliatra <[email protected]> * adding further details Signed-off-by: Anton Rubin <[email protected]> * Apply suggestions from code review Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: AntonEliatra <[email protected]> * Update search-analyzers.md Signed-off-by: AntonEliatra <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: AntonEliatra <[email protected]> --------- Signed-off-by: Anton Rubin <[email protected]> Signed-off-by: AntonEliatra <[email protected]> Co-authored-by: kolchfa-aws <[email protected]> Co-authored-by: Nathan Bower <[email protected]>
1 parent 3697333 commit 2e96e9a

File tree

2 files changed

+158
-15
lines changed

2 files changed

+158
-15
lines changed

_analyzers/search-analyzers.md

Lines changed: 102 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -22,14 +22,12 @@ To determine which analyzer to use for a query string at query time, OpenSearch
2222
In most cases, specifying a search analyzer that is different from the index analyzer is not necessary and could negatively impact search result relevance or lead to unexpected search results.
2323
{: .warning}
2424

25-
For information about verifying which analyzer is associated with which field, see [Verifying analyzer settings]({{site.url}}{{site.baseurl}}/analyzers/index/#verifying-analyzer-settings).
25+
## Specifying a search analyzer at query time
2626

27-
## Specifying a search analyzer for a query string
28-
29-
Specify the name of the analyzer you want to use at query time in the `analyzer` field:
27+
You can override the default analyzer behavior by explicitly setting the analyzer in the query. The following query uses the `english` analyzer to stem the input terms:
3028

3129
```json
32-
GET shakespeare/_search
30+
GET /shakespeare/_search
3331
{
3432
"query": {
3533
"match": {
@@ -43,16 +41,16 @@ GET shakespeare/_search
4341
```
4442
{% include copy-curl.html %}
4543

46-
For more information about supported analyzers, see [Analyzers]({{site.url}}{{site.baseurl}}/analyzers/supported-analyzers/index/).
44+
## Specifying a search analyzer in the mappings
4745

48-
## Specifying a search analyzer for a field
46+
When defining mappings, you can provide both the `analyzer` (used at index time) and `search_analyzer` (used at query time) for any [`text`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/text/) field.
4947

50-
When creating index mappings, you can provide the `search_analyzer` parameter for each [text]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/text/) field. When providing the `search_analyzer`, you must also provide the `analyzer` parameter, which specifies the [index analyzer]({{site.url}}{{site.baseurl}}/analyzers/index-analyzers/) to be used at indexing time.
48+
### Example: Different analyzers for indexing and search
5149

52-
For example, the following request specifies the `simple` analyzer as the index analyzer and the `whitespace` analyzer as the search analyzer for the `text_entry` field:
50+
The following configuration allows different tokenization strategies for indexing and querying:
5351

5452
```json
55-
PUT testindex
53+
PUT /testindex
5654
{
5755
"mappings": {
5856
"properties": {
@@ -67,14 +65,100 @@ PUT testindex
6765
```
6866
{% include copy-curl.html %}
6967

70-
## Specifying the default search analyzer for an index
68+
### Example: Using the edge n-gram analyzer for indexing and the standard analyzer for search
7169

72-
If you want to analyze all query strings at search time with the same analyzer, you can specify the search analyzer in the `analysis.analyzer.default_search` setting. When providing the `analysis.analyzer.default_search`, you must also provide the `analysis.analyzer.default` parameter, which specifies the [index analyzer]({{site.url}}{{site.baseurl}}/analyzers/index-analyzers/) to be used at indexing time.
70+
The following configuration enables [autocomplete]({{site.url}}{{site.baseurl}}/search-plugins/searching-data/autocomplete/)-like behavior, where you can type the beginning of a word and still receive relevant matches:
7371

74-
For example, the following request specifies the `simple` analyzer as the index analyzer and the `whitespace` analyzer as the search analyzer for the `testindex` index:
72+
```json
73+
PUT /articles
74+
{
75+
"settings": {
76+
"analysis": {
77+
"analyzer": {
78+
"edge_ngram_analyzer": {
79+
"tokenizer": "edge_ngram_tokenizer",
80+
"filter": ["lowercase"]
81+
}
82+
},
83+
"tokenizer": {
84+
"edge_ngram_tokenizer": {
85+
"type": "edge_ngram",
86+
"min_gram": 2,
87+
"max_gram": 10,
88+
"token_chars": ["letter", "digit"]
89+
}
90+
}
91+
}
92+
},
93+
"mappings": {
94+
"properties": {
95+
"title": {
96+
"type": "text",
97+
"analyzer": "edge_ngram_analyzer",
98+
"search_analyzer": "standard"
99+
}
100+
}
101+
}
102+
}
103+
```
104+
{% include copy-curl.html %}
105+
106+
The `edge_ngram_analyzer` is applied at index time, breaking input strings into partial prefixes (n-grams), which allows the index to store fragments like "se", "sea", "sear", and so on.
107+
Use the following request to index a document:
75108

76109
```json
77-
PUT testindex
110+
PUT /articles/_doc/1
111+
{
112+
"title": "Search Analyzer in Action"
113+
}
114+
```
115+
{% include copy-curl.html %}
116+
117+
Use the following request to search for the partial word `sear` in the `title` field:
118+
119+
```json
120+
POST /articles/_search
121+
{
122+
"query": {
123+
"match": {
124+
"title": "sear"
125+
}
126+
}
127+
}
128+
```
129+
{% include copy-curl.html %}
130+
131+
The response demonstrates that the query containing "sear" matches the document "Search Analyzer in Action" because the n-gram tokens generated at index time include that prefix. This mirrors the [autocomplete functionality]({{site.url}}{{site.baseurl}}/search-plugins/searching-data/autocomplete/), in which typing a prefix can retrieve full matches:
132+
133+
```json
134+
{
135+
...
136+
"hits": {
137+
"total": {
138+
"value": 1,
139+
"relation": "eq"
140+
},
141+
"max_score": 0.2876821,
142+
"hits": [
143+
{
144+
"_index": "articles",
145+
"_id": "1",
146+
"_score": 0.2876821,
147+
"_source": {
148+
"title": "Search Analyzer in Action"
149+
}
150+
}
151+
]
152+
}
153+
}
154+
```
155+
156+
## Setting a default search analyzer for an index
157+
158+
Specify `analysis.analyzer.default_search` to define a search analyzer for all fields unless overridden:
159+
160+
```json
161+
PUT /testindex
78162
{
79163
"settings": {
80164
"analysis": {
@@ -89,6 +173,9 @@ PUT testindex
89173
}
90174
}
91175
}
92-
93176
```
94177
{% include copy-curl.html %}
178+
179+
This configuration ensures consistent behavior across multiple fields, especially when using custom analyzers.
180+
181+
For more information about supported analyzers, see [Analyzers]({{site.url}}{{site.baseurl}}/analyzers/supported-analyzers/index/).
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
---
2+
layout: default
3+
title: Search analyzer
4+
parent: Mapping parameters
5+
grand_parent: Mapping and field types
6+
nav_order: 160
7+
has_children: false
8+
has_toc: false
9+
---
10+
11+
# Search analyzer
12+
13+
The `search_analyzer` mapping parameter specifies the analyzer to be used at search time for a [`text`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/text/) field. This allows the analyzer used for indexing to differ from the one used for search, offering greater control over how search terms are interpreted and matched.
14+
15+
By default, the same analyzer is used for both indexing and search. However, using a custom `search_analyzer` can be helpful when you want to apply looser or stricter matching rules during search, such as using [`stemming`]({{site.url}}{{site.baseurl}}/analyzers/stemming/) or removing stopwords only at search time. For more information and use cases, see [Search analyzers]({{site.url}}{{site.baseurl}}/analyzers/search-analyzers/).
16+
{: .note}
17+
18+
## Example
19+
20+
The following example creates a field that uses an `edge_ngram_analyzer` configured with an [`edge_ngram_tokenizer`]({{site.url}}{{site.baseurl}}/analyzers/tokenizers/edge-n-gram/) for indexing and a [`standard` analyzer]({{site.url}}{{site.baseurl}}/analyzers/supported-analyzers/standard/) for search:
21+
22+
```json
23+
PUT /articles
24+
{
25+
"settings": {
26+
"analysis": {
27+
"analyzer": {
28+
"edge_ngram_analyzer": {
29+
"tokenizer": "edge_ngram_tokenizer",
30+
"filter": ["lowercase"]
31+
}
32+
},
33+
"tokenizer": {
34+
"edge_ngram_tokenizer": {
35+
"type": "edge_ngram",
36+
"min_gram": 2,
37+
"max_gram": 10,
38+
"token_chars": ["letter", "digit"]
39+
}
40+
}
41+
}
42+
},
43+
"mappings": {
44+
"properties": {
45+
"title": {
46+
"type": "text",
47+
"analyzer": "edge_ngram_analyzer",
48+
"search_analyzer": "standard"
49+
}
50+
}
51+
}
52+
}
53+
```
54+
{% include copy-curl.html %}
55+
56+
For a full explanation of how search analyzers work as well as more examples, see [Search analyzers]({{site.url}}{{site.baseurl}}/analyzers/search-analyzers/).

0 commit comments

Comments
 (0)