ElasticSearch Character Filter

In this post, I am going to explain, how ‘Elasticsearch Character Filter’ work. So there are following steps to done this.
Step -1:  Set mapping for your index : Suppose our index name is ‘testindex’ and type is ‘testtype’. Now, we are going to set analyzer and filter.


curl -XPUT 'localhost:9200/testindex' -d '
{
"settings": {
"analysis": {
"char_filter": {
"quotes": {
"type": "mapping",
"mappings": [
"&=>and"
]
}
},
"analyzer": {
"gramAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"char_filter" : ["quotes"],
"filter": [
"lowercase"
]
},
"whitespaceAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase"
]
}
}
}
},
"mappings": {
"testtype": {
"_all": {
"analyzer": "gramAnalyzer",
"search_analyzer": "gramAnalyzer"
},
"properties": {
"Name": {
"type": "string",
"include_in_all": true,
"analyzer": "gramAnalyzer",
"search_analyzer": "gramAnalyzer",
"store":true
}
}
}
}
}'

“Character filters are used to preprocess the string of characters before it is passed to the tokenizer. A character filter may be used to strip out HTML markup, or to convert “&” characters to the word “and” “ As you seen above, we have set a filter to convert ‘&’ to ‘and’ Step-2 Index your data: Now, we are going to index data containing ‘&’ character. But we want to fetch data from ‘and’


curl -XPOST 'localhost:9200/testindex/testtype/1' -d '{
"Name" :"karra&john"
}'

Step-3: Check how analyzer work for your index data :


curl http://localhost:9200/testindex/_analyze?analyzer=gramAnalyzer \
-d 'karra&john'

Step-4: Fetch data from match query : Now, I want to fetch data from match query. As you have seen above, indexing data is ‘karra&john’ but now we would fetch those data from ‘karra&john’. See below query .


curl -XGET 'http://localhost:9200/testindex/testtype/_search' -d '
{
"query": {
"match": {
"Name": {
"query": "karraandjohn",
"analyzer": "gramAnalyzer"
}
}
}
}'

You can also do like this


curl -XGET 'http://localhost:9200/testindex/testtype/_search' -d '
{
"query": {
"match": {
"Name": {
"query": "karraandjohn"
}
}
}
}'

Step -5: Fetch data from filter :

</pre>
curl -XGET 'http://localhost:9200/testindex/testtype/_search' -d '
{
"query": {
"filtered": {
"filter": {
"term": { "Name" :"karraandjohn"}
}
}
}
}'

Result would like this :


{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [ {
"_index" : "testindex",
"_type" : "testtype",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"Name" : "karra&john"
}
} ]
}
}

So we have learned how character filter work .

This is the start of elasticsearch, from next week onwards we would be working on new topic. If you have any suggestion feel free to suggest us🙂 Stay tuned.

2 thoughts on “ElasticSearch Character Filter

Leave a comment