Elasticsearch’s boolean query gives you an elegant way to simply provide complex boosting conditions.  I’ve used boosting query fields as part of match clauses. But on their own, those clauses force a match as part of the boosting.

But, the bool query lets you have set of mandatory must clauses along with optional should clauses. The should clauses boost matches when more of the clauses are true. In addition, you can increase the boost per clause.

I index my Outlook email in Elasticsearch and recently started using a boosted bool query that has dramatically increased the accuracy of my searches and reduced the time to find what I’m looking for. This is actually my second attempt to index my email in the last 5 years.  Over the years, I learned that I’m typically looking for emails:

*with several AND’d keywords

  • from a specific person
  • to a specific person
  • that I’ve searched for and looked at before
  • with attachments 
  • that’s recent, usually within a month but at least within the last six months

Here is query that matches my searching habits:


“query”: {


“filtered”: {


“query”: {


“bool”: {


“must” : [


{ “query_string”: {“query”: “–query– +_type:email”, 

</span>

</pre>


                          "default_operator" : "AND" } }


],
"should": [

{ "query_string" : { "fields" : ["from"], "query" : 

                          "--query--", 



"boost" : 2



}},


{ "query_string" : { "fields" : ["to"], 

                          "query" : "--query--", 



"boost" : 2



}},


{ "match" : { "opened" : {"query" : "Y", 

                          



"boost" : 2



}}},


{ "match" : { "attachment" : "--query--"}},
{ "match" : { "dateYear" : "--query--"}},
{ "match" : { "dateMonth" : "--query--"}},
{ "match" : { "has_attach" : "Y"}},
{ "range" : { "@timestamp" : {"gt" : "now-30d", "lt" : "now" }}},
{ "range" : { "@timestamp" : {"gt" : "now-180d", "lt" : "now" }}}
]
}
}
}
},
"highlight": {
"fields": {
"subject": {},
"body-stem": {},
"attachment": {}
},
"fragment_size": 50,
"number_of_fragments" : 1,
"pre_tags": [
" "
],
"post_tags": [
"
"
]
},
"size": 20,
"fields" : ["subject", "_id", "@timestamp", "has_attach", "from", "body-stem"],
"sort": [
{
"_score": {
"order": "desc",
"ignore_unmapped": true
}
}
]
}

I created a simple test client for a research project and decided to re-purpose it as a simple email search client. It’s not pretty, but it’s very effective for finding lost email needles in the daily onslaught of emails. I like the simplicity of just typing search words and getting very relevant results back. I had been using Kibana to find emails but got frustrated that I kept having to manually add filters when I felt the search should have just been smarter. I’ve been very happy with the results and impressed with the elegance of the simple, yet powerful, query syntax.

</a>