ElasticSearch 0.90 – Sorting Inside Nested Objects

Send to Kindle

nestedIn addition to the ability to control what value is taken into consideration when sorting on multi-valued fields, ElasticSearch 0.90.0.Beta1 introduced another sorting improvement – ability to sort on the basis of the fields inside the nested objects. We can chose a path, a filter that will narrow the nested objects, and like in the sort_mode property (just like in the post describing multi-valued fields sorting).

Nested Objects Sorting

This time let’s start with the options introduced in ElasticSearch 0.90.0. In addition to the standard sorting options we can sort on the values calculated from nested object. In order to show how that works let’s assume that we have a book object and each book can have multiple prices in different regions and what we would like to do is sort on the minimal price from all the prices in the regions that match our query.

The Mappings

In order to store our documents we can use the following simple mappings (we store them in the library.json file):

{
 "mappings" : {
  "book" : {
   "properties" : {
    "id": { "type": "string", "store": "no", "index": "not_analyzed" },
    "title": { "type" : "string", "store" : "no", "index" : "analyzed" },
    "prices": {
     "type": "nested",
     "properties" : {
      "price": { "type": "float", "store": "no" },
      "region": { "type" : "string", "store" : "no", "index": "not_analyzed" }
     }
    }
   }
  }
 }
}

In order to create our library index we use the following command:

$ curl -XPOST 'localhost:9200/library' -d @library.json

The Data

Now let’s index two sample document, by running the following two commands:

$ curl -XPOST 'localhost:9200/library/book/1' -d '{
 "id": 1,
 "title": "Book one",
 "prices" : [
  {
   "price": 13.27,
   "region": "Europe"
  },
  {
   "price": 12.70,
   "region": "USA"
  },
  {
   "price": 11.99,
   "region": "Asia"
  }
 ]
}'

And the second document:

$ curl -XPOST 'localhost:9200/library/book/2' -d '{
 "id": 2,
 "title": "Book two",
 "prices" : [
  {
   "price": 11.00,
   "region": "Europe"
  },
  {
   "price": 10.99,
   "region": "USA"
  },
  {
   "price": 14.32,
   "region": "Asia"
  }
 ]
}'

Our Query

In order to see how it all works let’s find all the documents with the term book in the title field and let’s sort them on the basis of minimum price from all the regions. The query that will do all that looks like this:

{
 "fields" : ["_id"],
 "query" : {
  "term" : {
   "title" : "book"
  }
 },
 "sort" : [
  {
   "price" : {
    "mode" : "min",
    "order" : "asc",
    "nested_path" : "prices"
   }
  },
  "_score"
 ]
}

And the response from ElasticSearch would be as follows:

{
 "took" : 2,
 "timed_out" : false,
 "_shards" : {
  "total" : 5,
  "successful" : 5,
  "failed" : 0
 },
 "hits" : {
  "total" : 2,
  "max_score" : null,
  "hits" : [ {
   "_index" : "library",
   "_type" : "book",
   "_id" : "2",
   "_score" : 1.058217,
   "sort" : [ 10.99, 1.058217 ]
  }, {
   "_index" : "library",
   "_type" : "book",
   "_id" : "1",
   "_score" : 1.058217,
   "sort" : [ 11.99, 1.058217 ]
  } ]
 }
}

As we can see, the book with identifier of 2 is the first one because it contains 10.99 price for the region named USA. So it works as intended. The defined sort is not complicated – we’ve set the sort name price to inform ElasticSearch that we want to use the price field. The nested_path property allows us to define which nested object the property to sort on will be taken from, which in our case is the prices nested objects. Of course we can also define the sorting order to asc or desc by using the order property.

Sorting Mode

In the above example we’ve set the sorting mode to min, which means that the minimum value from the matching nested object should be taken into consideration. We can change that behavior, by setting the mode parameter to one of the following values:

  • min – ElasticSearch will choose the minimum value,
  • max – ElasticSearch will choose the maximum value,
  • avg – ElasticSearch will use the calculated average value for sorting (only for numeric fields),
  • sum – ElasticSearch will use the calculated sum value for sorting (only for numeric fields).

Nested Filter

In addition to the described properties we can filter the nested documents we sort on by using the nested_filter property and include a normal filter there. For example let’s try to sort on the basis of the price field, but only those nested objects that are in the Europe region. Our query would be modify to look like this:

{
 "fields" : ["_id"],
 "query" : {
  "term" : {
   "title" : "book"
  }
 },
 "sort" : [
  {
   "price" : {
    "mode" : "min",
    "order" : "asc",
    "nested_path": "prices",
    "nested_filter" : {
     "term" : {
      "region" : "Europe"
     }
    }
   }
  },
  "_score"
 ]
}

The response to the above query would be as follows:

{
 "took" : 3,
 "timed_out" : false,
 "_shards" : {
  "total" : 5,
  "successful" : 5,
  "failed" : 0
 },
 "hits" : {
  "total" : 2,
  "max_score" : null,
  "hits" : [ {
   "_index" : "library",
   "_type" : "book",
   "_id" : "2",
   "_score" : 1.058217,
   "sort" : [ 11.0, 1.058217 ]
  }, {
   "_index" : "library",
   "_type" : "book",
   "_id" : "1",
   "_score" : 1.058217,
   "sort" : [ 13.27, 1.058217 ]
  } ]
 }
}

As you can see its working as intended.

Tagged , , ,

4 thoughts on “ElasticSearch 0.90 – Sorting Inside Nested Objects

  1. Bastiaan says:

    Think the “type”: “nested” should be placed within the “properties” in above mappings file library.json

    As explained here: http://www.elasticsearch.org/guide/reference/mapping/nested-type/

    Took me some time to find out.
    Otherwise this was of great help to me, Thanks.

    • Rafał Kuć says:

      If you will run the following command:
      curl -XPOST ‘localhost:9200/test’ -d ‘{
      “mappings” : {
      “book” : {
      “properties” : {
      “id”: { “type”: “string”, “store”: “no”, “index”: “not_analyzed” },
      “title”: { “type” : “string”, “store” : “no”, “index” : “analyzed” },
      “prices”: {
      “type”: “nested”,
      “properties” : {
      “price”: { “type”: “float”, “store”: “no” },
      “region”: { “type” : “string”, “store” : “no”, “index”: “not_analyzed” }
      }
      }
      }
      }
      }
      }’

      The mappings that you’ll get will be as follows:
      curl -XGET ‘localhost:9200/test/_mapping?pretty’

      And the response:
      {
      “test” : {
      “book” : {
      “properties” : {
      “id” : {
      “type” : “string”,
      “index” : “not_analyzed”,
      “omit_norms” : true,
      “index_options” : “docs”
      },
      “prices” : {
      “type” : “nested”,
      “properties” : {
      “price” : {
      “type” : “float”
      },
      “region” : {
      “type” : “string”,
      “index” : “not_analyzed”,
      “omit_norms” : true,
      “index_options” : “docs”
      }
      }
      },
      “title” : {
      “type” : “string”
      }
      }
      }
      }
      }

      So I think there is nothing wrong with the mappings, isn’t it? We’ve specified the two not nested fields and one nested and this is how our type look like in the index.

      And great that this entry was helpful 🙂

  2. Daniel says:

    What if I want to sort by a second level nesting document like this:

    {
    “id”: 2,
    “title”: “Book two”,
    “prices” : [
    {
    “price”: 11.00,
    “region”: “Europe”,
    “member”: [
    {“name”:”gold”, “score”: 1.1},
    {“name”:”silver”, “score”: 2.3},
    {“name”:”bronze”, “score”: 3.2}
    ]
    },
    {
    “price”: 10.99,
    “region”: “USA”,
    “member”: [
    {“name”:”gold”, “score”: 1.1},
    {“name”:”silver”, “score”: 2.1},
    {“name”:”bronze”, “score”: 3.2}
    ]
    },
    {
    “price”: 14.32,
    “region”: “Asia”,
    “member”: [
    {“name”:”gold”, “score”: 1.2},
    {“name”:”silver”, “score”: 2.2},
    {“name”:”bronze”, “score”: 3.1}
    ]
    }
    ]
    }’

    how can I filter by
    price.name=Asia
    and
    price.member.name=silver

    and sort by price.member.score

Leave a Reply