stored fields

设置索引的时候，给某些字段的store属性设置为true，在查询时，请求中可以携带stored_fields参数，指定某些字段，这些字段会被包含在返回的结果中。如果请求中携带的字段没有被储存，将会被忽略。

没有设置store的情况，默认没有store属性

PUT blog_index
{
  "mappings":{
      "properties":{
        "title":{
          "type":"text",
          "fields":{
            "keyword":{
              "type":"keyword",
              "ignore_above":100
            }
          }
        },
        "publish_date":{
          "type":"date"
        },
        "author":{
          "type":"keyword",
          "ignore_above":100
        },
        "abstract":{
          "type":"text"
        },
        "url":{
          "enabled":false
        },
        "content":{
          "type":"text"
        }
      }
    }
}

　　存入数据

PUT blog_index/_doc/1
{
  "title":"blog title",
  "content":"blog content"
}

　　默认查询数据，返回的属性字段都在_source中

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "blog_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "title" : "blog title",
          "content" : "blog content"
        }
      }
    ]
  }
}

设置store属性的场景

PUT blog_index
{
  "mappings": {
      "_source": {
        "enabled": false
      },
      "properties": {
        "title": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 100
            }
          },
          "store": true
        },
        "publish_date": {
          "type": "date",
          "store": true
        },
        "author": {
          "type": "keyword",
          "ignore_above": 100, 
          "store": true
        },
        "abstract": {
          "type": "text",
          "store": true
        },
        "content": {
          "type": "text"
        },
        "url": {
          "type": "keyword",
          "doc_values":false,
          "norms":false,
          "ignore_above": 100, 
          "store": true
        }
      }
    }
}

　　添加数据，查询，多了名称为fields的字段，并且没有了_source，从 document 中获取的字段的值通常是array。

GET blog_index/_search
{
  "stored_fields": ["title","content"],   //content字段没有存储，当尝试获取stored_fields时get会将其忽略
  "highlight":{
    "fields": {"content": {}}
  }
}

--------结果------------------
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "blog_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "fields" : {
          "title" : [
            "blog title"
          ]
        }
      }
    ]
  }
}

当store为false时(默认配置），这些field只存储在"_source" field中。
当store为true时，这些field的value会存储在一个跟 _source 平级的独立的field中。同时也会存储在_source中，所以有两份拷贝。

场景：_source的内容非常大。

　　如果想要在返回的_source document中解释出某个field的值，开销会很大（当然可以定义source filtering将减少network overhead），比如某个document中保存的是一本书，所以document中可能有这些field: title, date, content。假如只是想查询书的title 跟date信息，而不需要解释整个_source（非常大），这个时候可以考虑将title, date这些field设置成store=true。

posted on 2022-11-27 14:35 溪水静幽阅读(63) 评论(0) 收藏举报