Elastic学习之旅 (4) ES文档CRUD操作

大家好，我是Edison。

上一篇：ES必备基础概念一览

ES文档CRUD介绍

和MongoDB一样，文档的CRUD是我们学习ES的必备操作，下图展示了ES文档的CRUD概要：

从上图可以知道，ES文档除了CRUD外，还多了一个Index操作，它的功能Create类似，但又有点不同：

Create - 如果ID存在，则会失败；否则新增成功；
Index - 如果ID存在，会先删除现有文档再创建新的，版本号会增加；否则，直接新增成功；

因此，我们可以说Index的功能或许可以叫 AddOrReplace。

Create文档

Create文档支持生成文档ID 和指定文档ID 两种方式：

自动生成ID

通过调用 post {index}/_doc，系统会自动生成文档ID。

由上图可知，系统自动生成了一个ID。

指定ID

通过调用 put {index}/_create/1，系统会指定新生成文档ID为1。但如果指定ID已经存在，操作则失败。

由上图可知，我们传了一个指定ID=1。但是，如果我们再次执行这条语句会如何？

再次执行会报错，因为ES检测到这个数据版本已经存在了。

Get文档

通过get {index}/_doc/{id}即可快速查询一个文档数据，如果没有找到，则返回HTTP 404。

在返回的文档中，文档的真正内容在_source字段里面。

在返回的文档中，还包含了文档元信息：

_index / _type
版本信息，同一个ID的文档，即使被删除，version号也会不断增加
_source中默认包含了文档的所有原始信息

Index文档

刚刚提到，Index 和 Create 不一样的地方在于：

如果文档不存在，就索引新的文档。
如果文档已存在，旧文档会先被删除，新文档会被索引，同时版本号+1。

因此，Index操作更像是我们所说的“AddOrReplace”。

通过put {index}/_doc/{id}即可完成Index操作，这里我们以刚刚get的示例为基础，修改id=1的user的username，由于id=1记录已存在，会先删除旧文档，再索引新文档：

从上图可以看到，当Index操作完成后，version号从1变为了2。

这时如果我们再查询一个id=1的文档，会发现已被新文档覆盖了，只有一个user字段了。

Update文档

Update方法就是真正的数据更新，它不会删除原来的文档。

通过post {index}/_update/{id}即可实现Update操作。

这时我们再次get一下，得到的结果：

可以看到，新增的数据已经加入了文档内容中，并且version又增加了一位。

Delete文档

可以通过 delete {index}/_doc/{id}来完成文档的删除操作。

可以看到，返回的结果状态显示为deleted，则表示删除成功。

这时如果再次查询这个文档，就会显示找不到了：

批量操作API（Bulk API）

ES提供了一个Bulk API，支持在一次API调用中，对不同的索引进行不同类型（如Index、Create、Update、Delete）的操作，可以有效减少网络连接所产生的开销。

POST _bulk
{ "index": { "_index":"test", "_id":"1" } }
{ "filed1": "value1" }
{ "delete": { "_index":"test", "_id":"2" } }
{ "create": { "_index":"test2", "_id":"3"  } }
{ "filed1": "value3" }
{ "update": { "_index":"test", "_id":"1" } }
{ "doc": { "field2":"value2" } }

其返回结果包含了每一条操作执行的结果。

{
  "took" : 854,
  "errors" : false,
  "items" : [
    {
      "index" : {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "1",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 0,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "delete" : {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "2",
        "_version" : 1,
        "result" : "not_found",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 1,
        "_primary_term" : 1,
        "status" : 404
      }
    },
    {
      "create" : {
        "_index" : "test2",
        "_type" : "_doc",
        "_id" : "3",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 0,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "update" : {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "1",
        "_version" : 2,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 2,
        "_primary_term" : 1,
        "status" : 200
      }
    }
  ]
}

需要注意的是：操作中单条操作失败，并不影响其他操作。此外，单次批量操作，数据量不宜过大，以免引发性能问题。

批量读取（mget）

和批量操作类似，ES提供了一个mget实现批量读取，可以减少网络连接产生的开销，提高读取的性能。

通过 get /_mget即可完成：

GET /_mget
{
  "docs":[
    {
      "_index":"users",
      "_id":1
    },
    {
      "_index":"users",
      "_id":2
    }
  ]
}

返回结果包含了多个数据：

{
  "docs" : [
    {
      "_index" : "users",
      "_type" : "_doc",
      "_id" : "1",
      "_version" : 1,
      "_seq_no" : 5,
      "_primary_term" : 1,
      "found" : true,
      "_source" : {
        "user" : "Andy",
        "postDate" : "2024-01-20T15:00:00",
        "message" : "Trying to use ElasticSearch"
      }
    },
    {
      "_index" : "users",
      "_type" : "_doc",
      "_id" : "2",
      "_version" : 1,
      "_seq_no" : 6,
      "_primary_term" : 1,
      "found" : true,
      "_source" : {
        "user" : "Wings",
        "postDate" : "2024-01-20T15:00:00",
        "message" : "Trying to use EFK"
      }
    }
  ]
}

批量查询（msearch）

和批量读取类似，ES提供了一个msearch实现批量查询，通过post {index}/_msearch即可完成：

POST users/_msearch
{}
{"query":{"match_all":{}},"size":3}
{"index":"movies"}
{"query":{"match_all":{}},"size":2}

例如上面这个批量查询，它从users中查询了3个数据还从 movices中查询了2个数据出来：

{
  "took" : 7,
  "responses" : [
    {
      "took" : 7,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 3,
          "relation" : "eq"
        },
        "max_score" : 1.0,
        "hits" : [
          {
            "_index" : "users",
            "_type" : "_doc",
            "_id" : "5-46K40BoVgALGyCI5vL",
            "_score" : 1.0,
            "_source" : {
              "user" : "Edison",
              "postDate" : "2024-01-20T14:00:00",
              "message" : "Trying to use Kibana"
            }
          },
          {
            "_index" : "users",
            "_type" : "_doc",
            "_id" : "1",
            "_score" : 1.0,
            "_source" : {
              "user" : "Andy",
              "postDate" : "2024-01-20T15:00:00",
              "message" : "Trying to use ElasticSearch"
            }
          },
          {
            "_index" : "users",
            "_type" : "_doc",
            "_id" : "2",
            "_score" : 1.0,
            "_source" : {
              "user" : "Wings",
              "postDate" : "2024-01-20T15:00:00",
              "message" : "Trying to use EFK"
            }
          }
        ]
      },
      "status" : 200
    },
    {
      "took" : 2,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 9743,
          "relation" : "eq"
        },
        "max_score" : 1.0,
        "hits" : [
          {
            "_index" : "movies",
            "_type" : "_doc",
            "_id" : "3687",
            "_score" : 1.0,
            "_source" : {
              "year" : 0,
              "@version" : "1",
              "title" : "Light Years",
              "id" : "3687",
              "genre" : [
                "Adventure",
                "Animation",
                "Fantasy",
                "Sci-Fi"
              ]
            }
          },
          {
            "_index" : "movies",
            "_type" : "_doc",
            "_id" : "3688",
            "_score" : 1.0,
            "_source" : {
              "year" : 1982,
              "@version" : "1",
              "title" : "Porky's",
              "id" : "3688",
              "genre" : [
                "Comedy"
              ]
            }
          }
        ]
      },
      "status" : 200
    }
  ]
}