_Join datatype 父子关系数据建模
nested object的建模,有个不好的地方,就是采取的是类似冗余数据的方式,将多个数据都放在一起了,维护成本比较高
ES 提供了类似关系型数据库中 Join 的实现。使用 Join 数据类型实现,可以通过 Parent / Child 的关系,从而分离两个对象
- 父文档和子文档是两个独立的文档
- 更新父文档无需重新索引整个子文档。子文档被新增,更改和删除也不会影响到父文档和其他子文档。
PUT my_blogs { "mappings": { "properties": { "blog_comments_relation":{ "type":"join", "relations":{ //父子关系 "blog"(父节点):"comment"(子节点) } }, "content":{ "type":"text" }, "title":{ "type":"keyword" } } } }
创建父节点
#索引父文档 PUT my_blogs/_doc/blog1 { "title":"Learning Elasticsearch", "content":"learning ELK @ geektime", "blog_comments_relation":{ "name":"blog" } } #索引父文档 PUT my_blogs/_doc/blog2 { "title":"Learning Hadoop", "content":"learning Hadoop", "blog_comments_relation":{ "name":"blog" } }
索引子文档
父文档和子文档必须存在相同的分片上:确保查询 join 的性能
当指定文档时候,必须指定它的父文档 ID:使用 route 参数来保证,分配到相同的分片
#索引子文档 PUT my_blogs/_doc/comment1?routing=blog1 //指定routing确保父子节点索引到相同的分片 { "comment":"I am learning ELK", "username":"Jack", "blog_comments_relation":{ "name":"comment", "parent":"blog1" //父文档ID } } #索引子文档 PUT my_blogs/_doc/comment2?routing=blog2 { "comment":"I like Hadoop!!!!!", "username":"Jack", "blog_comments_relation":{ "name":"comment", "parent":"blog2" } } #索引子文档 PUT my_blogs/_doc/comment3?routing=blog2 { "comment":"Hello Hadoop", "username":"Bob", "blog_comments_relation":{ "name":"comment", "parent":"blog2" } }
查询
//查询父节点 GET /my_blogs/_doc/blog2 //parent_id查询 POST /my_blogs/_search { "query":{ "parent_id":{ "type":"comment", "id":"blog2" } } } ---------------------结果------------------------- { "took" : 15, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 0.5389965, "hits" : [ { "_index" : "my_blogs", "_type" : "_doc", "_id" : "comment2", "_score" : 0.5389965, "_routing" : "blog2", "_source" : { "comment" : "I like Hadoop!!!!!", "username" : "Jack", "blog_comments_relation" : { "name" : "comment", "parent" : "blog2" } } }, { "_index" : "my_blogs", "_type" : "_doc", "_id" : "comment3", "_score" : 0.5389965, "_routing" : "blog2", "_source" : { "comment" : "Hello Hadoop", "username" : "Bob", "blog_comments_relation" : { "name" : "comment", "parent" : "blog2" } } } ] } } ------------结果-------------------- { "took" : 70, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "my_blogs", "_type" : "_doc", "_id" : "blog1", "_score" : 1.0, "_source" : { "title" : "Learning Elasticsearch", "content" : "learning ELK @ geektime", "blog_comments_relation" : { "name" : "blog" } } }, { "_index" : "my_blogs", "_type" : "_doc", "_id" : "blog2", "_score" : 1.0, "_source" : { "title" : "Learning Hadoop", "content" : "learning Hadoop", "blog_comments_relation" : { "name" : "blog" } } } ] } }
POST my_blogs/_search { "query": { "has_parent": { "parent_type": "blog", "query": { "match":{ "title":"Learning Hadoop" } } } } } -------------------结果------------------- { "took" : 5, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "my_blogs", "_type" : "_doc", "_id" : "comment2", "_score" : 1.0, "_routing" : "blog2", "_source" : { "comment" : "I like Hadoop!!!!!", "username" : "Jack", "blog_comments_relation" : { "name" : "comment", "parent" : "blog2" } } }, { "_index" : "my_blogs", "_type" : "_doc", "_id" : "comment3", "_score" : 1.0, "_routing" : "blog2", "_source" : { "comment" : "Hello Hadoop", "username" : "Bob", "blog_comments_relation" : { "name" : "comment", "parent" : "blog2" } } } ] } }
#通过ID和routing ,访问子文档 GET my_blogs/_doc/comment3?routing=blog2 --------------------结果------------------------- { "_index" : "my_blogs", "_type" : "_doc", "_id" : "comment3", "_version" : 1, "_seq_no" : 4, "_primary_term" : 1, "_routing" : "blog2", "found" : true, "_source" : { "comment" : "Hello Hadoop", "username" : "Bob", "blog_comments_relation" : { "name" : "comment", "parent" : "blog2" } } }
Join类型约束
每个索引只允许一个Join类型Mapping定义;
父文档和子文档必须在同一个分片上编入索引;这意味着,当进行删除、更新、查找子文档时候需要提供相同的路由值。
一个文档可以有多个子文档,但只能有一个父文档。
可以为已经存在的Join类型添加新的关系。
当一个文档已经成为父文档后,可以为该文档添加子文档。
全量检索
GET my_blogs/_search { "query":{ "match_all": {} }, "sort":["_id"] }
立志如山 静心求实
浙公网安备 33010602011771号