ElasticSearch分页查询的实现

1、设置mapping

PUT /t_order
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  }, 
   "mappings" : {
      "properties" : {
        "cancel_reason" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "cancel_time" : {
          "type" : "date"
        },
        "create_time" : {
          "type" : "date"
        },
        "create_user" : {
          "type" : "long"
        },
        "delivery_type" : {
          "type" : "byte"
        },
        "discount_amount" : {
          "type" : "integer"
        },
        "expired_time" : {
          "type" : "date"
        },
        "id" : {
          "type" : "long"
        },
        "is_deleted" : {
          "type" : "byte"
        },
        "is_pay" : {
          "type" : "byte"
        },
        "is_postsale" : {
          "type" : "byte"
        },
        "order_amount" : {
          "type" : "integer"
        },
        "order_code" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "order_remark" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "order_status" : {
          "type" : "byte"
        },
        "pay_amount" : {
          "type" : "integer"
        },
        "pay_time" : {
          "type" : "date"
        },
        "pay_type" : {
          "type" : "byte"
        },
        "postage" : {
          "type" : "integer"
        },
        "product_amount" : {
          "type" : "integer"
        },
        "serial_code" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "shop_id" : {
          "type" : "long"
        },
        "update_time" : {
          "type" : "date"
        },
        "update_user" : {
          "type" : "long"
        },
        "user_id" : {
          "type" : "long"
        }
    }
}

POST /t_order/_search
{
  "query": {
    "match_all": {}
  }
}

2、添加测试数据

POST /t_order/_bulk

{
  "id": 202208780570360889300,
  "order_code": "20222379790329301675",
  "order_amount": 1,
  "pay_amount": 1,
  "discount_amount": 0,
  "product_amount": 1,
  "order_status": 0,
  "is_deleted": 0,
  "user_id": 202208761977967681500,
  "shop_id": 117979,
  "expired_time": 1648558094000,
  "postage": 0,
  "cancel_time": 1648556595000,
  "cancel_reason": "订单逾期未支付系统自动取消订单",
  "order_remark": "",
  "delivery_type": 2,
  "pay_time": null,
  "pay_type": 1,
  "is_pay": 0,
  "is_postsale": 0,
  "create_time": 1648556294000,
  "create_user": 1508761977967681500,
  "update_time": 1648556595000,
  "update_user": 999999
}

3、演示（show me the code）：

### from + size [深度翻页不推荐使用 From + size]
#from + size 两个参数定义了结果页面显示数据的内容。
#from：未指定，默认值是 0，注意不是1，代表当前页返回数据的起始值。
#size：未指定，默认值是 10，代表当前页返回数据的条数。

POST /ds-trade_t_order/_search
{
  "from": 0,
  "size": 20, 
  "query": {
    "match_all": {}
  }
}

### searchAfter [官方文档强调：不再建议使用scroll API进行深度分页。如果要分页检索超过 Top 10,000+ 结果时，推荐使用：PIT + search_after。]
#part1:创建 PIT 视图，这是前置条件不能省。
POST /ds-trade_t_order/_pit?keep_alive=5m

#part2:创建基础查询语句，这里要设置翻页的条件。

POST /_search
{
  "size": 20, 
  "track_total_hits": true,
  "query": {
    "match_all": {}
  },
  "pit": {
    "id": "l9G1AwEQZHMtdHJhZGVfdF9vcmRlchY2M3VFTm9uZ1RrT1ltbWx5RDZvQllnABZaeDFQbHhSMVJGNktBZm5kakxqYTZBAAAAAAAAIhFCFml1Uy1Kb21pU2Zxdlc4OHhfWE1aSkEAARY2M3VFTm9uZ1RrT1ltbWx5RDZvQllnAAA="
  },
  "sort": [
    {
      "create_time": {
        "order": "desc"
      }
    }
  ]
}

#part3：实现后续翻页：后续翻页都需要借助 search_after 指定前一页的最后一个文档的 sort 字段值。

POST /_search
{
  "size": 20, 
  "track_total_hits": true,
  "query": {
    "match_all": {}
  },
  "pit": {
    "id": "l9G1AwEQZHMtdHJhZGVfdF9vcmRlchY2M3VFTm9uZ1RrT1ltbWx5RDZvQllnABZaeDFQbHhSMVJGNktBZm5kakxqYTZBAAAAAAAAIhFCFml1Uy1Kb21pU2Zxdlc4OHhfWE1aSkEAARY2M3VFTm9uZ1RrT1ltbWx5RDZvQllnAAA="
  },
  "sort": [
    {
      "create_time": {
        "order": "desc"
      }
    }
  ],
  "search_after": [
       1648557674000,
          7
  ]
}

###scroll [全量或数据量很大时遍历结果数据，而非分页查询。]
#part1:指定检索语句同时设置 scroll 上下文保留时间

POST /t_order/_search?scroll=3m
{
  "size": 20,
  "query": {
    "match_all": {}
  }
  , "sort": [
    {
      "create_time": {
        "order": "desc"
      }
    }
  ]
}

#part2：指定检索语句同时设置 scroll 上下文保留时间

POST /_search/scroll
{
  "scroll":"3m",
  "scroll_id":"FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFml1Uy1Kb21pU2Zxdlc4OHhfWE1aSkEAAAAAACIXbhZaeDFQbHhSMVJGNktBZm5kakxqYTZB"
}

总结：

From+ size：需要随机跳转不同分页（类似主流搜索引擎）、Top 10000 条数据之内分页显示场景。

search_after：仅需要向后翻页的场景及超过Top 10000 数据需要分页场景。

Scroll：需要遍历全量数据场景。而非翻页的场景（翻页场景scrol id 最多打开500个）。

max_result_window：调大治标不治本，不建议调过大。

PIT：本质是视图。

另外：根据实际经验得出一些参考意见：
1、search_after pit 不适用于商品列表分页查询（类似京猫这种商品列表），因为用户从商品列表进入商品详情，长时间停留在详情页查看后，返回商品列表继续翻页，此时
keep_alive 已经过期，出现无法翻页的错误。
2、Scroll 也会有上述问题。同时scroll也会有连接过多的问题，不适用于分页场景。

To prevent against issues caused by having too many scrolls open, the user is not allowed to open scrolls past a certain limit. By default, 
the maximum number of open scrolls is 500. This limit can be updated with the search.max_open_scroll_context cluster setting.

参考资料————————————————

版权声明：本文为CSDN博主「铭毅天下」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/laoyang360/article/details/116472697

posted @ 2022-04-01 14:05 下午喝什么茶阅读(291) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

下午喝什么茶

ElasticSearch分页查询的实现

公告