Elasticsearch之查询

Elasticsearch之查询的两种方式

1.查询字符串(query string)，简单查询，就像是像传递URL参数一样去传递查询语句，被称为简单搜索或查询字符串(query string)搜索。

2.另外一种是通过DSL语句来进行查询，被称为DSL查询(Query DSL),DSL是Elasticsearch提供的一种丰富且灵活的查询语言，该语言以json请求体的形式出现，通过restful请求与Elasticsearch进行交互

准备数据

PUT yang_night/_doc/1
{
  "name":"顾老二",
  "age":30,
  "from": "gu",
  "desc": "皮肤黑、武器长、性格直",
  "tags": ["黑", "长", "直"]
}

PUT yang_night/_doc/2
{
  "name":"大娘子",
  "age":18,
  "from":"sheng",
  "desc":"肤白貌美，娇憨可爱",
  "tags":["白", "富","美"]
}

PUT yang_night/_doc/3
{
  "name":"龙套偏房",
  "age":22,
  "from":"gu",
  "desc":"mmp，没怎么看，不知道怎么形容",
  "tags":["造数据", "真","难"]
}


PUT yang_night/_doc/4
{
  "name":"石头",
  "age":29,
  "from":"gu",
  "desc":"粗中有细，狐假虎威",
  "tags":["粗", "大","猛"]
}

PUT yang_night/_doc/5
{
  "name":"魏行首",
  "age":25,
  "from":"广云台",
  "desc":"仿佛兮若轻云之蔽月,飘飘兮若流风之回雪,mmp，最后竟然没有嫁给顾老二！",
  "tags":["闭月","羞花"]
}

查询字符串

# 使用GET命令，通过_serarch查询，条件是from属性是gu家的人都有哪些。
GET yang_night/_doc/_search?q=from:gu

# 使用GET命令，通过_serarch查询，条件是age属性是30的人都有哪些。
GET yang_night/_doc/_search?q=age:30

结构化查询(DSL方式)

# 查询条件是一步步构建出来的，将查询条件添加到match中，而match则是查询所有from字段的值中含有gu的结果就会返回。
GET yang_night/_doc/_search
{
  "query": {
    "match": {
      "from": "gu"
    }
  }
}

# 查询条件是一步步构建出来的，将查询条件添加到match中，而match则是查询所有age字段的值中含有30的结果就会返回。
GET yang_night/_search
{
  "query": {
    "match": {
      "age": 30
    }
  }
}

term,match,terms查询

term与match区别

match：会对搜索的关键词进行分词，按分词去搜索

term:不会对搜索的关键字进行分词，而直接搜索，精准匹配
    
"""
1.用的最多的都是分词的搜法
2.分词的粒度由分词器决定
"""

term查询

GET yang_night/_search
{
  "query": {
    "term": {
      "age": 18
    }
  }
}

terms查询

#两个词，只要有一个，就会查询出来
GET yang_night/_search
{
  "query": {
    "terms": {
      "age": [30,18]
    }
  }
}

match查询

# 查询条件是一步步构建出来的，将查询条件添加到match中即可，而match则是查询所有desc字段的值中含有武器很长的结果就会返回。
GET yang_night/_search
{
  "query": {
    "match": {
      "desc": "武器很长"
    }
  }
}

match_all 查询所有

# match_all的值为空，表示没有查询条件，那就是查询全部
GET yang_night/_search
{
  "query": {
    "match_all": {
    }
  }
}

match_phrase 短语查询

# 添加数据
PUT t1/_doc/1
{
  "title": "中国是世界上人口最多的国家"
}
PUT t1/_doc/2
{
  "title": "美国是世界上军事实力最强大的国家"
}
PUT t1/_doc/3
{
  "title": "北京是中国的首都"
}

"""
查出包含中国的数据
"""

# 因为match是分词查询，此时三条数据都满足条件，所以都会查出 
GET t1/doc/_search
{
  "query": {
    "match": {
      "title": "中国"
    }
  }
}

# 这里match_phrase是在文档中搜索指定的词组，而中国则正是一个词组 ，slop为间隔，默认为0，此时这里间隔为4
GET t1/_doc/_search
{
  "query": {
    "match_phrase": {
      "title": {
        "query": "中国",
        "slop": 4
      }
    }
  }
}

match_phrase_prefix 最左前缀查询

# 添加数据
PUT t3/_doc/1
{
  "title": "maggie",
  "desc": "beautiful girl you are beautiful so"
}
PUT t3/_doc/2
{
  "title": "sun and beach",
  "desc": "I like basking on the beach"
}

# 查出beautiful 的数据，由于输入的不是完整的词语 所以用match和match_phrase 都不太合适，所以有了match_phrase_prefix 
"""
前缀查询是短语查询类似，但前缀查询可以更进一步的搜索词组，只不过它是和词组中最后一个词条进行前缀匹配（如搜这样的you are bea）
最好通过max_expansions来设置最大的前缀扩展数量，因为产生的结果会是一个很大的集合，不加限制的话，影响查询性能。

"""
GET t3/_doc/_search
{
  "query": {
    "match_phrase_prefix": {
      "desc": "bea"
    }
  }
}

# 加上max_expansions
GET t3/_doc/_search
{
  "query": {
    "match_phrase_prefix": {
      "desc": {
        "query": "bea",
        "max_expansions": 1
      }
      
    }
  }
}

"""
此时去尝试加上max_expansions测试后，发现并没有如你想想的一样，仅返回一条数据，而是返回了多条数据。
max_expansions执行的是搜索的编辑（Levenshtein）距离。

那什么是编辑距离呢？编辑距离是一种计算两个字符串间的差异程度的字符串度量（string metric）。我们可以认为编辑距离就是从一个字符串修改到另一个字符串时，其中编辑单个字符（比如修改、插入、删除）所需要的最少次数。俄罗斯科学家Vladimir Levenshtein于1965年提出了这一概念。

我们再引用elasticsearch官网的一段话：该max_expansions设置定义了在停止搜索之前模糊查询将匹配的最大术语数，也可以对模糊查询的性能产生显着影响。但是，减少查询字词会产生负面影响，因为查询提前终止可能无法找到某些有效结果。重要的是要理解max_expansions查询限制在分片级别工作，这意味着即使设置为1，多个术语可能匹配，所有术语都来自不同的分片。此行为可能使其看起来好像max_expansions没有生效，因此请注意，计算返回的唯一术语不是确定是否有效的有效方法max_expansions。

我想你也没看懂这句话是啥意思，但我们只需知道该参数工作于分片层，也就是Lucene部分，超出我们的研究范围了。
我们快刀斩乱麻的记住，使用前缀查询会非常的影响性能，要对结果集进行限制，就加上这个参数。
"""

多条件查询

# 多条件查询，不能在match中加多个条件

# 查询是gu家人并且年龄为30的
GET yang_night/_doc/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "from": "gu"
          }
        },
        {
          "match": {
            "age": "30"
          }
        }
      ]
    }
  }
}

排序查询

 """
 1.desc降序, asc升序
 2.不是什么数据类型都能排序,只支持数字和时间
 """
    
# 查询gu家的人

1. desc 降序
GET yang_night/_doc/_search
{
  "query": {
    "match": {
      "from": "gu"
    }
  },
  "sort": [
    {
      "age": {
        "order": "desc"
      }
    }
  ]
}

2. asc 升序
GET yang_night/_doc/_search
{
  "query": {
    "match": {
      "from": "gu"
    }
  },
  "sort": [
    {
      "age": {
        "order": "asc"
      }
    }
  ]
}

分页查询

"""
"from": 2, 从第几条开始
"size": 1  取几条
"""

布尔查询

"""
must（and）    与的条件
should（or）   或者条件
must_not（not） 取反
filter条件过滤查询,
"""

# 查询 gu 家人并且年龄为30的
GET yang_night/_doc/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "from": "gu"
          }
        },
        {
          "match": {
            "age": 30
          }
        }
      ]
    }
  }
}

# 查询gu家人或者年龄为18的
GET yang_night/_doc/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "from": "gu"
          }
        },
        {
          "match": {
            "age": 18
          }
        }
      ]
    }
  }
}

# 查询既不是gu家人年龄也不为18的人 
GET yang_night/_doc/_search
{
  "query": {
    "bool": {
      "must_not": [
        {
          "match": {
            "from": "gu"
          }
        },
        {
          "match": {
            "age": 18
          }
        }
      ]
    }
  }
}


# 查询 gu家人并且年龄大于25的人
"""
gt 大于
gte 大于等于
lt 小于
lte 小于等于
"""
GET yang_night/_doc/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "from": "gu"
          }
        }
      ],
      "filter": {
        "range": {
          "age": {
            "gt": 25
          }
        }
      }
    }
  }
}

结果过滤

# 之查看name，和age两个属性

GET yang_night/_doc/_search
{
  "query": {
    "match_all": {
    }
  },
  "_source": ["name", "age"]
}

高亮查询

### 默认高亮样式 加粗
GET yang_night/_doc/_search
{
  "query": {
    "match": {
      "name": "石头"
    }
  },
  "highlight": {
    "fields": {
      "name": {}
    }
  }
}


### 自定义高亮样式 红色加粗
GET yang_night/_doc/_search
{
  "query": {
    "match": {
      "desc": "貌美"
    }
  },
  
  "highlight": {
    "pre_tags": "<b class='key' style='color:red'>",
    "post_tags": "</b>",
    "fields": {
      "desc": {}
    }
  }
}

聚合函数

# avg max min sum
select max(age) as my_avg

GET yang_night/_doc/_search
{
  "query": {
    "match_all": {
    }
  },
  "aggs": {
    "my_max": {
      "max": {
        "field": "age"
      }
    }
  },
  "_source": ["name", "age"]
}



GET yang_night/doc/_search
{
  "query": {
    "match": {
      "from": "gu"
    }
  },
  "aggs": {
    "my_max": {
      "max": {
        "field": "age"
      }
    }
  },
  "size": 0
}

GET yang_night/_doc/_search
{
  "size": 0, 
  "query": {
    "match_all": {}
  },
  "aggs": {
    "age_group": {
      "range": {
        "field": "age",
        "ranges": [
  
          {
            "from": 0,
            "to": 26
          },
          {
            "from": 26,
            "to": 31
          }
        ]
      }
    }
  }
}

posted @ 2022-03-14 23:14 yang_night 阅读(450) 评论(0) 收藏举报

刷新页面返回顶部

栽了清秋

Elasticsearch之查询

Elasticsearch之查询的两种方式

准备数据

查询字符串

结构化查询(DSL方式)

term,match,terms查询

term与match区别

term查询

terms查询

match查询

match_all 查询所有

match_phrase 短语查询

match_phrase_prefix 最左前缀查询

多条件查询

排序查询

分页查询

布尔查询

结果过滤

高亮查询

聚合函数

公告