elasticsearch的cross_fields查询

1.most_fields 这种方式搜索也存在某些问题

  • 它不能使用 operator 或 minimum_should_match 参数来降低次相关结果造成的长尾效应。

2.词 peter 和 smith 都必须出现,但是可以出现在任意字段中。

3.cross_fields 类型首先分析查询字符串并生成一个词列表,然后它从所有字段中依次搜索每个词。这种不同的搜索方式很自然的解决了 字段中心式 查询三个问题中的二个

4.经典案例

GET /_validate/query?explain
{
    "query": {
        "multi_match": {
            "query":       "peter smith",
            "type":        "cross_fields", 
            "operator":    "and",
            "fields":      [ "first_name", "last_name" ]
        }
    }
}

参考:https://www.elastic.co/guide/cn/elasticsearch/guide/current/_cross_fields_queries.html

---------------------------------------------------------------------------------------------------------------------------

1.正则结合cross_fields

PUT /addressbook/_doc/2
{
  "name":"test url",
  "mobile":"123/456/url"
}
GET /addressbook/_search
{
  "query": {
    "multi_match": {
      "query": ".*456.*",   #.*去掉也一样的效果
      "fields": ["name","mobile"]
    }
  }
}

 ---------------------------------------------------------------------------------------------------------

3.中文搜索,cross_field

3.1.定义映射

PUT yanbao072702
"mappings": {
  "_doc"{
      "properties": {
        "title": {
          "type": "text",
          "analyzer": "ik_max_word",
          "search_analyzer": "ik_max_word"
        },
        "author": {
          "type": "text",
          "analyzer": "ik_max_word",
          "search_analyzer": "ik_max_word"
        },
        "institution": {
            "type": "text",
            "analyzer": "ik_max_word",
            "search_analyzer": "ik_max_word"
        },
          "industry": {
              "type": "text",
              "analyzer": "ik_max_word",
              "search_analyzer": "ik_max_word"
          },
          "grade": {
              "type": "text",
              "analyzer": "ik_max_word",
              "search_analyzer": "ik_max_word"
          },
          "doc_type": {
              "type": "text",
              "analyzer": "ik_max_word",
              "search_analyzer": "ik_max_word"
          },
         "time": {
          "type": "date" ,
          "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
         },
          "doc_uri": {
           "type": "text",
            "index":false
         },
          "doc_size": {
           "type": "integer",
            "index":false
         },
          "market": {
          "type": "byte"
         }
      }
    }
    }
}'

3.2 插入数据

PUT /yanbao0727/_bulk
{"index":{"_id":"4"}}
{"title":"香港-报告","author":"中信证券","institution":"中信中国","industry":"testindustry","grade":"testgrade","doc_type":"testdoc_type","time":"2019-07-27","doc_uri":"www.baidu.com","doc_size":"10M","market":"cn"}

3.3 测试分词器

POST _analyze
{
  "analyzer": "ik_smart",
  "text":"test报告"
}

3.4 搜索“君安 报告”

POST /yanbao0727/_search
{
  "query": {
    "multi_match": {
      "query": "报告 君安",
      "type": "cross_fields", 
      "fields": ["author","title"]
    }
  }
}

3.5 搜索结果

{
        "_index" : "yanbao0727",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 2.6205368,
        "_source" : {
          "title" : "test报告",
          "author" : "国泰君安",
          "institution" : "君安证券",
          "industry" : "testindustry",
          "grade" : "testgrade",
          "doc_type" : "testdoc_type",
          "time" : "2019-07-27",
          "doc_uri" : "www.baidu.com",
          "doc_size" : "10M",
          "market" : "cn"
        }
      }

 参考:https://www.cnblogs.com/dxf813/p/8447196.html

posted @ 2019-07-27 08:54  littlevigra  阅读(294)  评论(0编辑  收藏  举报