es7 学习

以下分为索引文档(insert) 和查询文档(select)

1 一个index只有一个type

索引文档时,使用 _doc来代替type

PUT /megacorp/_doc/3
{
  "first_name" :  "Douglas",
    "last_name" :   "Fir",
    "age" :         35,
    "about":        "I like to build cabinets",
    "interests":  [ "forestry" ]
}

查询某一条文档

GET /megacorp/_doc/3

查询姓smith的

GET /megacorp/_search?q=last_name:Smith

2 查询姓smith的,并大于30岁的 DSL 1使用 a and b 2查询a,过滤b

POST /megacorp/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "last_name": "Smith"
          }
        },
        {
          "range": {
            "age": {
              "gt": 30
            }
          }
        }
      ]
    }
  }
}

POST /megacorp/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "last_name": "Smith"
          }
        }
      ],
      "filter": {
        "range": {
          "age": {
            "gt": 30
          }
        }
      }
    }
  }
}

3短语搜索, 包含关键字的全部分词

https://blog.csdn.net/sinat_29581293/article/details/81486761

GET /megacorp/_search
{
    "query" : {
        "match_phrase": {
            "about" : "rock climbing"
        }
    }
}

4查看关键字分词 standard标准分词汉字分为每个字,英文分为每个单词 ,ik分词有 ik_smart 和ik_max_word

GET /megacorp/_analyze
{
   "text": ["康师傅","rock climbing"],
   "analyzer": "standard"
}

{
  "tokens" : [
    {
      "token" : "康",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "<IDEOGRAPHIC>",
      "position" : 0
    },
    {
      "token" : "师",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "<IDEOGRAPHIC>",
      "position" : 1
    },
    {
      "token" : "傅",
      "start_offset" : 2,
      "end_offset" : 3,
      "type" : "<IDEOGRAPHIC>",
      "position" : 2
    },
    {
      "token" : "rock",
      "start_offset" : 4,
      "end_offset" : 8,
      "type" : "<ALPHANUM>",
      "position" : 103
    },
    {
      "token" : "climbing",
      "start_offset" : 9,
      "end_offset" : 17,
      "type" : "<ALPHANUM>",
      "position" : 104
    }
  ]
}

　　5查看某个字段在索引文档时分词结果

GET /test/_analyze
{
  "field": "t_name", 
  "text": ["康师傅","rock climbing"],
}

　　6 查看文档字段 ,t_name字段在索引文档时使用ik_max_word分词,查询文档时使用ik_smart分词

https://segmentfault.com/a/1190000012553894?utm_source=tag-newest

http://localhost:9200/test/_mapping

t_name: {
type: "text",
similarity: "BM25",
fields: {
keyword: {
type: "keyword",
ignore_above: 256
}
},
analyzer: "ik_max_word",
search_analyzer: "ik_smart"
},
t_pyname: {
type: "text",
fields: {
keyword: {
type: "keyword",
ignore_above: 256
}
}
},

　　7高亮关键字

GET /megacorp/_search
{
    "query" : {
        "match_phrase": {
            "about" : "rock climbing"
        }
    },
    "highlight": {
      "fields": {
        "about": {}
      }
    }
}

8es的group_by,聚合 aggregations,进行分析统计

GET /megacorp/_search
{
  "aggs": {
    "all_inter": {
      "terms": {
        "field": "interests.keyword"
      }
    }
  }
}

9 聚合时报错,具体原因是聚合需要大量的内存，聚合前，需要将相应的字段开启聚合,或者按上面的方式使用 .keyword

Fielddata is disabled on text fields by default. Set fielddata=true on [interests] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead

PUT megacorp/_mapping
{
  "properties": {
    "interests": { 
      "type":     "text",
      "fielddata": true
    }
  }
}

10聚合时间长,聚合慢, 使用"execution_hint": "map"

https://blog.csdn.net/laoyang360/article/details/79253294

GET /megacorp/_search
{
  "query": {
    "match": {
      "last_name": "smith"
    }
  }, 
  "aggs": {
    "all_inter": {
      "terms": {
        "field": "interests",
　　　　　"execution_hint": "map"

      }
    }
  }
}

11查询文档,一个字段多个关键字(同一个字段查询多个搜索词) interests字段包含music的或者包含sports的,or

GET /megacorp/_search
{
   "query": {
     "terms": {
       "interests": [
         "music",
         "sports"
       ]
     }
   }
}

12查询文档,同一个字段包含多个关键字 interests字段包含music的和包含sports的,and

GET /megacorp/_search
{
   "query": {
     "bool": {
       "must": [
         {
           "term": {
             "interests": {
               "value": "music"
             }
           }
         }
         ,
         {
           "term": {
             "interests": {
               "value": "sports"
             }
           }
         }
       ]
     }
   }
}

12查询文档,一个关键字多个字段(同一个搜索词查询多个字段)

https://blog.csdn.net/dm_vincent/article/details/41820537

GET /megacorp/_search
{
   "query": {
     "multi_match": {
       "query": "Smith",
       "fields": ["last_name","first_name"]
     }
   }
}

13聚合分级汇总,聚合后的每一组数据进行统计,aggs后再aggs

GET /megacorp/_search
{
  "size":0,
  "aggs": {
    "all_inter": {
      "terms": {
        "field": "interests",
        "execution_hint": "map"
      },
      "aggs": {
        "avg_age": {
          "avg": {
            "field": "age"
          }
        }
      }
    }
  }
}

14 多字段查询, 如一个关键字查询同音字,同义字,形近字,等

https://blog.csdn.net/questiontoomuch/article/details/48493991

同音字可以增加一个字段,如 t_pyname 是t_name的pinyin

同义字增加一个字段, t_shinglesname

使用一个词干提取器来将jumps，jumping和jumped索引成它们的词根：jump。然后当用户搜索的是jumped时，我们仍然能够匹配含有jumping的文档。
包含同义词，比如jump，leap和hop。
移除变音符号或者声调符号：比如，ésta，está和esta都会以esta被索引。

posted @ 2019-11-14 18:09 jackduan1 阅读(736) 评论(0) 收藏举报

刷新页面返回顶部

jackduan1

es7 学习

公告