随笔分类 -  lucene

lucene
摘要:DefaultIndexingChain.flush.writeDocValues时,遍历fields,调用field的DocValuesWriter.flush,如: SortedDocValuesWriter.flush.addSortedField时,获取该field的DocValuesCon 阅读全文
posted @ 2021-03-23 14:40 vsop_479 阅读(156) 评论(0) 推荐(0)
摘要:update操作buffer到DocumentsWriterDeleteQueue里,flush时处理deletes.DocumentsWriterDeleteQueue使用global DeleteSlice和DWPT DeleteSlice存储deletes。DWPT DeleteSlice用来 阅读全文
posted @ 2021-02-28 19:58 vsop_479 阅读(185) 评论(0) 推荐(0)
摘要:booleanQuery:"must" : [ { "term" : { "like" : "cooking" } }, { "term" : { "property" : "bike" } } ]termInsetQuery:{ "terms": {"like": [ "cooking", "fi 阅读全文
posted @ 2020-12-27 17:03 vsop_479 阅读(369) 评论(0) 推荐(0)
摘要:postings的存储, 读取, 缓存一个term的postings list 存储1: sort2: delta3: 每128个docID, 按block存储. block记录bits per value(该block最大值的bits, like fdx)4: skipper(for boolea 阅读全文
posted @ 2020-12-22 15:03 vsop_479 阅读(218) 评论(0) 推荐(0)
摘要:从最大的segment_N中读取已经提交的segments信息,具体为:从segment_N中读取已经提交的segment的name,id等。再从每个segment对应的si文件中读取segment的docCount,files,attributes等元信息。相关代码:lucene 8.7.0Seg 阅读全文
posted @ 2020-12-07 14:45 vsop_479 阅读(126) 评论(0) 推荐(0)
摘要:storedField涉及的主要文件有fdt, fdx. fdt用来分chunk存储数据, fdx来索引这些chunk。 fdt分析fdt写入由CompressingStoredFieldsWriter实现。主要field如下chunkSize: 16K(1 << 14), Lucene50Stor 阅读全文
posted @ 2020-11-29 22:05 vsop_479 阅读(283) 评论(0) 推荐(0)