博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
lucene segment会包含所有的索引文件,如tim tip等,可以认为是mini的独立索引
阅读量:6234 次
发布时间:2019-06-22

本文共 1573 字,大约阅读时间需要 5 分钟。

A Lucene index segment can be viewed as a "mini" index or a shard. Each segment is a collection of all needed files for an index, including .tim and .tip. If you list your Lucene index directory, you'll see files belonging to the same segment have the same names with all different types. In fact, if you force a merge, you'll get an index of one single segment.

Each segment  contains an index of a subset of your document collection. Lucene usually creates a new segment when new documents are added to a working index, to avoid (or rather delay and batch later) reindexing cost.
When a search is executed, Lucene will fan that query over all segments, and all the index wide statistics required for relevance ranking (such as idf) are combined, so from the client's perspective, the ranking is the same as searching from an index of one segment. Note that the other famous stat, tf, is per-document, so it is already available at the segment reader layer.
Now things get more interesting when you have Lucene indexes across machines (as the case in Solr Cloud, which is one of the distributed search service built on Lucene). Due to performance and complexity, Solr Cloud don't aggregate global stats across clusters (yet), so each machine would use their own stats on the index it holds (which could be consisted of multiple segments :).

 

摘自:https://www.quora.com/Are-the-individual-tim-and-tip-files-term-dictionaries-of-a-Lucene-index-segment-updated-when-a-new-segment-is-added-to-Lucene

本文转自张昺华-sky博客园博客,原文链接:http://www.cnblogs.com/bonelee/p/6668774.html,如需转载请自行联系原作者

你可能感兴趣的文章
Vue技巧小结(持续更新)
查看>>
Linux下编译软件时指定安装目录的好处
查看>>
java中多线程模拟(多生产,多消费,Lock实现同步锁,替代synchronized同步代码块)...
查看>>
[问题]apparmor 问题导致mysql切换datadir目录失败
查看>>
2012 使用XEvent sqlserver.blocked_process_report检测阻塞
查看>>
菜鸟学C:猜数字
查看>>
网络管理经验谈:网络不通的解决之道
查看>>
认识Microsoft Hyper-V Server
查看>>
ASP.NET中,“没有为该对象定义无参数的构造函数”的错误
查看>>
回收站(recyclebin)引发row cache lock
查看>>
nagios安装配置pnp4nagios-0.6
查看>>
VMware vSphere 5.1 群集深入解析(七)
查看>>
SQL Server 黑盒跟踪 -- 进一步了解sqldiag
查看>>
banner和背景的说明
查看>>
redhat6 + 11G RAC 双节点部署
查看>>
使用Handy Backup 6.2进行数据备份与还原(多图)
查看>>
计算机高手也不能编出俄罗斯方块——计算机达人成长之路(16)
查看>>
AD RMS保护电子邮件安全
查看>>
【COCOS2DX-LUA 脚本开发之八】使用Lua实现Http交互
查看>>
Discuz!NT负载均衡方案
查看>>