什么是检索在Lucene的查询所有匹配的文件最有效的方式，不排序？最有效、方式、文件、Lucene

2023-09-04 10:17:02 作者：落安

我期待执行保持内部完整性的目的的查询;例如，除去从索引特定字段/值的所有痕迹。因此，重要的是我发现的所有的配套文件（不只是前n个文档），但他们在返回的顺序是无关紧要的。

I am looking to perform a query for the purposes of maintaining internal integrity; for example, removing all traces of a particular field/value from the index. Therefore it's important that I find all matching documents (not just the top n docs), but the order they are returned in is irrelevant.

根据文档，它看起来像我需要使用 Searcher.Search（查询，收藏家）的方法，但没有内置的珍藏级，做什么我所需要的。

According to the docs, it looks like I need to use the Searcher.Search( Query, Collector ) method, but there's no built in Collector class that does what I need.

我应该得到我自己的珍藏为了这个目的？我需要做什么的时候这样做，要记住？

Should I derive my own Collector for this purpose? What do I need to keep in mind when doing that?

推荐答案

事实证明，这是比我预想的轻松了许多。我只是用这个例子执行情况的http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Collector.html并记录传递到收集（）法列表中的文档数量，揭露这是一个公共文档属性。

Turns out this was a lot easier than I expected. I just used the example implementation at http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Collector.html and recorded the doc numbers passed to the Collect() method in a List, exposing this as a public Docs property.

我再简单地重复这个属性，通过数回搜索器，以获得正确的文件：

I then simply iterate this property, passing the number back to the Searcher to get the proper Document:

var searcher = new IndexSearcher( reader );
var collector = new IntegralCollector(); // my custom Collector
searcher.Search( query, collector );
var result = new Document[ collector.Docs.Count ];
for ( int i = 0; i < collector.Docs.Count; i++ )
    result[ i ] = searcher.Doc( collector.Docs[ i ] );
searcher.Close(); // this is probably not needed
reader.Close();

到目前为止，这似乎是工作的罚款preliminary测试。

So far it seems to be working fine in preliminary tests.

更新：这里的$ C $下 IntegralCollector ：

Update: Here's the code for IntegralCollector:

internal class IntegralCollector: Lucene.Net.Search.Collector {
    private int _docBase;

    private List<int> _docs = new List<int>();
    public List<int> Docs {
        get { return _docs; }
    }

    public override bool AcceptsDocsOutOfOrder() {
        return true;
    }

    public override void Collect( int doc ) {
        _docs.Add( _docBase + doc );
    }

    public override void SetNextReader( Lucene.Net.Index.IndexReader reader, int docBase ) {
        _docBase = docBase;
    }

    public override void SetScorer( Lucene.Net.Search.Scorer scorer ) {
    }
}

上一篇：获取终点在ArcSegment与开始X / Y和启动+后掠角后掠角、终点、ArcSegment

下一篇：在事件侦听器的内存泄露侦听器、内存、事件

相关推荐

精彩图集

精彩推荐

图片推荐