如何自定义Lucene.NET搜索的符号词不区分大小写(EG" C#"或" .NET")?自定义、大小写、符号、Lucene

2023-09-04 02:39:47 作者:喜你已久

标准分析仪无法正常工作。从我可以理解,它改变了这一个搜索 C

The standard analyzer does not work. From what I can understand, it changes this to a search for c and net

WhitespaceAnalyzer 会工作,但它是区分大小写的。

The WhitespaceAnalyzer would work but it's case sensitive.

一般的规则是搜索应该像谷歌这样希望这是一个配置的事情考虑 .NET C#有过了有一段时间还是有一个解决此。

The general rule is search should work like Google so hoping it's a configuration thing considering .net, c# have been out there for a while or there's a workaround for this.

每下面的建议,我想自定义 WhitespaceAnalyzer 但如果关键字用逗号和无空格分隔没有被正确处理,例如

Per the suggestions below, I tried the custom WhitespaceAnalyzer but then if the keywords are separated by a comma and no-space are not handled correctly e.g.

java,.net,c#,oracle 

将不予退还,同时搜索这是不正确的。

will not be returned while searching which would be incorrect.

我碰到 PatternAnalyzer 这是用来分割的标记,但无法弄清楚如何使用它在这种情况下。

I came across PatternAnalyzer which is used to split the tokens but can't figure out how to use it in this scenario.

我用 Lucene.Net 3.0.3 .NET 4.0

推荐答案

对于其他人谁可能会寻找一个答案,以及

for others who might be looking for an answer as well

最后的答案竟然是通过创建一个自定义TokenFilter和一个自定义分析 随着Whitespacetokenizer,lowercasefilter等,这一切的一切约30行code令牌过滤器,我将创建一个博客文章的链接后这里的时候,我这样做,必须先创建一个博客!

the final answer turned out be to create a custom TokenFilter and a custom Analyzer using that token filter along with Whitespacetokenizer, lowercasefilter etc., all in all about 30 lines of code, i will create a blog post and post the link here when i do, have to create a blog first !

 
精彩推荐
图片推荐