是否一个算法存在,以帮助检测和QUOT;主要议题和QUOT;的英语句子?英语、议题、句子、算法

2023-09-11 00:26:33 作者:缺心。

我试图找出是否有一个已知的算法,可以检测出核心概念一个句子。

I'm trying to find out if there is a known algorithm that can detect the "key concept" of a sentence.

用例是如下:

在用户输入一个句子作为查询(鸡是否味道像火鸡?) 在我们的系统识别句子的概念(鸡,火鸡) ,它运行搜索我们的语料库内容

这是我们缺乏的区域被确定句子的核心主题是关于什么的。一句难道像火鸡鸡肉味有鸡的主要话题,因为用户询问鸡的味道。虽然火鸡不太重要的辅助话题。

The area that we're lacking in is identifying what the core "topic" of the sentence is really about. The sentence "Does chicken taste like turkey" has a primary topic of "chicken", because the user is asking about the taste of chicken. While "turkey" is a helper topic of less importance.

所以...我试图找出是否有一个算法,这将有助于我确定句子的主要议题...让我知道,如果你是知道的任何!

So... I'm trying to find out if there is an algorithm that will help me identify the primary topic of a sentence... Let me know if you are aware of any!!!

推荐答案

其实我对这个研究项目,并获得了两项比赛和我竞争的国民。

I actually did a research project on this and won two competitions and am competing in nationals.

有两步的方法:

解析这句话用上下文无关文法 在生成的分析树,查找只从属于名词短语类成分的所有名词 Parse the sentence with a Context-Free Grammar In the resulting parse trees, find all nouns which are only subordinate to Noun-Phrase-like constituents

例如,我吃了馅饼有两个名词:我与摊大饼。综观解析树,饼是一个动词短语的内部,因此它不能成为一个课题。 I,然而,这只是内部的NP状成分。是唯一的主题的候选,它是主体。寻找这个计划对 http://www.candlemind.com 早期的副本。需要注意的是词汇量不大,基本单数的话,而且没有动词变化,所以它有人,而不是人,有吃,而不是吃了。另外,我用了CFG是手工制作的限量。我不久将更新该程序。

For example, "I ate pie" has 2 nouns: "I" and "pie". Looking at the parse tree, "pie" is inside of a Verb Phrase, so it cannot be a subject. "I", however, is only inside of NP-like constituents. being the only subject candidate, it is the subject. Find an early copy of this program on http://www.candlemind.com. Note that the vocabulary is limited to basic singular words, and there are no verb conjugations, so it has "man" but not "men", has "eat" but not "ate." Also, the CFG I used was hand-made an limited. I will be updating this program shortly.

无论如何,有限制以这个节目。我的导师指出,在其电流的状态,它不能识别句子,主题是真实的NP(什么语法实际上调用NPS)。例如,月球是平的不是争论下去了。主题实际上是月球是平的。然而,该计划将承认月亮为主题。我将在短期内解决这个。

Anyway, there are limitations to this program. My mentor pointed out in its currents state, it cannot recognize sentences with subjects that are "real" NPs (what grammar actually calls NPs). For example, "that the moon is flat is not a debate any longer." The subject is actually "that the moon is flat." However, the program would recognize "moon" as the subject. I will be fixing this shortly.

总之,这是一个好足以让大多数的句子......

Anyway, this is good enough for most sentences...

我的研究论文可得在那里找到。去它的11页阅读的方法。

My research paper can be found there too. Go to page 11 of it to read the methods.

希望这有助于。