算法生成唯一的序列号为每个英语单词序列号、算法、英语单词

2023-09-11 06:23:41 作者:伱唇毁俄纯

有关我需要生成唯一的序列号为每个英文单词的应用程序。

For an application I need to generate unique serial numbers for each English word.

什么是最好的方法?

一个限制是序列号生成算法应该是非常有效的在一个普通的台式计算机。

One constraint is serial number generation algorithm should be very effective in an ordinary desktop computer.

感谢

推荐答案

你有所有可能的单词列表?如果是的话,从0开始的第一个字,并增加序列1的每个字。

Do you have a list of all possible words? If yes, start from 0 at the first word and increment the serial by 1 for each word.

如果没有则一个简单的方法,以保证它们是唯一的方法是使用字本身作为串行。例如, ABC = 0×41的0x42 0x43中= 4276803 。 至于建议中的意见还有其他的方法(即但是需要更多的工作),如COM $ P $第一pssing的单词,例如,霍夫曼。

If not then a simple way to guarantee they are unique is to use the word itself as the serial. For example, ABC = 0x41 0x42 0x43 = 4276803. As suggested in the comments there are other ways (that however require more work), such as compressing the words first with, for example, Huffman.

这当然变得尴尬长的话:的的火山肺矽病的序列将需要大约100个​​数字,例如:

This of course gets awkward with long words: The serial of Pneumonoultramicroscopicsilicovolcanoconiosis would require around 100 digits, for example.

否则,您可以使用哈希,但也不能保证这将是唯一的所有英语单词。

Otherwise you can use a hash, but there is no guarantee it will be unique for all English words.