如何优雅地计算在Ruby中一个字的字谜签名?字谜、一个字、优雅、Ruby

2023-09-11 04:59:56 作者:其实这只是一场游戏╮

产生出来的this的问题,我正在寻找一个优雅(红宝石)的方式来计算的话签名this的回答。

Arising out of this question, I'm looking for an elegant (ruby) way to compute the word signature suggested in this answer.

提出的想法是在单词的字母排序,并且还运行长度连接code重复字母。因此,例如密西西比第一次变为iiiimppssss,然后可以通过编码进一步缩短为4impp4s

The idea suggested is to sort the letters in the word, and also run length encode repeated letters. So, for example "mississippi" first becomes "iiiimppssss", and then could be further shortened by encoding as "4impp4s".

我是比较新的红宝石,虽然我可以砍的东西在一起,我敢肯定,这是一个衬垫人与红宝石更多的经验。我很想看到人们的途径,提高我的宝石知识。

I'm relatively new to ruby and though I could hack something together, I'm sure this is a one liner for somebody with more experience of ruby. I'd be interested to see people's approaches and improve my ruby knowledge.

编辑:澄清,计算签名的表现确实为我的应用程序没有太大的问题。我在找计算签名,所以我可以用它的每个字存储字的大型数据库(450K的话),然后查询的词语具有相同的签名(给定字即所有字谜,这是实际的英语单词)。因此,注重空间。在'优雅'的部分是只是为了满足我的好奇心。的

推荐答案

我没有太大的Ruby的人要么,但我注意到其他评论这似乎工作描述的算法。

I'm not much of a Ruby person either, but as I noted on the other comment this seems to work for the algorithm described.

s = "mississippi"
s.split('').sort.join.gsub(/(.)\1{2,}/) { |s| s.length.to_s + s[0,1] }

当然,你要确保这个词是小写的,不包含数字,等等。

Of course, you'll want to make sure the word is lowercase, doesn't contain numbers, etc.

根据要求,我会尽力解释了code。请原谅我,如果我没有得到所有的红宝石或章恩术语正确的,但在这里不用。

As requested, I'll try to explain the code. Please forgive me if I don't get all of the Ruby or reg ex terminology correct, but here goes.

我觉得分流/排序/加入部分是pretty的简单。对我来说,有趣的部分开始在调用GSUB。这将替换一个子匹配正则前pression与从它后面块中的返回值。该注册前发现的任何字符,并创建一个反向引用。这就是()的一部分。然后,我们用反向引用\ 1,其值为任何性质被发现的本场比赛的第一部分继续匹配过程。我们希望该字符被发现最少两个更多次出现3的最少数量的总和。使用量词{2}这可以。

I think the split/sort/join part is pretty straightforward. The interesting part for me starts at the call to gsub. This will replace a substring that matches the regular expression with the return value from the block that follows it. The reg ex finds any character and creates a backreference. That's the "(.)" part. Then, we continue the matching process using the backreference "\1" that evaluates to whatever character was found by the first part of the match. We want that character to be found a minimum of two more times for a total minimum number of occurrences of three. This is done using the quantifier "{2,}".

如果发现匹配,匹配的字符串,然后传递到code中的下一个块作为参数感谢| S |部分。最后,我们使用了等效字符串匹配的子串的长度,并追加到它的任何字符组成的子串(他们都应该是相同的)并返回连接的值。返回值取代了原来的匹配子。整个过程将持续到什么是留给匹配,因为它在原始字符串全局替换。

If a match is found, the matching substring is then passed to the next block of code as an argument thanks to the "|s|" part. Finally, we use the string equivalent of the matching substring's length and append to it whatever character makes up that substring (they should all be the same) and return the concatenated value. The returned value replaces the original matching substring. The whole process continues until nothing is left to match since it's a global substitution on the original string.

我道歉,如果这是令人困惑的。由于通常情况下,更容易对我来说,可视化的解决方案,而不是解释清楚。

I apologize if that's confusing. As is often the case, it's easier for me to visualize the solution than to explain it clearly.