COM pressing一个基地62(0-9A-ZA-Z)EN codeD字符串字符串、基地、pressing、COM

2023-09-11 06:11:05 作者:为迩独战天下⌒

我需要玉米preSS一个base62(0-9A-ZA-Z)烯$ C $光盘串,其长度为20个字符到15-16字符串以挤在​​一些其它信息。最棘手的部分是COM pressed输出也应base62 EN codeD。可以这样做?任何的建议是极大的AP preciated。

I need to compress a base62 (0-9a-zA-Z) encoded string with a length of 20 characters into a 15-16 character string in order to squeeze some other information in. The tricky part is the compressed output should also be base62 encoded. Can this be done? Any suggestion is greatly appreciated.

谢谢!

推荐答案

请参见鸽巢原理 - 如果你试图把100鸽子到10孔,一些漏洞将有多个鸽子。以相同的方式,为您的问题,有将必须出现两个字符串融为一体pressing到相同的字符串的。在这种情况下,你不会知道哪个字符串DECOM preSS的COM pressed字符串。

See the Pigeonhole principle - if you try to put 100 pigeons into 10 holes, some holes will have multiple pigeons. In the same way, for your problem, there will have to be occurrences of two strings compressing to the same string. In these cases, you won't know which string to decompress the compressed string to.

所以,不,你不能无损 COM preSS 20个字符16个字符(甚至是20到19个字符)的所有可能的输入相同的编码。

So no, you cannot losslessly compress 20 characters to 16 characters (or even 20 to 19 characters) in the same encoding for all possible inputs.

如果输入是有一些明确的特点,如,唯一的大写字符将是第一个字符,最后3个字符是其中数字出现,等等,那么这将是更玉米pressible和它可能是可能的。

If the input were to have some defining characteristics, such as that the only uppercase character will be the first character, the last 3 characters are where the numbers appears, etc., then it will be more compressible and it may be possible.

如果你有这样的特性(或者,如果你想转换到不同的编码方式,有足够的空间),你可以很容易地转换为字符串在任何编码为一个唯一的号码,然后这个数字到一个字符串不同的编码。要做到这一点的方法是:

If you had such characteristics (or if you want to convert to a different encoding that has enough space), you could easily convert a string in any encoding to a unique number and then this number into a string in a different encoding. The way to do this would be to:

对于每个字符位置,指定一个数字,从0开始,到每一个可能在该位置允许的字符。

For each character position, assign a number, starting at 0, to each possible character allowed in that position.

所以,如果A到Z和A到Z是允许在第一个位置,您可以指定0-25为A到Z和26-51为a到z的。因此,B,例如,将是1

So if "A" to "Z" and "a" to "z" is allowed in the first position, you could assign 0-25 to "A" through "Z" and 26-51 to "a" through "z". So "B", for example, will be 1.

遍历串,总通过允许值的当前位置的数量相乘,然后加入分配给该字符在该位置的总数。

Iterate through the string, multiplying the total by the number of allowable values for the current position and then adding the number assigned to the character at that position to the total.

要获得一个不同的编码,只需重复:

To get a different encoding, just repeatedly:

总设置为总除以允许值的数目为当前位置(舍去)的结果。 将当前位置设置到对应于上述的除法的余数的字符。

如果您从左到右或从右到左在上述任何一种情况下,只要去当挑一的方式,并坚持下去没关系。

It doesn't matter if you go from left to right or right to left in either of the above cases, as long as pick one way and stick to it.

您也可以很容易地确定这样的转换是可能的,通过计算对于每个编码的最大可能值(通过取每个字符的最大值) - 如果目标有一个较小的最大可能值,转换是不可能的。

You could also easily determine if such a conversion is possible by calculating the maximum possible value for each encoding (by taking the largest value for each character) - if the target has a smaller largest possible value, the conversion is not possible.

请注意,上面只有当某些位置有固定值,虽然可以在一定程度上,延长这项工作的其他编码(如具有字符串中最多1号),但是这变得有点更复杂的

Note that the above is only for when certain positions have fixed values, although you can, to some extent, extend this to work for other encodings (such as having at most 1 number in the string), but this gets a bit more complex.

例如:

输入格式: 1个大写字母(AZ),然后2个数字(0-9)   输出格式为: 1小写字母(AZ),则2大写/小写字母(AZ或az)   输入:Z35   编号: 10 *(10 *(26 * 0 + 25)+ 3)+ 5 = 2535   说明::我们以Z,总为0开始,我们乘上的大写字母(26)的数量,然后添加值Z(25)。我们然后转移到3,在这里我们多个本总由数字(10)的数量和添加的值3(3),等等。   输出计算:   二十六分之二千五百三十五= 97   2535%26 = 13,所以第1个字符=N(字母表的13 + 1 = 14号)   五十二分之九十七= 1   97%52 = 45,所以第2个字符=吨(字母表的45-26 + 1 = 20号)   1%52 = 1,因此第三字符=B   输出:非关税壁垒

Input format: 1 uppercase letter (A-Z), then 2 digits (0-9) Output format: 1 lowercase letter (a-z), then 2 upper-/lowercase letters (A-Z or a-z) Input: "Z35" Number: 10*(10*(26*0 + 25) + 3) + 5 = 2535 Explanation: We start with "Z", the total is 0 to start, which we multiply by the number of uppercase letters (26) and then add the value for "Z" (25). We then move on to "3", where we multiple this total by the number of digits (10) and add the value for "3" (3), and so on. Output calculation: 2535 / 26 = 97 2535 % 26 = 13, so 1st character = "n" (13+1 = 14th letter of alphabet) 97 / 52 = 1 97 % 52 = 45, so 2nd character = "t" (45-26+1 = 20th letter of alphabet) 1 % 52 = 1, so 3rd character = "B" Output: "ntB"

输入格式为:最大可能值: 10 *(10 *(26 * 0 + 25)+ 9)+ 9 = 2599   输出格式最大可能值: 52 *(52 *(26 * 0 + 25)+ 51)+ 51 = 70303   是转换成为可能?是,由于70303> = 2599。

Largest possible value for input format: 10*(10*(26*0 + 25) + 9) + 9 = 2599 Largest possible value for output format: 52*(52*(26*0 + 25) + 51) + 51 = 70303 Is conversion possible? Yes, because 70303 >= 2599.