如何找到其中包含给定字符串中的所有字符最小的子?字符串、字符、最小、其中包含

2023-09-11 01:50:56 作者:爱你无果i

我最近遇到一个有趣的问题上的字符串。假设您将得到如下:

I have recently come across an interesting question on strings. Suppose you are given following:

Input string1: "this is a test string"
Input string2: "tist"
Output string: "t stri"

因此​​,上面给出的,我怎么能进场,争取找到字符串1的最小的子字符串,其中包含从字符串2的所有字符?

So, given above, how can I approach towards finding smallest substring of string1 that contains all the characters from string 2?

推荐答案

您可以做一个直方图横扫 O(N + M)时间和 O(1)的空间, N 是在第一个字符串中的字符数和 M 是字符在第二数

You can do a histogram sweep in O(N+M) time and O(1) space where N is the number of characters in the first string and M is the number of characters in the second.

它的工作原理是这样的:

It works like this:

请的第二个字符串的字符的直方图(按键操作 HIST2 [S2 [I]] ++ )。 请的第一个字符串的字符的累积直方图,直到直方图包含每个字符的第二个字符串的直方图包含(我称之为直方图条件)。 然后向前移动的第一串,从直方图中减去,直到它不能满足直方图条件。马克的第一个字符串的该位(最后移动之前)作为暂定子。 将子串的前部向前直至你再次见面直方图条件。移动端向前,直到再次失败。如果这是一个更短的子串比第一,标记作为暂定的子串。 重复,直到你完成整个第一串已经通过。 的显着子是你的答案。 Make a histogram of the second string's characters (key operation is hist2[ s2[i] ]++). Make a cumulative histogram of the first string's characters until that histogram contains every character that the second string's histogram contains (which I will call "the histogram condition"). Then move forwards on the first string, subtracting from the histogram, until it fails to meet the histogram condition. Mark that bit of the first string (before the final move) as your tentative substring. Move the front of the substring forwards again until you meet the histogram condition again. Move the end forwards until it fails again. If this is a shorter substring than the first, mark that as your tentative substring. Repeat until you've passed through the entire first string. The marked substring is your answer.

请注意,通过改变你的直方图条件下使用的支票,你可以选择有相同的字符集的作为第二个字符串,或至少尽可能多的每一个字符键入的。 (它只是在差[1] - GT; 0安培;和B [1] - GT; 0 A [1]> = B [I]

Note that by varying the check you use on the histogram condition, you can choose either to have the same set of characters as the second string, or at least as many characters of each type. (Its just the difference between a[i]>0 && b[i]>0 and a[i]>=b[i].)

如果你保持跟踪哪些条件不满足,当你试图去满足它,并检查只是你当你试图打破它递减的事情,你可以加快直方图检查。 (在初始积累,你算你有多少项目满意,增量计算每次添加一个新特性需要从虚假到真实的情况。)

You can speed up the histogram checks if you keep a track of which condition is not satisfied when you're trying to satisfy it, and checking only the thing that you decrement when you're trying to break it. (On the initial buildup, you count how many items you've satisfied, and increment that count every time you add a new character that takes the condition from false to true.)