寻找共同块

2023-09-11 05:00:08 作者:往北的地方海未眠。

予有含一些文本(或二进制数据)两个文件(f1和f2)。 我如何能迅速找到共同的块?

I have two files (f1 and f2) containing some text (or binary data). How can I quickly find common blocks?

例如, F1:ABC DEF F2:XXABC XEF

e.g. f1: ABC DEF f2: XXABC XEF

输出:

常见块: 在F1 @ 0和f2ABC@ 2:长度为4 长度2:在F1 @ 5EF和f2 @ 8

common blocks: length 4: "ABC " in f1@0 and f2@2 length 2: "EF" in f1@5 and f2@8

推荐答案

维基百科有一些伪code 查找数据的两个序列之间的最长公共子串。在你的情况,你只需提取未$ P $的其他常见字符串pfixes(即最大共同子)表中所有常见的字符串。

Wikipedia has some pseudocode for finding the longest common substring between two sequences of data. In your case, you simply extract all common substring from the table that are not prefixes of other common substrings (i.e. maximal common substrings).

相关推荐