创建函数对于给定的输入和输出中函数

2023-09-11 00:01:52 作者:花開須相惜、

试想,有两个相同大小的组数字

Imagine, there are two same-sized sets of numbers.

是否有可能,以及如何创建功能的算法或子程序,其中输入项完全映射到输出的项目?像:

Is it possible, and how, to create a function an algorithm or a subroutine which exactly maps input items to output items? Like:

Input = 1, 2, 3, 4
Output = 2, 3, 4, 5

和功能是:

f(x): return x + 1

和按功能我的意思是略多于COMLEX [1]:

And by "function" I mean something slightly more comlex than [1]:

f(x):
    if x == 1: return 2
    if x == 2: return 3
    if x == 3: return 4
    if x == 4: return 5

这为创建特殊的散列函数或函数近似将是有益的。

This would be be useful for creating special hash functions or function approximations.

更新:

我试着问是找出在于是否有办法的 COM preSS 的,从上面的琐碎映射的例子[1]。

What I try to ask is to find out is whether there is a way to compress that trivial mapping example from above [1].

推荐答案

查找,输出一些字符串最短的程序(程序,函数等),就相当于找到了自己的柯尔莫哥洛夫复杂性,这是不可判定的。

Finding the shortest program that outputs some string (sequence, function etc.) is equivalent to finding its Kolmogorov complexity, which is undecidable.

如果不可能不是一个令人满意的答案,你必须限制你的问题。在所有适当的限制的情况下(多项式,有理函数,线性递推)找到一个最优算法会很容易,只要你明白自己在做什么。例如:

If "impossible" is not a satisfying answer, you have to restrict your problem. In all appropriately restricted cases (polynomials, rational functions, linear recurrences) finding an optimal algorithm will be easy as long as you understand what you're doing. Examples:

多项式 - 拉格朗日插值

有理函数 - 帕德逼近

布尔公式 - 卡诺图

近似解 - 回归的,线性的情况:的线性回归

approximate solution - regression, linear case: linear regression

数据一般包装 - 数据通信pression ;一些技术,如游程长度编码,是无损的,一些不

general packing of data - data compression; some techniques, like run-length encoding, are lossless, some not.

在的情况下多项式序列,它常常有助于考虑序列b N = A N + 1 -a N ;这降低了二次关系线性的,并且是线性的,以一个恒定的序列等,但没有银弹。你可能会建立一些启发(例如数学有 FindSequenceFunction - 检查页面得到的多么复杂这个可以用遗传算法,随机猜测,检查IM pression)许多内置的序列及其组成等。不管是什么,任何这样的程序 - 在理论上 - 是完美无穷远因柯尔莫哥洛夫复杂性不可判定。在实践中,你可能会得到满意的结果,但是这需要很多人年。

In case of polynomial sequences, it often helps to consider the sequence bn=an+1-an; this reduces quadratic relation to linear one, and a linear one to a constant sequence etc. But there's no silver bullet. You might build some heuristics (e.g. Mathematica has FindSequenceFunction - check that page to get an impression of how complex this can get) using genetic algorithms, random guesses, checking many built-in sequences and their compositions and so on. No matter what, any such program - in theory - is infinitely distant from perfection due to undecidability of Kolmogorov complexity. In practice, you might get satisfactory results, but this requires a lot of man-years.

参见另一个SO质疑。你也可以实施一些包装到 OEIS 在你的应用程序。

See also another SO question. You might also implement some wrapper to OEIS in your application.

字段:

大多数情况下,什么可以做的限度

Mostly, the limits of what can be done are described in

复杂性理论 - 描述什么问题都可以解决快速,比如寻找最短路径图中,什么不能像玩跳棋的通用版本(他们EXPTIME完成)

complexity theory - describing what problems can be solved "fast", like finding shortest path in graph, and what cannot, like playing generalized version of checkers (they're EXPTIME-complete).

信息理论 - 描述了多少信息是一个随机变量进行。例如,以抛硬币。通常情况下,它需要1位EN code的结果,和n位EN codeN结果(采用长0-1序列)。现在假设你有一个偏见硬币,让尾巴90%的时间。然后,就可以找到描述n个结果的另一种方式,平均给出更短的序列。为了得到理想的每编码抛掷的比特数(在这种情况下,小于1!)被称为熵;在那篇文章中的图显示有多少信息进行(1位为1 / 2-1 / 2,小于1的偏置硬币,0比特,如果始终是硬币的土地上的同一侧)。

information theory - describing how much "information" is carried by a random variable. For example, take coin tossing. Normally, it takes 1 bit to encode the result, and n bits to encode n results (using a long 0-1 sequence). Suppose now that you have a biased coin that gives tails 90% of time. Then, it is possible to find another way of describing n results that on average gives much shorter sequence. The number of bits per tossing needed for optimal coding (less than 1 in that case!) is called entropy; the plot in that article shows how much information is carried (1 bit for 1/2-1/2, less than 1 for biased coin, 0 bits if the coin lands always on the same side).

算法信息论 - 试图加入复杂性理论与信息理论。柯尔莫哥洛夫复杂性属于这里。您可以考虑一个字符串随机,如果有大的柯尔莫哥洛夫复杂性:aaaaaaaaaaaa不是一个随机字符串,f8a34olx可能是。因此,一个随机字符串INCOM pressible( Volchan的什么是随机序列是一个非常可读的介绍。)。蔡廷的算法信息论本书可供下载。引用:[...]我们构造仅涉及整数和加法,乘法和求幂,与属性,如果一个变化的参数,并要求解的数目是否是有限或无限的这个问题的答案是一个等式无异于公平硬币掷无关的结果。 (换句话说,没有算法可以猜到结果的概率> 1/2)。我没有看过那本书不过,所以不能评价它。

algorithmic information theory - that attempts to join complexity theory and information theory. Kolmogorov complexity belongs here. You may consider a string "random" if it has large Kolmogorov complexity: aaaaaaaaaaaa is not a random string, f8a34olx probably is. So, a random string is incompressible (Volchan's What is a random sequence is a very readable introduction.). Chaitin's algorithmic information theory book is available for download. Quote: "[...] we construct an equation involving only whole numbers and addition, multiplication and exponentiation, with the property that if one varies a parameter and asks whether the number of solutions is finite or infinite, the answer to this question is indistinguishable from the result of independent tosses of a fair coin." (in other words no algorithm can guess that result with probability > 1/2). I haven't read that book however, so can't rate it.

密切相关的信息理论的编码理论,它描述纠错codeS。实施例的结果:能够连接code 4位到7位,使得将有可能检测和校正任何单个差错,或检测两个错误(汉明(7,4))。

Strongly related to information theory is coding theory, that describes error-correcting codes. Example result: it is possible to encode 4 bits to 7 bits such that it will be possible to detect and correct any single error, or detect two errors (Hamming(7,4)).

积极的一面是:

象征性的算法,拉格朗日插值和帕德逼近是计算机代数部分/符号计算;冯楚Gathen,格哈德·现代计算机代数是一个很好的参考。

symbolic algorithms for Lagrange interpolation and Pade approximation are a part of computer algebra/symbolic computation; von zur Gathen, Gerhard "Modern Computer Algebra" is a good reference.

数据通信presssion - 在这里你最好去问别人的参考:)

data compresssion - here you'd better ask someone else for references :)