是否有一个快速的算法来确定一个上下文无关语言的长期的哥德尔数?上下文、的哥、德尔、算法

2023-09-11 22:49:01 作者:FALSE 虚伪

假设我们有一个简单的语法规范。有一种方法来枚举的语法,保证了任何有限的任期将有一个有限的位置,的由斜迭代它。例如,对于以下语法:

 的s :: =添加
加:: = MUL |添加+ MUL
MUL :: =项| MUL *学期
长期:: =号| (S)
数:: =数字|数字
数字:: = 0 | 1 | ... | 9
 

您可以枚举而言这样的:

  0
1
0 + 0
0 * 0
0 + 1
(0)
1 + 0
0 * 1
0 + 0 * 0
00
... 等等
 

我的问题是:有没有办法做相反?也就是说,采取这种语法的有效期限,比如 0 + 0 * 0 ,并找到了这样的枚举位置 - 在这种情况下,9

解决方案

对于这个特定的问题,我们可以编造一些东西非常简单,如果我们允许自己选择不同的枚举顺序。这个想法基本上是一个在每一点计数,该我也提到了评论。首先,一些preliminaries:部分进口/扩展,数据类型重新presenting语法,和pretty的打印机。为简单起见,我只数字上升至2(大到不能二进制更多,但足够小,不穿了我的手指和你的眼睛)。

  { - #语言TypeSynonymInstances# - }
进口Control.Applicative
进口Data.Universe.Helpers

S型=添加
数据添加=德穆尔德穆尔|地址:+ Mul的推导(均衡器,奥德,展示,阅读)
数据德穆尔=期限条款| MUL:*术语推导(均衡器,奥德,展示,阅读)
数据TERM =数数|括号小号推导(均衡器,奥德,展示,阅读)
数据数=数字数字|数字:::数得出(公式,奥德,展示,阅读)
数据位= D0 | D1 | D2推导(均衡器,奥德,展示,阅读,有界,枚举)

级PP一个,其中第一个::  - >串
例如PP添加在哪里
    PP(德穆尔米)=第米
    PP(A + M)=第一个+++ PP米
例如PP德穆尔在哪里
    PP(期限T)=第牛逼
    PP(M * T)=第M +*+ PP吨
例如PP期限在哪里
    PP(编号N)=第ñ
    第(括号多个)=(+页小号+ +)
例如PP号在哪里
    PP(数字D)=第ð
    第(四::: N)=第Ð++第ñ
例如PP位,其中PP =显示。 fromEnum
 
语音识别算法原理不完全归纳

现在让我们来定义枚举顺序。我们将使用两种基本的组合子, +++ 交错的两个列表(助记符:中性格的总和,因此我们正在采取元素从第一参数或第二)和 + * + 的对角化(助记符:中人物是一个产品,所以我们正在采取元素从第一和第二个参数)。这些在宇宙文档更多信息。其中不变的是维护的是,我们的名单 - 除了数字的 - 总是无穷的。这个在后面很重要的。

  SS =增加
增加=(德穆尔< $> MULS)+++(uncurry(+)< $>增加了+ * + MULS)
MULS =(期限< $>术语)+++(uncurry(*)< $> MULS + * +计)
条款=(编号:LT; $>数字)+++(括号< $> SS)
数=(数字< $>数字)++交错[Ð::: N | N'LT;  - 数字] | D<  - 数字]
位数= [D0,D1,D2]
 

让我们看看几个方面:

  *主要> mapM_(putStrLn。PP)(坐15 SS)
0
0 + 0
0 * 0
0 + 0 * 0
(0)
0 + 0 + 0
0 *(0)
0+(0)
1
0 + 0 + 0 * 0
0 * 0 * 0
0 * 0 + 0
(0 + 0)
0 + 0 *(0)
0 * 1
 

好了,现在让我们到好位。假设我们有两个无限列表 A B 。有两件事情要注意。首先,在 A +++ b ,所有的偶数索引来自 A ,所有的奇数索引来自 B 。因此,我们可以看一个索引的最后位,看看,其中列表中,而剩余的比特来接在该列表的索引。其次,在 A + * + B ,我们可以使用对数和单数之间的标准双射中的大名单指数在指数和对之间进行转换 A B 列表。不错!让我们开始吧。我们将定义一个类哥德尔,能够事情,可译为来回数字之间 - 指数进入居民的无穷列表。稍后我们将检查这个翻译上面定义枚举相匹配。

  NAT类型=整数 - 我在这里的熊
类哥德尔一个在那里
    为::一 - >纳特
    从::纳特 - >一个

例如哥德尔纳特在哪里= ID;从= ID

实例(哥德尔一个,哥德尔B)=>哥德尔(A,B),其中
    到(M_,N_)=(M + N)*(M + N + 1)`quot` 2 + m,其中
        M =到M_
        N =以N_
    从P =(从m,从n)其中
        isqrt =地板。开方。 fromIntegral
        基=(isqrt(1 + 8 * P) -  1)`quot` 2
        三角=基础*(基地+ 1)`quot` 2
        M =对 - 三角
        N =基地 - 米
 

有关对这里的实例是标准康托尔对角线。这只是一个有点代数:用三角形数字找出你要去的地方/来源。现在建立实例这个类是一件轻而易举的事。 ,则只需重新psented在基地3 $ P $:

   - 这种情况下是骗人的!不会有无穷多位数
 - 但我们如何使用它,我们会小心
例如哥德尔位在哪里
    为= fromIntegral。 fromEnum
    从= toEnum。 fromIntegral

例如哥德尔数在哪里
    以(数字D)=至d
    至(d ::: N)= 3 +至D + 3 *到n
    从n个
        | N'LT; 3 =数字(来自N)
        |否则让=(Q,R)= quotRem(N-3)中,从R 3 :::从q
 

对于剩下的三种类型,我们将按照上面的建议,检查标签位来决定发出哪个构造,并用剩余的位作为索引到一个对角化列表。所有这三种情况下,一定是看起来非常相似。

 实例哥德尔期限在哪里
    以(数字n)= 2 *到n
    到(括号多个)= 1 + 2 *送
    从N =情况下quotRem N 2
        (Q,0) - >号(从q)的
        (Q 1) - >括号(从Q)

例如哥德尔德穆尔在哪里
    以(期限T)= 2 *为t
    至(m:*吨)= 1 + 2 *至(m,t)的
    从N =情况下quotRem N 2
        (Q,0) - >期限(从Q)
        (Q 1) - > uncurry(*)(由q)的

例如哥德尔加入其中,
    以(Mul的M)= 2 *到m
    至(m +吨)= 1 + 2 *至(m,t)的
    从N =情况下quotRem N 2
        (Q,0) - >德穆尔(从Q)
        (Q 1) - > uncurry(+)(由q)的
 

这就是它!现在,我们可以有效地翻译来回解析树和他们的哥德尔编号为这个语法之间。此外,这个翻译符合上面列举,因为您可以验证:

  *主要>从地图[0..29] ==需要30 SS
真正
 

我们做了这个特殊语法的滥用很多很好的特性 - 无歧义,事实上,几乎所有的非终结符有无限多的推导 - 这种技术,但变化可以让你很远,特别是如果你不是太严格对每一个需要数与独特的东西有关。

另外,顺便说一句,你可能会注意到,除了为实例(NAT,NAT),这些哥德尔numberings是特别好,因为他们看/产生一个位(或三叔)的时间。所以,你可以想像做一些流。但(NAT,NAT)一个是pretty的讨厌的:你有提前的时间来计算知道整数开方。实际上,你可以把它变成一个流媒体的家伙,也不失是密集的财产(每纳特与唯一的(纳特,纳特相关)),不过那是另外一个答案 ...

Suppose we have a simple grammar specification. There is a way to enumerate terms of that grammar that guarantees that any finite term will have a finite position, by iterating it diagonally. For example, for the following grammar:

S      ::= add
add    ::= mul | add + mul
mul    ::= term | mul * term
term   ::= number | ( S )
number ::= digit | digit number
digit  ::= 0 | 1 | ... | 9

You can enumerate terms like that:

0
1
0+0
0*0
0+1
(0)
1+0
0*1
0+0*0
00
... etc

My question is: is there a way to do the opposite? That is, to take a valid term of that grammar, say, 0+0*0, and find its position on such enumeration - in that case, 9?

解决方案

For this specific problem, we can cook up something fairly simple, if we allow ourselves to choose a different enumeration ordering. The idea is basically the one in Every Bit Counts, which I also mentioned in the comments. First, some preliminaries: some imports/extensions, a data type representing the grammar, and a pretty-printer. For the sake of simplicity, my digits only go up to 2 (big enough to not be binary any more, but small enough not to wear out my fingers and your eyes).

{-# LANGUAGE TypeSynonymInstances #-}
import Control.Applicative
import Data.Universe.Helpers

type S      = Add
data Add    = Mul    Mul    | Add :+ Mul       deriving (Eq, Ord, Show, Read)
data Mul    = Term   Term   | Mul :* Term      deriving (Eq, Ord, Show, Read)
data Term   = Number Number | Parentheses S    deriving (Eq, Ord, Show, Read)
data Number = Digit  Digit  | Digit ::: Number deriving (Eq, Ord, Show, Read)
data Digit  = D0 | D1 | D2                     deriving (Eq, Ord, Show, Read, Bounded, Enum)

class PP a where pp :: a -> String
instance PP Add where
    pp (Mul m) = pp m
    pp (a :+ m) = pp a ++ "+" ++ pp m
instance PP Mul where
    pp (Term t) = pp t
    pp (m :* t) = pp m ++ "*" ++ pp t
instance PP Term where
    pp (Number n) = pp n
    pp (Parentheses s) = "(" ++ pp s ++ ")"
instance PP Number where
    pp (Digit d) = pp d
    pp (d ::: n) = pp d ++ pp n
instance PP Digit where pp = show . fromEnum

Now let's define the enumeration order. We'll use two basic combinators, +++ for interleaving two lists (mnemonic: the middle character is a sum, so we're taking elements from either the first argument or the second) and +*+ for the diagonalization (mnemonic: the middle character is a product, so we're taking elements from both the first and second arguments). More information on these in the universe documentation. One invariant we'll maintain is that our lists -- with the exception of digits -- are always infinite. This will be important later.

ss    = adds
adds  = (Mul    <$> muls   ) +++ (uncurry (:+) <$> adds +*+ muls)
muls  = (Term   <$> terms  ) +++ (uncurry (:*) <$> muls +*+ terms)
terms = (Number <$> numbers) +++ (Parentheses <$> ss)
numbers = (Digit <$> digits) ++ interleave [[d ::: n | n <- numbers] | d <- digits]
digits  = [D0, D1, D2]

Let's see a few terms:

*Main> mapM_ (putStrLn . pp) (take 15 ss)
0
0+0
0*0
0+0*0
(0)
0+0+0
0*(0)
0+(0)
1
0+0+0*0
0*0*0
0*0+0
(0+0)
0+0*(0)
0*1

Okay, now let's get to the good bit. Let's assume we have two infinite lists a and b. There's two things to notice. First, in a +++ b, all the even indices come from a, and all the odd indices come from b. So we can look at the last bit of an index to see which list to look in, and the remaining bits to pick an index in that list. Second, in a +*+ b, we can use the standard bijection between pairs of numbers and single numbers to translate between indices in the big list and pairs of indices in the a and b lists. Nice! Let's get to it. We'll define a class for Godel-able things that can be translated back and forth between numbers -- indices into the infinite list of inhabitants. Later we'll check that this translation matches the enumeration we defined above.

type Nat = Integer -- bear with me here
class Godel a where
    to :: a -> Nat
    from :: Nat -> a

instance Godel Nat where to = id; from = id

instance (Godel a, Godel b) => Godel (a, b) where
    to (m_, n_) = (m + n) * (m + n + 1) `quot` 2 + m where
        m = to m_
        n = to n_
    from p = (from m, from n) where
        isqrt    = floor . sqrt . fromIntegral
        base     = (isqrt (1 + 8 * p) - 1) `quot` 2
        triangle = base * (base + 1) `quot` 2
        m = p - triangle
        n = base - m

The instance for pairs here is the standard Cantor diagonal. It's just a bit of algebra: use the triangle numbers to figure out where you're going/coming from. Now building up instances for this class is a breeze. Numbers are just represented in base 3:

-- this instance is a lie! there aren't infinitely many Digits
-- but we'll be careful about how we use it
instance Godel Digit where
    to = fromIntegral . fromEnum
    from = toEnum . fromIntegral

instance Godel Number where
    to (Digit d) = to d
    to (d ::: n) = 3 + to d + 3 * to n
    from n
        | n < 3     = Digit (from n)
        | otherwise = let (q, r) = quotRem (n-3) 3 in from r ::: from q

For the remaining three types, we will, as suggested above, check the tag bit to decide which constructor to emit, and use the remaining bits as indices into a diagonalized list. All three instances necessarily look very similar.

instance Godel Term where
    to (Number n) = 2 * to n
    to (Parentheses s) = 1 + 2 * to s
    from n = case quotRem n 2 of
        (q, 0) -> Number (from q)
        (q, 1) -> Parentheses (from q)

instance Godel Mul where
    to (Term t) = 2 * to t
    to (m :* t) = 1 + 2 * to (m, t)
    from n = case quotRem n 2 of
        (q, 0) -> Term (from q)
        (q, 1) -> uncurry (:*) (from q)

instance Godel Add where
    to (Mul m) = 2 * to m
    to (m :+ t) = 1 + 2 * to (m, t)
    from n = case quotRem n 2 of
        (q, 0) -> Mul (from q)
        (q, 1) -> uncurry (:+) (from q)

And that's it! We can now "efficiently" translate back and forth between parse trees and their Godel numbering for this grammar. Moreover, this translation matches the above enumeration, as you can verify:

*Main> map from [0..29] == take 30 ss
True

We did abuse many nice properties of this particular grammar -- non-ambiguity, the fact that almost all the nonterminals had infinitely many derivations -- but variations on this technique can get you quite far, especially if you are not too strict on requiring every number to be associated with something unique.

Also, by the way, you might notice that, except for the instance for (Nat, Nat), these Godel numberings are particularly nice in that they look at/produce one bit (or trit) at a time. So you could imagine doing some streaming. But the (Nat, Nat) one is pretty nasty: you have to know the whole number ahead of time to compute the sqrt. You actually can turn this into a streaming guy, too, without losing the property of being dense (every Nat being associated with a unique (Nat, Nat)), but that's a topic for another answer...