解析Lisp的S-防爆pressions在C#中的已知模式模式、Lisp、pressions

2023-09-03 05:31:10 作者:陌上初安

我的工作与提供的数据作为一个类Lisp S-防爆pression字符串的服务。此数据到达厚,快,我想通过它来搅动尽快,最好直接在字节流(这只是单字节字符),不含任何回溯。这些字符串可以是相当漫长的,我不希望整个消息分配一个字符串GC流失。

I'm working with a service that provides data as a Lisp-like S-Expression string. This data is arriving thick and fast, and I want to churn through it as quickly as possible, ideally directly on the byte stream (it's only single-byte characters) without any backtracking. These strings can be quite lengthy and I don't want the GC churn of allocating a string for the whole message.

我的当前实现使用李玟/ R与语法,但它有一些问题。由于回溯,它赋予了全码流为字符串。这也是一个有点繁琐为我的code用户改变,如果他们不得不这样做。我宁愿有一个纯C#的解决方案。李玟/ R也不允许解析器/扫描对象的重用,所以我必须重新创建它们为每个消息。

My current implementation uses CoCo/R with a grammar, but it has a few problems. Due to the backtracking, it assigns the whole stream to a string. It's also a bit fiddly for users of my code to change if they have to. I'd rather have a pure C# solution. CoCo/R also does not allow for the reuse of parser/scanner objects, so I have to recreate them for each message.

概念上的数据流可以被看作是一个序列的S-防爆pressions:

Conceptually the data stream can be thought of as a sequence of S-Expressions:

(item 1 apple)(item 2 banana)(item 3 chainsaw)

解析这个序列将创建三个对象。每个对象的类型可以通过在列表中的第一个值被确定,在上述情况下,项目。传入流的模式/语法是公知的。

Parsing this sequence would create three objects. The type of each object can be determined by the first value in the list, in the above case "item". The schema/grammar of the incoming stream is well known.

在我开始编码,我想知道是否有图书馆,在那里,已经这样做了。我敢肯定,我不是第一人,也有这个问题。

Before I start coding I'd like to know if there are libraries out there that do this already. I'm sure I'm not the first person to have this problem.

修改

下面是一个更详细一点上我想要的,因为我认为原来的问题可能是一个有点含糊。

Here's a little more detail on what I want as I think the original question may have been a little vague.

由于一些性pressions,如:

Given some SExpressions, such as:

(Hear 12.3 HelloWorld)
(HJ LAJ1 -0.42)
(FRP lf (pos 2.3 1.7 0.4))

我要的对象等效于这样的列表:

I want a list of objects equivalent to this:

{
    new HearPerceptorState(12.3, "HelloWorld"),
    new HingeJointState("LAJ1", -0.42),
    new ForceResistancePerceptorState("lf", new Polar(2.3, 1.7, 0.4))
}

实际的数据集我的工作是perceptors从一个机器人模型名单在机器人世界杯3D模拟足球联赛。我可能潜在地还需要deserialise 另一组相关数据,具有更复杂的结构。

The actual data set I'm working on is a list of perceptors from a robot model in the RoboCup 3D simulated soccer league. I may potentially also need to deserialise another set of related data with a more complex structure.

推荐答案

在我看来解析生成器是不必要的解析简单的S-EX pressions只包含列表,数字和符号。一个手工编写的递归下降解析器可能是简单,至少是同样快。一般的模式是这样的(使用Java,C#应该是非常相似的):

In my opinion a parse generator is unneccessary to parse simple S-expressions consisting only of lists, numbers and symbols. A hand-written recursive descent parser is probably simpler and at least as fast. The general pattern would look like this (in java, c# should be very similar):

Object readDatum(PushbackReader in) {
    int ch = in.read();
    return readDatum(in, ch);
}
Object readDatum(PushbackReader in, int ch) {
    if (ch == '(')) {
        return readList(in, ch);
    } else if (isNumber(ch)) {
        return readNumber(in, ch);
    } else if (isSymbolStart(ch)) {
        return readSymbol(in, ch);
    } else {
        error(ch);
    }
}
List readList(PushbackReader in, int lookAhead) {
    if (ch != '(') {
        error(ch);
    }
    List result = new List();
    while (true) {
        int ch = in.read();
        if (ch == ')') {
            break;
        } else if (isWhiteSpace(ch)) {
            skipWhiteSpace(in);
        } else {
            result.append(readDatum(in, ch);
        }
    }
    return result;
}
String readSymbol(PushbackReader in, int ch) {
    StringBuilder result = new StringBuilder();
    result.append((char)ch);
    while (true) {
       int ch2 = in.read();
       if (isSymbol(ch2)) {
           result.append((char)ch2);
       } else if (isWhiteSpace(ch2) || ch2 == ')') {
           in.unread(ch2);
           break;
       } else if (ch2 == -1) {
           break;
       } else {
           error(ch2);
       }
    }
    return result.toString();
}
 
精彩推荐