使用塞西 - 乌尔曼算法EX pressions code发电机乌尔、发电机、算法、EX

2023-09-11 22:58:23 作者:他是我年少时最蔚蓝的海

举一个 AST树,我要生成一个汇编语言一样。我试图用塞西 - 乌尔曼的算法,但我有一些问题该算法实现。

Give a AST tree, I want to generate an assembly-like language. I'm trying using Sethi-Ullman algorithm but I have some questions in the algorithm implemetation.

目前我做到以下几点:

发出推REG ,其中 REG 是右子树的登记,评估左子树,送一寄存器指定为右子树的寄存器中,然后发出 POP REG 操作,其中 REG 是右子树的寄存器了。

Emit a push REG where REG is the register of right subtree, evaluate left subtree, get one free register assign as register of right subtree and then emit a POP REG operation where REG is the register of right subtree too.

enum Reg { Reg_r0, Reg_r1 };
Reg regs[] = { Reg_r0, Reg_r1 };
    Reg getreg() {
             static int c;
        if(c == sizeof(regs) / sizeof(int))
        c = 0;
        return regs[c++];
    }

下面是一个伪code(从C语言)如何从我unsertood实现它(包括标签()功能)

Here's a pseudo-code (from C-language) how to implement it from what I unsertood(including the label() function)

// K is the number of registers available
int K = 2;
void gen(AST ast) {
    if(ast.left != null && ast.right != null) {
        int l = ast.left.n;
        int r = ast.right.n;

        if(l >= K && r >= K) {
            gen(ast.right);
            ast.n -= 1;
            emit_operation(PUSH, ast.right.reg);
            gen(ast.left);
            ast.reg = getreg();
            emit_operation(POP, ast.right.reg);
        } else if(l >= r) {
            gen(ast.left);
            gen(ast.right);
            ast.n -= 1;
        } else if(l < r) {
            gen(ast.right);
            gen(ast.left);
            ast.n -= 1;
        }

        ast.reg = getreg();
        Reg r1 = ast.left.reg;
        Reg r2 = ast.right.reg;
        emit_operation(ast.type, r1, r2);
    } else if(ast.type == Type_id || ast.type == Type_number) {
        ast.n += 1;
        ast.reg = getreg();
        emit_load(ast);
    } else {
        print("gen() error");
        // error
    }
}

// ershov numbers
void label(AST ast) {
    if(ast == null)
        return;

    label(ast.left);
    label(ast.right);

    if(ast.type == Type_id || ast.type == Type_number)
        ast.n = 1;
    // ast has two childrens
    else if(ast.left not null && ast.right not null) {      
        int l = ast.left.n;
        int r = ast.right.n;

        if(l == r)
            ast.n = 1 + l;
        else
            ast.n = max(1, l, r);
    }
    // ast has one child
    else if(ast.left not null && ast.right is null)
        ast.n = ast.left.n;
    else
        print("label() error!");
}

编辑:请告诉我,如果更多的情况下是需要了解这种

Tell me please if more context is needed to understand this.

推荐答案

塞西 - 乌尔曼是真的只是一个调度算法,不是一个寄存器分配算法,所以它只是告诉你的序的在其中做的操作,以减少所需要的寄存器的数量;它不会告诉你的这的注册使用的地方。

Sethi-Ullman is really just a scheduling algorithm, not a register allocation algorithm, so it just tells you the order in which to do operations to minimize the number of registers needed; it does not tell you which registers to use where.

所以,你需要将它与寄存器分配策略相结合 - 通常只是一个贪婪的分配。再有就是,如果你用完了寄存器做什么的问题 - 你插入泄漏内联,或放弃做别的事情

So you need to combine it with a register allocation strategy -- usually just a greedy allocator. Then there's the question of what to do if you run out of registers -- do you insert spills inline, or abort and do something else?

为了做简单的贪心分配内嵌的安排和指令生成(你似乎与你简单的递归程序做),你需要跟踪哪些寄存器都在使用中,在任何给定的时间。最简单的方法是通过增加一个额外 IN_USE 参数的发电机功能:

In order to do simple greedy allocation inline with your scheduling and instruction generation (what you seem to be doing with your simple gen recursive procedure), you'll need to keep track of which registers are in use at any given time. The easiest way is by adding an extra in_use argument to your gen function:

typedef unsigned RegSet; /* using a simple bitmask for a set -- assuming that
                          * unsigned is big enough to have a bit per register */

void gen(AST *ast, RegSet in_use) {
    if(ast->left != 0 && ast->right != 0) {
        if (ast->left->n >= ast->right->n) {
            gen(ast->left, in_use);
            gen(ast->right, in_use | (1 << ast->left->reg));
        } else {
            gen(ast->right, in_use);
            gen(ast->left, in_use | (1 << ast->right->reg)); }
        ast->reg = ast->left->reg
        emit_operation(ast->type, ast->left->reg, ast->right->reg);
    } else if(ast->type == Type_id || ast->type == Type_number) {
        ast->reg = pick_unused_register(in_use);
        emit_load(ast);
    } else ....

请注意,你需要一个单独的递归通计算 N 为每个节点(塞西 - 乌尔曼本身需要两个遍历,第一次遍历计算自下而上的 N 价值,为第二遍历使用自上而下)。

Note that you need a separate recursive pass to calculate n for each node (Sethi-Ullman inherently requires two traversals, with the first traversal computing bottom up the n value for the second traversal to use top-down).

现在上面的code不处理用完的寄存器都没有。要做到这一点,你需要插入一些泄漏。一种方法是,以检测所有寄存器都在使用中使得递归调用和溢出然后之前,之后恢复

Now the above code doesn't deal with running out of registers at all. To do that, you need to insert some spills. One way is to detect that all registers are in use before making a recursive call and spill then, restoring afterwards:

void gen(AST *ast, RegSet in_use) {
    if(ast->left != 0 && ast->right != 0) {
        Reg spill = NoRegister; /* no spill yet */
        AST *do1st, *do2nd;     /* what order to generate children */
        if (ast->left->n >= ast->right->n) {
            do1st = ast->left;
            do2nd = ast->right;
        } else {
            do1st = ast->right;
            do2nd = ast->left; }
        gen(do1st, in_use);
        in_use |= 1 << do1st->reg;
        if (all_used(in_use)) {
            spill = pick_register_other_than(do1st->reg);
            in_use &= ~(1 << spill);
            emit_operation(PUSH, spill); }
        gen(do2nd, in_use);
        ast->reg = ast->left->reg
        emit_operation(ast->type, ast->left->reg, ast->right->reg);
        if (spill != NoRegister)
            emit_operation(POP, spill);
    } else ...

当然,这个结果是不是非常有效 - 它通常是更好地蔓延更快,更高填充,但只有当你知道你将要用完寄存器

Of course, this turns out to be not terribly efficient -- its usually better to spill sooner and refill later, but only when you know you're going to run out of registers.