井字极大极小算法不与4×4板工作极小、不与、算法、工作

2023-09-11 02:28:41 作者:㈠★段情ゞ

所以,我一直在做这个项目,现在在过去的3周。我设法得到了极大极小功能工作早在一个3×3板,但问题的开始,当我试图使用它的一个4×4板,即Java堆空间错误引起的。此后,随着阿尔法贝塔剪枝的帮助下,我已经成功地从aprox的搞垮极大极小函数内的所需最小最大呼叫数。 59000 16000 11000,最后以8000电话(这是假设初始极大极小要求董事会有一个插槽已填写)。现在的问题却是,该方法只是不断跑4×4场比赛。它只是一直自称不停止,没有错误,没有结果,什么都没有。从理论上讲,我看到它的样子,我的函数应该适用于任意的电路板尺寸,唯一的问题就是内存。现在,因为我已经减少了我的函数的内存贪婪很大,我希望它的工作。那么,它的3×3。但是,它没有为4×4。 功能做什么的简要说明: 该函数返回尺寸2含在所有可能的下一步行动以及预期将比分从这一举动获得最优惠的下一步行动的数组。该评分系统很简单。 +10的一个O胜利,-10为平局的X胜0。该功能当然是递归的。在它,你会发现一定的快捷方式,减少所需调用自身的次数。例如,如果是X的转弯,返回得分为-10(这是最好的得分X),然后退出循环,即停止观察其他从这种状态可能的举动。这里的code类国家:

So I've been working on this project for the past 3 weeks now. I managed to get the minimax function to work early on for a 3x3 board, however problems started arising when I tried using it for a 4x4 board, namely Java heap space errors. Since then, with the help of Alpha beta pruning, I've managed to bring down the number of required minimax calls within the minimax function from aprox. 59000 to 16000 to 11000 and then finally to 8000 calls(This is assuming an initial minimax call for a board with one slot already filled). The problem now however, is that the method just keeps running for 4x4 games. It simply keeps calling itself non stop, no errors, no result, nothing. Theoretically, the way I see it, my function should work for arbitrary board sizes, the only problem was memory. Now, since I've reduced the memory greed of my function greatly, I'd expected it to work. Well, it does for the 3x3. However, it doesn't for the 4x4. A brief explanation of what the function does: The function returns an array of size 2 containing the most favorable next move from amongst all the possible next moves along with the score expected to be achieved from that move. The scoring system is simple. +10 for an O win, -10 for an X win and 0 for a draw. The function is of course recursive. Within it you will find certain shortcuts that reduce the number of required calls to itself. For example if it's X's turn and the returned score is -10(which is the best possible score for X) then exit the loop, i.e. stop observing other potential moves from this state. Here's the code for class State:

private String [] state;    //Actual content of the board
private String turn;    //Whose turn it is
private static int n;   //Size of the board
public State(int n){
    state=new String[n*n];
    for(int i = 0; i<state.length; i++){
        state[i]="-";
    }
    State.n=n;
}


public int[] newminimax47(int z){
    int bestScore = (turn == "O") ? +9 : -9;    //X is minimizer, O is maximizer
    int bestPos=-1;
    int currentScore;
    int lastAdded=z;
    if(isGameOver()!="Not Gameover"){
        bestScore= score();
    }
    else{
        int i=0;
        for(int x:getAvailableMoves()){
            if(turn=="X"){  //X is minimizer
                setX(x);
                currentScore = newminimax47(x)[0];
                if(i==0){
                    bestScore = currentScore;
                    bestPos=x;
                    if(bestScore==-10)
                        break;
                }
                else if(currentScore<bestScore){
                    bestScore = currentScore;
                    bestPos=x;
                    if(bestScore==-10)
                        break;
                }
            }
            else if(turn=="O"){ //O is maximizer
                setO(x);
                currentScore = newminimax47(x)[0];
                if(i==0){
                    bestScore = currentScore;
                    bestPos=x;
                    if(bestScore==10)
                        break;
                }

                else if(currentScore>bestScore){
                    bestScore = currentScore;
                    bestPos = x;
                    if(bestScore==10)
                        break;
                }
            }
            i++;
        }
    }
    revert(lastAdded);
    return new int [] {bestScore, bestPos};
}

使用newminimax47辅助功能():

Complementary functions used by newminimax47():

isGameOver():

isGameOver():

public String isGameOver(){
    if(n==3){
        //Rows 1 to 3
        if((state[0]!="-")&&(state[0]==state[1])&&(state[1]==state[2]))
            return (state[0]=="X") ? "X Won" : "O Won";
        else if((state[3]!="-")&&(state[3]==state[4])&&(state[4]==state[5]))
            return (state[3]=="X") ? "X Won" : "O Won";
        else if((state[6]!="-")&&(state[6]==state[7])&&(state[7]==state[8]))
            return (state[6]=="X") ? "X Won" : "O Won";

        //Columns 1 to 3
        else if((state[0]!="-")&&(state[0]==state[3])&&(state[3]==state[6]))
            return (state[0]=="X") ? "X Won" : "O Won";
        else if((state[1]!="-")&&(state[1]==state[4])&&(state[4]==state[7]))
            return (state[1]=="X") ? "X Won" : "O Won";
        else if((state[2]!="-")&&(state[2]==state[5])&&(state[5]==state[8]))
            return (state[2]=="X") ? "X Won" : "O Won";

        //Diagonals
        else if((state[0]!="-")&&(state[0]==state[4])&&(state[4]==state[8]))
            return (state[0]=="X") ? "X Won" : "O Won";
        else if((state[6]!="-")&&(state[6]==state[4])&&(state[4]==state[2]))
            return (state[6]=="X") ? "X Won" : "O Won";

        //Checking if draw
        else if((state[0]!="-")&&(state[1]!="-")&&(state[2]!="-")&&(state[3]!="-")&&
                (state[4]!="-")&&(state[5]!="-")&&(state[6]!="-")&&(state[7]!="-")&&
                (state[8]!="-"))
            return "Draw";
        else
            return "Not Gameover";
    }
    else {
        //Rows 1 to 4
        if((state[0]!="-")&&(state[0]==state[1])&&(state[1]==state[2])&&(state[2]==state[3]))
            return (state[0]=="X") ? "X Won" : "O Won";
        else if((state[4]!="-")&&(state[4]==state[5])&&(state[5]==state[6])&&(state[6]==state[7]))
            return (state[4]=="X") ? "X Won" : "O Won";
        else if((state[8]!="-")&&(state[8]==state[9])&&(state[9]==state[10])&&(state[10]==state[11]))
            return (state[8]=="X") ? "X Won" : "O Won";
        else if((state[12]!="-")&&(state[12]==state[13])&&(state[13]==state[14])&&(state[14]==state[15]))
            return (state[12]=="X") ? "X Won" : "O Won";

        //Columns 1 to 4
        else if((state[0]!="-")&&(state[0]==state[4])&&(state[4]==state[8])&&(state[8]==state[12]))
            return (state[0]=="X") ? "X Won" : "O Won";
        else if((state[1]!="-")&&(state[1]==state[5])&&(state[5]==state[9])&&(state[9]==state[13]))
            return (state[1]=="X") ? "X Won" : "O Won";
        else if((state[2]!="-")&&(state[2]==state[6])&&(state[6]==state[10])&&(state[10]==state[14]))
            return (state[2]=="X") ? "X Won" : "O Won";
        else if((state[3]!="-")&&(state[3]==state[7])&&(state[7]==state[11])&&(state[11]==state[15]))
            return (state[3]=="X") ? "X Won" : "O Won";

        //Diagonale
        else if((state[0]!="-")&&(state[0]==state[5])&&(state[5]==state[10])&&(state[10]==state[15]))
            return (state[0]=="X") ? "X Won" : "O Won";
        else if((state[12]!="-")&&(state[12]==state[9])&&(state[9]==state[6])&&(state[6]==state[3]))
            return (state[0]=="X") ? "X Won" : "O Won";

        //Pruefe ob Gleichstand
        else if((state[0]!="-")&&(state[1]!="-")&&(state[2]!="-")&&(state[3]!="-")&&
                (state[4]!="-")&&(state[5]!="-")&&(state[6]!="-")&&(state[7]!="-")&&
                (state[8]!="-")&&(state[9]!="-")&&(state[10]!="-")&&(state[11]!="-")&&
                (state[12]!="-")&&(state[13]!="-")&&(state[14]!="-")&&(state[15]!="-"))
            return "Draw";
        else
            return "Not Gameover";
    }   
}

请原谅isGameOver()方法的直率,它只是检查板的状态(即赢,抽奖,游戏没有过)

Please excuse the bluntness of the isGameOver() method, it merely checks the state of the board( i.e. Win, Draw, Game not Over)

在getAvailableMoves()方法:

The getAvailableMoves() method:

public int[] getAvailableMoves(){
    int count=0;
    int i=0;
    for(int j=0; j<state.length; j++){
        if(state[j]=="-")
            count++;
    }
    int [] availableSlots = new int[count];
    for(int j=0; j<state.length; j++){
        if(state[j]=="-")
            availableSlots[i++]=j;      
    }
    return availableSlots;
}

此方法只返回一个int数组所有可用的下一步行动(关于当前状态)或返回空数组,如果没有移动的可用或游戏就结束了。

This method merely returns an int array with all available next moves(regarding the current state) or returns empty array if no moves are available or if game is over.

评分()方法:

public int score(){
    if(isGameOver()=="X Won")
        return -10;
    else if(isGameOver()=="O Won")
        return +10;
    else 
        return 0;
}

濑(),setX的()和恢复():

setO(), setX() and revert():

public void setX(int i){
    state[i]="X";
    turn="O";
    lastAdded=i;
}
public void setO(int i){
    state[i]="O";
    turn="X";
    lastAdded=i;
}
public void revert(int i){
    state[i]="-";
    if(turn=="X")
        turn = "O";
    else
        turn = "X";
}

我的主要方法如下所示为一个3x3的游戏:

My main method looks like this for a 3x3 game:

public static void main(String args[]){
    State s=new State(3);
    int [] ScoreAndRecommendedMove=new int[2];
    s.setX(8);
    ScoreAndRecommendedMove=s.newminimax47(8);
    System.out.println("Score: "+ScoreAndRecommendedMove[0]+" Position: "+ ScoreAndRecommendedMove[1]);
}

在这个游戏中,X已经开始了游戏与移动在位置8在这种情况下,该方法将返回

In this game, X has started the game with a move at position 8. The method in this case will return

Score: 0 Position: 4  

含义是O公司最有前途的举动是在第4位,在最坏的情况下,将产生0(即平局)的分数。

Meaning that O's most promising move is at position 4 and in the worst case will yield a score of 0(i.e a draw).

下面的图片是为了给如何newminimax47()的作品的想法。在这种情况下,我们开始状态(板)给出的数字1。注意:该数字表示precedence创作的视状态。 1 2之前创建的,2 3之前创建的,3 4之前创建等。

The following image is meant to give an idea of how newminimax47() works. In this case our starting state(board) is given the number 1. Note: The numbers indicate the precedence in creation of the regarded states. 1 was created before 2, 2 was created before 3, 3 was created before 4 and so on.

在这种情况下,得分和位置最终返回到状态1将是

In this scenario, the score and position eventually returned to state 1 will be

Score: 0 Position: 6

从国家8的到来。

coming from state 8.

注:code,你看到的是实际状态类的只是片段。对自己的这些片段应该让你重新创建,并与中,newminimax47功能玩耍没有问题(至少3×3)。任何错误,你可能会发现不是真正的错误,他们只是不包括在我这里复制的片断和code应该没有他们。在濑户和setX的功能,例如lastAdded变量不包括在该片段在这里,但我只是意识到你不需要它能够用极小功能工作,所以你可以评论了这一点。

Note: The code you see is just snippets of the actual State class. These snippets on their own should allow you to recreate, and play around with, the newminimax47 function without problems(at least for 3x3). Any bugs you may find are not really bugs, they were simply not included in the snippets I copied here and the code should work without them. The lastAdded variable in the setO and setX functions for example is not included in the snippets here but I just realized you don't need it to be able to work with the minimax function, so you can just comment that out.

推荐答案

我研究了一下你的code和有相当多的说

I played around with your code and there is quite a bit to say

错误

首先有一个错误。我不认为你的code实际工作了3×3板。问题是你恢复添加到板上移动的位置。你这样做的 newminimax47 方法的末尾只出现一次,即使在该方法中添加移动到电路板内的循环。这意味着,调用该方法不仅计算的东西,但也改变了电路板的状态,以及code中的其余部分预计它不是。

first of all there is a bug. I don't think your code actually works for a 3x3 board. The problem is the location you revert the move you add to the board. You do this at the end of the newminimax47 method exactly once, even though in the method you add moves to the board inside a for loop. This means that calling the method does not only compute something but also changes the board state, and the rest of the code expects it not to.

所以删除恢复地方是现在和还原只要你可以:

So remove the revert where it is now and in revert as soon as you can:

setX(x);                                                                                                                                                                                                                                             
currentScore = newminimax47(x)[0];                           
revert(x);       

这也意味着你不需要 lastAdded 变量。

this also implies you don't need the lastAdded variable.

播放

这是一个更容易地看到发生了什么,如果你真的对自己的算法玩。添加一个方法到你的状态类

It's a lot easier to see what is happening if you actually play against your own algorithm. Add a method to your State class

public void dump() {                                                        
    for (int y = 0; y < n; y++) {                                           
        for (int x = 0; x < n; x++) {                                       
            System.out.print(state[y * n + x]);                             
        }                                                                   
        System.out.println();                                               
    }                                                                       
}

在你主像

public void play() {                                                        
    State s=new State(3);                                                   
    Scanner in = new Scanner (System.in);                                   
    while (s.isGameOver().equals("Not Gameover")) {                         
        int[] options = s.getAvailableMoves();                              
        s.dump();                                                           
        System.out.println ("Your options are " + Arrays.toString(options));
        int move = in.nextInt();                                            
        s.setX(move);                                                       
        int [] ScoreAndRecommendedMove=new int[2];                          
        ScoreAndRecommendedMove=s.newminimax47(0);                           
        System.out.println("Score: "+ScoreAndRecommendedMove[0]+" Position: "+ ScoreAndRecommendedMove[1]);
        s.setO(ScoreAndRecommendedMove[1]);                                 
    }                                                                       
    s.dump();                                                               
}

和你可以真正反对它玩。在一个3×3板这工作得很好,我。不幸的是我估计计算一个4x4的第一个举动把我的电脑约48小时。

and you can actually play against it. On a 3x3 board this works just fine for me. Unfortunately I estimated computing the first move on a 4x4 takes my computer approximately 48 hours.

数据类型

您选择的数据类型往往是有点...奇怪。如果你想记住一个字符,使用字符而不是字符串。如果你想返回一个是/否的决定,尽量使用布尔。也有程序的某些部分,可以通过更少的code做同样被取代。但是,这是不是你的问题,所以上...

Your choice of data types is often a bit... strange. If you want to remember a single character, use char instead of String. If you want to return a yes/no decision, try use boolean. There are also some parts of the program that could be replaces by less code doing the same. But that wasn't your question, so on to...

算法

好了,这有什么错极小来解决这个问题? 假设前四个动作是X5,O8,X6 O7。另一种可能性是先从X5,O7,X6,O8游戏。然而,另外一个是X6,O7,X 5,O8。最后还有X6,O8,X5,O7。

Ok, so what's wrong with minimax to solve this problem? Suppose the first four moves are X5, O8, X6 O7. Another possibility is to start the game with X5, O7, X6, O8. Yet another one is X6, O7, X5, O8. And finally there's X6, O8, X5, O7.

所有四个这些可能性的前四个动作游戏引到完全相同的游戏状态。但极小不会承认它们是相同的(基本上没有并联支路无记忆),所以它会计算它们全部四个。和次数,如果你搜索更深层次的每块板的状态计算将迅速增加。

All four of those possibilities for the first four moves of the game lead to the exact same gamestate. But minimax will not recognise they're the same (basically there is no memory of parallel branches) so it will compute them all four. And the number of times each board state is computed will increase rapidly if you search deeper.

可能的游戏人数大幅租税可能的板状态的数量。为了估计游戏的数量:在第一有16种可能的移动,然后在15,然后14,13,...等。粗略的估计是16!虽然极小不必计算所有的人,因为很多人会16日移动之前已经完成。

The number of possible games vastly outnumbers the number of possible board states. To estimate the number of games: at first there are 16 possible moves, then 15, then 14, 13, ... and so on. A rough estimation is 16!, although minimax won't have to compute all of them, because many of them will have finished before the 16th move.

这是估计游戏状态数为:董事会上的每平方可以是空的,或X或O。所以,这3 ^ 16板。不是所有的人实际上是有效的董事会,因为两个X的主板​​上的数量最多只能多一个,然后O数量,但仍然是接近3 ^ 16。

An estimation for the number of game states is: every square on the board can be either empty, or an X, or an O. So that's 3^16 boards. Not all of them are actually valid boards, because the number of Xs on the board can be at most one more then the number of Os, but still it's close to 3^16.

16!可能的游戏是大约50万次以上,然后3 ^ 16的电路板的状态。这意味着我们大约计算每块板半畅想倍的只有一次代替。

16! possible games is about half a million times more then 3^16 possible board states. Which means we're approximately computing every board half a milion times in stead of just once.

解决的办法是先记住每一个游戏状态,你计算。每次递归函数被调用首先检查,如果你已经知道答案,如果是刚刚返回旧的答案。这是一种被称为记忆化。

The solution is to start remembering every gamestate you compute. Every time the recursive function is called first check if you already know the answer, and if so just return the old answer. It's a technique called memoization.

记忆化

我将描述如何在使用数据结构,你已经选择(尽管我不同意他们的观点),以增加记忆化。要做到记忆化,你需要的,你可以做快速的集合添加和快速查找。列表(例如的ArrayList )将不会对我们有什么好的。它的快速增加值,但做一个查找是在长的列表非常缓慢。有一些选项,但一个最简单的使用的HashMap 。为了使用的HashMap 你需要创建的东西,重新presents你的状态,你可以作为一个按键使用。最直接的是,只是做一个字符串与所有的X / O / - 即重新present您的主板在它的符号

I'll describe how to add memoization while using the data structures you already choose (even though I don't agree with them). To do memoization you need a collection on which you can do quick adds and quick lookups. A list (for example ArrayList) won't do us any good. It's fast to add values but to do a lookup is very slow in long lists. There are some options but the easiest one to use is HashMap. In order to use HashMap you need to create something that represents your state and you can use as a key. The most straight-forward is to just make a String with all the X/O/- symbols in it that represent your board.

所以添加

Map<String,int[]> oldAnswers = new HashMap<String,int[]>();                                                                  

你的国家对象。

然后,在你的 newminimax47 方法开始创建重新presents国家String和检查,如果我们已经知道答案:

Then at the start of your newminimax47 method create the String that represents the State and check if we already know the answer:

    String stateString = "";                                                
    for (String field : state) stateString += field;                        
    int[] oldAnswer = oldAnswers.get(stateString);                          
    if (oldAnswer != null) return oldAnswer;

最后,当你计算出一个新的答案 newminimax47 结束时,你不仅要归还,而且其存储在图:

Finally when you compute a new answer the end of newminimax47 you should not only return it, but also store it in the map:

    int[] answer = {bestScore, bestPos};                                    
    oldAnswers.put (stateString, answer);                                   
    return answer;

使用记忆化的地方,我能对你的code打一个4x4的游戏。第一个举措依然很慢(20秒),但在那之后的一切计算和它的速度非常快。如果您想进一步加快步伐,你可以看看α+β剪枝。但改善不会靠近多达记忆化。另一种选择是使用更有效的数据类型。它不会降低你的算法的理论秩序,但仍然可以很容易地使之加快5倍。

With memoization in place I was able to play a 4x4 game against your code. The first move is still slow (20 seconds) but after that everything is computed and it's very fast. If you want to speed it up further, you could look into alpha beta pruning. But the improvement won't be near as much as memoization. Another option is using more efficient data types. It won't reduce the theoretical order of your algorithm, but could still easily make it 5 times faster.