3D Tic Tac Toe with Minimax＆Alpha-beta修剪选择次优移动

我正在尝试为3D Tic Tac Toe游戏进行Alpha-Beta修剪Minimax。但是，看起来该算法选择次优路径。3D Tic Tac Toe with Minimax＆Alpha-beta修剪选择次优移动

例如，您可以简单地通过直接穿过立方体的中间或单个板子来获胜。人工智能似乎挑选出最适合的单元，而不是目前的轮次。

我试过重新创建并使用启发式玩法，我为算法返回，但是我没有取得太多进展。不管层面如何，它似乎都有同样的问题。

该代码是here。

相关部分是computers_move和think_ahead（和'2'变体，这些只是我试验一个稍微替代的方法）。

我希望它可能是简单的东西，我忽略了，但据我所知，我不知道问题是什么。如果有人能够解决这个问题，我会非常感激。

def computers_move2(self): 
    best_score = -1000 
    best_move = None 
    h = None 
    win = False 

    for move in self.allowed_moves: 
     self.move(move, self.ai) 
     if self.complete: 
      win = True 
      break 
     else: 
      h = self.think_ahead2(self.human, -1000, 1000) 
     self.depth_count = 0 
     if h >= best_score: 
      best_score = h 
      best_move = move 
      self.undo_move(move) 
     else: 
      self.undo_move(move) 

    if not win: 
     self.move(best_move, self.ai) 
    self.human_turn = True 

def think_ahead2(self, player, a, b): 
    if self.depth_count <= self.difficulty: 
     self.depth_count += 1 
     if player == self.ai: 
      h = None 
      for move in self.allowed_moves: 
       self.move(move, player) 
       if self.complete: 
        self.undo_move(move) 
        return 1000 
       else: 
        h = self.think_ahead2(self.human, a, b) 
        if h > a: 
         a = h 
         self.undo_move(move) 
        else: 
         self.undo_move(move) 
       if a >= b: 
        break 
      return a 
     else: 
      h = None 
      for move in self.allowed_moves: 
       self.move(move, player) 
       if self.complete: 
        self.undo_move(move) 
        return -1000 
       else: 
        h = self.think_ahead2(self.ai, a, b) 
        if h < b: 
         b = h 
         self.undo_move(move) 
        else: 
         self.undo_move(move) 
       if a >= b: 
        break 
      return b 
    else: 
     diff = self.check_available(self.ai) - self.check_available(self.human) 
     return diff

来源

2016-03-01 Battleroid

原来我的算法似乎工作正常。问题是由我的帮手功能move和undo_move造成的。另外根本问题是我允许的移动。

我注意到，在探索树时，computer_plays的最外层循环中移动次数严重减少。在第一次扫描期间，计算机和人类玩家每转对允许移动的次数将从总计27次减少到20次，然后是10次，最终为5次。

变为暂时测试的移动未被替换。所以我换了一个标准列表的集合，并在每次移动/撤消之后对列表进行排序，并彻底解决了我的问题。

来源

2016-03-02 04:30:13 Battleroid

3D Tic Tac Toe with Minimax＆Alpha-beta修剪选择次优移动

回答

相关问题