Problem-Solving Agents
A problem can be defined formally by five components:
- The initial state that the agent starts in.
- A description of the possible actions available to the agent. Given a particular state , returns the set of actions that can be executed in .
- A description of what each action does; the formal name for this is the transition model, specified by a function that returns the state that results from doing action in state .
Note
The initial state, actions, and transition model implicitly define the state space of the problem. The state space forms a directed network or graph in which the nodes are states and the links between nodes are actions. A path in the state space is a sequence of states connected by a sequence of actions.
- The goal test, which determines whether a given state is a goal state.
- A path cost function that assigns a numeric cost to each path. The step cost of taking action in state to reach state is denoted by .
Example Problems
Toy problems
- vacuum world
- 8-puzzle
Real-world problems
- Touring problems
- Traveling salesperson problem (TSP)
Searching for Solutions
Infrastructure for searching algorithms
Search algorithms require a date structure to keep track of the search tree that is being constructed. For each node of the tree, we have a structure that contains four components:
- : the state in the state space to which the node corresponds
- : the node in the search tree that generated this node;
- : the action that was applied to the parent to generate the node
- : the cost, traditionally denoted by , of the path from the initial state to the node
Measuring problem-solving performance
- Completeness: Is the algorithm guaranteed to fina a solution when there is one?
- Optimality: Can the algorithm find the solution that has the lowest path cost among all solutions?
- Time complexity
- Space complexity
Uninformed Search Strategies
Breadth-first Search (BFS)
Depth-first Search (DFS)
Uniform-cost search
Uniform-cost search expands the node with the lowest path cost . This is done by storing the frontier as a priority queue ordered by .
If , then in this case BFS and uniform-cost search are identical.
Depth-limited search
Nodes at depth are treated as if they have no successors.
Iterative deepening depth-first search (IDS)
For from to , conduct depth limited search with the depth limit .
In general, iterative deepening is the preferred uninformed search method when the search space is large and the depth of the solution is not known.
Bidirectional search
The idea behind bidirectional search is to run two simultaneous searches—one forward from the initial state and the other backward from the goal—hoping that the two searches meet in the middle.
The motivation is that is much less than .
Bidirectional search is implemented by replacing the goal test with a check to see whether the frontiers of the two searches intersect; if they do, a solution has been found.
Informed (Heuristic) Search Strategies
Greedy best-first search
Greedy best-first search tries to expand the node that is closest to the goal, on the grounds that this is likely to lead to a solution quickly. Thus, it evaluates nodes by using just the heuristic function .
- Incomplete: can stick into a dead end, thus never finding the solution
- Not optimal
A-star search
The most widely know form of best-first search is called A* search. It evaluates node by combing , the cost to reach the node, and , the cost to get from the node to the goal.
Thus, if we are trying to find the cheapest solution, a reasonable thing to try first is the node with the lowest value of . It turns out that this strategy is more than just reasonable: provided that the heuristic function satisfies certain conditions, A* search is both complete and optimal.
Tip
The algorithm is identical to uniform-cost-search except that A* uses instead of .
Conditions for optimality
Admissibility The first condition we require for optimality is that be an admissible heuristic. An admissible heuristic is one that never overestimates the cost to reach the goal. In this case, never overestimate the true cost of a solution along the current path through .
Admissible heuristics should think the cost of solving the problem is less that it actually is. An obvious example of an admissible heuristic is the straight-line distance.
Consistency (monotonicity)
Note
This second condition is required only for applications of A* to graph search
A heuristic is consistent if, for every node and every successor of generated by any action , the estimated cost of reaching the goal from is no greater than the step cost of getting to plus the estimated cost of reaching the goal from :
Consistence is a stronger condition than admissibility
An consistent heuristic must be admissible: if there were a route form to via that was cheaper than , that would violate the property that is a lower bound on the cost to reach
Optimality of A*
The tree-search version of A* is optimal if is admissible, while the graph-search version is optimal if is consistent.
Drawbacks of A* Because it keeps all generated nodes in memory (as do all GRAPH-SEARCH algorithms), A* usually runs out of space long before it runs out of time. For this reason, A* is not practical for many large-scale problems.
Memory-bounded heuristic search
The simplest way to reduce memory requirements for A* is to adapt the idea of iterative deepening to the heuristic search context, resulting in the iterative-deepening A (IDA) algorithm**. The main difference between IDA* and standard iterative deepening is that the cutoff used is the -cost () rather than the depth; at each iteration, the cutoff value is the smallest -cost of any node that exceeded the cutoff on the previous iteration.
Recursive best-first search (RBFS) is a memory-efficient variant of A* that uses a best-first search strategy recursively. It maintains an f-cost limit and explores the most promising node (lowest f-cost) while keeping track of the best alternative f-cost for backtracking. When the f-cost of a node exceeds the limit, it backtracks to explore other options, updating the limit dynamically based on the best alternative paths.
import heapq
def ida_star(initial_state, goal_state, heuristic_function, get_neighbors):
"""
Iterative Deepening A* (IDA*) algorithm.
Args:
initial_state: The starting state.
goal_state: The target state.
heuristic_function: A function that takes a state and returns the estimated cost to reach the goal.
get_neighbors: A function that takes a state and returns a list of (neighbor_state, cost) tuples.
Returns:
A list representing the path to the goal state, or None if no path is found.
"""
def search(current_state, g, cost_bound, path):
f = g + heuristic_function(current_state)
if f > cost_bound:
return f, None
if current_state == goal_state:
return -1, path
min_next_cost = float('inf')
for neighbor, cost in get_neighbors(current_state):
if neighbor not in path: # Avoid cycles
next_path = path + [neighbor]
result_cost, result_path = search(neighbor, g + cost, cost_bound, next_path)
if result_cost == -1:
return -1, result_path
if result_cost > 0:
min_next_cost = min(min_next_cost, result_cost)
return min_next_cost, None
cost_bound = heuristic_function(initial_state)
path = [initial_state]
while True:
result_cost, result_path = search(initial_state, 0, cost_bound, path)
if result_cost == -1:
return result_path
if result_cost == float('inf'):
return None # No solution found
cost_bound = result_cost
def rbfs(initial_state, goal_state, heuristic_function, get_neighbors):
"""
Recursive Best-First Search (RBFS) algorithm.
Args:
initial_state: The starting state.
goal_state: The target state.
heuristic_function: A function that takes a state and returns the estimated cost to reach the goal.
get_neighbors: A function that takes a state and returns a list of (neighbor_state, cost) tuples.
Returns:
A list representing the path to the goal state, or None if no path is found.
"""
def search(current_state, g, f_limit):
f = g + heuristic_function(current_state)
if f > f_limit:
return None, f
if current_state == goal_state:
return [current_state], f
successors = []
for neighbor, cost in get_neighbors(current_state):
successors.append((neighbor, g + cost + heuristic_function(neighbor)))
if not successors:
return None, float('inf')
while True:
successors.sort(key=lambda x: x[1])
best_successor, best_f = successors[0]
if best_f > f_limit:
return None, best_f
if len(successors) > 1:
alternative_f = successors[1][1]
else:
alternative_f = float('inf')
result, new_f_limit = search(best_successor, g + get_cost(current_state, best_successor, get_neighbors), min(f_limit, alternative_f))
if result is not None:
return [current_state] + result, new_f_limit
successors.pop(0)
if not successors:
return None, float('inf')
def get_cost(state1, state2, get_neighbors_func):
"""Helper function to get the cost between two states."""
for neighbor, cost in get_neighbors_func(state1):
if neighbor == state2:
return cost
return 1 # Default cost if not explicitly defined
result_path, _ = search(initial_state, 0, float('inf'))
return result_path
if __name__ == '__main__':
# Example Usage (8-Puzzle)
def misplaced_tiles_heuristic(state):
goal = (1, 2, 3, 8, 0, 4, 7, 6, 5)
misplaced = 0
for i in range(9):
if state[i] != 0 and state[i] != goal[i]:
misplaced += 1
return misplaced
def get_puzzle_neighbors(state):
neighbors = []
empty_index = state.index(0)
row, col = divmod(empty_index, 3)
moves = [(0, 1), (0, -1), (1, 0), (-1, 0)] # Right, Left, Down, Up
for dr, dc in moves:
new_row, new_col = row + dr, col + dc
if 0 <= new_row < 3 and 0 <= new_col < 3:
new_index = new_row * 3 + new_col
new_state_list = list(state)
new_state_list[empty_index], new_state_list[new_index] = new_state_list[new_index], new_state_list[empty_index]
neighbors.append((tuple(new_state_list), 1)) # Cost is always 1 for a move
return neighbors
initial_puzzle = (1, 2, 3, 8, 0, 4, 7, 6, 5)
goal_puzzle = (1, 2, 3, 8, 0, 4, 7, 6, 5) # Already at goal
initial_puzzle_harder = (2, 8, 3, 1, 6, 4, 7, 0, 5)
goal_puzzle_harder = (1, 2, 3, 8, 0, 4, 7, 6, 5)
print("IDA* Example:")
path_ida_star = ida_star(initial_puzzle_harder, goal_puzzle_harder, misplaced_tiles_heuristic, get_puzzle_neighbors)
if path_ida_star:
print("Path found (IDA*):")
for state in path_ida_star:
print(state)
else:
print("No path found (IDA*)")
print("\nRBFS Example:")
path_rbfs = rbfs(initial_puzzle_harder, goal_puzzle_harder, misplaced_tiles_heuristic, get_puzzle_neighbors)
if path_rbfs:
print("Path found (RBFS):")
for state in path_rbfs:
print(state)
else:
print("No path found (RBFS)")