Planning problem modeling & Search solver



Model a problem scenario by addressing the information by regarding the world as a state, the current entity state, the entity goals, and action outcome. After modeling the scenario as a planning problem, it is possible to solve it with different algorithms to find the best strategy that finds a solution.

A general approach to modeling a planning problem

  • State – represents the world in a specific time
  • Action – a move between states
  • Action outcome – how much action costs
  • Initial state – how the world looks at first
  • Goal state – the desired state
  • Solution – a path between initial to a goal state
  • Finding the solution: an AI algorithm that finds the best solution

The assumption regarding the Planning model

  • Environment
    • Static / Dynamic – is the environment changes at any given time
  • Perception
    • Fully / partial observable – is the machine that can observe the entire environment at any given time or only parts of it
    • Perfect / noisy information – if the observed data is perfect or not
  • Actions
    • Deterministic / stochastic – what is the possibility of the action to take place
    • Discrete / continue outcome – is there an outcome (reward) for an action, how it is calculated
planning Assumption
planning Assumption

Example of a planning problem in the cyber world: the automated pen-testing:

For penetration testing reasons, we want to model how a hacker will attack a network, by finding the best attack on the network we can appoint the security teams to put extra mitigation on this path. The automated pen-testing example can be modeled easily as a planning problem based on Simulated Penetration Testing: From “Dijkstra” to “Turing Test++” [1]

  • State: computers accessed
  • Action: an exploit performed from one computer to another
  • Outcome: if the exploit succeeds the new state has another computer
  • Initial state: no computers compromised
  • Goal state: the most sensitive computer
  • Solution: what is the best way for an attacker to attack the network


POMDP solving for auto pentesting
POMDP solving for auto pen-testing

There are many possible ways to state assumptions regarding this problem. Are the network configuration and connections static or dynamic and change all the time? Does the hacker know what the entire network looks like or only for those computers he has hacked? Is an attack always successful? How much does an attack cost? What much each usage of exploit cost?

All these questions must be addressed as part of the assumption building; on one hand, we want to model the problem as closely as possible from reality; but on the other hand, make the problem simpler (=relaxing it) might also end up with a more practical and compatible solution. In the next AI topics, we will explain how to solve this problem after we model it as a planning problem.

Everything You Always Wanted to Know About Planning (But Were Afraid to Ask)-Jorge Hoffmann



The machine examines a large space of possible solutions and finds the best one, the researchers are trying to find new and more efficient algorithms to search this space. For each search algorithm developed, the researcher must  answer the following characteristics:

  • Complete – the algorithm returns a result
  • Optimal – the found solution is the best one
  • Space complexity – the maximum possible state the algorithm needs to store in memory
  • Runtime complexity – the maximum time (=count of actions) the algorithm will run

Types of search algorithms:

  • Uniform cost search – the algorithm uses the level of search (Dijkstra)
  • Informed search – the algorithm uses heuristic function along with the level (A*)

In the automated pen-testing example, if we state our assumption very loosely in such a way that the environment is static, fully observable, and has perfect information and that the outcome is deterministic and discrete, with a very naïve and easy to solve model, we could solve the problem using a search algorithm. If we could calculate a heuristic of how much a computer is “far” from the goal, we could use an informed search.

  1. “Dijkstra” to “Turing Test++”, ICAPS, International Conference on Automated Planning and Scheduling, 2015, Jorg Hoffmann


Leave a Reply

Your email address will not be published. Required fields are marked *