Home · Book Reports · 2024 · Game AI Pro 1

Published: March 13, 2024

Tags: Ai · Games · Programming

Author :: Multiple

Publication Year :: 2014

Read Date :: 2024-01-27

Source :: Game_AI_Pro_(2013).pdf

The book in...

One sentence:

A whirlwind tour of tons of video game AI that is a great starting point for determining which AI path you want to travel further down.

Five sentences:

The first few chapters of the book talks about the human body and how neurology and signals within your body for things like reflexes are analogous to AI processes. Then the rest of the book jumps around and covers a lot of ground but never goes into the nitty-gritty details of any of the topics. This is fine and the book was a great primer on what kinds of AI is available and what the strengths, weakness, or use-cases for the different varieties of AI you might use in games. Some of the information might be a little dated - they talk about the advanced GPUs having hundreds of pipes - but the core concepts are perfectly applicable today (10 years after publication). If you are interested in general game AI and want to get a feel for the options available, then this is probably a pretty good place to start; if you are interested in a particular type of AI, there are probably better options.

designates my notes. / designates important. / designates very important.

Thoughts

Chapter 2 talks about neurology. This is used as an analogy of how signals propagate within the body and how they can propagate within you game’s AI. Interesting to note that the speed of signals within the body - your reflexes - are actually WAY slower (compared to for example internet speeds) than you might think.

The book touches on many topics but doesn’t go deeply into any. That isn’t a bad thing. It is actually a good introduction to many AI concepts and can act as a sort of idea catalyst to give you a jumping off point in which AI branch you want to learn more about (or which AI might work in your game).

Some of the info is a little dated - they talk about hundreds of pipes in a GPU as a big deal - but nothing bad and nothing that makes the concepts any less relevant.

Part 1 - General Wisdom

01: What is Game AI?
02: Informing Game AI through the Study of Neurology
03: Advanced Randomness Techniques for Game AI Gaussian Randomness, Filtered Randomness, and Perlin Noise

Part 2 - Architecture

04: Behavior Selection Algorithms - An Overview
05: Structural Architecture - Common Tricks of the Trade
06: The Behavior Tree Starter Kit
07: Real-World Behavior Trees in Script
08: Simulating Behavior Trees A Behavior Tree/Planner Hybrid Approach
09: An Introduction to Utility Theory
10: Building Utility Decisions into Your Existing Behavior
11: Reactivity and Deliberation in Decision-Making Systems
12: Exploring HTN Planners through Example
13: Hierarchical Plan-Space Planning for Multi-unit Combat Maneuvers
14: Phenomenal AI Level-of-Detail Control with the LOD Trader
15: Runtime Compiled C++ for Rapid AI Development
16: Plumbing the Forbidden Depths Scripting and AI

Part 3 - Movement and Pathfinding

17: Pathfinding Architecture Optimizations
18: Choosing a Search Space Representation
19: Creating High-Order Navigation Meshes through Iterative Wavefront Edge Expansions
20: Precomputed Pathfinding for Large and Detailed Worlds on MMO Servers
21: Techniques for Formation Movement Using Steering Circles
22: Collision Avoidance for Preplanned Locomotion
23: Crowd Pathfinding and Steering Using Flow Field Tiles
24: Efficient Crowd Simulation for Mobile Games
25: Animation-Driven Locomotion with Locomotion Planning

Part 4 - Strategy and Tactics

26: Tactical Position Selection An Architecture and Query Language
27: Tactical Pathfinding on a NavMesh
28: Beyond the Kung-Fu Circle A Flexible System for Managing NPC Attacks
29: Hierarchical AI for Multiplayer Bots in Killzone 3
30: Using Neural Networks to Control Agent Threat Response

Part 5 - Agent Awareness and Knowledge Representation

31: Crytek’s Target Tracks Perception System
32: How to Catch a Ninja NPC Awareness in a 2D Stealth Platformer
33: Asking the Environment Smart Questions
34: A Simple and Robust Knowledge Representation System
35: A Simple and Practical Social Dynamics System
36: Breathing Life into Your Background Characters
37: Alibi Generation Fooling All the Players All the Time

Part 6 - Racing

38: An Architecture Overview for AI in Racing Games
39: Representing and Driving a Race Track for AI Controlled Vehicles
40: Racing Vehicle Control Systems using PID Controllers
41: The Heat Vision System for Racing AI A Novel Way to Determine Optimal Track Positioning
42: A Rubber-Banding System for Gameplay and Race Management

Part 7 - Odds and Ends

43: An Architecture for Character-Rich Social Simulation
44: A Control-Based Architecture for Animal Behavior
45: Introduction to GPGPU for AI
46: Creating Dynamic Soundscapes Using an Artificial Sound Designer
47: Tips and Tricks for a Robust Third-Person Camera System
48: Implementing N-Grams for Player Prediction, Procedural Generation, and Stylized AI

Part 1 - General Wisdom

· Chapter 1 - What is Game AI?

By: Kevin Dill

page 007:

Brian Kernighan, codeveloper of Unix and the C programming language, is believed to have said, “Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it”
Apparently the scenario AI for the original Warcraft would simply wait a fixed amount of time, and then start spawning a wave of units to attack you. It would spawn these units at the edge of the gray fog—which is to say, just outside of the visibility range of your units. It would continue spawning units until your defenses were nearly overwhelmed—and then it would stop, leaving you to mop up those units that remained and allowing you to win the fight.
This approach seems to cross the line fairly cleanly into the realm of “cheating.” The AI doesn’t have to worry about building buildings, or saving money, or recruiting units—it just spawns whatever it needs. On the other hand, think about the experience that results. No matter how good or bad you are at the game, it will create an epic battle for you — one which pushes you to the absolute limit of your ability, but one in which you will ultimately, against all odds, be victorious.

· Chapter 2 - Informing Game AI through the Study of Neurology

By: Brett Laming

page 018:

An illustration of action potential propagation through nerve fibers over time. The resulting swing in repolarization and the delay in membrane channels reopening prevent previously excited sites from firing again.
Schmitt trigger and hysteresis

page 019:

Notice how a potentially hard problem, the transmission of action potentials, can be better conceptualized by compartmentalizing the space (Figure 2.4). At this stage, each com- partment represents just a local area, and the comprehension is immediately simplified. Now consider a tactical warfare simulation with a realistic AI communication system. Rather than working out a route for each agent “A” to get a message to another agent “B,” it is easier to imagine all such agents on a compartmentalized grid, posting messages to neighboring agents who then carry the transmission that way. In doing so, we get some nice extras for free: the potential for spies, utterance signaling, realistic transmission delays, and bigger signaling distances covered by nonverbal gestures or field phones. In either case, by thinking about the problem as a different representation, we have instantly simplified the procedure and got some realistic wins as well!

page 022:

Hence the Rosenblatt perceptron can be specified as follows: For a vector of potential inputs $v_i = [-1, i_1, \dots, i_j]$ and the vector of weights $v_w = [\omega_0, \omega_1, \dots, \omega_j]$ where $\omega_0$ represents the bias weight, but always for a constant input $(i_0 = -1)$:

$$y = v_i . v_w > 0 \space then \space 1 \space else \space 0$$

Rosenblatt’s perceptron has two phases, a learning pass and a prediction pass. In the learning pass, a definition of $v_i$ is passed in as well as a desired output, $y_{ideal}$ , normalized to a range of $0 \dots 1$. On each presentation of $v_i$ and $y_{ideal}$ a learning rule is applied in a similar form to Hebb’s.
Here, each weight is affected by the difference in overall output and expected output $y_{ideal}$ multiplied up by contributory input and learning rate.

$$\omega_{i,t+1} = \omega_i + \mu(y_{ideal} - y)i_i$$

So by presenting a series of these vectors and desired output, it is possible to train this artificial neuron to start trying to give new output in the absence of untrained conditions by just applying the output calculation without training.

page 023:

The perceptron is essentially a Boolean classifier.

For example, here is a utility equation from a recent talk [Mark 10]:

$$coverChance = 0.2+reloadNeed \times 1.0+healNeed \times 1.5+threatRating \times 1.3$$

Now, these values probably took some degree of trial and error to arrive at and probably had to be normalized into sensible ranges. But consider this:

$$coverChance = 1 \times w_0 + reloadNeed \times w_1+healNeed \times w_2+threatRating \times w_3$$

If we present the player with a number of scenarios and ask them whether they would take cover, we can easily get values that, after enough questions, are still easily accessible in meaning and therefore can make better initial guesses at the equation that might want to drive our AI.
A final nice property is that we don’t necessarily need to answer the false side of the equation if we don’t want to; we could just supply random values to represent false. Imagine each time the player goes into cover we measure these values and return true, following it by random values that return false. Even if we happen to get a lucky random true cover condition, over time it will just represent noise. If we then clamp all weights to sensible ranges, potentially found by training in test circumstances, we now have a quick to compute run-time predictor of what determines when the player may think of going into cover!

page 026:

We should always bear in mind that some of the quickest processing at 200 ms or so is still 6 frames at 30 frames per second. This means that distributing AI planning over frames is not only viable, it may be more biologically plausible.

· Chapter 3 - Advanced Randomness Techniques for Game AI Gaussian Randomness, Filtered Randomness, and Perlin Noise

By: Steve Rabin, Jay Goldblatt, and Fernando Silva

page 030:

Normal distributions (also known as Gaussian distributions or bell curves)
There is randomness in these distributions, but they are not uniformly random. For example, the chance of a man growing to be 6 feet tall is not the same as the chance of him growing to a final height of 5 feet tall or 7 feet tall. If the chance were the same, then the distribution would be uniformly random.
Some random things in life do show a uniform distribution, such as the chance of giving birth to a boy or a girl. However, the large majority of distributions in life are closer to a normal distribution than a uniform distribution. But why?
The answer is quite simple and is explained by the central limit theorem. Basically, when many random variables are added together, the resulting sum will follow a normal distribution. This can be seen when you roll three 6-sided dice. While there is a uniform chance of a single die landing on any face, the chance of rolling three dice and their sum equaling the maximum of 18 is not uniform with regard to other outcomes. For example, the odds of three dice adding up to 18 is 0.5% while the odds of three dice adding up to 10 is 12.5%. Figure 3.1 shows that the sum of rolling three dice actually follows a normal distribution.

page 031:

The probability of the sum of rolling three six-sided dice will follow a normal distribution, even though the outcomes of any single die have a uniform distribution. This is due to the central limit theorem.

page 032:

Figure 3.2 shows both uniform and Gaussian bullet spreads on a target (generated from the sample demo on the book’s website). It should be evident that the Gaussian one on the right looks much more realistic, but how is it generated? The trick is to use polar coordinates and a mix of both uniform and Gaussian randomness. First, a random uniform angle is generated from 0 to 360°. This value is used as a polar coordinate to determine an angle around the center of the target. It is important for this value to be uniform because there should be an equal chance of any angle. Second, a random Gaussian number is generated to determine the distance from the center of the target. By combining the random uniform angle and the random Gaussian distance from the center, you can recreate a very realistic bullet spread.

page 034:

Randomness is too random (for many uses in games).
small runs of randomness don’t look random

page 036:

some rules to make random number generation “feel” less random. Honestly these feel clunky and heavy handed. I would do something like change the weights of the probabilities per roll. Start with a H:T 50:50 coin flip. Heads. Second flip at 40:60. Heads. Third flip at 30:70. Continue skewing the weight until the result is tails. Then skew the other way.

page 037:

The open source program ENT will run a variety of metrics to evaluate the randomness, so it would be advisable to run these benchmarks if you design your own rules… https://github.com/Fourmilab/ent_random_sequence_tester

page 040:

In one-dimension, Perlin noise is constructed by first deciding how many octaves to use. Each octave contributes to the signal detail at a particular scale, with higher octaves adding more fine-grained detail. Each octave is computed individually and then they are added to each other to produce the final signal. Figure 3.5 shows one-dimensional Perlin noise, constructed with four octaves.
In order to explain how each octave signal is produced, let’s start with the first one. The first octave is computed by starting and ending the interval with two different uniform random numbers, in the range [0, 1]. The signal in the middle is computed by applying a mathematical function that interpolates between the two. The ideal function to use is the S-curve function $6t^5 - 15t^4 + 10t^3$ because it has many nice mathematical properties, such as being smooth in the first and second derivatives [Perlin 02]. This is desirable so that the signal contained within higher octaves is smooth.
For the second octave, we choose three uniform random numbers, place them equidistant from each other, and then interpolate between them using our sigmoid function. Similarly, for the third octave, we choose five uniform random numbers, place them equidistant from each other, and then interpolate between them. The number of uniform random numbers for a given octave is equal to $2^{n−1} + 1$. Figure 3.5 shows four octaves with randomly chosen numbers within each octave.
Once we have the octaves, the next step is to scale each octave with an amplitude. This will cause the higher octaves to progressively contribute to the fine-grained variance in the final signal. Starting with the first octave, we multiply the signal by an amplitude of 0.5, as shown in Figure 3.5. The second octave is multiplied by an amplitude of 0.25, and the third octave is multiplied by an amplitude of 0.125, and so on. The formula for the amplitude at a given octave is $p_i$ , where $p$ is the persistence value and $i$ is the octave (our example used a persistence value of 0.5). The persistence value will control how much influence higher octaves have, with high values of persistence giving more weight to higher octaves ( producing more high-frequency noise in the final signal).
Now that the octaves have been appropriately scaled, we can add them together to get our final one-dimensional Perlin noise signal, as shown at the bottom right of Figure 3.5. While this is all fine and good, it is important to realize that for the purposes of game AI, you are not going to compute and store the entire final signal, since there is no need to have the whole thing at once. Instead, given a particular time along the signal, in the range $[0, 1]$ along the x-axis, you’ll just compute that particular point as needed for your simulation. So if you want the point in the middle of the final signal, you would compute the individual signal in each octave at time 0.5, scale each octave value with their correct amplitude, and add them together to get a single value. You can then run your simulation at any rate by requesting the next point at 0.500001, 0.51, or 0.6, for example.

page 042:

Controlling Perlin Noise
As alluded to in the previous section, there are several controls that will allow you to customize the randomness of the noise. The following list is a summary.
Number of octaves: Lower octaves offer larger swings in the signal while higher octaves offer more fine-grained noise. This can be randomized within a population as well, so that some individuals have more octaves than others when generating a particular behavior trait.
Range of octaves: You can have any range, for example octaves 4 through 8. You do not have to start with octave 1. Again, the ranges can be randomized within a population.
Amplitude at each octave: The choice of amplitude at each octave can be used to control the final signal. The higher the amplitude, the more that octave will influence the final signal. Simply ensure that the sum of amplitudes across all octaves does not exceed 1.0 if you don’t want the final signal to exceed 1.0.
Choice of interpolation: The S-curve function is commonly used in Perlin noise, with original Perlin noise using $3t^2 - 2t^3$ [Perlin 85] and improved Perlin noise using $6t^5 - 15t^4 + 10t^3$ (smooth in the second derivative) [Perlin 02]. However, you might be able to get other interesting effects by choosing a different formula [Komppa 10].

Part 2 - Architecture

· Chapter 4 - Behavior Selection Algorithms - An Overview

Michael Dawe, Steve Gargolinski, Luke Dicken, Troy Humphreys, and Dave Mark

page 048:

Finite-State Machines

page 049:

class FSMState
{
    virtual void onEnter();
    virtual void onUpdate();
    virtual void onExit();
    list<FSMTransition> transitions;
};

class FSMTransition
{
    virtual bool isValid();
    virtual FSMState* getNextState();
    virtual void onTransition();
}

class FiniteStateMachine
{
    void update();
    list<FSMState> states;
    FSMState* initialState;
    FSMState* activeState;
}

The FiniteStateMachine class contains a list of all states in our FSM, as well as the initial state and the current active state. It also contains the central update() function, which is called each tick and is responsible for running our behavioral algorithm as follows:
Call isValid() on each transition in activeState.transtitions until isValid() returns true or there are no more transitions.
If a valid transition is found, then:
Call activeState.onExit()
Set activeState to validTransition.getNextState()
Call activeState.onEnter()
If a valid transition is not found, then call activeState.onUpdate() With this structure in place, it’s a matter of setting up transitions and filling out the onEnter(), onUpdate(), onExit(), and onTransition() functions to produce the desired AI behavior.

page 050:

Hierarchical Finite-State Machines
Adding the second, third, or fourth state to an NPC’s FSM is usually structurally trivial, as all that’s needed is to hook up transitions to the few existing required states. However, if you’re nearing the end of development and your FSM is already complicated with 10, 20, or 30 existing states, then fitting your new state into the existing structure can be extremely difficult and error-prone.
You want to add a conversation state, but want to return to the patrol “direction” after the conversation? You’ll need 2 distinct states.

if we nest our two Patrol states into a state machine called Watch Building, then we can get by with just one Conversation state

The reason this works is that the HFSM structure adds additional hysteresis that isn’t present in an FSM. With a standard FSM, we can always assume that the state machine starts off in its initial state, but this is not the case with a nested state machine in an HFSM. Note the circled “H” in Figure 4.4, which points to the “history state.” The first time we enter the nested Watch Building state machine, the history state indicates the initial state, but from then on it indicates the most recent active state of that state machine.
Our example HFSM starts out in Watch Building (indicated by the solid circle and arrow as before), which chooses Patrol to Safe as the initial state. If our NPC reaches the safe and transitions into Patrol to Door, then the history state switches to Patrol to Door. If the NPC’s phone rings at this point, then our HFSM exits Patrol to Door and Watch Building, transitioning to the Conversation state. After Conversation ends, the HFSM will transition back to Watch Building which resumes in Patrol to Door (the history state), not Patrol to Safe (the initial state).
For a solid detailed implementation, check out Section 5.3.9 in the book Artificial Intelligence for Games by Ian Millington and John Funge [Millington and Funge 09].

page 052:

The algorithm to execute a behavior tree is as follows:
Make root node the current node
While current node exists,
Run current node’s precondition
If precondition returns true,
- Add node to execute list
- Make node’s child current node
Else,
- Make node’s sibling current node
Run all behaviors on the execute list

page 053:

Since trees are stateless, the algorithm doesn’t need to remember what behaviors were previously running in order to determine what behaviors should execute on a given frame. Further, behaviors can (and should) be written to be completely unaware of each other, so adding or removing behaviors from a character’s behavior tree do not affect the running of the rest of the tree. This alleviates the problem common with FSMs, where every state must know the transition criteria for every other state.
Extensibility is also an advantage with behavior trees. It is easy to start from the base algorithm as described and start adding extra functionality. Common additions are behavior on_start/on_finish functions that are run the first time a behavior begins and when it completes. Different behavior selectors can be implemented as well. For example, a parent behavior could specify that instead of choosing one of its children to run, each of its children should be run once in turn, or that one of its children should be chosen randomly to run. Indeed, a child behavior could be run based on a utility system-type selector (see below) if desired. Preconditions can be written to fire in response to events as well, giving the tree flexibility to respond to agent stimuli. Another popular extension is to specify individual behaviors as nonexclusive, meaning that if their precondition is run, the behavior tree should keep checking siblings at that level.
Since behaviors themselves are stateless, care must be taken when creating behaviors that appear to apply memory. For example, imagine a citizen running away from a battle. Once well away from the area, the “run away” behavior may stop executing, and the highest-priority behavior that takes over could take the citizen back into the combat area, making the citizen continually loop between two behaviors. While steps can be taken to prevent this sort of problem, traditional planners can tend to deal with the situation more easily.

page 054:

A utility-based system measures, weighs, combines, rates, ranks, and sorts out many considerations in order to decide the preferability of potential actions.

page 055:

Another caveat to using utility-based architecture is that all the subtlety and responsiveness that you gain often comes at a price. While the core architecture is often relatively simple to set up, and new behaviors can be added simply, they can be somewhat challenging to tune. Rarely does a behavior sit in isolation in a utility-based system. Instead, it is added to the pile of all the other potential behaviors with the idea that the associated mathematical models will encourage the appropriate behaviors to “bubble to the top.” The trick is to juggle all the models to encourage the most reasonable behaviors to shine when it is most appropriate. This is often more art than science. As with art, however, the results that are produced are often far more engaging than those generated by using simple science alone.
For more on utility-based systems, see the article in this book, An Introduction to Utility Theory [Graham 13] and the book Behavioral Mathematics for Game AI [Mark 09].
Goal-Oriented Action Planning (GOAP) is a technique pioneered by Monolith’s Jeff Orkin for the game F.E.A.R. in 2005, and has been used in a number of games since, most recently for titles such as Just Cause 2 and Deus Ex: Human Revolution. GOAP is derived from the Stanford Research Institute Problem Solver (STRIPS) approach to AI which was first developed in the early 1970s. In general terms, STRIPS (and GOAP) allows an AI system to create its own approaches to solving problems by being provided with a description of how the game world works—that is, a list of the actions that are possible, the requirements before each action can be used (called “preconditions”), and the effects of the action.

page 056:

Backwards chaining search works in the following manner:
Add the goal to the outstanding facts list
For each outstanding fact
Remove this outstanding fact
Find the actions that have the fact as an effect
If the precondition of the action is satisfied,
- Add the action to the plan,
- Work backwards to add the now-supported action chain to the plan
Otherwise,
- Add the preconditions of the action as outstanding facts

page 057:

hierarchical task networks (HTN)

page 058:

As opposed to backward planners like GOAP, which start with a desired world state and move backwards until it reaches the current state world state, HTN is a forward planner, meaning that it will start with the current world state and work towards a desired solution.
The following pseudocode shows how a plan is built.
Add the root compound task to our decomposing list
For each task in our decomposing list
Remove task
If task is compound
- Find method in compound task that is satisfied by the current world state
- If a method is found, add method’s tasks to the decomposing list
- If not, restore planner to the state before the last decomposed task
If task is primitive
- Apply task’s effects to the current world state
- Add task to the final plan list
HTN planners start with a very high-level root task and continuously decompose it into smaller and smaller tasks.

· Chapter 5 - Structural Architecture - Common Tricks of the Trade

Kevin Dill

page 064:

because of increasing complexity as you add more configuration, break the decision making up hierarchically. That is, have a high-level reasoner that makes the big, overarching decisions, and then one or more lower-level reasoners that handle implementation of the higher-level reasoners’ decisions.
The advantage here is that the complexity of AI configuration scales worse than linearly on the number of options in a particular reasoner. To give a sense of the relevance, imagine that the cost of configuring the AI is $O(n^2)$ on the number of options (as it is for FSMs). If we have 25 options, then the cost of configuring the AI is on the order of 252 = 625. On the other hand, if we have five reasoners, each with five options, then the cost of configuring the AI is only 5 × (52) = 125.

page 065:

Option stacks allow us to push a new, high priority option on top of the stack, suspending the currently executing option but retaining its internal state. When the high priority option completes execution, it will pop itself back off of the stack, and the previously executing option will resume as if nothing had ever happened. There are a myriad of uses for option stacks, and they can often be several levels deep. For example, a high-level strategic reasoner might have decided to send a unit to attack a distant enemy outpost. Along the way, that unit could be ambushed—in which case, it might push a React to Ambush option on top of its option stack. While responding to the ambush, one of the characters in the unit might notice that a live grenade has just been thrown at its feet. That character might then push an Avoid Grenade option on top of React to Ambush. Once the grenade has gone off (assuming the character lives) it can pop Avoid Grenade off the stack, and React to Ambush will resume. Once the enemy ambush is over, it will be popped as well, and the original Attack option will resume.

page 066:

One handy trick is to use option stacks to handle your hit reaction. If a character is hit by an enemy attack (e.g., a bullet), we typically want them to play a visible reaction. We also want the character to stop whatever it was doing while it reacts. For instance, if an enemy is firing their weapon when we hit them, they should not fire any shots while the hit reaction plays. It just looks wrong if they do. Thus, we push an Is Hit option onto the option stack, which suspends all previously running options while the reaction plays, and then pop it back off when the reaction is done.
In the academic AI community, blackboard architectures typically refer to a specific approach in which multiple reasoners propose potential solutions (or partial solutions) to a problem, and then share that information on a blackboard [Wikipedia 12, Isla et al. 02]. Within the game community, however, the term is often used simply to refer to a shared memory space which various AI components can use to store knowledge that may be of use to more than one of them, or may be needed multiple times.
Line-of-sight and path are examples that could be placed on a blackboard.

page 067:

One trick is to put the intelligence in the world, rather than in the character. This technique was popularized by The Sims, though earlier examples exist. In The Sims (and its sequels), objects in the world not only advertise the benefits that they offer (for example, a TV might advertise that it’s entertaining, or a bed might advertise that you can rest there), they also contain information about how to go about performing the associated actions [Forbus et al. 01].
Another advantage of this approach is that it greatly decreases the cost of expansion packs. In the Zoo Tycoon 2 franchise, for example, every other expansion pack was “ content only.” Because much of the intelligence was built into the objects, we could create new objects that would be used by existing animals, and even entirely new animals, without having to make any changes to the source code.

· Chapter 6 - The Behavior Tree Starter Kit

By: Alex J. Champandard and Philip Dunstan

· Chapter 7 - Real-World Behavior Trees in Script

By: Michael Dawe

process_behavior_node(node)
    if (node.precondition returns true) {
        node.action()
        if (node.child exists)
            process_behavior_node(node.child)
    } else {
    if (node.sibling exists)
        process_behavior_node(node.sibling)
    }

· Chapter 8 - Simulating Behavior Trees A Behavior Tree/Planner Hybrid Approach

By: Daniel Hilburn

page 100:

page 101:

As you can see, planners are great at managing what an AI should do. They allow designers to specify high-level goals for the AI by evaluating world states in the planner’s heuristic, rather than trying design specific behaviors for specific situations. Planners are able to do this by keeping a very strict separation between what an AI does (actions) and what the AI should do (heuristics). This also makes the AI more flexible and durable in the face of design changes. If the jump action gets cut because the team didn’t have time to polish the animations, just remove it. The AI will still create the best possible plan for its current world state. If the kick action suddenly also sends out a shockwave, you only need to add that to the kick action’s result description. You don’t need to change what the AI should do just because you changed what it does.

page 102:

While the flexibility of planners is a great strength, it can also be a great weakness. Often, designers will want to have more control over the sequences of actions that an AI can perform. While it is cool that your AI can create a jump kick plan on its own, it could also create a sequence of 27 consecutive jumps. This breaks the illusion of intelligence our AI should produce
This is the classic tradeoff that AI designers and programmers have to deal with constantly: the choice between the fully designed (brittle) AI that behavior trees provide and the fully autonomous (unpredictable) AI that planners provide.

page 110:

Another wrinkle that arose was the idea of mistakes. It isn’t realistic for the Jedi to always defeat their enemies; they should sometimes fail.

Rather than add these special cases to each Action, we added a special Action called a FakeSim. The FakeSim Action is a special type of Composite Action called a decorator, which wraps another Action to add extra functionality to it. The FakeSim was responsible for adding incorrect information to the wrapped Action’s simulation step by modifying the world state directly. For example, there are some enemies that have a shield which makes them invulnerable to lightsaber attacks. If we want a Jedi to attack the enemy to demonstrate that the enemy is invulnerable while the shield is up, we can wrap the SwingSaber Action with a FakeSim Decorator which lowers the victim’s shield during the simulation step. Then, the SwingSaber simulation will think that the Jedi can damage the enemy and give it a good simulation result. This would allow SwingSaber to be chosen, even though it won’t actually be beneficial.

· Chapter 9 - An Introduction to Utility Theory

By: David “Rez” Graham

page 114:

It’s important to note that utility is not the same as value. Value is a measurable quantity (such as the prices above). Utility measures how much we desire something. This can change based on personality or the context of the situation.
using normalized scores (values that go from 0–1) provide a reasonable starting point.
It’s important to note that any value range will work, as long as there is consistency across the different variables. If an AI agent scores an action with a value of 15, you should know immediately what that means in the context of the whole system. For instance, does that 15 mean 15 out of 25 or 15%?
The key to decision making using utility-based AI is to calculate a utility score (sometimes called a weight) for every action the AI agent can take and then choose the action with the highest score.

page 115:

The most common technique is to multiply the utility score by the probability of each possible outcome and sum up these weighted scores. This will give you the expected utility of the action. This can be expressed mathematically with Equation 9.1.

$$EU = \sum_{i=1}^n D_i P_i$$

$D$ is the desire for that outcome (i.e., the utility), and $P$ is the probability that the outcome will occur. This probability is normalized so that the sum of all the probabilities is 1. This is applied to every possible action that can be chosen, and the action with the highest expected utility is chosen. This is called the principle of maximum expected utility [Russell et al. 09].
For example, an enemy AI in an RPG attacking the player has two possible outcomes— either the AI hits the player or it misses. If the AI has an 85% chance to hit the player, and successfully hitting the player has a calculated utility score of 0.6, the adjusted utility would be 0.85 × 0.6 = 0.51. (Note that, in this case, missing the player has a utility of zero, so there’s no need to factor it in.) Taking this further, if this attack were to be compared to attacking with a different weapon, for example, with a 60% chance of hitting but a utility score of 0.9 if successful, the adjusted utility would be 0.60 × 0.9 = 0.54. Despite having a lesser chance of hitting, the second option provides a greater overall expected utility.
Let’s say we have an ant simulation game where the AI must determine whether to expand the colony or whether to breed. There are three different factors we want to consider for these decisions. The first is the overall crowdedness of the colony. If there are too many ants, we need to expand to make room for more. The second is the health of the colony, which we’ll say is based on how full the food stores are. Ant eggs need to be kept at a specific temperature; so there are specially built nurseries that house the eggs where they are taken care of. The amount of room in these nurseries is the third decision factor. These decision factors are based on game statistics that determine the score for each factor. The population and max population determine how many ants are in the colony and how many can exist based on the current colony size. The food stat represents how full the food stores are and is measured as a number from 0 to \100. The nursery space stat is also measured from 0 to 100 and represents how much space there is in the nursery. You can think of the last stats as percentages.

page 116:

By averaging the normalized scores together, we can build an endless chain of combinations. This is a really powerful concept. Each decision factor is effectively isolated from every other decision factor. The only thing we know or care about is that the output will be a normalized score. We can easily add more game stat inputs, like the distance to an enemy ant colony. This could feed into a decision factor for deciding what kind of ants to breed. You can easily move decision factors around as well, combining them in different ways. If you wanted crowdedness to factor negatively into the decision for breeding, you could subtract crowdedness from 1.0 and average that into the score for breeding.

page 117:

In the ant example above, we chose to represent health as a linear ratio by dividing the current amount of food with the maximum amount of food. This probably isn’t a very realistic calculation since the colony shouldn’t care about food when the stores are mostly full. Some kind of quadratic curve is more of what we want.
The key to utility theory is to understand the relationship between the input and the output, and being able to describe that resulting curve [Mark 09]. This can be thought of as a conversion process, where you are converting one or more values from the game to utility. Coming up with the proper function is really more art than science and is usually where you’ll spend most of your time. There are a huge number of different formulas you could use to generate reasonable utility curves, but a few of them crop up often enough that they warrant some discussion.
A linear curve forms a straight line with a constant slope. The utility value is simply a multiplier of the input. Equation 9.2 shows the formula for calculating a normalized utility score for a given value and Figure 9.2 shows the resulting curve.

$$U =\frac{x}{m}$$

In Equation 9.2, $x$ is the input value and $m$ is the maximum value for that input. This is really just a normalization function, which is all we need for a linear output.
A quadratic function is one that forms a parabolic curve, causing it to start slow and then curve upwards very quickly. The simplest way to achieve this is to add an exponent to Equation 9.2. Equation 9.3 shows an example of this.

$$U =\empheqlparen\frac{x}{m}\empheqrparen^k$$

page 118:

As the value of $k$ rises, the steepness of the curve will also rise. Since the equation normalizes the output, it will always converge on 0 and 1, so a large value of $k$ will have very little impact for low values of $x$. Figure 9.3 shows curves for three different values of $k$. It’s also possible to rotate the curve so that the effect is more urgent for low values of $x$ rather than high values. If you use an exponent between 0 and 1, the curve is effectively rotated, as shown in Figure 9.4.

page 119:

The logistic function is another common formula for creating utility curves. It’s one of several sigmoid functions that place the largest rate of change in the center of the input range, trailing off at both ends as they approach 0 and \1. The input range for the logistic function can be just about anything, but it is effectively limited to $[–10 \dots 10]$. There really isn’t much point generating a curve larger than that and the range is often clamped down even further. For example, when $x$ is 6, $EU$ will be 0.9975.
Equation 9.4 shows the formula for the logistic function and Figure 9.5 shows the resulting curve. Note the use of the constant $e$. This is Euler’s number—the base of the natural logarithm—which is approximately 2.718281828. This value can be adjusted to affect the shape of the curve. As the number goes up, the curve will sharpen and begin to resemble a square wave. As the number goes down, it will soften.

$$U=\frac{1}{1+e^-x}$$

page 120:

A piecewise linear curve is just a custom-built curve. The idea is that you hand-tune a bunch of 2D points that represent the thresholds you want.
You might want hunger to NEVER be selectable when it is below a certain threshold.

There are many other types of custom curves. For example, the curve in Figure 9.6 could be changed so that the values from 15 to 60 are calculated with a quadratic curve, while the rest are linear. There’s no limit to the number of combinations you can have.
Once the utility has been calculated for each action, the next step is to choose one of those actions. There are a number of ways you can do this. The simplest is to just choose the highest scoring option. For some games, this may be exactly what you want. A chess AI should definitely choose the highest scoring move. A strategy game might do the same. For some games (like The Sims), choosing the absolute best action can feel very robotic due to the likelihood that the action will always be selected in that situation. Another solution is to use the utility scores as weight, and randomly choose one of the actions based on the weights. This can be accomplished by dividing each score with the sum of all scores to get the percentage chance that the action will be chosen. Then you generate a random number and select the action that number corresponds to. This tends to have the opposite problem, however. Your AI agents will behave reasonably well most of the time, but every now and then, they’ll choose something utterly stupid.

page 121:

You can get the best of both worlds by taking a subset of the highest scoring actions and choosing one of those with a weighted random. This can either be a tuned value, such as choosing from among the top five scoring actions, or it can be percentile based where you take the highest score and also consider things that scored within, say, 10% of it.
there could also be times when some set of actions are just completely inappropriate. You may not even want to score them. For example, say you’re making an FPS and have a guard AI. You might have some set of actions for him to consider, like getting some coffee, chatting with his fellow guard, checking for lint, etc. If the player shoots at him, he shouldn’t even consider any of those actions
The most straightforward way to solve this is with bucketing, also known as dual utility AI [Dill 11]. All actions are categorized into buckets and each bucket is given a weight. The higher priority buckets are always processed first.
In Figure 9.7, you can see that there are two buckets, one for Hunger and one for Fun. Hunger has scored 0.8 while Fun has scored 0.4. The Sim will walk through all possible actions in the Hunger bucket and, assuming any of those actions are valid, will choose one. The Sim will not consider anything in the Fun bucket

page 122:

The buckets themselves are scored based on a response curve created by designers.
One issue that’s worth bringing up in any AI system is the concept of inertia. If your AI agent is attempting to decide something every frame, it’s possible to run into oscillation issues, especially if you have two things that are scored similarly. For example, say you have FPS where the AI realizes it’s in a bad spot. The enemy soldier starts scoring both “attack the player” and “run away” at 0.5. If the AI was making a new decision every frame, it is possible that they would start appearing very frantic. The AI might shoot the player a couple times, start to run away, then shoot again, then repeat. Oscillations in behavior such as this look very bad.
One solution is to add a weight to any action that you are already currently engaged in. This will cause the AI to tend to remain committed until something truly better comes along. Another solution is to use cooldowns. Once an AI agent makes a decision, they enter a cooldown stage where the weighting for remaining in that action is extremely high. This weight can revert at the end of the cooldown period, or it can gradually drop as well. Another solution is to stall making another decision—either for a period of time or until such time as the current action is finished. This really depends on the type of game you’re making and how your decision/action process works, however. On The Sims Medieval, a Sim would only attempt to make a decision when their interaction queue was empty. Once they chose an action, they would commit to performing that action. Once the Sim completed (or failed to complete) their action, they would choose a new action.

page 123:

When making decisions, the AI considers four basic factors. The first is a desire to attack, which is based on a tuned value that scales linearly as it becomes possible to kill the player in a single hit. This causes the actor to get more aggressive during the end-game and take more risks, as shown in Equation 9.5. This is a good example of a range-bound linear curve. The value of a in the equation is the tuned aggression of the actor, which is the default score.

$$U = max\Biggl( min\biggl( \Bigl( 1-\frac{hp-minDmg}{maxDmg-minDmg} \Bigl) \times(1-a)+a,1 \biggl),0 \Biggl)$$

Figure 9.8 shows the resulting curve from Equation 9.5 where a is set to 0.6.
The second decision factor is the threat. This is a curve that measures what percentage of the actor’s current hit points will be taken away if the player hits for maximum damage. It has a shape similar to a quadratic curve and is generated with Equation 9.6.

$$U = min \empheqlparen \frac{maxDmg}{hp} , 1 \empheqrparen$$

Figure 9.9 shows the resulting curve for Threat.
The third decision factor is the actor’s desire for health. This uses a variation of the logistics function in Equation 9.4. As the actor’s hit points are reduced, its desire to heal will rise. Equation 9.7 shows the formula for this decision factor.

$$U = 1 - \frac{1}{1+(e\times0.68)^{-\empheqlparen\frac{hp}{maxHp}\times12\empheqrparen+6}}$$

The resulting curve is a nice, smooth, sigmoid curve, which is shown in Figure 9.10. Note the addition of +6 to the exponent. This is what pushes the curve over to the positive x-axis rather than centering around 0.

page 124:

The final decision factor is the desire to run away. This is a quadratic curve with a steepness based on the number of potions the agent has. If the agent has several potions, the likelihood of running away is extremely small. If the agent has none, this desire grows much faster. Equation 9.8 shows the formula for the run desire.

$$U=1-\empheqlparen\frac{hp}{maxHp}\empheqrparen^{\frac{1}{(p+1)^4} \times 0.25}$$

The curve itself is dependent on the value of $p$, which is the number of potions the actor has left. Figure 9.11 shows various curves for various values of $p$.

page 126:

References
[Dill 11] K. Dill. “A game AI approach to autonomous control of virtual characters.” Interservice/ Industry Training, Simulation, and Education Conference, 2011, pp. 4–5. Available online (http://www.iitsec.org/about/PublicationsProceedings/Documents/11136_Paper.pdf).
[Mark 09] D. Mark. Behavioral Mathematics for Game AI. Reading, MA: Charles River Media, 2009, pp 229–240.
[Russell et al. 09] S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Reading, MA: Prentice Hall, 2009, pp. 480–509.

· Chapter 10 - Building Utility Decisions into Your Existing Behavior Tree

By: Bill Merrill

page 129:

In a standard behavior tree, priority is static. It is baked right into the tree. The simplicity is welcome, but in practice it can be frustratingly limiting. The same behavior may require different relative priorities, depending on the context. Ensuring our Monster Hunter’s primary weapon has a full clip should always be a consideration, even if we’re casually patrolling the jungle. But if we’re engaged with a savage monster, it’s absolutely necessary that we continue to deal damage. Behavior tree authors often deal with this conundrum by duplicating sections of the tree at different branches, with different conditions and/or priorities. Even with slick sub-tree instancing or referencing, this still becomes inefficient, verbose, and potentially fragile.
Decisions are rarely binary, and many behaviors simply do not have priorities we can comfortably establish offline. Let’s start with a simple example behavior tree (Figure 10.1). Having no ability to shoot is a precondition for the Seek Medic behavior, forcing us to duplicate the behavior, as seen in Figure 10.2. We could start by giving Seek Medic stricter conditions and prioritizing it over Shoot, but this will likely create the opposite problem where the Monster Hunter immediately takes the Seek Medic action the instant conditions pass. This is the sort of fundamental problem we want to address with the integration of utility.

page 131:

if our Monster Hunter is low on health and wishes to consider rendezvousing with the squad’s medic for a health boost, we can measure the benefits of receiving treatment in health points gained. However, running frantically to a safe position is likely to gain the attention of the alien beast, putting us at a risk. If we can measure the risk by predicting the health we’re likely to lose in transit, both inputs are now in terms of health points and can be combined and/or compared directly, as in Equation (10.1). We could simply take their sum, and if the net value is positive, taking this action has some benefit we can weigh against other actions.

$$RawUtility = HealthGained – HealthLost$$

More desirably, by attaching more weight to the amount of health we’ll lose in transit, we can ensure that we only take this action if we expect to net a significant amount of health, as seen in Equation (10.2). After all, breaking even would be a waste of the time we could’ve otherwise spent slaying the creature. We also want a high degree of confidence that, even if our predictions were overly optimistic, we’re unlikely to end up with a net loss in health and looking rather boneheaded as a result. Naturally there’s more we could do, such as apply an exponential scale to HealthLost, which causes the utility to fall off more rapidly as the risk grows, as in Equation (10.3).

$$Value = HealthGained – (HealthLost × 2.0)$$

$$Value = HealthGained – (pow(HealthLost,1.2))$$

What happens if we’re unable to represent our input values in such easily relatable units, and we wish to consider much more than just a net change in health? One way to combat this scenario is to combine the various influences into higher-level, more abstract values such as “Morale,” “Threat,” etc. The utility of running to visit our medic could also take into consideration the lost time we could’ve otherwise spent damaging the monster. Specifically, we could take our formula above, normalize the result, and classify it as a “Heal” factor. Next, we could generate a second formula representing this time lost, normalize it, and classify it as “Delay.” We now have two normalized quantities representing higher-level valuations, which we can combine into a final utility value.

$$Utility = \frac{Heal\times HealPower - Delay \times DelayPower}{HealPower + DelayPower}$$

page 132:

The tree already features a component for selecting which branches are taken during execution, namely the selector. To introduce utility-based selection, we’ll simply create a new specialized type of selector that considers not just the binary validity of its children, but their relative utility as well. We’ll cleverly dub the new node type the utility selector.
we can address our problem with Seek Medic by switching Combat to a utility selector, as we’ve done in Figure 10.3.

page 134:

a utility selector must expand all nodes in its child sub-trees, potentially conducting large quantities of utility calculations in a single pass. This may or may not be a problem depending on the scale you’re working with, but with complex utility calculations in large behavior trees on platforms sensitive to random memory access patterns, it’s certainly not ideal. Thankfully, there are ways to mitigate this problem. For one, we could limit utility calculations to some interval within our leaf behaviors’ implementations, and return cached values. Alternatively, we could compute utility values for all of our tree’s leaf nodes within a completely separate pass, with its own load balancing, leaving only cached values to be used during calls to CalculateUtility().

· Chapter 11 - Reactivity and Deliberation in Decision-Making Systems

By: Carle Côté

page 137:

Reactivity is about the ability of an agent to be responsive when stimuli are perceived in its environment, while deliberation is about the ability of an agent to make decisions and engage consequent actions.

page 138:

an agent needs to gather information from its environment (Sense), use the collected information in some decision process to decide what to do next (Think), engage new actions accordingly (Act), and repeat these steps over and over to create autonomy.

page 140:

page 141:

By looking at the timeline, we can observe a lot of transitions in the behavior track. They represent the agent changing its stance from running at the threat to a slow-paced patrol stance multiple times within a couple of seconds because of the vision system losing direct line of sight with the threat. From the player’s perspective, the behavior transitions would seem off, and they would most likely be judged as undesirable behavioral artifacts caused for no apparent reason. This is without mentioning that the animation system might not even be responsive enough to execute these fast stance transitions without creating animation popping artifacts. This is a good example to show where responsiveness isn’t the only criterion that needs to be considered by deliberation decision models; sustaining actions for the proper amount of time is also crucial to delivering believable behaviors. In this case, we can solve this issue by hooking the transition’s threat-sighted symbol to a logical representation of seeing/losing a threat in a chase that would include some form of filtering (using hysteresis algorithms or other similar methods) to avoid creating undesirable oscillations. Figure 11.6 shows an ideal version of the timeline resulting from that logical representation.

page 142:

Figure 11.7 represents the same FSM example presented in Section 11.3.1 but including two reaction states: Hurt and Suffocate. Because Chase, Patrol, Combat, and Flee are deliberation states that are designed to be active as long as possible, they are susceptible to be interrupted at any time. This explains why, in Figure 11.7, we can see that every deliberation state has transitions to every reaction state. Consequently, adding new reaction states to the model would require new transitions from all of the existing deliberation states. The same applies when adding new deliberation states to the model. With increased complexity, it’s easy to see that the model will be hard to understand and maintain mostly because it tries to mix two very different kinds of transition dynamics within the same model. To solve this issue, it would be interesting to consider using multiple decision models that can interact together.

page 143:

It is possible to avoid the limitation of using only one decision model. Figure 11.8 shows an architectural solution allowing multiple decision models. The design principle is pretty simple: create a module (Action Selector) responsible to act as a selector switch between Deliberation and Reactivity modules. With this architectural solution, Deliberation and Reactivity modules can use their own decision models as long as they can both receive the same stimuli and output their respective set of actions. For example, the Deliberation module could use the FSM presented in Figure 11.4, while the Reactivity module could use a very simple set of rules or a decision tree to evaluate which reaction should be requested according to perceived stimuli. As for the implementation of the Action Selector itself, it can also be done with its own decision model as long as it’s able to signal the Deliberation module when a new decision must be taken or to cancel the current deliberate action in order to execute a reaction. Figure 11.9 shows the resulting timeline.

page 146:

· Chapter 12 - Exploring HTN Planners through Example

By: Troy Humphreys

page 149:

hierarchical task networks (HTN)

page 153:

In our previous example, using the tree trunk as a melee weapon and throwing boulders are both methods to the AttackEnemy compound task. The conditions in which we decide which method to use depend on whether the troll has a tree trunk or not. Here is an example of the AttackEnemy task using the notation above.

Compound Task [AttackEnemy]
    Method 0 [WsHasTreeTrunk == true]
        Subtasks [NavigateTo(EnemyLoc), DoTrunkSlam()]
    Method 1 [WsHasTreeTrunk == false]
        Subtasks [LiftBoulderFromGround(), ThrowBoulderAt(EnemyLoc)]

page 154:

By understanding how compound tasks work, it’s easy to imagine how we could have a large hierarchy that may start with a BeTrunkThumper compound task that is broken down into sets of smaller tasks—each of which are then broken into smaller tasks, and so on. This is how HTN forms a hierarchy that describes how our troll NPC is going to behave. It’s important to understand that compound tasks are really just containers for a set of methods that represent different ways to accomplish some high level task. There is no compound task code running during plan execution.
A domain is the term used to describe the entire task hierarchy.
We start with a compound task called BeTrunkThumper. This root task encapsulates the “main idea” of what it means to be a Trunk Thumper.

Compound Task [BeTrunkThumper]
    Method [WsCanSeeEnemy == true]
        Subtasks [NavigateToEnemy(), DoTrunkSlam()]
    Method [true]
        Subtasks [ChooseBridgeToCheck(), NavigateToBridge(), CheckBridge()]

As you can see with this root compound task, the first method defines the troll’s highest priority. If he can see the enemy, he will navigate using NavigateToEnemy task and attack his enemy with the DoTrunkSlam task. If not, he will fall to the next method. This next method will run three tasks; choose the next bridge to check, navigate to that bridge, and check the bridge for enemies. Let’s take a look at the primitive tasks that make up these methods and the rest of the domain.

Primitive Task [DoTrunkSlam]
    Operator [AnimatedAttackOperator(TrunkSlamAnimName)]
Primitive Task [NavigateToEnemy]
    Operator [NavigateToOperator(EnemyLocRef)]
        Effects [WsLocation = EnemyLocRef]
Primitive Task [ChooseBridgeToCheck]
    Operator [ChooseBridgeToCheckOperator]
Primitive Task [NavigateToBridge]
    Operator [NavigateToOperator(NextBridgeLocRef)]
        Effects [WsLocation = NextBridgeLocRef]
Primitive Task [CheckBridge]
    Operator [CheckBridgeOperator(SearchAnimName)]

page 155:

With a domain made up of compound and primitive tasks, we are starting to form an image of how these are put together to represent an NPC. Combine that with the world state and we can talk about the work horse of our HTN, the planner.
There are three conditions that will force the planner to find a new plan: the NPC finishes or fails the current plan, the NPC does not have a plan, or the NPC’s world state changes via a sensor.
To do this, the planner starts with a root compound task that represents the problem domain in which we are trying to plan for. Using our earlier example, this root task would be the BeTrunkThumper task. This root task is pushed onto the TasksToProcess stack. Next, the planner creates a copy of the world state. The planner will be modifying this working world state to “ simulate” what will happen as tasks are executed. After these initialization steps are taken, the planner begins to iterate on the tasks to process. On each iteration, the planner pops the next task off the TasksToProcess stack. If it is a compound task, the planner tries to decompose it—first, by searching through its methods looking for the first set of conditions that are valid. If a method is found, that method’s subtasks are added on to the TaskToProcess stack. If a valid method is not found, the planner’s state is rolled back to the last compound task that was decomposed.

page 156:

As you might have realized, the planner uses a depth-first search to find a valid plan. This does mean that you may have to explore the whole domain to find a valid plan. However, it’s important to remember that you are traversing a hierarchy of tasks. This hierarchy allows the planner to cull large sections of the network via the compound task’s methods. Because we aren’t using a heuristic or cost—such as with A* and Dijkstra searches—we can skip any kind of sorting.

page 157:

After seeing our troll in game, the designers think that the tree trunk attack is a little over- powered. They suggest that the trunk breaks after three attacks, forcing the troll to search for another one. First we can add the property WsTrunkHealth to the world state.

Compound Task [BeTrunkThumper]
    Method [ WsCanSeeEnemy == true]
        Subtasks [AttackEnemy()]// using the new compound task
    Method [true]
        Subtasks [ChooseBridgeToCheck(), NavigateToBridge(), CheckBridge()]
Compound Task [AttackEnemy]//new compound task
    Method [WsTrunkHealth > 0]
        Subtasks [NavigateToEnemy(), DoTrunkSlam()]
    Method [true]
        Subtasks [FindTrunk(), NavigateToTrunk(), UprootTrunk(), AttackEnemy()]
Primitive Task [DoTrunkSlam]
    Operator [DoTrunkSlamOperator]
        Effects [WsTrunkHealth += -1]
Primitive Task [UprootTrunk]
    Operator [UprootTrunkOperator]
        Effects [WsTrunkHealth = 3]
Primitive Task [NavigateToTrunk]
    Operator [NavigateToOperator(FoundTrunk)]
    Effects [WsLocation = FoundTrunk]

page 159:

The designer asks you to implement a behavior that will chase after the enemy and react once he sees the enemy again. Let’s look at the changes we could make to the domain to handle this issue.

Compound Task [BeTrunkThumper]
    Method [ WsCanSeeEnemy == true]
        Subtasks [AttackEnemy()]
    Method [ WsHasSeenEnemyRecently == true]//New method
        Subtasks [NavToLastEnemyLoc(), RegainLOSRoar()]
    Method [true]
        Subtasks [ChooseBridgeToCheck(), NavigateToBridge(), CheckBridge()]
Primitive Task [NavToLastEnemyLoc]
    Operator [NavigateToOperator(LastEnemyLocation)]
        Effects [WsLocation = LastEnemyLocation]
Primitive Task [RegainLOSRoar]
    Preconditions[WsCanSeeEnemy == true]
    Operator [RegainLOSRoar()]

Expected effects are effects that get applied to the world state only during planning and plan validation. The idea here is that you can express changes in the world state that should happen based on tasks being executed. This allows the planner to keep planning farther into the future based on what it believes will be accomplished along the way.

Primitive Task [NavToLastEnemyLoc]
    Operator [NavigateToOperator(LastEnemyLocation)]
        Effects [WsLocation = LastEnemyLocation]
        ExpectedEffects [WsCanSeeEnemy = true]

Now when this task gets popped off the decomposition list, the working world state will get updated with the expected effect and the RegainLOSRoar task will be allowed to proceed with adding tasks to the chain. This simple behavior could have been implemented a couple of different ways, but expected effects came in handy more than a few times during the development of Transformers: Fall of Cybertron. They are a simple way to be just a little more expressive in a HTN domain.

page 160:

To this point, we have been decomposing compound tasks based on the order of the task’s methods. This tends to be a natural way of going about our search, but consider these attack changes to our Trunk Thumper domain.

Compound Task [AttackEnemy]
    Method [WsTrunkHealth > 0, AttackedRecently == false, CanNavigateToEnemy == true]
        Subtasks [NavigateToEnemy(), DoTrunkSlam(), RecoveryRoar()]
    Method [WsTrunkHealth == 0]
        Subtasks [FindTrunk(), NavigateToTrunk(), UprootTrunk(), AttackEnemy()]
    Method [true]
        Subtasks [PickupBoulder(), ThrowBoulder()]
Primitive Task [DoTrunkSlam]
    Operator [DoTrunkSlamOperator]
        Effects [WsTrunkHealth += -1, AttackedRecently = true]
Primitive Task [RecoveryRoar]
    Operator [PlayAnimation(TrunkSlamRecoverAnim)]
Primitive Task [PickupBoulder]
    Operator [PickupBoulder()]
Primitive Task [ThrowBoulder]
    Operator [ThrowBoulder()]

page 165:

Partial planning is one of the most powerful features of HTN. In simplest terms, it allows the planner the ability to not fully decompose a complete plan. HTN is able to do this because it uses forward decomposition or forward search to find plans. That is, the planner starts with the current world state and plans forward in time from that. This allows the planner to only plan ahead a few steps.
GOAP and STRIPS planner variants, on the other hand, use a backward search [Jorkin 04]. This means the search makes its way from a desired goal state toward the current world state. Searching this way means the planner has to complete the entire search in order to know what first step to take.

· Chapter 13 - Hierarchical Plan-Space Planning for Multi-unit Combat Maneuvers

By: William van der Sterren

page 181:

A third way to consider fewer plans is the hierarchical plan-space planner’s ability to plan from the “middle-out.” In the military, planning specialists mix forward planning and reverse planning, sometimes starting with the critical step in the middle. When starting in the middle (for example, with the air landing or a complex attack), they subsequently plan forward to mission completion and backward to mission start. The military do so because starting with the critical step drastically reduces the number of planning options to consider.

· Chapter 14 - Phenomenal AI Level-of-Detail Control with the LOD Trader

By: Ben Sunshine-Hill

page 188:

BIR = Break in Reality
An unrealistic state (US) BIR is the most immediate and obvious type of BIR, where a character’s immediately observable simulation is wrong. A character eating from an empty plate, or running in place against a wall
A fundamental discontinuity (FD) BIR is a little more subtle, but not by much: it occurs when a character’s current state is incompatible with the player’s memory of his past state. A character disappearing while momentarily around a corner, or having been frozen in place for hours while the player was away

page 189:

An unrealistic long-term behavior (ULTB) BIR is the subtlest: It occurs only when an extended period of observation reveals problems with a character’s behavior. A character wandering randomly instead of having goal-driven behaviors
a tool which will be used in a lot of them: the exponential moving average (EMA). The EMA is a method for smoothing and averaging an ongoing sequence of measurements. Given an input function $F(t)$ we produce the output function $G(t)$. We initialize $(0) = F(0)$, and then at each time $t$ we update $G(t)$ as $G(t) = (1 - \alpha)F(t)+ \alpha G(t- \Delta t)$ , where Δt is the timestep since the last measurement. The $\alpha$ in that equation is calculated as $\alpha = e^{-k \Delta t}$, where $k$ is the convergence rate (higher values lead to faster changes in the average). You can tune $k$ to change the smoothness of the EMA, and how closely it tracks the input function. We’re going to use the EMA a lot in these models, so it’s a good idea to familiarize yourself with it (Figure 14.2).

· Chapter 15 - Runtime Compiled C++ for Rapid AI Development

By: Doug Binks, Matthew Jack, and Will Wilson

· Chapter 16 - Plumbing the Forbidden Depths Scripting and AI

By: Mike Lewis

page 220:

The most successful approaches to scripting will generally fall into one of two camps. First is the “scripts as master” perspective, wherein scripts control the high-level aspects of agent decision making and planning. The other method sees “scripts as servant,” where some other architecture controls the overall activity of agents, but selectively deploys scripts to attain specific design goals or create certain dramatic effects.
In general, master-style systems work best in one of two scenarios. In the optimal case, a library of ready-to-use tools already exists, and scripting can become the “glue” that combines these techniques into a coherent and powerful overarching model for agent behavior.
servant scripts are most effective when design requires a high degree of specificity in agent behavior. This is the typical sense in which interactions are thought of as “scripted”; a set of possible scenarios is envisioned by the designers, and special-case logic for reacting to each scenario is put in place by the AI implementation team. Servant scripts need not be entirely reactive, however; simple scripted loops and behavioral patterns can make for excellent ambient or “background” AI.

page 232:

Last, but certainly not least, integrated architectures provide an illustrative method for writing almost any large-scale code. The layered approach has been heavily encouraged for decades, with notable proponents including Fred Brooks and the SICP course from MIT. Learning to structure code in this way can be a powerful force multiplier for creating clean, well separated modules for the rest of the project, even well outside the scope of AI systems.
1986 MIT course: https://www.youtube.com/playlist?list=PL8FE88AA54363BC46

page 233:

The chief problem with overusing scripting is combinatorial explosion.

page 236:

When using scripting, this often simply boils down to keeping a trace log of the steps that have been performed by the agent, and, where applicable, what branches have been selected and how often loops have been repeated. Being able to select an agent and view a debug listing of its complete script state is also an invaluable tool.
Scripts should be seen as a sort of glue that attaches various decision-making, planning, and knowledge representation systems into a cohesive and powerful whole.

Part 3 - Movement and Pathfinding

· Chapter 17 - Pathfinding Architecture Optimizations

By: Steve Rabin and Nathan R. Sturtevant

page 242:

A* and pathfinding
Precompute Every Single Path (Roy–Floyd–Warshall) While at first glance it seems ridiculous, it is possible to precompute every single path in a search space and store it in a look-up table. The memory implications are severe, but there are ways to temper the memory requirements and make it work for games. The algorithm is known in English-speaking circles as the Floyd–Warshall algorithm, while in Europe it is better known as Roy–Floyd.
Roy–Floyd–Warshall is the absolute fastest way to generate a path at runtime. It should routinely be an order of magnitude faster than the best A* implementation.
The look-up table is calculated offline before the game ships.
The look-up table requires $O(n^2)$ entries, where n is the number of nodes. For example, for a 100 by 100 grid search space, there are 10,000 nodes. Therefore, the memory required for the look-up table would be 100,000,000 entries (with 2 bytes per entry, this would be ~200 MB).
Path generation is as simple as looking up the answer. The time complexity is $O(p)$, where p is the number of nodes in the final path.

page 243:

Figure 17.1 shows a search space graph and the resulting tables generated by the Roy–Floyd–Warshall algorithm. A full path is found by consecutively looking up the next step in the path (left table in Figure 17.1). For example, if you want to find a final path from B to A, you would first look up the entry for (B, A), which is node D. You would travel to node D, then look up the next step of the path (D, A), which would be node E. By repeating this all the way to node A, you will travel the optimal path with an absolute minimum amount of CPU work. If there are dynamic obstacles in the map which must be avoided, this approach can be used as a very accurate heuristic estimate, provided that distances are stored in the look-up table instead of the next node to travel to (right table in Figure 17.1).

As we mentioned earlier, in games you can make the memory requirement more reasonable by creating minimum node networks that are connected to each other [Waveren 01, van der Sterren 04]. For example if you have 1000 total nodes in your level, this would normally require 10002 = 1,000,000 entries in a table. But if you can create 50 node zones of 20 nodes each, then the total number of entries required is 50 × 202 = 20,000 (which is 50 times fewer entries).

page 244:

Another approach to reducing the memory requirement is to compress the Roy–Floyd–Warshall data. Published work [Botea 11] has shown the effectiveness of compressing the data, and this approach fared very well in the 2012 Grid-Based Path Planning competition (http://www.movingai.com/GPPC/), when sufficient memory was available.
An alternate way to compress the Roy–Floyd–Warshall data is to take advantage of the structure of the environment. In many maps, but not all maps, there are relatively few optimal paths of significant length through the state space, and most of these paths overlap. Thus, it is possible to find a sparse number of “transit nodes” through which optimal paths cross [Bast et al. 07]. If, for every state in the state space, we store the path to all transit nodes for that state, as well as the optimal paths between all transit nodes, we can easily reconstruct the shortest path information between any two states, using much less space than when storing the shortest path between all pairs of states. This is one of several methods which have been shown to be highly effective on highway road maps [Abraham et al. 10].
The full Roy–Floyd–Warshall data results in very fast pathfinding queries, at the cost of memory overhead. In many cases you might want to use less memory and more CPU, which suggests building strong, but not perfect heuristics.
Imagine if we store just a few rows/columns of the Roy–Floyd–Warshall data. This corresponds to keeping the shortest paths from a few select nodes. Fortunately, improved distance estimates between all nodes can be inferred from this data. If $d(x, y)$ is the distance between node x and y, and we know $d(p, z)$ for all $z$, then the estimated distance between $x$ and $y$ is $h(x, y) = |d(p, x) – d(p, y)|$, where $p$ is a pivot node that corresponds to a single row/column in the Roy–Floyd–Warshall data. With multiple pivot nodes, we can perform multiple heuristic lookups and take the maximum. The improved estimates will reduce the cost of A* search. This approach has been developed in many contexts and been given many different names [Ng and Zhang 01, Goldberg and Harrelson 05, Goldenberg et al. 11, Rayner et al. 11]. We prefer the name Euclidean embedding, which we will justify shortly. First, we summarize the facts about this approach:
Euclidean embeddings can be far more accurate than the default heuristics for a map, and in some maps are nearly as fast as Roy-Floyd-Warshall.
The look-up table can be calculated before the game ships or at runtime, depending on the size and dynamic nature of the maps.
The heuristic requires $O(kn)$ entries, where $n$ is the number of nodes and $k$ is the number of pivots.
Euclidean embeddings provide a heuristic for guiding A* search. Given multiple heuristics, A* should usually take the maximum of all available heuristics.
Why do we call this a Euclidean embedding? Consider a map that is wrapped into a spiral, such as in Figure 17.2. Points A and B are quite close in the coordinates of the map, but quite far when considering the minimal travel distance between A and B. If we could just unroll the map into a straight line, the distance estimates would be more accurate. Thus, the central problem is that the coordinates used for aesthetic and gameplay purposes are not the best for A* search purposes. That is, they do not provide accurate heuristic estimates. If we could provide a different set of coordinates optimized for A* search, we could use these coordinates to estimate distances between nodes and have a higher quality heuristic. This process of transforming a map into a new state space where distance estimates are (hopefully) more accurate is called an embedding. A single-source shortest-path search from a pivot node is equivalent to performing a one-dimensional embedding, as each node gets a single coordinate, and the heuristic in this embedding is the distance between the embedded points. Other types of embeddings are possible, just not yet well understood.

page 245:

The key question of this approach is how the pivots should be selected. In general, a pivot should not be at the center of the map, but near the edges. The heuristic from pivot p between nodes $x$ and $y$ will be most accurate when the optimal path from $p$ to $x$ goes through $y$. In many games, there are locations where characters will commonly travel, which suggests good locations for pivots. In a RPG, for instance, entrance and exit points to an area are good locations. In a RTS, player bases would be most useful. In a capture-the-flag FPS, the location of the flag would probably work well.

page 246:

If you have a large world, none of the three search space representations will be sufficient to keep CPU load to a minimum. Instead, you need to resort to subdividing the search space with a hierarchical representation. Hierarchical pathfinding is the concept that the search space can be subdivided into at least two levels: a high-level zone-to-zone representation and a low-level step-by-step representation [Rabin 00].

page 247:

In order for A* to guarantee an optimal path, the heuristic must be admissible, meaning that the heuristic guess of the cost from the current node to the goal node must never overestimate the true cost. However, by using an overestimating heuristic, you can get a tremendous speed-up at the possible expense of a slightly nonoptimal path. While this sounds like a terrible trade-off initially, it turns out that a small amount of overestimating has large benefits with very little noticeable nonoptimality. In the world of search algorithms this once might have been seen as heresy, but in the video game industry it’s a shrewd and worthwhile optimization.
In order to understand how to overestimate the heuristic, let’s first look at Equation 17.1, which is the classic A* cost formula. As you can see, the final cost, $f(x)$, is the sum of the given cost, $g(x)$, and the heuristic cost, $h(x)$. Each node added to the Open List gets this final cost assigned to it and this is how the Open List is sorted.

$$ f(x) = g(x) + h(x)$$

shows the addition of a weight on the heuristic portion of the formula.

$$ f(x) = g(x) + (h(x) \times weight)$$

By altering the weight, we can tune how A* behaves. If the weight is zero, then the formula reduces down to just $g(x)$, which is identical to the Dijkstra search algorithm. This approach is guaranteed to find an optimal path, but is not a “smart” search because it explores uniformly outward in all directions. If the weight is 1.0, then the equation is the classic A* formula , guaranteed to expand the minimal number of nodes needed to find an optimal path given the current heuristic estimate, modulo tie-breaking. If the weight is larger than 1.0, then we are tilting the algorithm toward the behavior of Greedy Best-First search, which is not optimal but focuses the search on finding the goal as quickly as possible. Thus, we can tune A* with the weight to lean it toward Dijkstra or Greedy Best-First. By using weights in the neighborhood of 1.1 to 1.5 or higher, we can progressively force the search to more aggressively push toward the goal node, at the increasing expense of a possible sub-optimal path. When the terrain is filled with random obstacles that resemble columns or trees, then a larger weight makes a lot of sense and the path is not noticeably suboptimal. However, if significant backtracking away from the goal is required for the final path, then a lower weight is advisable.

page 249:

Bad Idea #1: Simultaneous Searches
supporting many simultaneous searches at the same time is fraught with disaster. The primary problem is that you’ll need to support separate Open Lists for each request. The implications are severe as to the amount of memory required, and the subsequent thrashing in the cache can be devastating.
But what is to be done about a single search that holds up all other searches? On one hand, this might be a false concern because your pathfinding engine should be blindingly fast for all searches. If it isn’t, then that’s an indication that you chose the wrong search space representation or should be using hierarchical pathfinding.
However, if we concede that a single search might take a very long time to calculate, then one solution is to learn from supermarkets. The way supermarkets deal with this problem is to create two types of check-out lanes. One is for customers with very few items (10 items or less) and one for the customers with their cart overflowing with groceries. We can do a similar thing with pathfinding by allowing up to two searches at a time. One queue is for requests deemed to be relatively fast (based on distance between the start and goal) and one queue for requests deemed to take a long time (again based on distance between the start and goal).

· Chapter 18 - Choosing a Search Space Representation

By: Nathan R. Sturtevant

page 253:

for reference, an example map is shown in Figure 18.1(a). Figure 18.1(b) shows the grid decomposition of the map, Figure 18.1(c) shows a waypoint graph on the map, and Figure 18.1(d) shows a triangle decomposition, which is a type of navigation mesh.

Memory usage is measured simply by the overhead of building and storing the representation of the map in memory. Localization is the process of moving from a spatial coordinate to a representation that is native to the path planning representation. When a user clicks the mouse, for instance, the coordinates of the click are recorded. This must then be converted into a grid cell or polygon in a navigation mesh. Planning is the cost of finding a valid path between two locations. Smoothing and path following is the process of taking a planned path and removing sharp turns or discontinuities to improve the overall quality. This can be done as part of planning, postplanning, or while the path is being followed by a character. Dynamic modification is the cost of performing changes to the representation on the fly, while the game is being played.

page 255:

The pros (of grids) are:
- Grids are one of the simplest possible representations and are easy to implement. A working implementation can be completed in a few hours.
- A grid representation can be easily edited externally with a text editor. This can save significant tool-building efforts [Van Dongen 10].
- Terrain costs in grids are easy to dynamically update. For example, player-detected traps in Dragon Age: Origins are easily marked with a few bits in the relevant grid cells. It is easy for A* to account for these costs when planning, although the cost of planning will be increased if too many cells are re-weighted.
- Passable cells can be quickly modified in a grid in a similar way to terrain costs being updated.
- Localization in a grid is easy, simply requiring the coordinates to be divided by the grid resolution to return the localized grid cell.
The cons (of grids) are:
- Grids are memory-intensive in large worlds. Note that a sparse representation can be used when the world is large, but the walkable space is relatively small [Sturtevant 11].
- Path smoothing usually must be performed to remove the characteristic 45° and 90° angles that are found in grid-based movement, although any-angle planning approaches can also be used [Nash et al. 07].
- Path planning in grids can be expensive due to the fine-grain representation of the world. This can be addressed using some form of abstraction [Rabin 00, Sturtevant 07].
- Grid worlds often contain many symmetric paths, which can increase the cost of path planning. Some techniques can be used to avoid this (e.g., [Harabor and Grastien 11]), but this can also be avoided with different state representations.

page 256:

Waypoint graphs represent the world as an abstract graph.
The pros (of waypoint graphs) are:
- Waypoint graphs are relatively easy to implement.
- Waypoint graphs are easy to modify if the changes are known ahead of time. For instance, if a door in the world closes and is locked, it is easy for the developer to mark the edges in the graph that cross the opening of the door and block them when the door is shut.
- Waypoint graphs represent only a small fraction of the points found in a grid. This sparse representation of walkable space is both cheap to store and leads to inexpensive path planning requests.
The cons (of waypoint graphs) are:
- Path quality can suffer if there are not enough walkable edges in the graph, but too many walkable edges will impact storage and planning complexity.
- Waypoint graphs may require manual placement of nodes to get good path quality.
- Localization on waypoint graphs requires mapping between game space and the graph. If a character is knocked off of the graph, it may be unclear where the character should actually be within the waypoint graph.
- Because there is no explicit representation of the underlying state space, smoothing off the waypoint graph can result in characters getting stuck on physics or other objects.
- Dynamic changes are difficult when they aren’t known ahead of time. If a char- acter can create an unexpected hole in a wall, new connections on the waypoint graph are needed. However, it can be expensive to check all nearby connections to verify if they have become passable due to the changes in the map.
Navigation meshes represent the world using convex polygons [Tozour 04]. A special case of navigation meshes are constrained Delaunay triangulations [Chen 09], for which the world is only represented by triangles. Note that grids can also be seen as a special case of navigation meshes, as both representations use convex polygons, but their usage is significantly different in practice.
The pros (of nav meshes) are:
- Polygons can represent worlds more accurately than grids, as they can represent non-grid-aligned worlds.
- With the accurate representation of a polygon it is easier to correctly perform smoothing both before and during movement. This accuracy can also be used for tighter animation constraints.
- Path planning on navigation meshes is usually fast, as the representation of the world is fairly coarse. But, this does not impact path quality, as characters are free to walk at any angle.
- Navigation meshes are not as memory-intensive as grids as they can represent large spaces with just a few polygons.

page 257:

The cons (of nav meshes) are:
- The time required to implement a navigation mesh is significant, although good open-source implementations are available [Mononen 11].
- Navigation meshes often require geometric algorithms, which may fail in special cases such as parallel lines, meaning that implementation is much more difficult [Chen 09].
- Changes to navigation meshes can be difficult or expensive to implement, especially when contrasted with changes to grid worlds.
- Localization on navigation meshes can be expensive if poorly implemented. Good implementations will use additional data structures like grids to speed up the process [Demyen 06].
Grids are most useful when the terrain is fundamentally 2D, when implementation time is limited, when the world is dynamic, and when sufficient memory is available. They are not well suited for very large open-world games, or for games where the exact bounds of walkable spaces are required for high-quality animation.
Waypoint graphs are most useful when implementation time is limited, when fast path planning is needed, and when an accurate representation of the world is not necessary.
Navigation meshes are best when there is adequate time for testing and implementation. They are the most flexible of the possible implementations when implemented well, but can be overkill for smaller projects.

By: D. Hunter Hale and G. Michael Youngblood
This chapter was very interesting, but I will (for now) stick with baking nav meshes in Godot.

· Chapter 20 - Precomputed Pathfinding for Large and Detailed Worlds on MMO Servers

By: Fabien Gravot, Takanori Yokoyama, and Youichiro Miyake

page 269:

Precomputed solutions for pathfinding were common on old generation consoles, but have rarely been used on current hardware. These solutions give the best results in terms of computation cost, as all path request results are precomputed in a lookup table. However, they have two drawbacks: memory cost and loss of flexibility. Currently most games use dynamic algorithms like A* for navigation.
In the context of MMO games, however, precomputed solutions are still used. While the corresponding servers are typically equipped with ample memory, they have very few CPU cycles available for each request.

page 275:

Node: a polygon or a sublayer component.
Component: a group of connected nodes. There is always a path between any two nodes in a component.
Tile: a cell in the grid partitioned layer. It can have any number of components or nodes.

· Chapter 21 - Techniques for Formation Movement Using Steering Circles

By: Stephen Bjore

· Chapter 22 - Collision Avoidance for Preplanned Locomotion

By: Bobby Anguelov

· Chapter 23 - Crowd Pathfinding and Steering Using Flow Field Tiles

By: Elijah Emerson

page 308:

For each 10 × 10 m grid sector there are three different 10 × 10 m 2D arrays, or fields of data, used by this algorithm. These three field types are cost fields, integration fields, and flow fields. Cost fields store predetermined “path cost” values for each grid square and are used as input when building an integration field. Integration fields store integrated “cost to goal” values per grid location and are used as input when building a flow field. Finally, flow fields contain path goal directions. The following sections go over each field in more detail.
A cost field is an 8-bit field containing cost values in the range 0–255, where 255 is a special case that is used to represent walls, and 1-254 represent the path cost of traversing that grid location. Varying costs can be used to represent slopes or difficult to move through areas, such as swamps. Cost fields have at least a cost of one for each grid location; if there is extra cost associated with that location, then it’s added to one.
If a 10 × 10 m sector is clear of all cost, then a global static “clear” cost field filled with ones is referenced instead. In this way, you only spend memory on cost fields that contain unique data. In an RTS game, there are a surprising number of clear sectors.

page 309:

The integration field is a 24-bit field where the first 16 bits is the total integrated cost amount and the second 8 bits are used for integration flags such as “active wave front” and “line of sight.” You can optionally spend more memory for better flow results by using a 32-bit float for your integrated cost making it a 40-bit field.
Flow fields are 8-bit fields with the first four bits used as an index into a direction lookup table and the second four bits as flags, such as “pathable” and “has line of sight.” The flow field holds all the primary directions and flags used by the agent’s steering pipeline for steering around hills and walls to flow toward the path goal.

· Chapter 24 - Efficient Crowd Simulation for Mobile Games

By: Graham Pentheny

page 318:

Units move through the grid following a static vector flow field. The flow field represents the optimal path direction at every cell in the grid, and is an approximation of a continuous flow function. Given a set of destination points, the flow function defines a vector field of normalized vectors, indicating the direction of the optimal path to the nearest destination. The flow function is similar to common methods for describing flows in fluid dynamics [Cabral and Leedom 93], with the difference that all flow vectors are normalized.
Flow fields guide units to the nearest destination in the same manner as a standard pathfinding system; however, the units’ pathing information is encoded in a flow field, removing the need for units to compute paths individually.
For example, if a bridge across a river is destroyed, the flow field only needs to be recomputed once to account for the change to pathable areas. Units following that flow field will implicitly change their respective paths in response to the change in the game world.

page 320:

Units in Fieldrunners 2 use a limited, greedy, prioritized summation of five steering behaviors (four of which are shown in Figure 24.2). The five behaviors listed in descending order of priority include flow-field following, obstacle avoidance, separation, alignment, and cohesion. In each simulation step, a unit is only influenced by a specified total magnitude of steering forces. The forces resulting from steering behaviors are added to the running total in priority order until the maximum magnitude has been reached. Any steering forces that have not been added to the total are ignored.
Obstacle avoidance helps faster units maneuver intelligently around slower units. The implementations of the obstacle avoidance and separation behaviors differ slightly from Reynolds’ original implementation [Reynolds 99]. The obstacle avoidance steering behavior generates a “side stepping” force perpendicular to the unit’s velocity, and proportional to the position and relative velocity of the neighbor. The force generated by the separation steering behavior is scaled by the ratio of the kinetic energy of the neighbor to the kinetic energy of the unit the force is applied to. Units with smaller masses and velocities (presumably being more nimble) will more readily yield to larger, less maneuverable units. Finally, flow-field following moves the unit in the direction specified by the flow field. The flow-field direction at the position of the unit is computed by linearly interpolating the four closest flow vectors.

page 321:

The mass, maximum force, maximum velocity, and neighbor radius attributes describe a unit’s unique behavior. The mass is used to calculate the unit’s kinetic energy in addition to the accelerations resulting from steering behaviors. The maximum force value dictates the maximum combined magnitude of steering forces that can influence the unit in a single simulation step. A unit’s agility value is defined as the ratio of a unit’s maximum force to its mass—the unit’s maximum acceleration. Finally, the maximum velocity attribute limits the magnitude of the unit’s velocity, and the neighbor radius attribute restricts the set of neighbors used in calculating flocking forces to those within a certain radius.

· Chapter 25 - Animation-Driven Locomotion with Locomotion Planning

By: Jarosław Ciupinski

Part 4 - Strategy and Tactics

· Chapter 26 - Tactical Position Selection An Architecture and Query Language

By: Matthew Jack

page 340:

The key attributes of a successful query language for position selection are:
Abstraction—We gain a lot of power if we can reapply existing criteria (keywords) in new ways.
Readability—The intent of a query should be understood as easily as possible by developers.
Extensibility—It should be easy to add new criteria to the language.
Efficiency—The language itself should not add overhead to the evaluation process.
Listing 26.1 shows a query that is made up of two subqueries, called “options.” The first option is preferred but, should it fail, subsequent options will be tried in the order that they are specified before failure is reported. In this example, the first option will collect hidespots effective against the current attack target within a radius of 15 m around the agent, discarding any that are less than 5 m from the agent and discard any point closer to the attack target than the agent himself. Of the points that remain, it will prefer any hard cover that is available to any soft cover and prefer the closest point. The second option (the fallback) generates a grid of points to move sideways or away from the target and prefers to block Line-of-Sight (LOS). Both options share the goal of moving to a new, safe location at least 5 m away. Thus, this query might be used in reaction to a grenade landing at the agent’s feet.

page 341:

The Generation section of each option specifies the source of our candidate points. The Conditions section contains the filters that should be applied during evaluation, which must all pass if a point is to be valid. The Weights section tells the AI how to score the fitness of those valid points.
Each line in those sections is a Lua table entry comprising a string and a value. The string, in the syntax of our DSL, is formed from a sequence of keywords joined by underscores, which is trivially parsed back into the keywords themselves. Usually the most important keyword comes first—for example, hidespots or distance—and specifies the evaluation or generation method that we are going to apply. This can be referred to as the criterion.

page 348:

If an agent runs towards a hidespot only to have an opponent occupy it before him, it can hurt us twice: first by the agent appearing to lack any anticipation of his opponent, and second by leaving him stranded in the open and abruptly changing direction towards other cover.
This is easy to prevent with a couple of provisions. The simplest is to focus on our cur- rent primary opponent and discard any point closer to our enemy than to us, effectively assuming that both agents will move with the same speed. We can implement this as a simple condition, such as:

Conditions = {canReachBefore_the_target = true}

This rule of thumb is so pervasive to good behavior that in the Crysis 1 hiding system it was not optional; you might make it a default in all queries.

page 352:

A physics raycast operation is always expensive and will generally traverse a large number of memory locations. Should it be performed synchronously, this will be painful in terms of cache misses, cache trashing, and possibly synchronization costs with any physics thread

page 353:

The simplest thing we can do that will make a big difference to performance in general is to pay attention to the order we evaluate our criteria. The rule of thumb for evaluation order is:
Cheap filters first, to discard points early
Weights, to allow us to sort into order
Expensive filters, evaluating from highest scoring down, until a point passes

page 358:

Tactical Position Selection is a keystone of shooter AI and a potential Swiss army knife of AI and gameplay programming. When we break out of rigid evaluation methods and provide an expressive query language, we can drive a wide variety of behavior with only small data changes. By creating a library of queries specific to the environment and desired behavior, and by considering best practices and making use of specific building blocks, we can keep queries simple and create them quickly. Feedback from our query results to our behavior and query selection allows our AI to adapt when circumstances change.

· Chapter 27 - Tactical Pathfinding on a NavMesh

By: Daniel Brewer

page 366:

· Chapter 28 - Beyond the Kung-Fu Circle A Flexible System for Managing NPC Attacks

By: Michael Dawe

page 370:

Requiring that opponents attack the player one at a time is a technique known as the Kung-Fu Circle, named after classic scenes from martial arts movies in which the protagonist faces off against dozens of foes who launch their attacks one at a time.
At a high level, the Belgian AI algorithm is built around the idea of a grid carried around with every creature in the game. While every NPC had a grid for itself, in practice the player is the game entity we are most concerned about, so we will use the player as our example throughout this article. The grid is world-space aligned and centered on the player with eight empty slots for attacking creatures
In addition to the physical location of those slots, the grid stores two variables: grid capacity and attack capacity. Grid capacity will work to place a limit on the number of creatures that can attack the player at once, while attack capacity will limit the number and types of attacks that they can use.
Every creature in the game is assigned a grid weight, which is the cost for that creature to be assigned a spot on someone’s grid. The total grid weight of the creatures attacking a character must be less than that character’s grid capacity. Similarly, every attack has an attack weight, and the total weight of all attacks being used against a character at any point in time must be less than that character’s attack capacity.

page 371:

Though the algorithm often refers to grid positions, it is more natural to think of the grid slots as defining positions equidistant from the player, forming a circle some arbitrary distance away. For Reckoning, we further subdivided the grid into inner and outer circles, which we called the “attack” and “approach” circles.

page 372:

page 373:

Grid capacity and attack capacity work together to limit both the number and types of creatures and attacks that can be thrown at the player at any given time. One advantage of this system is the immediate difficulty scaling that can come out of changing the player’s grid capacity and attack capacity, all without changing the carefully balanced individual grid and attack weight set for each creature. As the player increased the difficulty in Reckoning, we scaled up the grid and attack capacities accordingly.

· Chapter 29 - Hierarchical AI for Multiplayer Bots in Killzone 3

By: Remco Straatman, Tim Verweij, Alex Champandard, Robert Morcus, and Hylke Kleve

· Chapter 30 - Using Neural Networks to Control Agent Threat Response

By: Michael Robbins

page 391:

The books Artificial Intelligence for Games [Millington 09] and AI Techniques for Game Programing [Buckland 02] are great resources for getting started, while Game Programming Gems 2 [Manslow 01] provides sample code and a wide range of practical hints and tips.

Part 5 - Agent Awareness and Knowledge Representation

· Chapter 31 - Crytek’s Target Tracks Perception System

By: Rich Welsh

page 406:

In order to tell whether an AI agent is able to see something, most games tend to use a raycast from the agent to the target. While this does test whether there’s a clear line of sight between the two, it can start to get expensive very quickly. For example, in our games we try to limit the number of active AI agents to 16. This means that every agent can potentially see 15 other AI characters. In the worst case scenario, this could mean 120 raycasts being requested to check visibility between all the AI agents, even before the player is considered! In the AI system used for Crackdown 2, each agent could register how hostile a target needed to be before visibility checks should be done. These hostility levels were hostile, neutral, and friendly. By having AI agents register an interest in only hostile targets, this meant that any nonhostile targets became invisible to the agent, dramatically reducing the amount of raycasts required. Should a target change hostility (for example, going from being a neutral to a hostile target), the AI system will start or stop visibility tests as required.
Further optimizations can be made to the generation of visual stims. In the CryAISystem, every agent has a view distance and a field of view. A lot of unnecessary raycasts can be avoided by doing these much cheaper tests to see if potential visual targets are even within an agent’s view cone before requesting a raycast.
The remaining few stims tend to be events that you want to make your AI aware of, but don’t fit under the normal categories of sight or sound. For some of these events, you can effectively treat them as a dog whistle—create a sound stim without playing any audio and send that to the perception manager. These stims are then just handled in whatever way you need them to be.

page 407:

For stims that can’t be treated as a sound, extra data is usually required. An example from our games is bullet rain. When bullet rain is occurring, the AI don’t necessarily know the point of origin (or the hostility, though we pass the shooter’s ID with the stim so that we can add the bullet rain to the appropriate target track—as explained in Section 31.5 Target Tracks); however the direction the bullet rain is coming from is known. As such, this type of stim needs to be handled slightly differently to a sound, having the agent react to being under fire without knowing the shooter’s position immediately (Table 31.1).

· Chapter 32 - How to Catch a Ninja NPC Awareness in a 2D Stealth Platformer

By: Brook Miles
I find this a very interesting chapter. Nothing particularly complex, but these concepts fit into the kind of games I am interested in writing.

page 413:

Mark of the Ninja is a 2D stealth platformer game by Klei Entertainment.

page 415:

We determined that fundamentally there were two broad categories of interest sources our agents needed to detect in the world, things they could see, and things they could hear, which gave us our two senses: sight and sound.
Detection by the sight test involves a series of checks including these questions: Is the interest source within one of the agent’s vision cones? Is the game object associated with the interest source currently lit by a light source, or does the agent have the night vision flag, which removes this requirement? Is there any collision blocking line of sight between the agent’s eye position and that of the interest source?
A vision cone, as shown in Figure 32.1, is typically defined by an offset and direction from the agent’s eye position, an angle defining how wide it is, and a maximum distance. Other vision geometry is possible as well; we have some which are simply a single ray, an entire circle, or a square or trapezoid for specific purposes.

page 416:

For both performance and gameplay reasons, sight and sound interest sources define a maximum radius, and only agents within that radius are tested to determine whether they can detect the interest source.
From our two core senses, we can now allow the designers to create an interest source representing whatever object or event they want, so long as it can be detected via the sight or sound tests. Designers can specify a “gunshot” sound, or a “footstep” sound, a “corpse” sight, or a “suspect” sight. The agent’s behavioral scripts can use the specified interest source type to determine any special behavior, but the backend only needs to know how to determine whether the agent can see or hear the interest source.
You can also bend the definition of “sight” and “sound” somewhat. One special case of agent in Mark of the Ninja is guard dogs, who we want to be able to “smell” the player in the dark but, for gameplay reasons, only over a very short distance. In this case, instead of needing to create an entirely new smell test, we can simply define a sight interest source which is attached to the player game object, and require that any agent noticing it have the “dog” tag as shown in Listing 32.1. Voila, we have a “smell” interest.
An interest is just the record in the agent’s brain of what he’s interested in right this moment. It may have a reference to the interest source that created it (if there was one), but even if it doesn’t, it still has a copy of all of the necessary information, the sense for the interest, the source type, its position, priority, and so on. When an agent is determined to have detected an interest source, an interest record is added to its brain, and this is the information that the agent uses from that point on.
While sight and sound are the only available sense types for interest sources in Mark of the Ninja, interests of any arbitrarily defined sense can be added directly to an agent’s brain by the designer through a script call, as no additional testing needs to be done against them; the designer or script writer has already determined that this agent should be interested in whatever it is.
For example a “touch” interest may be added to an agent’s brain when he is struck with a dart projectile, or a “missing partner” interest can be added if the agent’s partner goes off to investigate a sound and fails to return.

page 417:

A question that arose early on was how groups of agents should respond when they all sense an interest simultaneously. At first it was every man for himself; each agent did its own test and upon sensing an interest would react, most likely by running to investigate it. This typically resulted in entire groups of agents running towards the slightest noise or converging on the player en masse. This wasn’t the kind of gameplay we were looking for. We want the player to be able to manipulate the agents, distract them, split them apart, and dispatch them on the player’s own terms.
By driving the detection of interest sources from the interest source itself, instead of from each agent individually, we can easily collect all of the information we need in order to determine who should be reacting, and how they should react.
The sensory manager update loop tests each interest source against all possible “detection candidates.” Given this list, it makes some decisions based mainly on group size, but possibly also by location or distance from the interest source. If only a single agent can detect the interest, our work is done, the agent is notified, and he goes to investi gate. If more than one agent can detect the interest, we can assign roles to each agent, which are stored along with the interest record in the agent’s brain. Roles only have meaning within the context of a specific interest, and when that interest is forgotten or replaced, the role associated with it goes away too. If multiple agents can detect the interest, one is chosen as the “sentry” or “group leader” and he plays audio dialog telling the other agents nearby to go check out the interest and then hangs back waiting. One or more agents are given the “investigate” role and will go and investigate, seemingly at the command of the “group leader.” Any remaining agents will get the “bystander” role, and may indicate they’ve seen or heard the interest but other- wise hold position and decrease the priority of the interest in their mind so they are more likely to notice new interests for which they might be chosen as leader or investigator. The key is that once the roles are assigned, and the sensory update is complete, there is no “group” to manage. Each agent is acting independently, but due to the roles that were assigned, they behave differently from each other in a way that implies group coordination.

page 418:

If you are investigating a broken light and come across a dead body, should you stop and investigate the body, or continue to look at the light? What if you hear your partner being stabbed by a Ninja, and then discover a broken light on your way to help him? Should you stop to investigate?
Interest sources, and by extension interest entries in agents’ brains, contain a simple integer value of priority, where higher priority interests can replace lower or equal priority interests, and the agent will change his focus accordingly. If the agent currently holds a high priority interest, lower priorities interests are discarded and never enter the agent’s awareness.
Listing 32.2. Mark of the Ninja’s priority definitions for interests.

INTEREST_PRIORITY_LOWEST = 0
INTEREST_PRIORITY_BROKEN = 1
INTEREST_PRIORITY_MISSING = 2
INTEREST_PRIORITY_SUSPECT = 4
INTEREST_PRIORITY_SMOKE = 4
INTEREST_PRIORITY_CORPSE = 4
INTEREST_PRIORITY_NOISE_QUIET = 4
INTEREST_PRIORITY_NOISE_LOUD = 4
INTEREST_PRIORITY_BOX = 5
INTEREST_PRIORITY_SPIKEMINE = 5
INTEREST_PRIORITY_DISTRACTIONFLARE = 10
INTEREST_PRIORITY_TERROR = 20

page 419:

Once an agent determines that an interest source has been dealt with, particularly in the case of mundane things like broken lights, we really don’t want every guard that walks past to stop and take notice. This would be repetitive and doesn’t make for especially compelling gameplay. Even worse would be the same agent noticing the same interest source over and over. When an agent has completed whatever investigation is called for, the interest source associated with the agent’s interest record is marked as investigated, which then removes the interest source (but not the game object it was associated with) from the world, never to be seen or heard of again.

· Chapter 33 - Asking the Environment Smart Questions

By: Mieszko Zielinski

· Chapter 34 - A Simple and Robust Knowledge Representation System

By: Phil Carlisle

By: Phil Carlisle

page 444:

SocialObjectComponent - This component performs the task of coordinating much of the group formation aspect of the system. At its core, it is a component that handles set membership, allowing characters to request access to the group, removing characters that are no longer participating, and allocating resources and/or positions in the group structure. This system is also responsible for advertising the availability of the social interaction as well as organizing flow control for when resources are limited, which as an example is useful to control how many people are talking at once during a group discussion.
This SocialObjectComponent is usually either added to the world during instantiation of an object or it is added dynamically during the update of a character that is receptive to a social encounter. An example of the former is a hot dog stand, where the SocialObectComponent is instantiated to control the behavior of the characters as they use the stand to buy hot dogs. An example of the latter would be when a character has true conditions for <idle>, <wants_social>, <sees_friend>, and <friend_also_wants_social>. When the conditions for social activity are met, the character spawns a GameObject, which has a SocialObjectComponent added. This becomes a proposal for social interaction and the SocialObjectComponent begins its role in coordinating the interaction.
The intracharacter coordination of the various social dynamics components is controlled by the SocialComponent. This component is responsible for querying the world as to available potential interactions, forming requests to participate, controlling the focus of attention, etc. Much of the work of this component is involved in handling events propagated through the game and sending events to the different components of the social dynamics system to handle. For instance, the social component sends an event to its parent GameObject to indicate that the attention of the character has changed.

· Chapter 36 - Breathing Life into Your Background Characters

By: David “Rez” Graham

page 452:

get reasonable-looking behavior is to use a schedule system. A schedule can be thought of as a black box where the input is the current time and the output is an action. Under the covers, these actions are really nothing more than “go here and run this animation.”

page 453:

There are three key components to this schedule system: schedules, schedule entries, and actions. Together, these concepts form the core of the background AI system.
Schedules are linked to NPCs with a pointer, an ID, or anything else that’s appropriate to the system. The schedule is the top-level interface for accessing any schedule data on the NPC and manipulating the schedule during runtime. Schedules contain a number of schedule entries, each of which represents a slice of time (see the following).
A schedule entry is a single slice of time within the schedule. It manages the lifecycle of the schedule entry and determines when it’s time to move to the next entry. Schedule entries also encapsulate the decision-making an NPC performs when it is looking for something to do. For this reason, schedule entries are typically implemented with a strategy pattern, or other design pattern that enables you to easily swap one entry for another [Gamma et al. 01].

page 454:

It’s usually not enough to have an NPC interact with a targeted object. It’s much more common to tell an NPC to sleep in its own bed or to work at its particular forge. We want the best of both worlds, so the action system needs to deal with the concept of object ownership as well.
All game objects that can be interacted with should have a type ID, which defines the type of object it is. The object type is really just metadata for the designer to label groups of similar objects. For example, all beds could be grouped under the “bed” type. This allows NPCs to own an object of a particular type, which greatly simplifies the scheduling data. For example, a schedule could have an action that tells the NPC to go to bed. The NPC will look in its map of owned objects and check to see if it has a “bed” object. If it does, it will use that bed. If not, some default behavior can be defined. Perhaps the NPC will choose a random bed, or perhaps it will fail the action and choose a new one. This same action could be applied to multiple NPCs without modification and it would work just fine. Each NPC would go to its own appropriate bed.

page 456:

One downside of schedules is that they tend to result in a lot of data duplication. For example, if your game has a day/night cycle, it’s likely every NPC will have a schedule entry for sleeping. In fact, many of the entries will be very similar with each other. You can mitigate this by allowing templates and data inheritance.

page 457:

Another thing to consider is the rigid nature of schedules. If you have a schedule that sends NPCs to an inn for dinner and drinks at 6:00pm every day, you could easily have a pile of NPCs all stopping whatever they’re doing and immediately heading to the inn at the exact same time. A better solution is to randomize the schedule update time with a Gaussian distribution function centered on the end time for that schedule entry.
Remember, the underlying structure behind this schedule system is a hierarchical finite-state machine.

· Chapter 37 - Alibi Generation Fooling All the Players All the Time

By: Ben Sunshine-Hill

Part 6 - Racing

· Chapter 38 - An Architecture Overview for AI in Racing Games

By: Simon Tomlinson and Nic Melder

page 473:

The breadth of behaviors in a racing AI system is not large, so generally a finite-state machine (FSM) is sufficient to represent them. At any time most of the behaviors may become valid and so they usually all compete to be the active one, and should be reviewed on every update of the strategic layer. A good way to manage this is for each valid state (as defined by the current state exit transitions) to evaluate a ‘utility’ score.

· Chapter 39 - Representing and Driving a Race Track for AI Controlled Vehicles

By: Simon Tomlinson and Nic Melder

· Chapter 40 - Racing Vehicle Control Systems using PID Controllers

By: Nic Melder and Simon Tomlinson

page 493:

PID controller

page 494:

The closed-loop PID controller is comprised of three parts. The first term ($K_p$) is proportional to the error ($e$) between the required target value ($R$) and the plant output ($Y$). The second term ($K_i$) is proportional to the integral of the error, and the last term ($K_d$) is proportional to the derivative of the error. The output is the sum of these three components:

$$e(t) = Y(t) - R(t)$$

$$u(t) = K_p \cdot e(t) + K_i \cdot \int e(t)dt + K_d \cdot \frac{d}{dt}e(t) $$

Calculating the error, integral error, and differential of the error is straightforward; it is the selection of gain for $K_p$, $K_i$, and $K_d$ that give the controller its characteristics. Note that the error can be either positive or negative, that is the measured speed could be above or below the required speed. Similarly $K_p$, $K_i$, and $K_d$ can be negative. The effect of increasing the different parameters on the four main system characteristics is summarized in Table 40.1.

page 497:

Once the desired characteristics are determined, the following sequence is recom- mended for tuning the controllers:

Set all gains (K) to 0 and increase Kp until the system behaves in a desirable manner with no overshoot or oscillation. Tuning this value first is advisable, as it will generally have the biggest effect on the output.
Increase Ki to eliminate the steady-state error.
Adjust Kd to reduce any overshoot or reduce the settling time as required.

In practice, it is often found that $K_d$ and $K_i$ are approximately half of $K_p$. As a general rule, if the system is unable to reach its desired value, increase Ki. If it oscillates, reduce $K_p$ and $K_i$. If it is too slow to change, then increase $K_d$. Remember that the properties of the plant also play a part, so if there is a lot of time lag in the plant, the controller may need to work harder to compensate.

· Chapter 41 - The Heat Vision System for Racing AI A Novel Way to Determine Optimal Track Positioning

By: Nic Melder

page 502:

Position: It is not possible to drive in the same place as an observed vehicle, so write a large amount of heat at this position.
Block: If the observed vehicle is behind the player, it may make a good target to block so remove heat at this position.
Draft: If the observed vehicle is in a good position for drafting, remove heat based upon a drafting cone produced by this vehicle.

page 503:

· Chapter 42 - A Rubber-Banding System for Gameplay and Race Management

By: Nic Melder

Part 7 - Odds and Ends

By: Michael Mateas and Josh McCoy

page 515:

This chapter describes the Comme il Faut (CiF) social simulation architecture. CiF was used to create Prom Week,* a social puzzle and storytelling game

· Chapter 44 - A Control-Based Architecture for Animal Behavior

By: Michael Ramsey

· Chapter 45 - Introduction to GPGPU for AI

By: Conan Bourke and Tomasz Bednarz

page 541:

Listing 45.1 Flocking pseudocode.

for each agent
    for each neighbor within radius
        calculate agent separation force away from neighbor
        calculate center of mass for agent cohesion force
        calculate average heading for agent alignment force
    calculate wander force
    sum prioritized weighted forces and apply to agent velocity
for each agent
    apply velocity to position

· Chapter 46 - Creating Dynamic Soundscapes Using an Artificial Sound Designer

By: Simon Franco

page 551;

We have two basic types of events that can be posted. Information-only events notify the ASD of changes in the game state, which may not be directly related to a sound emitting action. For example, if a game object has been spawned or despawned, or if the game is paused or resumed, then we can use an information-only event to pass this knowledge to the AI. These events may not cause sounds to be played directly, but they can affect the way in which we play other sounds.
The second type of event is the play-request event. These events are used when we want to play a specific type of sound, such as a gunshot. These events replace where previously we would have called directly into the sound system. For example, we now post an event at the point when an explosion occurs, or when the animation system causes feet to strike the ground (so that we can play footstep sounds). By using play-request events, we give the ASD the opportunity to make decisions about which sound sample to play and how loud to play it (e.g., should the volume of gunshot events be reduced so that we can hear enemy speech?). The ASD can evaluate the event and use the player’s current context, and the state of the game objects involved, to drive its decisions.

page 552:

The sound-history table stores a record of each type of sound, along with its category, when it was last played and how many times it was played. This is so the ASD can keep track of what it has previously played. This knowledge is used when selecting which sounds to play back. For example, we may want to avoid playing the same piece of music too often, or perhaps not play a tension piece of music if we had just played a piece of combat music.

· Chapter 47 - Tips and Tricks for a Robust Third-Person Camera System

By: Eric Martel

page 559:

We quickly found many other uses for the debug camera. For example, animators would use it to see their animations from a different perspective, level artists would simply go inspect the mesh they wanted without actually having to play the game, and marketing guys would create in-game trailers with one person playing the game and another controlling the camera. The marketing usage required some changes to the system, such as adding controls to roll the camera, modify the field of view, and control its movement speed, but in retrospect it was well worth the time invested.

· Chapter 48 - Implementing N-Grams for Player Prediction, Procedural Generation, and Stylized AI

By: Joseph Vasquez II

page 568:

Probability can be defined as the likelihood that a given event will occur. Mathematically, we define it as a ratio of the number of ways an event can occur divided by the number of possible outcomes. This ratio is strictly between 0 and 1, so we treat it as a percentage, as shown in Equation 48.1.

$$ P(event\space e) = \frac{#\space of\space ways\space e\space can\space happen}{#\space of\space all\space possible\space outcomes}$$

Let’s assume that a player has three move options: Jump, Attack, and Dodge. Without any other information, the probability of choosing Jump would be P(Jump) = 1/3 = 0.33 = 33%. Informally, we take this to mean that Jump occurs 33% of the time on average. In 100 moves, we expect Jump to occur on average 33 times. Working backwards from this understanding, if we’re given a sequence of 100 events in which Jump occurred 33 times, the probability of Jump occurring again is assumed to be 33%, as shown in Equation 48.2.

$$P(event\space e) = \frac{#\space of\space times\space e\space occurred}{#\space of\space time\space any\space event\space occurred }$$

This is useful for deriving probability from a sequence of past events. From this history, we can predict that the next event is most likely the event that has occurred most often.
In this sequence: “Jump, Jump, Dodge, Attack, Dodge, Attack, Jump, Jump, Dodge,” P(Jump) = 4/9 = 44%, P(Attack) = 2/9 = 22%, and P(Dodge) = 3/9 = 33%. Based on raw probability, the player will jump next. Do you agree with this prediction? Probably not, since you’ve likely picked up on a pattern. N-grams are used to find patterns in sequences of events. An N-gram could predict that a “Dodge, Attack” pattern was being executed. N-grams provide predictions more accurately than raw probability alone.
Each N-gram has an order, which is the N value. 1-grams, 2-grams, and 3-grams are called unigrams, bigrams, and trigrams. Above that, we use 4-gram, 5-gram, and so on. The N value is the length of patterns that the N-gram will recognize (the N-tuples). N-grams step through event sequences in order and count each pattern of N events they find. As new events are added to the sequence, N-grams update their internal pattern counts.

page 569:

Consider a game where we need to predict the player’s next move, Left or Right. So far, the player has moved in the following sequence: “R, R, L, R, R”. From this, we can compute the following N-grams:
- A unigram works the same as raw probability; it stores: R: 4, L: 1.
- A bigram finds patterns in twos, “RR, RL, LR, RR”, and stores: RR: 2, RL: 1, LR: 1.
- A trigram finds patterns by threes, “RRL, RLR, LRR”, and stores each pattern once.
- A 4-gram will store RRLR: 1 and RLRR: 1.
- A 5-gram will store the entire sequence once.
There isn’t enough data yet for an N-gram with order greater than 5. Notice that we store occurrence counts rather than probabilities, which requires less computation. We can perform the divisions later to calculate probabilities when we need them. With patterns learned, our N-grams can now predict the next event.
An N-gram predicts by picking out an observed pattern that it thinks is being executed again. To do so, it considers the most recent events in the sequence, and matches them against previously observed patterns. This set of recent events we’ll call the window. (See Figure 48.1.)
The length of the window is always N-1, so each pattern includes one more event than the window length. Patterns “match” the window if their first N-1 events are the same. All matching patterns will have a unique Nth event. The pattern with the most occurrences has the highest probability, so its Nth event becomes the prediction. No division is required.
Let’s see what our N-grams will predict next in the same sequence, “R, R, L, R, R.” (See Table 48.1.)
Using N-gram statistics, we can make the following predictions:
The unigram predicts Right, since it uses raw probability. Its window size is N – 1 = 0.
The bigram has observed two patterns that match its window, RR and RL. Since RR occurred once more than RL, the bigram chooses RR and predicts Right.
The trigram finds that only RRL matches its window, so it uses RRL to predict Left.
The 4-gram has not observed any patterns that start with LRR, so it cannot make a prediction.

page 570:

The probability that an event $e$ will occur next is equal to the probability of the matching pattern containing e as its $N$th event, as shown in Equation 48.3.

$$P(event\space e\space is\space next) = P(matching\space pattern\space ending in\space e\space is\space next) $$

The probability of a matching pattern is its occurrence count divided by the total occurrences of all matching patterns (all patterns whose first N–1 events match the window), as shown in Equation 48.4.

$$P(matching\space pattern\space mp) = \frac{#\space of\space mp\space occurances}{Sum\space of\space all\space matching\space pattern\space occurances}$$

During prediction, you’ll likely check the occurrences for all matching patterns to find the max. You can sum them while you’re at it, and find the denominator in the same pass.
If needed, you can calculate the probability that an event pattern will occur again anytime in the future, as shown in Equation 48.5. This is a nonconditional probability, as it doesn’t use the current window.

page 571:

$$P(event\space pattern\space ep) = \frac{#\space of\space ep\space occurances}{Sum\space of\space all\space pattern\space occurances\space so\space far}$$

With the length L of the event sequence, you can calculate the denominator directly:

$$Total\space pattern\space occurances\space T = L-(N-1)$$

It is simply the sequence length minus the first window. Note that $T$ will be negative when $L$ is less than the window size. If $T < 0$, this means that $|T|+1$ more events are required for a single $N$ length pattern to occur.

page 577:

The longer the sequence, the more confident and less flexible an N-gram becomes.

Home · Book Reports · 2024 · Game AI Pro 1

Thoughts

Table of Contents

Part 1 - General Wisdom

· Chapter 1 - What is Game AI?

page 007:

· Chapter 2 - Informing Game AI through the Study of Neurology

page 018:

page 019:

page 022:

page 023:

page 026:

· Chapter 3 - Advanced Randomness Techniques for Game AI Gaussian Randomness, Filtered Randomness, and Perlin Noise

page 030:

page 031:

page 032:

page 034:

page 036:

page 037:

page 040:

page 042:

Part 2 - Architecture

· Chapter 4 - Behavior Selection Algorithms - An Overview

page 048:

page 049:

page 050:

page 052:

page 053:

page 054:

page 055:

page 056:

page 057:

page 058:

· Chapter 5 - Structural Architecture - Common Tricks of the Trade

page 064:

page 065:

page 066:

page 067:

· Chapter 6 - The Behavior Tree Starter Kit

· Chapter 7 - Real-World Behavior Trees in Script

· Chapter 8 - Simulating Behavior Trees A Behavior Tree/Planner Hybrid Approach

page 100:

page 101:

page 102:

page 110:

· Chapter 9 - An Introduction to Utility Theory

page 114:

page 115:

page 116:

page 117:

page 118:

page 119:

page 120:

page 121:

page 122:

page 123:

page 124:

page 126:

· Chapter 10 - Building Utility Decisions into Your Existing Behavior Tree

page 129:

page 131:

page 132:

page 134:

· Chapter 11 - Reactivity and Deliberation in Decision-Making Systems

page 137:

page 138:

page 140:

page 141:

page 142:

page 143:

page 146:

· Chapter 12 - Exploring HTN Planners through Example

page 149:

page 153:

page 154:

page 155:

page 156:

page 157:

page 159:

page 160: