Needs-based AI: part 3, action performance

(This note is a part of the needs-based AI series)

Action Performance

Having chosen something to do, we push the advertisement’s actions on the agent’s action queue, to be performed in order. Each action would routinely be a complete mini-script. For example, the stove’s “clean” action might be small script that:

  • Animates the agent getting out a sponge, scrubbing the stove
  • Runs the animation loop, and an animated stove condition meter
  • Grants the promised reward

It’s important that the actual reward be granted manually during the action, and not be awarded automatically. This gives us two benefits:

  1. Interrupted actions will not be rewarding, and
  2. Objects can falsely advertise, and not actually grant the rewards they promised

False advertisement is an especially powerful, but dangerous option. For example, suppose that we have a food item that advertises a hunger reward, but doesn’t actually award it. A hungry agent would be likely to pick that action – but since they got no reward, at the next selection point they would again likely pick, and then again, and again. This quickly leads to very intriguing “addictive” behaviors.

This may seem like a useful way to force agents to perform an action. But it’s just as hard to make them stop once they’ve started. False advertisements create action loops that are very difficult to tune. In practice, forcing an action is easier done by just pushing the desired action on the agent’s action queue.

 

Action Chaining

Performing a complex action such as cooking a meal usually involves several steps (such as prep and cooking), and several objects (a fridge, a cutting board, a stove). This sequence must not be atomic – steps can be interrupted, or they can fail due to some external factors.

Complex sequences are implemented by chaining multiple actions together. For example, the “eat dinner” action might involve several steps:

  1. Take a food item from the fridge
  2. Prepare food item on a counter
  3. Cook food item on the stove
  4. Sit down and eat, getting a hunger reward

Of course we don’t want to implement this as an atomic action; there is too much variability in the world for it to always work out perfectly.

We can add dynamism in a couple of ways. The simpler way is to simply represent this as a sequence of small atomic actions, and push each of them on the agent’s queue. This is straightforward, and has the nice effect of interruptability: if something important comes up, we can load it on the front of the queue, and when it’s done, the agent will just get back to whatever it was doing.

Of course these steps can fail, in which case the entire chain should be aborted, potentially with interesting results. For example, a failed “cook food” action, in addition to interrupting cooking, might create a new “burned food” object that needs to be cleaned up.

The second method, more powerful but more difficult, is to implement action chaining by “lazy evaluation.” In this approach, only one action step is created and run at a time, and when it ends, it creates the next action and front-loads it on the queue.

For an example of how that might look, consider the “eat dinner” action again. The advertisement would specify only one action: take food. Once that step is done, it would find the nearest kitchen counter object, ask it for the “prepare food” action, and load that on the queue. Once “prepare food” was done, it would find the nearest stove, ask it for a new “cook food” action, and so on.

Doing action chaining this way makes it possible to modify the chain based on what objects are available to the agent. For example, a microwave oven might create a different “cook food” action than a stove would, providing more variety and surprise for the player. Second, it makes interesting failures easier. For example, the stove can look up some internal variable (eg. repair level) to determine its failure, and randomly push a “create a kitchen fire” action instead.

In either case, using an action queue provides nice modularity. Sequences of smaller action components are more loosely coupled, and arguably more maintainable, than standard state machines.

 

Action Chain State Saving

When an action chain is interrupted, we might want to be able to save its state somehow, so that it gets picked up later.

Since all actions are done on objects, one way to do this is to mutate the state of the object in question. For example, the progress of “cleaning” can be stored as a separate numeric cleanness value on an object, which gets continuously increased while the action is running.

But sometimes actions involve multiple objects, or the state is more complicated. Another way to implement this is by adding state objects. An intuitive example is food from the original Sims: the action of prepping food creates a “prepped food” object, which cooking then turns into a pot of “cooked food”, which can be plated and turned into a “serving”. The state of preparation is then embedded right in the world: if you interrupt prepping, your cut up food will just sit there, until you pick it up later and put it on the stove.

 

(Go to intro, part 1, part 2, part 3)