At an abstract level, basic actions involve manipulating spatial or
geometric relations between objects (or possibly between the agent and
objects). Thus, part of the spatiotemporal specification of an action
can be a spatial goal, which consists of the type of the
manipulation (establishing, terminating, maintaining, or modifying) and
the relation that is being manipulated (see Figure
). The relation is expressed as queries, which will
refer to spatial predicates on objects and agents. For instance, the
action in ``Put the block on the table'' would have the spatial goal of
establishing the relation of the block being on the table. The relation
could be expressed as a query to the block, ``block.on(table)'', which
would return a ``yes'' or ``no'' answer. The relation would be
considered established when the query returned a ``yes''. In order to
convert these abstract spatial goals into the lower level kinematics and
dynamics, an algorithm is needed that can understand the queries (e.g.,
what it means for something to be ``on'' something else) and convert
them into motions for establishing, terminating, etc., the relations they
express.
Figure: The spatiotemporal specification type
The spatial goals are provided by the lexical terms used in the input or, conversely, are formed by recognizing certain changes in directions or relationships in the object state. In either case we consider a set of terms including those considered and defined in Badler's earlier work [Bad75]:

The semantics of these terms for animation synthesis is to be implemented through PaT-Nets that transform the given parameters (e.g., objects involved) into movement paths, vectors, or directions eventually processed by the primitive actions sent to the agent. For textual synthesis, PaT-Net ``recognizers'' as defined in [Bad75] can be executed to vote for the semantic term most closely describing the changing situation being presented. Although many terms are simply recognized from directions relative to object labels (``forward'' from movement with front of object in the lead, etc.), others require a multi-node PaT-Net to determine the sub-steps and completion of the activity (e.g. across, around, back-and-forth, etc.)
Mathematically, a motion can be represented in any one of the three domains - kinematic, dynamic and frequency. The motion generating primitive functions take parameterized inputs to generate the exact motions. The modifiers impose additional constraints on the motion. At any time, one or more of the constraints are active. The constraints can be specified in any of the above mentioned three domains. The mathematical components of the representation are designed to facilitate conversion of movements defined or specified in one system. Conversion procedures exist to change kinematics data into dynamics information ( inverse dynamics) and vice versa ( forward dynamics) [KMB96].
The basic parameters for modifying a motion in the kinematic domain are position, time, velocity, acceleration and path. These parameters are in general relative but they could also be global. In the dynamic domain, given a motion and the agent/object involved, it is possible to compute the forces and torques at the joints required to generate the given motion. If an object is able to rotate about an axis, then a force applied in a direction perpendicular to the axis of rotation causes the object to rotate. The speed of rotation is governed by the magnitude of the force. So, to modify the motion in the dynamic domain, we can specify relative forces and torques. In the frequency domain, we can represent the motion as a function of time using Fourier series expansion. The parameters to vary the motion in this domain could then be mapped to period and amplitude.