Chapter 4. TECHNIQUES TO AID MOTION SPECIFICATION

4.1 Interpolation
4.2 Ease-In/Ease-Out
4.3 Orientation Representation and Interpolation
4.4 Camera Path Following
4.5 Simple Key Frame System
4.6 A Simple Animation Language
4.7 Hierarchical Structure Animation
4.8 More Sophisticated Animation Languages
4.9 Key Frame and Track-Based Animation Systems
4.10 Shape Interpolation, Metamorphosis
4.11 Implicit Surfaces

Any technique short of requiring the animator to specify each and every pixel value at each and every frame attempts, to some extent, to abstract out the important features of the animation. Of course, selecting the important feature of an animation is a very subjective thing to do. Consequently, the appropriate feature for one kind of animation will not be the appropriate feature for another kind of animation. Thus, this section refers to the various techniques in which the animator is responsible for specifying most of the information. That is, the animator is working at a fairly low level of abstraction. These techniques differ from model-based techniques in that there is a more direct relationship between the information provided by the animator and the resulting motion. In contrast, in model-based animation, the animator works at a high level of animation and, as a result, gives up more control of the specific motion, but less information is required from the animator.

4.1 Interpolation

At the foundation of almost all animation is the interpolation of values. The simplest case is interpolating the position of a point in space. Even this is non-trivial to do correctly and requires some discussion of several issues: the appropriate parameterization of position, the appropriate interpolating function, and maintaining the desired control of the interpolation over time.

Often, an animator has a list of values associated with a given parameter at specific frames (called key frames or keys) of the animation. The question to be answered is how best to generate the values of the parameter for the frames between the key frames. The parameter to be interpolated may be a coordinate of the position of an object, or a joint angle of an appendage of a robot, or the transparency attribute of an object, or any other parameter used in the display and manipulation of computer graphic elements.

Appendix A contains a detailed discussion of specific interpolation techniques. Here, we will discuss the issues that determine how to choose the most appropriate interpolation technique and apply it in the production of an animated sequence. One of the first decisions to make is whether the given values represent actual values that the parameter should have at the key frames (interpolation), or whether they are meant merely to control the interpolating function and do not represent actual values the parameter will assume (approximation). Other issues that influence which interpolation technique to use include how smooth the resulting function needs to be (i.e. continuity), how much computation you can afford to do (order of interpolating polynomial, and whether local or global control of the interpolating function is required. See Appendix A for a discussion of all of these issues. Suffice to say at this point, the Catmull-Romm spline is often used in animation path movement because it is an interpolating spline and requires no additional information from the user besides the points that the path is to pass through. Parabolic blending is an often-overlooked technique that also affords local control and is interpolating. See Mortenson's book or Rogers and Adams' book for discussion of such curves.

We will assume that a interpolating technique has been chosen and that a function P(t) has been selected which, for a given value of t, will produce a value, i.e., p = P(t). If position is being interpolated then three functions are used in the following manner. The x, y and z coordinates for the positions at the key frames are specified. Key frames are associated with specific values of the time (t) parameter. The x, y and z coordinates are considered independently. For example, the points (x,t) are used as control points for the interpolating curve so that X=Px(t) results, where P denotes an interpolating function and the subscript x is used to denote that this specific curve was formed using the x coordinates of the key positions. Similarly, Y=Py(t) and Z=Pz(t) are formed so that at any time, t, a position (x,y,z) can be produced.

Varying the parameter of interpolation, in this case t, by a constant amount, does not mean that the resulting values, in this case Euclidean position, will vary by a constant amount. This means that just because t changes at a constant rate, the interpolated value will not necessarily, in fact seldom, have a constant speed.

In order to ensure a constant speed for the interpolated value, the interpolating function has to be parameterized by arc length, i.e., distance along the curve of interpolation. Some type of reparameterization by arc length should be performed for most applications. Usually this reparameterization can be approximated without adding undue overhead or complexity to the interpolating function.

Reparameterization by Arc Length

Assume that we have a spline defined by P(t). As noted above, we have no guarantee that the variance in t is directly related to the distance traveled along the curve. That is to say, that the distance traveled from P(0) to P(.2) is not necessarily twice as far as the distance traveled along the curve from P(.2) to P(.3). In order to establish this relationship we have to reparameterize the interpolating spline by arc length or some scalar of arc length. There are various ways to approach this. Analytically, arc length is defined thusly:

    L = integral from u1 to u2 of sqrt(pu . pu) du

Gassian quadrature to reduce the integra to

   L = summation from 1 to n of w-sub-i times f(evaluated at u-sub-i)

where n is the number of sample points, w-sub-i are the weight values, and u-sub-i are the sampled values. See Mortenson's book, pp. 299-300. The u's can be normalized to the range zero to one and tables of weights and sample values can be found in tables. See Numerical Recipies for a discussion of Gaussian Quadrature.

Most of the time, the curves that arise in computer animation applications are not analytically reparameterizable by arc length. They must be reparameterized numerically. A simple, but somewhat inaccurate, approach to this reparameterization is to precompute a table of values which relates the original parameter with an arc length parameter. One the number of entries in the table depends on the accuracy with which the arc length must be computed. This is determined by the application. The function is evaluated at n equidistant parameter values (equidistant in parameter space, e.g., t = 0.0, 0.01, 0.02, 0.03, etc.). N should be sufficiently large to ensure that the resulting arc lengths are within tolerance; this will become clear as the technique is described.

Using the parameter values and corresponding values of the function, a table is built which records accumulated linear distances between the computed points. Entries are then made in the table which record normalized distances along the curve. These entries are monotonically increasing from zero to one and represent the arc length parameterization (a scaled version of the arc length). This table can be precomputed before being used in the animation. This table can be used to determine the functional parameter needed to produce a point along the curve that corresponds to arc length.

An alternative technique which is computationally more efficient but programmatically more difficult. This technique uses adaptive Gaussian quadrature to determine the value of the function at the point along the curve that corresponds to the given arc length. See Brian Guenter's article.

4.2 Ease-In/Ease-Out

Note the difference between interpolating position along a curve and the speed along which the interpolation proceeds (consider the interpolating parameter to be 'time'). For example you can do linear interpolation in space but cubic interpolation of the distance with respect to the time parameter. If t is a parameter that goes between 0 and 1 then ease-in/ease-out can be implemented by a function t'=ease(t) so that if t varies uniformly between 0 and 1, t' will start at 0 slowly increasing, gaining speed until the middle values and then decelerating at it approaches 1. t' is then used as the interpolation parameter in whatever function produces spatial values.

Sin Interpolation

One way to implement ease-in/ease-out is to use the shape of the section of the sin curve as the ease(t) function:

t' = ease(t) = (sin(t*PI-PI/2)+1)/2

sin ease-in/ease-out

In this function, t is the input which is to be varied uniformly from zero to one (e.g., 0.0, 0.1, 0.2, 0.3, ...). The t' is the output which also goes from zero to one, but does so by starting off slow, speeding up, and then slowing down. For example, at t = .25, t' = .1465, and at t = .75, t' = .8535. With the sin/cosine ease-in/ease-out function presented here, the 'speed' of t' with respect to t is never constant for an interval but, rather, is always accelerating or decelerating.

Another method is to have the user specify times t1 and t2. A sinusoidal curve is used for velocity to implement an acceleration from time 0 to t1. A sinusoidal curve is also used for velocity to implement deceleration from time t2 to 1. Between times t1 and t2, a constant velocity is used. This is done by taking a parameter t in the range 0 to 1 and remapping it into that range according to the above velocity curves to get a new parameter rt. So as t varies uniformly from 0 to 1, rt will accelerate from 0, then maintain a constant parametric velocity and then decelerate back to 1.

  double ease(t,t1,t2)
  double   t,t1,t2;
  {
    double  rt,s,e1,e2,nt,rt;

    e1 = t1*2/PI;               /* distance after t1 acceleration */
    e2 = (1-t2)*2/PI;           /* distance after (1-t2) deceleration */

    if (tt2) {
      nt = (t-t2)/(1.0-t2);     /* fraction into deceleration */
      s = sin(nt*PI/2);         /* use sin quadrant for acceleration */
      rt = e1 + (t2-t1) + s*e2; /* distance after s */
    }
    else {		
      rt = e1 + t - t1;         /* distance after t */
    }

    rt = rt/(e1+(t2-t1)+e2);    /* scale back into 0:1 range */
    return(rt);
  }

Constant Acceleration: Parabolic Ease-In/Ease-Out

An alternative approach and one that avoids the transcendental function evaluation, or corresponding table look-up and interpolation, is to establish basic assumptions about the acceleration and, from there, integrate to get the resulting interpolation function.

The default case of no ease-in/ease-out would produce a velocity curve that is a horizontal straight line of v0 as t goes from 0 to 1. The distance covered would be ease(1) = v0*1. To implement an ease-in/ease-out function, assume constant acceleration and deceleration at the beginning and end of the motion, and zero acceleration during the middle of the motion.

acc1

Integrate to get velocity. This produces a linear ramp for accelerating and decelerating velocities

acc2

and a distance function with parabolic sections at both ends.

acc3

(Distance in this case is in parametric space, or t-space, and is the distance covered by the output value of t. For a given range of t = [0,1], the user specifies times to control acceleration and deceleration: a and b. Acceleration occurs from time 0 to time a. Deceleration occurs from time b to time 1.

  t' = ease(t)
  v(t)  = v0*t/a                for t = [0,a]
        = v0                        t = [a,b]
        = v0 - (t-b)*v0/(1-b)       t = [b,1]
        = (1-t)*v0/(1-b)

where the distance covered must be equal to the distance covered in the default case. Of course the time can be renormalized into whatever range the user is interested in. For now, we will use a normalized distance of 1 to solve for v0 with the familiar distance = average-speed*time:

  1 = a*v0/2 + (b-a)*v0 + (1-b)*v0/2;
or
  v0 = 2/(b-a+1)

Now, integrating v(t) to get distance as a function of t:

  d = v0*t**2/(2*a)                                 for t = [0,a]
    = v0*a/2+(t-a)*v0                                   t = [a,b]
    = v0*a/2+(b-a)*v0 + (t-t**2/2-b+b**2/2)*v0/(1-b)    t = [b,1]

  double ease(t,a,b)
  double  t,t1,t2;
  {
    double  v0,a1,a2;

    v0 = 2/(1+b-a);          /* constant velocity attained */
    if (t&lta) {
      d = v0*t*t/(2*a);
    }
    else {
      d = v0*a/2;
      if (t&ltb) {
        d += (t-a)*v0;
      }
      else {
        d += (b-a)*vo
        d += (t-t*t/2-b+b*b/2)*vo/(1-b);
      }
    }
    return(d);
  }

Sin Acceleration/Deceleration with Intermediate Constant Speed

Another method is to have the user specify times a and b, similar to the constant acceleration technique described above. However, in this case a segment of the sin curve is used to control the velocity acceleration and deceleration. Between times a and b, a constant velocity is used.

The development of this is similar to that for constant acceleration. A velocity curve is constructed from sin segments with interior constant velocity. The constant velocity, v0, is unknown.

sinALt

The distance covered can be computed as a function of v0. A normalized distance of one can be used and v0 can be solved for. From this the equations are integrated and equations for the distance covered can be derived (and is left as an exercise for the reader).

4.3 Orientation Representation and Interpolation

A common problem to address in computer animation is how to represent the position and orientation of an object in space, and how to change this representation over time to produce animation. A typical scenario is one in which the user specifies two such representations and the computer is used to interpolate intermediate positions thus producing animated motion. We will assume that there is no scaling involved, non-uniform or otherwise, so we are referring to rigid body transformations.

The first obvious choice for representing the orientation and position of an object is by the 4x4 transformation matrix. A user can interactively rotate and translate an object to produce a compound 4x4 transformation matrix. In such a matrix the upper left 3x3 matrix represents a rotation to apply to the object and the first three elements of the fourth row represent the translation. (Assuming points are represented by row vectors which are post-multipied by the transformation matrix.) No matter how the 4x4 transformation matrix was formed (no matter what order the transformations were given by the user, such as rotate, translate, rotate, rotate, translate, rotate), the final 4x4 transformation matrix produced by concatenating all of the individual transformation matrix in the specified order will result in a 4x4 matrix which specifies the final position of the object by a 3x3 rotation matrix followed by a translation.

orientation1

The conclusion is that the rotation can be interpolated independent from the translation. (For now, we will just consider linear interpolation although higher order interpolations are possible. See Appendix A.)

Now consider two such transformations that the user has specified as key positions that the computer should interpolate between. While it should be obvious that interpolating the translations is straightforward, it is not clear at all the interpolating the rotations will produce intuitive results. In fact, it is the objective of this introduction to show that interpolation of orientations can be a problem. A property of 3x3 rigid-body rotation matrices is that the rows and columns are orthonormal (unit length and perpendicular to each other). Simple linear interpolation between the nine pairs of numbers which make up the two 3x3 rotation matrices to be interpolated will not produce intermediate 3x3 matrices which are orthonormal and are, therefore, not rigid-body rotations. It should be pretty easy to see that interpolating a rotation of +90 degrees about the y-axis with a rotation of -90 degrees about the y-axis results in transformations which are nonsense. There are alternative representations which are more useful that transformation matrices in performing such interpolations.

Another problem with using transformation matrices to represent orientations is that, if incremental rotations are accumulated, errors can arise in such a way that an invalid representation results. A fixed angle representation avoids the invalid representation problem (although it, too, is subject to errors as shown below.)

Fixed Angle Representation

Fixed angle representation really means 'angles about fixed axes'. A fixed order of three rotations is implied, such as x-y-z. This means that orientation is given by a set of three ordered angles, first around x, then around y, then around z. There are many possible orderings of the rotations and, in fact, it is not even necessary to use all three coordinate axes. For example, x-y-x is a feasible set of rotations. The only ones that don't make sense are those orderings in which an axis immediately follows itself such as x-x-y.

In any case, the main point is that the orientation of an object is given by three angles, such as (10,45,90). In this example, the orientation represented is the one obtained by rotating the object first about the x-axis by 10 degrees, then about the y-axis by 45 degrees, then about the z-axis by 90 degrees. We will use the following notation to represent such a sequence of rotations: Rx(10) Ry(45) Rz(90) (points are post-multiplied by transformation matrices).

The problem with using this scheme is that two of the axes of rotation can essentially line up on top of each other when you consider what a slight change in value does from a given representation. For example, consider the orientation represented by (0,90,0) and how slight changes in the parameters will effect the object from this orientation.

fixed1

A slight change of the third parameter will rotate the object slightly about the z-axis. However, note that a slight change of the first parameter will also rotate the object slightly about the z-axis because the 90 degree y-axis rotation has essentially made the first axis of rotation align with the third axis of rotation. This is what is called gimbal lock. From the initial position (0,90,0) the object can no longer be rotated about the global x-axis by a simple change in its orientation representation (actually, the representation that will effect such an orientation is (90,90+epsilon,90) - not very intuitive!).

This same problem makes interpolation between key positions problematic in some cases. Consider the key orientations of (0,90,0) and (90,45,90).

fixed1 fixed2

The second orientation is a 45 degree x-axis rotation from the first position. However, as discussed above, the object can no longer directly rotate about the x-axis from the first key orientation because of the 90-degree y-axis rotation. Direct interpolation of the key orientation representations would produce (45,67.5,45) which is very different from the (90,22.5,90) orientation that is desired). Below is the half-way image produced when viewing from the negative z-axis toward the origin.

In its favor, the Fixed Angle Representation is compact, intuitive and easy to work with.

Euler Angle Representation

In Euler angle representation, the axes of rotation are fixed to the object as opposed to being global axes. Consider x-y-z Euler angle representation:. Rx(alpha) followed by Ry(beta), but around the local, rotated coordinate system. Using a prime symbol to represent rotation about a rotated frame, we have: Rx(alpha)R'y(beta). A y rotation around the rotated frame can be effected by Rx(-alpha)Ry(beta)Rx(alpha). So the result after the first two rotations is:

Rx(alha)R'y(beta) = Rx(alpha) Rx(-alpha) Ry(beta) Rx(alpha) = Ry(beta) Rx(alpha)

The third rotation, Rz(gamma), is around the now twice rotated frame which can be effected again by undoing the previous stuff by Rx(-alph)Ry(-beta) followed by Rz(gamma) and then redoing the previous rotations. Putting all three rotation together, and using a double prime to denote rotation about a twice rotated frame, gives us:

Rx (alpha) R'y (beta) R''z (gamma) = Ry(beta) Rx(alph) Rx(-alph) Ry(-beta) Rz(gamma) Ry(beta) Rx(alpha) = Rz(gamma) Ry(beta) Rx(alph)

This system of Euler angles is really equivalent to the fixed angle system in reverse order. For example, z-y-x Euler angles are equivalent to x-y-z fixed angles. Therefore, the Euler Angle Representation has the same advantages and disadvantages that the Fixed Angle Representation has.

Euler Parameters

Euler showed that one orientation can be derived from another by a single rotation about an axis. Thus an orientation can be represented by a four-tuple of angle and (x,y,z) vector. The problem, however is that this representation cannot be used easily to interpolate between orietnations or to form a representation that represents concatenated rotations.

Quaternions

As shown, the above methods have problems when one tries to interpolate positions in those representations. A better approach is to use quaternions to represent orientation. Quaternions were actually developed by Euler in XXXX(?) for YYYY(?). A quaternion is a four-tuple, [s,(x,y,z)] or [s,v], consisting of a scalar part ,s, and a vector part, v. Hence, the quaternion is an alternate representation to the Euler parameter representation. Importantly, it's in a form that can be interpolated and concatenated. Quaternion multiplication is defined as:

[s1,v1] * [s2,v2] = [(s1*s2-v1.v2),(s1*v2+s2*v1+v1xv2)]

The inverse of a quaternion is obtained by negating its vector part and dividing both parts by the magnitude squared:

q-1 = (1/||q||2)*[s,-v] where ||q||2 = s2 + v.v

To rotate a vector, v, by a quaternion, q, treat the vector as [0,v] and:

v' = Rot(v) = q-1*v*q

Notice how this guarantees that vector cross products are preserved:

Rot(v1) x Rot(v2) = Rot(v1 x v2)

Also notice that because all effects of magnitude are divided out, any scalar multiple of a quaternion gives the same rotation, similar to the homogeneous representation of points.

The interpretation of the quaternion representation is that of an axis of rotation and an angle. Euler in the 1600s (?) showed that any orientation can be represented by a single axis of rotation and a single angle. The unit quaternion representation of an orientation is in the form of

q = Rot((x,y,z),�) = [cos(�/2),sin(�/2)(x,y,z)]

where � is the angle and (x,y,z) is the axis of rotation. Notice that as far as the rotational interpretation is concerned, q = -q. In fact, q = k*q where k is an arbitrary non-zero constant.

The sphere of unit quaternions forms a subgroup, S3, of the quaternion group. We can rotate without speeding up by interpolating on the sphere, specifically along the great arc between the two quaternion points. A formula for spherical linear interpolation from q1 to q2 with parameter u varying from 0 to 1 can be obtained in two different ways. From the group structure we find:

Slerp(q1,q2,u) = q1(q1-1q2)u

while from 4D geometry comes:

Slerp(q1,q2,u) = (sin((1-u)*�)/sin(�))*q1 + (sin(u*�)/sin(�))*q2

where q1.q2 = cos(�). The first is simpler for analysis while the second is more practical for applications. Notice that in the second one in the case u = 1/2, Slerp(q1,q2,u) = q1 + q2.

When interpolating between a series of orientations, slerping has the same problems that linear interpolation does between points in Euclidean space - first order discontinuities (see Appendix on interpolation). In Schumaker's paper [ref], he suggests using cubic Bezier interpolation to smooth the interpolation between orientations. He uses a technique which automatically calculates reasonable interior control points to define cubic segments between each pair of orientations.

Assume for now that we need to interpolated between two-dimensional points in Euclidean space. In order to calculate the control point following any particular point, qn, take the vector defined by qn-1 to qn and add it to the point qn. Now take this point and find the average of it and qn+1. This point becomes an. Take the vector defined by an to qn and add it to qn to get bn. bn and an are the control points immediately before and after the point qn. A cubic Bezier curve segment is then defined by the point qn, an, bn+1, qn+1.

It should be easy to see how this procedure can be converted into the 4D spherical world of quaternions. Instead of adding vectors, rotations are concatenated. Averaging of orientations can easily be done by slerping to the half-way orientation.

Once the internal control points are computed, the De Castenue algorithm can be applied to interpolate points along the curve. In the Euclidean case of Bezier curve interpolation, the De Castenue construction procedure looks like the following. [include diagram of De Castenue algorithm]

The same procedure can be used to construct the Bezier curve in four-dimensional spherical space. To obtain an orientation corresponding to the u = 1/4 position alogn the curve, for example, the following orientations are computed:

	p1 = slerp(qn,an,1/4)
	p2 = slerp(an,bn+1,1/4)
	p3 = slerp(bn+1,qn+1,1/4)
	p12 = slerp(p1,p2,1/4)
	p23 = slerp(p2,p3,1/4)
	p = slerp(p12,p23,1/4)

where p is the orientation 1/4 along the cubic spline. [include spherical De Castenue algorithm]

4.4 Camera Path Following

One of the simplest types of animation is that in which everything remains static except the camera. These are commonly referred to as walk throughs or flybys. Essentially the camera can be considered just as any other object as far as orientation and positioning is concerned. The user needs to construct a path through space for the observer to follow along with orientation information. The path is usually specified by key frame positioning followed by interpolation of the inbetween frames. There are various ways to deal with the view direction. For example, the center of interest can be held constant while observer position is interpolated along a curve (the view direction is the vector between the observer position and the center of interest). This is useful when the observer is flying over an environment inspecting a specific location such as a building. A path for the center of interest can be constructed, say from a series of buildings in an environment. Often, the animator will want the center of interest to stay focused on one building for a few frames before it goes to the next building. Alternatively, the center of interest can be controlled by other points along the observer path. For example, observer position for the next frame can be used to determine the view direction for the current frame. Sometimes this is too jerky. Some nth frame beyond the current frame can be used to produce a smoother view direction. The center of interest can also be attached to objects in the animation.

The specific method used to control view direction depends a lot on what the animation is trying to show. It can be handled in variety of ways.

Full orientation control is a bit more involved; controlling the observer position and view direction leaves one degree of freedom still to be determined: the observer tilt. One option is to just interpolate the view direction and then calculate the head up orientation and apply any tilt to that. The other is to specify another spline that controls the head-up position, but this can be hard to control.

[Need more explanation about path following and control of orientation] [Probably include some bias and tension curve control parameters]

4.5 Simple Key Frame System

The earliest computer animation systems were derivative of conventional hand-drawn animation systems in which positions of objects and the observer were specified at certain key frames. The positions and orientations were then interpolated to produce the frames between these keys. For now, we will only consider interpolation of positions; interpolating orientations involves issues particular to that application and are discussed as a separate topic in Section 1.3. Marcelli Wein and Hunger interpolated between 2-1/2D key frames.

Key frame animation is based on traditional animation techniques. Master animators would define and draw key frames that would essentially provide foundation pillars for the motion. Apprentice animators would then draw in the intermediate frames based on the key frames on either side of the intermediate sequence. This same idea was one of the earliest techniques implemented for two-dimensional computer animation. The 'master' animator would draw key images and provide point-to-point correspondence information for interpolation programs which would then produce the intermediate images. The computer was an automated version of the apprentice animators. Although the computer could do the interpolation faster and more accurately than apprentices, it had to be provided with extensive information concerning what corresponded with what in the different key frames. These computer programs process two-dimensional information and therefore could not readily handle the three-dimensionality implied by animated figures the way the apprentices could.

4.6 A Simple Animation Language

Animation languages have the advantage of documenting motion, providing reproducible motions, providing algorithmic motion specification and being compact representations for motion. However, because they are essentially special purpose programming languages, in order to work with them, the animator needs to have a programmer's mentality. Also, specifying complex motion with an animation language can be awkward.

Most are based on setting or changing object transformations over time. Light sources and the observer are usually animated the same way geometric objects are. Essentially an animation language provides the basic expression evaluation of any typical programming language along with some special constructions that have animation semantics.

principles
	basic - transformations over time
	programmability
		extension to languages, e.g., LISP, C
			buy into variables, expressions, indirect referencing, subroutines, macros
		readability, modifiability
		modular? graceful under complexity?
	interactivity
		script/batch based
		interactive setup of transformations
		interactive animation
	reactive
		conditional animation - reaction to circumstances that may arise during animation
		if (cond) then 

miscellaneous
	instancing
	camera/view handling

Set/Change Commands

The fundamental facility that an animation language provides is a direct way to modify an object's transformation. Usually an object and its transformation representation are special data types that can be manipulated by these special commands. The most straightforward transformation manipulation commands are the ability to absolutely set or incrementally change something about the object's transformation. For example, simple commands may be of the form:

SET