Disclaimer: This dissertation has been written by a student and is not an example of our professional work, which you can see examples of here.

Any opinions, findings, conclusions, or recommendations expressed in this dissertation are those of the authors and do not necessarily reflect the views of UKDiss.com.

Developing Humanoid Robot Animations in Motion Capture

Info: 5428 words (22 pages) Dissertation
Published: 12th Dec 2019

Reference this

Tagged: AnimationFilm Studies

Introduction (Chapter 1)

This research describes the framework in which the different human movements have been taken from motion capture and that information is animated which sets the direction to study the digital character models and its locomotion in the virtual environment. It also gives feasible approach in understanding of walking gait patterns in that environment. This framework also leads to the study issues related to safety engineering.


Analysis of human locomotion and its research area have changed since it began form the cave drawings of the Paleolithic Era. The motive for human locomotion studies at early stages were driven by the need to carry on by resourcefully moving from place to place, dodging from predators and hunting for food (Alexander, 2000). Modern-day human locomotion studies have added to a wide range of applications ranging from military use, sport, ergonomics, and health care. In locomotion studies, according to (Hall, 1991) the term biomechanicsbecame accepted during the early 1970s as the internationally recognized descriptor of the field of area concerned with the mechanical study of living organism. In sport, human locomotion studies are made to extend the restrictions of an athlete when even the minimum improvement in performance is pursued eagerly (J. A. , 1984). However, the development of human locomotion studies remains reliant on the improvement of new tools for observation. According to (Alexander, 2000) lately, instrumentation and computer technology have grant opportunities for the improvement of the study of human locomotion. (J. A. , 1984).

Illustrate frequent techniques for measuring motion and mentioned the co-ordinate analyzer (motion capture device) as a major advance in movement study. According to (Furniss, 2000) Motion capture or mocap was initially created for military use earlier than it was modified into the entertainment industry since the mid 1980.s. (Dyer, 1995) define motion capture as measuring an objects location and direction in physical space, then recording that sequence into a computer usable form. According to(Micheal, 2003) ; (Suddha Basu, 2005) motion capture is the fastest way to produce rich, realistic animation data. (James F O’Brien, 2000) illustrate that Mocap can also be useful in several additional fields such as music, fine art dance, sign language, motion recognition, rehabilitation with medicine, biomechanics, special effects for live-action films and computer animation of all types as well as in defense and athletic analysis/training. There are basically three types of motion capture systems accessible such as mechanical, electromagnetic and optical based system. All three systems go through the same basic process shown in figure. The first step is the input where the movement of live actors either human or animal is recorded using various method depending on the type of the motion capture system used. Next, the information is processed to identify the corresponding markers of the live actor and then transferred into virtual space using specialized computer software. Finally the output is where the information is translated into a 3D trajectory computer data that contains translation and rotation information known as motion capture file.


Producing realistic character animation remains one of the great challenges in computer graphics. At present, there are three methods by which this animation can be produced. The first one is key framing, in which the animator gives important key poses for the character at specific frames. A second one uses physical simulation to drive the character’s motion its results are good, due to lack of control its difficult to use and it’s costly and with characters it’s not much successful. The last one is motion capture, has been widely used to animate characters. It uses sensors placed on person and collects the data that describes their motion however they are performing the desired motion. As the technology for motion capture has improved and the cost decreased, the interest in using this approach for character animation has also increased. The main challenge that an animator is confronted with is to generate character animation with a realistic appearance. As humanoid robot renovation is a popular research area since it can be used in various applications to understand the emerging field of robotics and other digital animation fields. Currently most of the methods work for controlled environments and human pose reconstruction to recognize humanoid robots is a popular research area since it can be used in various applications. Motion capture and motion synthesis are expensive and time consuming tasks for articulated figures, such as humans. Human pose view based on computer vision principles is inexpensive and widely applicable approach. In computer vision literature the term human motion capture is usually used in connection with large scale body analysis ignoring the fingers, hands and the facial muscles, which is the case in this research. The motion capture is fairly involved to calculate a 3D skeletal representation of the motion of satisfactory value to be functional for animation. The animation generation is an application of motion capture where the required accuracy is not as high as in some other applications, such as medicine (Ferrier, June 2002)

Problem Context

1) Even though motion capture is applied into so many fields by creating physically perfect motions, it has a few significant weaknesses. According to (Lee, MCML: Mocap, 2004) firstly, it has low flexibility, secondly the captured data can have different data formats depending on the motion capture system which was employed and thirdly, commercially available motion capture libraries are difficult to use as they often include hundreds of examples. (Shih-Pin Chao, 2003) States that motion capture sessions are not only costly but also a labor intensive process thus, promotes the usability of the motion data.

2) In the field of animation and gaming industry, it is common that motion information are captured to be used for a particular project or stored in a mocapdata. This data can either be used as the whole range of motion sequence or as part of a motion synthesis. In sport science, mocap data is used for analyzing and perfecting the sequencing mechanics of premier athletes, as well as monitoring the recovery progress of physical therapies. This simply means that a vast collection of motion capture data models are limited for different sets. Currently, motion data are often stored in small clips to allow for easy hand sequencing for describing the behavior (Jernej Barbic, 2004) (Tanco L. M., 2000). However, according to (Lee, MCML: Mocap, 2004) (Morales, 2001) (Tanco L. M., 2000) a motion capturedata models lack interoperability. This calls for an immediate need for tools that synchronize these datasets (Feng Liu, 2003).

3) In light of the recent course of interest in virtual environment applications, much research has been devoted to solving the problems of manipulating humans in 3-D simulated worlds, and especially to human locomotion. However, most of the animation approaches based on these studies can only generate limited approach lacking the locomotion capabilities for example walking their application in virtual environments are inevitably limited.

Project Objective

The objective of this project is to create a framework taken from motion capture data techniques which can set the direction to study 3D articulated figures and the humanoid robot locomotion in the virtual environment by understanding walking gait patterns in human. This framework also leads to the study issues related to safety engineering.

The other objective of this project is to capture, process, and examine the locomotion feasibility in virtual environment and analyze different tasks in virtual environment.

In system overview diagram all the different steps has been described it starts from mocaop suit that is on the subject and then its data of random movement has been taken into computer and motion analysis is done. After motion analysis it’s been retargeted and with avatar model the final output scene has been created. Then with software development kit feasible program has been created to deal with different information of that scene.

Project Scope

To capture the human motion from the motion capture technology and using the captured data to animate the different motions and then refining the animated data. By using the technology called motion builder we can simulate and study the effects of walk and fall in the virtual environment.  After mapping the captured data in the animated character which is called digital humanoid robot an application is build to study the nature of the animated scene which is called an enhanced framework. The other technology is used is called mathematica which is used for studying the factors in mathematical terms because the human motion builder is a simulation technology and mathematica is a dynamic solver engine. So it will lead towards the study of digital humanoid robot of walking and falling in virtual environments on some assumptions.


This part outlines the in general structure of the thesis, and a short explanation for each chapter:

Chapter 1: deals with Introduction, scope and objective with problem context.

Chapter 2: Introduces human motion capture techniques and different work in animation of human walking in virtual environment and gives a summary of the related work in this area.

Chapter 3: deals with the system structure which describes the hardware and the software technologies involved in the research and also illustrate the frame work model and this model help exploit the behavior of humanoid which sets up the framework.

Chapter 4: describes the framework analysis based on the study of articulated animation models in virtual environment and walking gait patters with Bezier curve algorithm.

Chapter 5: mention all the techniques that have been extracted from different software’s and how it’s used to set up the whole framework and evaluates results which are categorized in three phases the application which represents coordinate system and structure, walking gait patters by using Bezier curve and the falling effect by visual aid.

Chapter 6: is the conclusion that summarizes the outcome of the project, and discusses the future works.


This chapter describes the introduction of motion capture and how it will be utilized to improve the study of human locomotion. The project scope and objectives are elaborated and listed down in this chapter.

Literature Review (Chapter 2)

Motion capture system

 Motion capture is an attractive way of creating the motion parameters for computer animation. It can provide the realistic motion parameters. It permits an actor and a director to work together to create a desired pose, that may be difficult to describe with enough specificity to have an animator recreate manually (Ferrier, June 2002). The application areas of motion capture techniques can be summarized as follows (Perales, 2001):

Virtual reality: interactive virtual environments, games, virtual studios, character animation, film, advertising

Smart surveillance systems: access control, parking lots, supermarkets, vending machines, traffic.

Advanced user interfaces: advanced user interfaces.

Motion analysis and synthesis: annotations of videos, personalized training, clinical studies of medicine.

Understanding the working of humanoid robot has been always on study of human locomotion.  This literature review discusses human motion control techniques, motion capture techniques in general and advance, non-vision based motion capture techniques, vision-based motion capture techniques with and without markers and other enhanced techniques which are covered in details for which the framework can be understood easily.

Properties of Tracking Systems

This section lists properties of tracking systems and discusses the relationships between the various properties.


Accuracy can be defined as the agreement between the measured results from tracking technologies and the actual position of the object, and because the true value is unknown the tracking technologies can only be evaluated with relative accuracy. For one tracking system, the accuracy is limited by the principle and affected by the noise/interferences from the environment. The sources of noises are depending on the tracking technology we use. For different tracking principles, the influencing factors are different. For example, for optical motion tracking, the interference is lighting and AC current; for magnetic, ferrous objects distort the magnetic field and cause errors. If the model or the mechanism of the noise is quantitatively known, it is a systematic error and can be compensated by post-treatment after tracking or eliminated by pre-filtering before tracking.


Robustness defines the system’s ability to continue to function in adverse conditions or with missing or incorrect measurements. Some systems make assumptions about the surrounding environment during operation. Also, a system may be unable to take a measurement at a particular time. Related to the robustness is repeatability in the reported data. If the reported values are consistent over time and over operating conditions and environment, then measuring the accuracy (or the lack thereof) is possible, and corrective algorithms can be applied.

Tracking range

The range is the space in which the system can measure sufficient and accurate data for the application. For some systems, the range can be reduced by noises from the environment or be limited by the hardware of the system itself. For example, magnetic system cannot track accurate data when the tracked object is at the margin of the magnetic field due to the inhomogeneous distribution of the field.

Tracking speed

Tracking speed is the frequency at which the measurement system can obtain the updated tracking data. There are two significant numbers for the system, one is update rate and the other one is latency. Update rate is the frequency at which the tracking system generates the tracking data; latency describes the delay between tracking data has been generated and the host computer receives the data in real-time mode.


The hardware means the physical realization of the components of the tracking system. It includes the number of components, and the size and weight of those components, especially those that the user is required to carry (or wear). Some systems may have a significant amount of hardware that must be set up in the environment, although it may need no further attention from the user once in position. Ideally, the application would like to give the user complete freedom of movement. Some devices tether the user to a fixed object. Some systems may have a heavy or unwieldy device which the user must manipulate in order to move. Some devices have a tendency to pull the user back to a “resting position” for the device. The hardware also determines the biggest part of the costs and therefore is very often a decisive factor for the choice of the applied motion tracking system

Non-vision Based Motion Capture

In non-vision based systems, sensors are attached to the human body to collect movement information. Some of them have a small sensing footprint that they can detect small changes such as finger or toe movement (Hu, A survey – human movement tracking and stroke rehabilitation, 1996). Each kind of sensor has advantages and limitations (Hu, A survey – human movement tracking and stroke rehabilitation, 1997).

Advantages of magnetic trackers:

  • real-time data output can provide immediate feedback
  • no post processing is required
  • they are less expensive than optical systems
  • no occlusion problem is observed
  • multiple performers are possible

Disadvantages of magnetic trackers:

  • the trackers are sensitivity to metal objects
  • cables restricts the performers
  • they provide lower sampling rate than some optical systems
  • the marker configurations are difficult to change

Advantages of electromechanical body suits:

  • they are less expensive than optical and magnetic systems
  • real-time data is possible
  • no occlusion problem is observed
  • multiple performers are possible

Disadvantages of electromechanical body suits:

  • they provide lower sampling rate
  • they are difficult to use due to the amount of hardware
  • configuration of sensors is fixed

Vision-Based Motion Capture with Markers

In 1973, Johansson explored his famous Moving Light Display (MLD) psychological experiment to perceive biological motion (Johansson). In the experiment, small reflective markers are attached to the joints of the human performers. When the patterns of the movements are observed, the integration of the signals coming from the markers resulted in recognition of actions. Although the method faces the challenges such as errors, non-robustness and expensive computation due to environmental constraints, mutual occlusion and complicated processing, many marker based tracking systems are available in the market. This is a technique that uses optical sensors, e.g. cameras, to track human movements, which are captured by placing markers upon the human body. Human skeleton is a highly articulated structure and moves in three-dimension. For this reason, each body part continuously moves in and out of occlusion from the view of the cameras, resulting in inconsistent and unreliable motion data of the human body. One major drawback of using optical sensors and markers, they cannot sense joint rotation accurately. This is a major drawback in representing a real 3D model (Hu, A survey – human movement tracking and stroke rehabilitation, 1997). Optical systems have advantages and limitations (Perales, 2001).

Advantages of optical systems are as follows:

  • they are more accurate
  • larger number of markers are possible
  • no cables restricts the performers
  • they produces more samples per second

Disadvantages of optical systems:

  • they require post-processing
  • they are expensive (between 100, 000 and 250, 000)
  • occlusion is a problem in these systems
  • environment of the capturing must be away from yellow light and reflective noise

Vision-Based Motion Capture without Markers

As a less restrictive motion capture technique, markerless-based systems are capable of overcoming the mutual occlusion problem as they are only concerned about boundaries or features on human bodies. This is an active and promising but also challenging research area in the last decade. The research with respect to this area is still ongoing (Hu, A survey – human movement tracking and stroke rehabilitation, 1996). The markerless-based motion capture technique exploits external sensors like cameras to track the movement of the human body. A camera can be of a resolution of a million pixels. This is one of the main reasons that optical sensors attracted people’s attention. However, such vision-based techniques require intensive computational power (Bryson, 1993). As a commonly used framework, 2D motion tracking only concerns the human movement in an image plane, although sometimes people intend to project a 3D structure into its image plane for processing purposes. This approach can be catalogued with and without explicit shape models (Hu, A survey – human movement tracking and stroke rehabilitation, 1996). The creation of motion capture data from a single video stream seems like a plausible idea. People are able to watch a video and understand the motion, but clearly, the computing the human motion parameters from a video stream are a challenging task (Ferrier, June 2002). Vision-based motion capture techniques usually include initialization and tracking steps.


A system starts its operation with correct interpretation of the current scene. The initialization requires camera calibration, adaptation to scene characteristics and model initialization. Camera calibration is defined as parameters that are required for translating a point in a 3D scene to its position in the image. Some systems find initial pose and increment it from frame to frame whereas in other systems the user specifies the pose in every single frame. Some systems have special initialization phase where the start pose is found automatically whereas in others the same algorithm is used both for initialization and pose estimation (Granum, 2001).


Tracking phase extracts specific information, either low level, such as edges, or high level, such as head and hands. Tracking consists of three parts (Granum, 2001):

  • Figure-ground segmentation: the human figure is extracted from the rest of the image.
  • Representation: segmented images are converted to another presentation to reduce the amount of information.
  • Tracking over time: how the subject should be tracked from frame to frame.


Mechanical measurement is the oldest form of location; rulers and tape measures provide a simple method of locating one item with reference to another. More sophisticated mechanical techniques have been developed. Nowadays measurements of the angles of the body joints with potentiometers or shaft encoders combined with knowledge of the dimensions of the rigid components allow accurate calculations of the position of different body parts.(Beresford, 2005)

Today mechanical position tracking devices can be separated into body-based and ground-based systems.

Body based systems are those which are mounted on, or carried on, the body of the user and are used to sense either the relative positions of various parts of the user’s body or the position of an instrument relative to a fixed point on the user’s body. Body-based systems are typically used to determine either the user’s joint angles for reproduction of their body in the synthetic environment, or to determine the position of the user’s hand or foot, relative to some point on the user’s body. Since the body based systems are used to determine the relative position between two of the user’s body parts, the devices must somehow be attached to the user’s body. This particular issue has raised many questions: How is the device attached to the body in a way which will minimize relative motion between the attachment and the soft body part? How are the joints of the device aligned with the user’s joints to minimize the difference in the centers of rotation? Some other problems associated with body-based tracking systems are specifically caused by the device being attached to the user’s body. These systems are typically very obtrusive and encumbering and therefore do not allow the user complete freedom of movement. Body-based systems are, however, quite accurate and do not experience problems like measurement drift (the tendency of the device’s output to change over time with no change in the sensed quantity), interference from external electromagnetic signals or metallic devices in the vicinity, or shadowing (loss of sight of the tracked object due to physical interference of another object)(Frey, 1996).

Ground based systems are not carried by the user but are mounted on some fixed surface (i.e. the user’s desk or the floor) and are used to sense the position of an implement relative to that fixed surface. Ground-based systems are typically used to determine the position and orientation of an implement manipulated by the user relative to some fixed point which is not on the user’s body. Like body-based mechanical systems, they are very accurate and are not plagued by measurement drift errors, interference or shadowing. Ground-based systems do suffer from one thing which the body-based systems do not: They confine the user to work within the space allowed by the device. Usually this means that the user is confined to work in a space the size of a large desk. If the application does not require the user to move around much throughout the task (i.e. the user remains seated), this is not considered as a problem.

Mechanical tracking systems are the best choice for force-feedback (haptic) devices since they are rigidly mounted to either the user or a fixed object. Haptic devices are used to allow the user a ‘sense of touch’. The user can feel surfaces in the synthetic environment or feel the weight of an object. The device can apply forces to the user’s body so that the user can experience a sense of exertion. Mechanical tracking systems also typically have low latencies (the time required to receive useful information about a sensed quantity) and high update rates (the rate at which the system can provide useful information). Therefore these systems have found good commercial niche as measurement devices and hand tracking systems.


  • high update rate
  • low latency
  • accurate
  • No blocking problem, no interference from environment · best choice for force feedback


  • Restricted movement from mounted device


Acoustic tracking systems utilize high frequency sound waves to track objects by either the triangulation of several receivers (time-of-flight method) or by measuring the signal’s phase difference between transmitter and receiver (phase-coherence method).

Generally the user carries the transmitter, and a series of sensors around the room determine the linear distance to the transmitter. Some systems have the user carry a receiver and listen to a series of transmitters positioned around the volume.

The ‘time-of-flight’ method of acoustic tracking uses the speed of sound through air to calculate the distance between the transmitter of an acoustic pulse and the receiver of that pulse. The use of one transmitter on a tracked object and a minimum of three receivers at stationary positions in the vicinity allow an acoustic system to determine the relative position of the object via triangulation. This method limits the number of objects tracked by the system to one. An alternative method has been devised in which several transmitters are mounted at stationary positions in the room and each object being tracked is fitted with a receiver. Using this method, the positions of numerous objects may be determined simultaneously. Note that the use of one transmitter (or one receiver) attached to an object can resolve only position. The use of two transmitter (receiver) sets with the same object can be used to determine the position and orientation (6 DOF) of the object. The desire to track more than just the position of an object suggests that the second method (multiple stationary transmitters with body mounted receivers) may be preferable.

The other method of acoustic tracking is the phase-coherent tracking. It may be used to achieve better accuracies than the time-of-flight method. The system does this by sensing the signal phase difference between the signal sent by the transmitter and that detected by the receiver. If the object being tracked moves farther than one-half of the signal wavelength in any direction during the period of one update, errors will result in the position determination. Since phase coherent tracking is an incremental form of position determination, small errors in position determination will result in larger errors over time (drift errors), which may be the reason why only few phase-coherent systems have been implemented successfully.

Some problems associated with both acoustic tracking methods result from the line-of-sight required between transmitter and receiver. This line of sight requirement obviously plagues the devices with shadowing problems. It also limits their effective tracking range, although they have better tracking ranges than electromagnetic systems. Unlike electromagnetic systems, they do not suffer from metallic interference, but they are susceptible to interference caused by ambient noise sources, by reflections of the acoustic signals from hard surfaces, and environmental interference (e.g. temperature variations).


  • Very high freedom of movement



  • Line-of-sight problems
  • Either high range or high accuracy (not both!)
  • Environmental interference (e.g. temperature variations, other noise sources)
  • Drift errors (phase-coherent)
  • High latency, low update rates


Electromagnetic tracking systems are currently the most widely used systems for human body tracking applications. They employ the use of artificially-generated electromagnetic fields to induce voltages in detectors attached to the tracked object. A fixed transmitter and the sensors consist of three coils mounted in mutually orthogonal directions. The sensors range in size, but tend to be around a few cubic centimeters. The transmitters range in size with the power of the field they are expected to generate, and range from several cubic inches to a cubic foot. There are four magnetic fields that have to be measured: the environmental field (including the Earth’s magnetic field), and three orthogonal fields in the transmitter’s coordinate directions in figure. Each of these fields is measured in the sensor’s three coordinate dimensions for a total of twelve measurements of each sensor. From this information, the position and orientation of the sensor with respect to the transmitter can be computed.

These tracking systems are robust, fast, and fairly inexpensive and can be used to track numerous objects (body parts) with acceptable position and orientation accuracies (on the order of 0.1 inches and 0.5 degrees). Unlike electric fields, magnetic fields are unaffected by the presence or absence of human bodies and other non-metallic objects in the environment. This offers a tremendous opportunity, because it enables magnetic trackers to overcome the line-of-sight requirement that plagues acoustic, optical, and externally connected mechanical tracking systems. On the other hand, the magnetic systems suffer from sensitivity to background magnetic fields and interference caused by ferrous metal devices in the vicinity, and therefore is inaccurate in practical environments. Due to this and the limited range of the generated magnetic field, the magnetic tracking systems are restricted to a small special area.


  • High update rates
  • Very low latency
  • High robustness
  • No shadowing
  • Rather cheap

Acceptable accuracy in artificial environment


  • High sensitivity to background magnetic fields
  • Inaccurate in practical environments due to interference caused by ferrous metal devices
  • Low range of the magnetic field and Tracking scope is low due to cable


An internal sensor contains three gyroscopes, to determine the angular rate, and three accelerometers, to determine linear acceleration. Originally, they were mounted to orthogonal axes on a gimbaled platform, as it can be seen in figure. After removing the effect of gravity from the vertical accelerometer, the data has to be double-integrated to provide a measure of the offset between initialization and the current position. In fact, this combination of sensors has been used successfully for inertial navigatio

Cite This Work

To export a reference to this article please select a referencing stye below:

Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.

Related Services

View all

Related Content

All Tags

Content relating to: "Film Studies"

Film Studies is a field of study that consists of analysing and discussing film, as well as exploring the world of film production. Film Studies allows you to develop a greater understanding of film production and how film relates to culture and history.

Related Articles

DMCA / Removal Request

If you are the original writer of this dissertation and no longer wish to have your work published on the UKDiss.com website then please: