Motion capture (sometimes referred as mocap or mo-cap, for short) is the process of recording high-resolution movement of objects or people into a computer system. It is used in

military A military, also known collectively as armed forces, is a heavily armed, highly organized force primarily intended for warfare. Militaries are typically authorized and maintained by a sovereign state, with their members identifiable by a d ...

entertainment Entertainment is a form of activity that holds the attention and Interest (emotion), interest of an audience or gives pleasure and delight. It can be an idea or a task, but it is more likely to be one of the activities or events that have deve ...

sports Sport is a physical activity or game, often competitive and organized, that maintains or improves physical ability and skills. Sport may provide enjoyment to participants and entertainment to spectators. The number of participants in ...

, medical applications, and for validation of computer vision and robots. In films, television shows and video games, motion capture refers to recording actions of human actors and using that information to animate digital character models in 2D or 3D

computer animation Computer animation is the process used for digitally generating Film, moving images. The more general term computer-generated imagery (CGI) encompasses both still images and moving images, while computer animation refers to moving images. Virtu ...

. When it includes face and fingers or captures subtle expressions, it is often referred to as performance capture. In many fields, motion capture is sometimes called motion tracking, but in filmmaking and games, motion tracking usually refers more to

match moving In visual effects, match moving is a technique that allows the insertion of 2D elements, other live action elements or CG computer graphics into live-action footage with correct position, scale, orientation, and motion relative to the photograph ...

. In motion capture sessions, movements of one or more actors are sampled many times per second. Whereas early techniques used images from multiple cameras to calculate 3D positions, often the purpose of motion capture is to record only the movements of the actor, not their visual appearance. This ''animation data'' is mapped to a 3D model so that the model performs the same actions as the actor. This process may be contrasted with the older technique of rotoscoping. Camera movements can also be motion captured so that a virtual camera in the scene will pan, tilt or dolly around the stage driven by a camera operator while the actor is performing. At the same time, the motion capture system can capture the camera and props as well as the actor's performance. This allows the computer-generated characters, images and sets to have the same perspective as the video images from the camera. A computer processes the data and displays the movements of the actor, providing the desired camera positions in terms of objects in the set. Retroactively obtaining camera movement data from the captured footage is known as ''match moving'' or '' camera tracking''. The first virtual actor animated by motion-capture was produced in 1993 by Didier Pourcel and his team at Gribouille. It involved "cloning" the body and face of French comedian Richard Bohringer, and then animating it with still-nascent motion-capture tools.

Advantages

Motion capture offers several advantages over traditional

of a 3D model: * Low latency, close to real-time results can be obtained. In entertainment applications, this can reduce the costs of keyframe-based

animation Animation is a filmmaking technique whereby still images are manipulated to create moving images. In traditional animation, images are drawn or painted by hand on transparent celluloid sheets to be photographed and exhibited on film. Animati ...

. The Hand Over technique is an example of this. * The amount of work does not vary with the complexity or length of the performance to the same degree as when using traditional techniques. This allows many tests to be done with different styles or deliveries, giving a distinct personality that is only limited by the talent of the actor. * Complex movement and realistic physical interactions such as

secondary motion Secondary animation, also known as secondary motion, is flat motions generated as a reaction to the movement of primary motion by a character. It is significant in animation because it amplifies the character's motion via effects that appear to be d ...

s, weight, and exchange of forces can be easily recreated in a physically accurate manner. * The amount of animation data that can be produced within a given time is extremely large when compared to traditional animation techniques. This contributes to both cost-effectiveness and meeting production deadlines. * Potential for free software and third-party solutions reducing its costs.

Disadvantages

* Specific hardware and special software programs are required to obtain and process the data. * The cost of the software, equipment and personnel required can be prohibitive for small productions. * The capture system may have specific requirements for the space in which it is operated, depending on camera field of view or magnetic distortion. * When problems occur, it is easier to shoot the scene again rather than trying to manipulate the data. Only a few systems allow real-time viewing of the data to decide if the take needs to be redone. * The initial results are limited to what can be performed within the capture volume without extra editing of the data. * Movement that does not follow the laws of physics cannot be captured. * Traditional animation techniques, such as added emphasis on anticipation and follow through, secondary motion or manipulating the shape of the character, as with squash and stretch animation techniques, must be added later. * If the computer model has different proportions from the capture subject, artifacts may occur. For example, if a cartoon character has large, oversized hands, these may intersect the character's body if the human performer is not careful with their physical motion.

Applications

There are many applications of motion capture. The most common are for video games, movies, and movement capture, however there is a research application for this technology being used at Purdue University in robotics development.

Video games

Video games A video game or computer game is an electronic game that involves interaction with a user interface or input device (such as a joystick, game controller, controller, computer keyboard, keyboard, or motion sensing device) to generate visual fe ...

often use motion capture to animate athletes, martial artists, and other in-game characters. As early as 1988, an early form of motion capture was used to animate the 2D

player characters A player character (also known as a playable character or PC) is a fictional character in a video game or tabletop role-playing game whose actions are controlled by a player rather than the rules of the game. The characters that are not control ...

Martech Martech was a video game publisher which operated in Pevensey Bay between 1982 and 1989. It was founded as Martech Games. The company published a number of successful video games for the BBC Micro, BBC Model B, ZX Spectrum, ZX81, MSX, Amstrad CPC, ...

's video game '' Vixen'' (performed by model

Corinne Russell Corinne Russell (born 22 November 1963) is an English former Page 3 girl, glamour model and dancer during the 1980s. Modelling career Russell made her Page 3 debut in ''The Sun'' on 23 August 1982, and first appearing in the '' Daily Star'' ...

) and

Magical Company , also known as Mahō, is a Japanese entertainment company. History Established in Kobe in 1983 to design and develop video games, the company was incorporated on May 29, 1985 as Home Data. During the 80s they developed and published various ...

's 2D arcade

fighting game The fighting game video game genre, genre involves combat between multiple characters, often (but not limited to) one-on-one battles. Fighting game combat often features mechanics such as Blocking (martial arts), blocking, grappling, counter- ...

''Last Apostle Puppet Show'' (to animate digitized sprites). Motion capture was later notably used to animate the 3D character models in the Sega Model

arcade games An arcade game or coin-op game is a coin-operated entertainment machine typically installed in public businesses such as restaurants, bars and amusement arcades. Most arcade games are presented as primarily games of skill and include arcade ...

Virtua Fighter is a series of fighting games created by Sega AM2 and designer Yu Suzuki. The original ''Virtua Fighter (video game), Virtua Fighter'' was released in December 1993 and has received four main sequels and several spin-offs. The highly influential ...

'' (1993) and '' Virtua Fighter 2'' (1994). In mid-1995, developer/publisher Acclaim Entertainment had its own in-house motion capture studio built into its headquarters.

Namco was a Japanese multinational video game and entertainment company founded in 1955. It operated video arcades and amusement parks globally, and produced video games, films, toys, and arcade cabinets. Namco was one of the most influential c ...

's 1995 arcade game '' Soul Edge'' used passive optical system markers for motion capture. Motion capture also uses athletes in based-off animated games, such as Naughty Dog's Crash Bandicoot, Insomniac Games' Spyro the Dragon, and Rare's Dinosaur Planet.

Robotics

Indoor positioning is another application for optical motion capture systems. Robotics researchers often use motion capture systems when developing and evaluating control, estimation, and perception algorithms and hardware. In outdoor spaces, it's possible to achieve accuracy to the centimeter by using the Global Navigation Satellite System ( GNSS) together with Real-Time Kinematics ( RTK). However, this reduces significantly when there is no line-of-sight to the satellites — such as in indoor environments. The majority of vendors selling commercial optical motion capture systems provide accessible open source drivers that integrate with the popular Robotic Operating System ( ROS) framework, allowing researchers and developers to effectively test their robots during development. In the field of aerial robotics research, motion capture systems are widely used for positioning as well. Regulations on airspace usage limit how feasible outdoor experiments can be conducted with Unmanned Aerial Systems ( UAS). Indoor tests can circumvent such restrictions. Many labs and institutions around the world have built indoor motion capture volumes for this purpose. Purdue University houses the world's largest indoor motion capture system, inside the Purdue UAS Research and Test (PURT) facility. PURT is dedicated to UAS research, and provides tracking volume of 600,000 cubic feet using 60 motion capture cameras. The optical motion capture system is able to track targets in its volume with millimeter accuracy, effectively providing the true position of targets — the "ground truth" baseline in research and development. Results derived from other sensors and algorithms can then be compared to the ground truth data to evaluate their performance.

Movies

Movies use motion capture for CGI effects, in some cases replacing traditional cel animation, and for completely CGI creatures, such as Gollum, The Mummy,

King Kong King Kong, also referred to simply as Kong, is a fictional giant monster resembling a gorilla, who has appeared in various media since 1933. The character has since become an international pop culture icon,Erb, Cynthia, 1998, ''Tracking Kin ...

, Davy Jones from '' Pirates of the Caribbean'', the Na'vi from the film ''Avatar'', and Clu from '' Tron: Legacy''. The Great Goblin, the three Stone-trolls, many of the orcs and goblins in the 2012 film '' The Hobbit: An Unexpected Journey'', and Smaug were created using motion capture. The film '' Batman Forever'' (1995) used some motion capture for certain visual effects.

Warner Bros. Warner Bros. Entertainment Inc. (WBEI), commonly known as Warner Bros. (WB), is an American filmed entertainment studio headquartered at the Warner Bros. Studios complex in Burbank, California and the main namesake subsidiary of Warner Bro ...

had acquired motion capture technology from

arcade video game An arcade video game is an arcade game that takes player input from its controls, processes it through electrical or computerized components, and displays output to an electronic monitor or similar display. All arcade video games are coin-oper ...

company Acclaim Entertainment for use in the film's production. Acclaim's 1995 video game of the same name also used the same motion capture technology to animate the digitized sprite graphics. The 1999 film '' Star Wars: Episode I – The Phantom Menace'' was the first feature-length film to include a main character created ( Jar Jar Binks, played by Ahmed Best), using motion capture. The 2000

India India, officially the Republic of India, is a country in South Asia. It is the List of countries and dependencies by area, seventh-largest country by area; the List of countries by population (United Nations), most populous country since ...

n- American film '' Sinbad: Beyond the Veil of Mists'' was the first feature-length film made primarily with motion capture, although many character animators also worked on the film, which had a very limited release. 2001's '' Final Fantasy: The Spirits Within'' was the first widely released movie to be made with motion capture technology. Despite its poor box-office intake, supporters of motion capture technology took notice. '' Total Recall'' had already used the technique, in the scene of the x-ray scanner and the skeletons. '' The Lord of the Rings: The Two Towers'' was the first feature film to utilize a real-time motion capture system. This method streamed the actions of actor Andy Serkis into the computer-generated imagery skin of Gollum / Smeagol as it was being performed. Storymind Entertainment, which is an independent Ukrainian studio, created a

neo-noir Neo-noir is a film genre that adapts the visual style and themes of 1940s and 1950s American film noir for contemporary audiences, often with more graphic depictions of violence and sexuality. During the late 1970s and the early 1980s, the term ...

third-person / shooter video game called '' My Eyes On You,'' using motion capture in order to animate its main character, Jordan Adalien, and along with non-playable characters. Of the three nominees for the 2006 Academy Award for Best Animated Feature, two of the nominees ('' Monster House'' and the winner '' Happy Feet'') used motion capture, and only

Disney The Walt Disney Company, commonly referred to as simply Disney, is an American multinational mass media and entertainment industry, entertainment conglomerate (company), conglomerate headquartered at the Walt Disney Studios (Burbank), Walt Di ...

Pixar Pixar (), doing business as Pixar Animation Studios, is an American animation studio based in Emeryville, California, known for its critically and commercially successful computer-animated feature films. Pixar is a subsidiary of Walt Disney ...

's ''

Cars A car, or an automobile, is a motor vehicle with wheels. Most definitions of cars state that they run primarily on roads, seat one to eight people, have four wheels, and mainly transport people rather than cargo. There are around one billio ...

'' was animated without it. In the ending credits of

's film '' Ratatouille'', a stamp appears labelling the film as "100% Genuine Animation – No Motion Capture!" Since 2001, motion capture has been used extensively to simulate or approximate the look of live-action theater, with nearly

photorealistic Photorealism is a genre of art that encompasses painting, drawing and other graphic media, in which an artist studies a photograph and then attempts to reproduce the image as realistically as possible in another medium. Although the term can b ...

digital character models. '' The Polar Express'' used motion capture to allow

Tom Hanks Thomas Jeffrey Hanks (born July 9, 1956) is an American actor and filmmaker. Known for both his comedic and dramatic roles, he is one of the most popular and recognizable film stars worldwide, and is regarded as an American cultural icon. Ha ...

to perform as several distinct digital characters (in which he also provided the voices). The 2007 adaptation of the saga ''

Beowulf ''Beowulf'' (; ) is an Old English poetry, Old English poem, an Epic poetry, epic in the tradition of Germanic heroic legend consisting of 3,182 Alliterative verse, alliterative lines. It is one of the most important and List of translat ...

'' animated digital characters whose appearances were based in part on the actors who provided their motions and voices. James Cameron's highly popular ''

Avatar Avatar (, ; ) is a concept within Hinduism that in Sanskrit literally means . It signifies the material appearance or incarnation of a powerful deity, or spirit on Earth. The relative verb to "alight, to make one's appearance" is sometimes u ...

'' used this technique to create the Na'vi that inhabit Pandora.

The Walt Disney Company The Walt Disney Company, commonly referred to as simply Disney, is an American multinational mass media and entertainment conglomerate headquartered at the Walt Disney Studios complex in Burbank, California. Disney was founded on October 16 ...

has produced

Robert Zemeckis Robert Lee Zemeckis (born May 14, 1952) is an American filmmaker known for directing and producing a range of successful and influential movies, often blending cutting-edge visual effects with storytelling. He has received several accolades incl ...

's ''

A Christmas Carol ''A Christmas Carol. In Prose. Being a Ghost Story of Christmas'', commonly known as ''A Christmas Carol'', is a novella by Charles Dickens, first published in London by Chapman & Hall in 1843 and illustrated by John Leech. It recounts the ...

'' using this technique. In 2007, Disney acquired Zemeckis' ImageMovers Digital (that produces motion capture films), but then closed it in 2011, after a box office failure of '' Mars Needs Moms''. Television series produced entirely with motion capture animation include '' Laflaque'' in Canada, '' Sprookjesboom'' and ' in The Netherlands, and '' Headcases'' in the UK.

Movement capture

Virtual reality Virtual reality (VR) is a Simulation, simulated experience that employs 3D near-eye displays and pose tracking to give the user an immersive feel of a virtual world. Applications of virtual reality include entertainment (particularly video gam ...

and

augmented reality Augmented reality (AR), also known as mixed reality (MR), is a technology that overlays real-time 3D computer graphics, 3D-rendered computer graphics onto a portion of the real world through a display, such as a handheld device or head-mounted ...

providers, such as uSens and Gestigon, allow users to interact with digital content in real time by capturing hand motions. This can be useful for training simulations, visual perception tests, or performing virtual walk-throughs in a 3D environment. Motion capture technology is frequently used in

digital puppetry Digital puppetry is the manipulation and performance of digitally animated 2D or 3D figures and objects in a virtual environment that are rendered in real-time by computers. It is most commonly used in filmmaking and television production but has ...

systems to drive computer-generated characters in real time.

Gait analysis Gait analysis is the systematic study of animal locomotion, more specifically the study of human motion, using the eye and the brain of observers, augmented by instrumentation for measuring body movements, biomechanics, body mechanics, and the a ...

is one application of motion capture in

clinical medicine Medicine is the science and Praxis (process), practice of caring for patients, managing the Medical diagnosis, diagnosis, prognosis, Preventive medicine, prevention, therapy, treatment, Palliative care, palliation of their injury or disease, ...

. Techniques allow clinicians to evaluate human motion across several biomechanical factors, often while streaming this information live into analytical software. One innovative use is pose detection, which can empower patients during post-surgical recovery or rehabilitation after injuries. This approach enables continuous monitoring, real-time guidance, and individually tailored programs to enhance patient outcomes. Some physical therapy clinics utilize motion capture as an objective way to quantify patient progress. During the filming of James Cameron's ''Avatar'' all of the scenes involving motion capture were directed in real-time using Autodesk MotionBuilder software to render a screen image which allowed the director and the actor to see what they would look like in the movie, making it easier to direct the movie as it would be seen by the viewer. This method allowed views and angles not possible from a pre-rendered animation. Cameron was so proud of his results that he invited

Steven Spielberg Steven Allan Spielberg ( ; born December 18, 1946) is an American filmmaker. A major figure of the New Hollywood era and pioneer of the modern blockbuster, Spielberg is widely regarded as one of the greatest film directors of all time and is ...

and

George Lucas George Walton Lucas Jr. (born May 14, 1944) is an American filmmaker and philanthropist. He created the ''Star Wars'' and ''Indiana Jones'' franchises and founded Lucasfilm, LucasArts, Industrial Light & Magic and THX. He served as chairman ...

on set to view the system in action. In Marvel's '' The Avengers'', Mark Ruffalo used motion capture so he could play his character the Hulk, rather than have him be only CGI as in previous films, making Ruffalo the first actor to play both the human and the Hulk versions of Bruce Banner. FaceRig software uses facial recognition technology from ULSee.Inc to map a player's facial expressions and the body tracking technology from Perception Neuron to map the body movement onto a 2D or 3D character's motion on-screen. During ''

Game Developers Conference The Game Developers Conference (GDC) is an annual conference for video game developers. The event includes an expo, networking events, and awards shows like the Game Developers Choice Award for Game of the Year, Game Developers Choice Awards and ...

'' 2016 in San Francisco ''

Epic Games Epic Games, Inc. is an American Video game developer, video game and software development, software developer and video game publisher, publisher based in Cary, North Carolina. The company was founded by Tim Sweeney (game developer), Tim Sween ...

'' demonstrated full-body motion capture live in Unreal Engine. The whole scene, from the upcoming game '' Hellblade'' about a woman warrior named Senua, was rendered in real-time. The keynote was a collaboration between ''

Unreal Engine Unreal Engine (UE) is a 3D computer graphics game engine developed by Epic Games, first showcased in the 1998 first-person shooter video game '' Unreal''. Initially developed for PC first-person shooters, it has since been used in a variety of ...

'', '' Ninja Theory'', '' 3Lateral'', ''Cubic Motion'', ''IKinema'' and '' Xsens''. In 2020, the two-time Olympic figure skating champion Yuzuru Hanyu graduated from

Waseda University Waseda University (Japanese: ), abbreviated as or , is a private university, private research university in Shinjuku, Tokyo. Founded in 1882 as the Tōkyō Professional School by Ōkuma Shigenobu, the fifth Prime Minister of Japan, prime ministe ...

. In his thesis, using data provided by 31 sensors placed on his body, he analysed his jumps. He evaluated the use of technology both in order to improve the scoring system and to help skaters improve their jumping technique. In March 2021 a summary of the thesis was published in the academic journal.

Methods and systems

Motion tracking or motion capture started as a photogrammetric analysis tool in biomechanics research in the 1970s and 1980s, and expanded into education, training, sports and recently

for

television Television (TV) is a telecommunication medium for transmitting moving images and sound. Additionally, the term can refer to a physical television set rather than the medium of transmission. Television is a mass medium for advertising, ...

, cinema, and

video game A video game or computer game is an electronic game that involves interaction with a user interface or input device (such as a joystick, game controller, controller, computer keyboard, keyboard, or motion sensing device) to generate visual fe ...

s as the technology matured. Since the 20th century, the performer has to wear markers near each joint to identify the motion by the positions or angles between the markers. Acoustic, inertial, LED, magnetic or reflective markers, or combinations of any of these, are tracked, optimally at least two times the frequency rate of the desired motion. The resolution of the system is important in both the spatial resolution and temporal resolution as motion blur causes almost the same problems as low resolution. Since the beginning of the 21st century - and because of the rapid growth of technology - new methods have been developed. Most modern systems can extract the silhouette of the performer from the background. Afterwards all joint angles are calculated by fitting in a mathematical model into the silhouette. For movements you can not see a change of the silhouette, there are hybrid systems available that can do both (marker and silhouette), but with less marker. In robotics, some motion capture systems are based on simultaneous localization and mapping.

Optical systems

''Optical systems'' utilize data captured from image sensors to triangulate the 3D position of a subject between two or more cameras calibrated to provide overlapping projections. Data acquisition is traditionally implemented using special markers attached to an actor; however, more recent systems are able to generate accurate data by tracking surface features identified dynamically for each particular subject. Tracking a large number of performers or expanding the capture area is accomplished by the addition of more cameras. These systems produce data with three degrees of freedom for each marker, and rotational information must be inferred from the relative orientation of three or more markers; for instance shoulder, elbow and wrist markers providing the angle of the elbow. Newer hybrid systems are combining inertial sensors with optical sensors to reduce occlusion, increase the number of users and improve the ability to track without having to manually clean up data.

Passive markers

''Passive optical'' systems use markers coated with a retroreflective material to reflect light that is generated near the camera's lens. The camera's threshold can be adjusted so only the bright reflective markers will be sampled, ignoring skin and fabric. The centroid of the marker is estimated as a position within the two-dimensional image that is captured. The grayscale value of each pixel can be used to provide sub-pixel accuracy by finding the centroid of the Gaussian. An object with markers attached at known positions is used to calibrate the cameras and obtain their positions, and the lens distortion of each camera is measured. If two calibrated cameras see a marker, a three-dimensional fix can be obtained. Typically a system will consist of around 2 to 48 cameras. Systems of over three hundred cameras exist to try to reduce marker swap. Extra cameras are required for full coverage around the capture subject and multiple subjects. Vendors have constraint software to reduce the problem of marker swapping since all passive markers appear identical. Unlike active marker systems and magnetic systems, passive systems do not require the user to wear wires or electronic equipment. Instead, hundreds of rubber balls are attached with reflective tape, which needs to be replaced periodically. The markers are usually attached directly to the skin (as in biomechanics), or they are

velcro Velcro IP Holdings LLC, trading as Velcro Companies and commonly referred to as Velcro (pronounced ), is a British privately held company, founded by Swiss electrical engineer George de Mestral in the 1950s. It is the original manufacturer of ho ...

ed to a performer wearing a full-body spandex/lycra suit designed specifically for motion capture. This type of system can capture large numbers of markers at frame rates usually around 120 to 160 fps although by lowering the resolution and tracking a smaller region of interest they can track as high as 10,000 fps.

Active marker

Active optical systems triangulate positions by illuminating one LED at a time very quickly or multiple LEDs with software to identify them by their relative positions, somewhat akin to celestial navigation. Rather than reflecting light back that is generated externally, the markers themselves are powered to emit their own light. Since the inverse square law provides one quarter of the power at two times the distance, this can increase the distances and volume for capture. This also enables a high signal-to-noise ratio, resulting in very low marker jitter and a resulting high measurement resolution (often down to 0.1 mm within the calibrated volume). The TV series '' Stargate SG1'' produced episodes using an active optical system for the VFX allowing the actor to walk around props that would make motion capture difficult for other non-active optical systems. ILM used active markers in '' Van Helsing'' to allow capture of Dracula's flying brides on very large sets similar to Weta's use of active markers in ''

Rise of the Planet of the Apes ''Rise of the Planet of the Apes'' is a 2011 American science fiction action film directed by Rupert Wyatt from a screenplay by Rick Jaffa and Amanda Silver. It is a reboot of the ''Planet of the Apes'' film series and is the seventh install ...

''. The power to each marker can be provided sequentially in phase with the capture system providing a unique identification of each marker for a given capture frame at a cost to the resultant frame rate. The ability to identify each marker in this manner is useful in real-time applications. The alternative method of identifying markers is to do it algorithmically requiring extra processing of the data. There are also possibilities to find the position by using colored LED markers. In these systems, each color is assigned to a specific point of the body. One of the earliest active marker systems in the 1980s was a hybrid passive-active mocap system with rotating mirrors and colored glass reflective markers and which used masked linear array detectors.

Time modulated active marker

Active marker systems can further be refined by strobing one marker on at a time, or tracking multiple markers over time and modulating the amplitude or pulse width to provide marker ID. 12-megapixel spatial resolution modulated systems show more subtle movements than 4-megapixel optical systems by having both higher spatial and temporal resolution. Directors can see the actor's performance in real-time, and watch the results on the motion capture-driven CG character. The unique marker IDs reduce the turnaround, by eliminating marker swapping and providing much cleaner data than other technologies. LEDs with onboard processing and radio synchronization allow motion capture outdoors in direct sunlight while capturing at 120 to 960 frames per second due to a high-speed electronic shutter. Computer processing of modulated IDs allows less hand cleanup or filtered results for lower operational costs. This higher accuracy and resolution requires more processing than passive technologies, but the additional processing is done at the camera to improve resolution via subpixel or centroid processing, providing both high resolution and high speed. These motion capture systems typically cost $20,000 for an eight-camera, 12-megapixel spatial resolution 120-hertz system with one actor.

Semi-passive imperceptible marker

One can reverse the traditional approach based on high-speed cameras. Systems such a
Prakash
use inexpensive multi-LED high-speed projectors. The specially built multi-LED IR projectors optically encode the space. Instead of retro-reflective or active light emitting diode (LED) markers, the system uses photosensitive marker tags to decode the optical signals. By attaching tags with photo sensors to scene points, the tags can compute not only their own locations of each point, but also their own orientation, incident illumination, and reflectance. These tracking tags work in natural lighting conditions and can be imperceptibly embedded in attire or other objects. The system supports an unlimited number of tags in a scene, with each tag uniquely identified to eliminate marker reacquisition issues. Since the system eliminates a high-speed camera and the corresponding high-speed image stream, it requires significantly lower data bandwidth. The tags also provide incident illumination data which can be used to match scene lighting when inserting synthetic elements. The technique appears ideal for on-set motion capture or real-time broadcasting of virtual sets but has yet to be proven.

Underwater motion capture system

Motion capture technology has been available for researchers and scientists for a few decades, which has given new insight into many fields.

Underwater cameras

The vital part of the system, the underwater camera, has a waterproof housing. The housing has a finish that withstands corrosion and chlorine which makes it perfect for use in basins and swimming pools. There are two types of cameras. Industrial high-speed cameras can also be used as infrared cameras. Infrared underwater cameras come with a cyan light strobe instead of the typical IR light for minimum fall-off underwater and high-speed cameras with an LED light or with the option of using image processing. Oqus underwater

Motion tacking by using image processing

=Measurement volume

= An underwater camera is typically able to measure 15–20 meters depending on the water quality, the camera and the type of marker used. Unsurprisingly, the best range is achieved when the water is clear, and like always, the measurement volume is also dependent on the number of cameras. A range of underwater markers are available for different circumstances.

=Tailored

= Different pools require different mountings and fixtures. Therefore, all underwater motion capture systems are uniquely tailored to suit each specific pool instalment. For cameras placed in the center of the pool, specially designed tripods, using suction cups, are provided.

Markerless

Emerging techniques and research in

computer vision Computer vision tasks include methods for image sensor, acquiring, Image processing, processing, Image analysis, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical ...

are leading to the rapid development of the markerless approach to motion capture. Markerless systems such as those developed at

Stanford University Leland Stanford Junior University, commonly referred to as Stanford University, is a Private university, private research university in Stanford, California, United States. It was founded in 1885 by railroad magnate Leland Stanford (the eighth ...

, the

University of Maryland The University of Maryland, College Park (University of Maryland, UMD, or simply Maryland) is a public land-grant research university in College Park, Maryland, United States. Founded in 1856, UMD is the flagship institution of the Univ ...

MIT The Massachusetts Institute of Technology (MIT) is a private research university in Cambridge, Massachusetts, United States. Established in 1861, MIT has played a significant role in the development of many areas of modern technology and sc ...

, and the Max Planck Institute, do not require subjects to wear special equipment for tracking. Special computer algorithms are designed to allow the system to analyze multiple streams of optical input and identify human forms, breaking them down into constituent parts for tracking. ESC entertainment, a subsidiary of Warner Brothers Pictures created especially to enable virtual cinematography, used a technique called Universal Capture that utilized 7 camera setup and the tracking the optical flow of all

pixel In digital imaging, a pixel (abbreviated px), pel, or picture element is the smallest addressable element in a Raster graphics, raster image, or the smallest addressable element in a dot matrix display device. In most digital display devices, p ...

s over all the 2-D planes of the cameras for motion,

gesture A gesture is a form of nonverbal communication or non-vocal communication in which visible bodily actions communicate particular messages, either in place of, or in conjunction with, speech. Gestures include movement of the hands, face, or othe ...

and

facial expression Facial expression is the motion and positioning of the muscles beneath the skin of the face. These movements convey the emotional state of an individual to observers and are a form of nonverbal communication. They are a primary means of conveying ...

capture leading to photorealistic results.

Traditional systems

Traditionally markerless optical motion tracking is used to keep track of various objects, including airplanes, launch vehicles, missiles and satellites. Many such optical motion tracking applications occur outdoors, requiring differing lens and camera configurations. High-resolution images of the target being tracked can thereby provide more information than just motion data. The image obtained from NASA's long-range tracking system on the space shuttle Challenger's fatal launch provided crucial evidence about the cause of the accident. Optical tracking systems are also used to identify known spacecraft and space debris despite the fact that it has a disadvantage compared to radar in that the objects must be reflecting or emitting sufficient light. An optical tracking system typically consists of three subsystems: the optical imaging system, the mechanical tracking platform and the tracking computer. The optical imaging system is responsible for converting the light from the target area into a digital image that the tracking computer can process. Depending on the design of the optical tracking system, the optical imaging system can vary from as simple as a standard digital camera to as specialized as an astronomical telescope on the top of a mountain. The specification of the optical imaging system determines the upper limit of the effective range of the tracking system. The mechanical tracking platform holds the optical imaging system and is responsible for manipulating the optical imaging system in such a way that it always points to the target being tracked. The dynamics of the mechanical tracking platform combined with the optical imaging system determines the tracking system's ability to keep the lock on a target that changes speed rapidly. The tracking computer is responsible for capturing the images from the optical imaging system, analyzing the image to extract the target position and controlling the mechanical tracking platform to follow the target. There are several challenges. First, the tracking computer has to be able to capture the image at a relatively high frame rate. This posts a requirement on the bandwidth of the image-capturing hardware. The second challenge is that the image processing software has to be able to extract the target image from its background and calculate its position. Several textbook image-processing algorithms are designed for this task. This problem can be simplified if the tracking system can expect certain characteristics that is common in all the targets it will track. The next problem down the line is controlling the tracking platform to follow the target. This is a typical control system design problem rather than a challenge, which involves modeling the system dynamics and designing controllers to control it. This will however become a challenge if the tracking platform the system has to work with is not designed for real-time. The software that runs such systems is also customized for the corresponding hardware components. One example of such software is OpticTracker, which controls computerized telescopes to track moving objects at great distances, such as planes and satellites. Another option is the software SimiShape, which can also be used hybrid in combination with markers.

RGB-D cameras

RGB-D cameras such as

Kinect Kinect is a discontinued line of motion sensing input devices produced by Microsoft and first released in 2010. The devices generally contain RGB color model, RGB cameras, and Thermographic camera, infrared projectors and detectors that map dep ...

capture both the color and depth images. By fusing the two images, 3D colored voxels can be captured, allowing motion capture of 3D human motion and human surface in real-time. Because of the use of a single-view camera, motions captured are usually noisy. Machine learning techniques have been proposed to automatically reconstruct such noisy motions into higher quality ones, using methods such as lazy learning and Gaussian models. Such method generates accurate enough motion for serious applications like ergonomic assessment.

Non-optical systems

Inertial systems

Inertial motion capture technology is based on miniature inertial sensors, biomechanical models and sensor fusion algorithms. The motion data of the inertial sensors (

inertial guidance system An inertial navigation system (INS; also inertial guidance system, inertial instrument) is a navigation device that uses motion sensors (accelerometers), rotation sensors ( gyroscopes) and a computer to continuously calculate by dead reckoning ...

) is often transmitted wirelessly to a computer, where the motion is recorded or viewed. Most inertial systems use inertial measurement units (IMUs) containing a combination of gyroscope, magnetometer, and accelerometer, to measure rotational rates. These rotations are translated to a skeleton in the software. Much like optical markers, the more IMU sensors the more natural the data. No external cameras, emitters or markers are needed for relative motions, although they are required to give the absolute position of the user if desired. Inertial motion capture systems capture the full six degrees of freedom body motion of a human in real-time and can give limited direction information if they include a magnetic bearing sensor, although these are much lower resolution and susceptible to electromagnetic noise. Benefits of using Inertial systems include: capturing in a variety of environments including tight spaces, no solving, portability, and large capture areas. Disadvantages include lower positional accuracy and positional drift which can compound over time. These systems are similar to the Wii controllers but are more sensitive and have greater resolution and update rates. They can accurately measure the direction to the ground to within a degree. The popularity of inertial systems is rising amongst game developers, mainly because of the quick and easy setup resulting in a fast pipeline. A range of suits are now available from various manufacturers and base prices range from $1000 to US$80,000.

Mechanical motion

Mechanical motion capture systems directly track body joint angles and are often referred to as exoskeleton motion capture systems, due to the way the sensors are attached to the body. A performer attaches the skeletal-like structure to their body and as they move so do the articulated mechanical parts, measuring the performer's relative motion. Mechanical motion capture systems are real-time, relatively low-cost, free from occlusion, and wireless (untethered) systems that have unlimited capture volume. Typically, they are rigid structures of jointed, straight metal or plastic rods linked together with potentiometers that articulate at the joints of the body. These suits tend to be in the $25,000 to $75,000 range plus an external absolute positioning system. Some suits provide limited force feedback or haptic input.

Magnetic systems

Magnetic systems calculate position and orientation by the relative magnetic flux of three orthogonal coils on both the transmitter and each receiver. The relative intensity of the voltage or current of the three coils allows these systems to calculate both range and orientation by meticulously mapping the tracking volume. The sensor output is

six degrees of freedom Six degrees of freedom (6DOF), or sometimes six degrees of movement, refers to the six mechanical degrees of freedom of movement of a rigid body in three-dimensional space. Specifically, the body is free to change position as forward/backw ...

(6DOF), which provides useful results obtained with two-thirds the number of markers required in optical systems; one on upper arm and one on lower arm for elbow position and angle. The markers are vulnerable to magnetic and electrical interference from metal objects in the environment, like rebar (steel reinforcing bars in concrete) or wiring, which affect the magnetic field, and electrical sources such as monitors, lights, cables and computers. The sensor response is nonlinear, especially toward edges of the capture area. The wiring from the sensors tends to preclude extreme performance movements. With magnetic systems, it is possible to monitor the results of a motion capture session in real time. The capture volumes for magnetic systems are dramatically smaller than they are for optical systems. With the magnetic systems, there is a distinction between alternating-current (AC) and

direct-current Direct current (DC) is one-directional flow of electric charge. An electrochemical cell is a prime example of DC power. Direct current may flow through a conductor such as a wire, but can also flow through semiconductors, insulators, or even ...

(DC) systems: DC system uses square pulses, AC systems use sine waves.

Stretch sensors

Stretch sensors are flexible parallel plate capacitors that measure either stretch, bend, shear, or pressure and are typically produced from silicone. When the sensor stretches or squeezes its capacitance value changes. This data can be transmitted via Bluetooth or direct input and used to detect minute changes in body motion. Stretch sensors are unaffected by magnetic interference and are free from occlusion. The stretchable nature of the sensors also means they do not suffer from positional drift, which is common with inertial systems. Stretchable sensors, on the other hands, due to the material properties of their substrates and conducting materials, suffer from relatively low

signal-to-noise ratio Signal-to-noise ratio (SNR or S/N) is a measure used in science and engineering that compares the level of a desired signal to the level of background noise. SNR is defined as the ratio of signal power to noise power, often expressed in deci ...

, requiring filtering or

machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...

to make them usable for motion capture. These solutions result in higher latency when compared to alternative sensors.

Related techniques

Facial motion capture

Most traditional motion capture hardware vendors provide for some type of low-resolution facial capture utilizing anywhere from 32 to 300 markers with either an active or passive marker system. All of these solutions are limited by the time it takes to apply the markers, calibrate the positions and process the data. Ultimately the technology also limits their resolution and raw output quality levels. High-fidelity facial motion capture, also known as performance capture, is the next generation of fidelity and is utilized to record the more complex movements in a human face in order to capture higher degrees of emotion. Facial capture is currently arranging itself in several distinct camps, including traditional motion capture data, blend-shaped based solutions, capturing the actual topology of an actor's face, and proprietary systems. The two main techniques are stationary systems with an array of cameras capturing the facial expressions from multiple angles and using software such as the stereo mesh solver from

OpenCV OpenCV (Open Source Computer Vision Library) is a Library (computing), library of programming functions mainly for Real-time computing, real-time computer vision. Originally developed by Intel, it was later supported by Willow Garage, then Itseez ...

to create a 3D surface mesh, or to use light arrays as well to calculate the surface normals from the variance in brightness as the light source, camera position or both are changed. These techniques tend to be only limited in feature resolution by the camera resolution, apparent object size and number of cameras. If the users face is 50 percent of the working area of the camera and a camera has megapixel resolution, then sub millimeter facial motions can be detected by comparing frames. Recent work is focusing on increasing the frame rates and doing optical flow to allow the motions to be retargeted to other computer generated faces, rather than just making a 3D Mesh of the actor and their expressions.

Radio frequency positioning

Radio frequency positioning systems are becoming more viable as higher frequency radio frequency devices allow greater precision than older technologies such as

radar Radar is a system that uses radio waves to determine the distance ('' ranging''), direction ( azimuth and elevation angles), and radial velocity of objects relative to the site. It is a radiodetermination method used to detect and track ...

. The speed of light is 30 centimeters per nanosecond (billionth of a second), so a 10 gigahertz (billion cycles per second) radio frequency signal enables an accuracy of about 3 centimeters. By measuring amplitude to a quarter wavelength, it is possible to improve the resolution down to about 8 mm. To achieve the resolution of optical systems, frequencies of 50 gigahertz or higher are needed, which are almost as dependent on line of sight and as easy to block as optical systems. Multipath and reradiation of the signal are likely to cause additional problems, but these technologies will be ideal for tracking larger volumes with reasonable accuracy, since the required resolution at 100 meter distances is not likely to be as high. Many scientists believe that radio frequency will never produce the accuracy required for motion capture. Researchers at Massachusetts Institute of Technology researchers said in 2015 that they had made a system that tracks motion by radio frequency signals.

Non-traditional systems

An alternative approach was developed where the actor is given an unlimited walking area through the use of a rotating sphere, similar to a hamster ball, which contains internal sensors recording the angular movements, removing the need for external cameras and other equipment. Even though this technology could potentially lead to much lower costs for motion capture, the basic sphere is only capable of recording a single continuous direction. Additional sensors worn on the person would be needed to record anything more. Another alternative is using a 6DOF (Degrees of freedom) motion platform with an integrated omni-directional treadmill with high resolution optical motion capture to achieve the same effect. The captured person can walk in an unlimited area, negotiating different uneven terrains. Applications include medical rehabilitation for balance training, bio-mechanical research and virtual reality.

3D pose estimation

In 3D pose estimation, an actor's pose can be reconstructed from an image or

depth map In 3D computer graphics and computer vision, a depth map is an Digital image, image or Channel (digital image), image channel that contains information relating to the distance of the Computer representation of surfaces, surfaces of scene objec ...

References

External links

The fascination for motion capture
, an introduction to the history of motion capture technology {{DEFAULTSORT:Motion Capture Computer-related introductions in 1994 Audiovisual introductions in 2000 Computer animation 3D computer graphics Computing input devices Articles containing video clips Motion control photography