In the context of
human–computer interaction
Human–computer interaction (HCI) is the process through which people operate and engage with computer systems. Research in HCI covers the design and the use of computer technology, which focuses on the interfaces between people (users) and comp ...
, a modality is the classification of a single independent channel of
input/output
In computing, input/output (I/O, i/o, or informally io or IO) is the communication between an information processing system, such as a computer, and the outside world, such as another computer system, peripherals, or a human operator. Inputs a ...
between a computer and a human. Such channels may differ based on sensory nature (e.g., visual vs. auditory),
or other significant differences in processing (e.g., text vs. image).
A system is designated unimodal if it has only one modality implemented, and
multimodal if it has more than one.
When multiple modalities are available for some tasks or aspects of a task, the system is said to have overlapping modalities. If multiple modalities are available for a task, the system is said to have redundant modalities. Multiple modalities can be used in combination to provide complementary methods that may be redundant but convey information more effectively. Modalities can be generally defined in two forms: computer-human and human-computer modalities.
Computer–human modalities
Computers utilize a wide range of technologies to communicate and send information to humans:
* Common modalities
**
Vision
Vision, Visions, or The Vision may refer to:
Perception Optical perception
* Visual perception, the sense of sight
* Visual system, the physical mechanism of eyesight
* Computer vision, a field dealing with how computers can be made to gain und ...
– computer graphics typically through a screen
**
Audition
An audition is a sample performance by an actor, singer, musician, dancer or other performer. It typically involves the performer displaying their talent through a previously memorized and rehearsed solo piece or by performing a work or piece gi ...
– various audio outputs
**
Tactition – vibrations or other movement
* Uncommon modalities
**
Gustation
The gustatory system or sense of taste is the sensory system that is partially responsible for the perception of taste. Taste is the perception stimulated when a substance in the mouth reacts chemically with taste receptor cells located on tas ...
(taste)
**
Olfaction
The sense of smell, or olfaction, is the special sense through which smells (or odors) are perceived. The sense of smell has many functions, including detecting desirable foods, hazards, and pheromones, and plays a role in taste.
In humans, ...
(smell)
**
Thermoception
In physiology, thermoception or thermoreception is the sensation and perception of temperature, or more accurately, temperature differences inferred from heat flux. It deals with a series of events and processes required for an organism to recei ...
(heat)
**
Nociception
In physiology, nociception , also nocioception; ) is the Somatosensory system, sensory nervous system's process of encoding Noxious stimulus, noxious stimuli. It deals with a series of events and processes required for an organism to receive a pai ...
(pain)
**
Equilibrioception
The sense of balance or equilibrioception is the perception of balance and spatial orientation. It helps prevent humans and nonhuman animals from falling over when standing or moving. Equilibrioception is the result of a number of sensory sy ...
(balance)
Any human sense can be used as a computer to human modality. However, the modalities of
seeing and
hearing
Hearing, or auditory perception, is the ability to perceive sounds through an organ, such as an ear, by detecting vibrations as periodic changes in the pressure of a surrounding medium. The academic field concerned with hearing is auditory sci ...
are the most commonly employed since they are capable of transmitting information at a higher speed than other modalities, 250 to 300
and 150 to 160
words per minute
Words per minute, commonly abbreviated as WPM (sometimes lowercased as wpm), is a measure of words processed in a minute, often used as a measurement of the speed of typing, reading or Morse code sending and receiving.
Alphanumeric entry
Since ...
, respectively. Though not commonly implemented as computer-human modality, tactition can achieve an average of 125 wpm through the use of a
refreshable Braille display. Other more common forms of tactition are smartphone and game controller vibrations.
Human–computer modalities
Computers can be equipped with various types of
input devices
In computing, an input device is a piece of equipment used to provide data and control signals to an information processing system, such as a computer or information appliance. Examples of input devices include Computer keyboard, keyboards, Compu ...
and sensors to allow them to receive information from humans. Common input devices are often interchangeable if they have a standardized method of communication with the computer and
afford practical adjustments to the user. Certain modalities can provide a richer interaction depending on the context, and having options for implementation allows for more robust systems.
* Simple modalities
**
Keyboard
**
Pointing device
A pointing device is a human interface device that allows a User (computing)#End-user, user to input Three-dimensional space, spatial (i.e., continuous and multi-dimensional) data to a computer. Graphical user interfaces (GUI) and Computer- ...
**
Touchscreen
A touchscreen (or touch screen) is a type of electronic visual display, display that can detect touch input from a user. It consists of both an input device (a touch panel) and an output device (a visual display). The touch panel is typically l ...
* Complex modalities
**
Computer vision
Computer vision tasks include methods for image sensor, acquiring, Image processing, processing, Image analysis, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical ...
**
Speech recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also ...
**
Motion
In physics, motion is when an object changes its position with respect to a reference point in a given time. Motion is mathematically described in terms of displacement, distance, velocity, acceleration, speed, and frame of reference to an o ...
**
Orientation
With the increasing popularity of
smartphones
A smartphone is a mobile phone with advanced computing capabilities. It typically has a touchscreen interface, allowing users to access a wide range of applications and services, such as web browsing, email, and social media, as well as mult ...
, the general public are becoming more comfortable with the more complex modalities. Motion and orientation are commonly used in smartphone mapping applications. Speech recognition is widely used with Virtual Assistant applications. Computer Vision is now common in camera applications that are used to scan documents and QR codes.
Using multiple modalities
Having multiple modalities in a system gives more
affordance
In psychology, affordance is what the environment offers the individual. In design, affordance has a narrower meaning; it refers to possible actions that an actor can readily perceive.
American psychologist James J. Gibson coined the term ...
to users and can contribute to a more robust system. Having more also allows for greater
accessibility for users who work more effectively with certain modalities. Multiple modalities can be used as backup when certain forms of communication are not possible. This is especially true in the case of redundant modalities in which two or more modalities are used to communicate the same information. Certain combinations of modalities can add to the expression of a computer-human or human-computer interaction because the modalities each may be more effective at expressing one form or aspect of information than others.
There are six types of cooperation between modalities, and they help define how a combination or fusion of modalities work together to convey information more effectively.
* Equivalence: information is presented in multiple ways and can be interpreted as the same information
* Specialization: when a specific kind of information is always processed through the same modality
* Redundancy: multiple modalities process the same information
* Complementarity: multiple modalities take separate information and merge it
* Transfer: a modality produces information that another modality consumes
* Concurrency: multiple modalities take in separate information that is not merged
Complementary-redundant systems are those which have multiple sensors to form one understanding or dataset, and the more effectively the information can be combined without duplicating data, the more effectively the modalities cooperate. Having multiple modalities for communication is common, particularly in smartphones, and often their implementations work together towards the same goal, for example gyroscopes and accelerometers working together to track movement.
See also
*
*
*
References
{{DEFAULTSORT:Modality (Human-Computer Interaction)
Multimodal interaction