HOME

TheInfoList



OR:

Data migration is the process of selecting, preparing, extracting, and transforming
data In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpret ...
and permanently transferring it from one
computer storage Computer data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers. The central processing unit (CPU) of a compute ...
system to another. Additionally, the validation of migrated data for completeness and the decommissioning of legacy data storage are considered part of the entire data migration process. Data migration is a key consideration for any system implementation, upgrade, or consolidation, and it is typically performed in such a way as to be as automated as possible, freeing up human resources from tedious tasks. Data migration occurs for a variety of reasons, including server or storage equipment replacements, maintenance or upgrades, application migration, website consolidation, disaster recovery, and
data center A data center (American English) or data centre (British English)See spelling differences. is a building, a dedicated space within a building, or a group of buildings used to house computer systems and associated components, such as telecommun ...
relocation.


The standard phases

, "nearly 40 percent of data migration projects were over time, over budget, or failed entirely." As such, to achieve an effective data migration, proper planning is critical. While the specifics of a data migration plan may vary—sometimes significantly—from project to project, the computing company IBM suggests there are three main phases to most any data migration project: planning, migration, and post-migration. Each of those phases has its own steps. During planning, dependencies and requirements are analyzed, migration scenarios get developed and tested, and a project plan that incorporates the prior information is created. During the migration phase, the plan is enacted, and during post-migration, the completeness and thoroughness of the migration is validated, documented, closed out, including any necessary decommissioning of legacy systems. For applications of moderate to high complexity, these data migration phases may be repeated several times before the new system is considered to be fully validated and deployed. Planning: The data, applications, etc. that will be migrated are selected based on business, project, and technical requirements and dependencies. Hardware and bandwidth requirements are analyzed. Feasible migration and back-out scenarios are developed, as well as the associated tests, automation scripts, mappings, and procedures. Data cleansing and transformation requirements are also gauged for data formats to improve
data quality Data quality refers to the state of qualitative or quantitative pieces of information. There are many definitions of data quality, but data is generally considered high quality if it is "fit for tsintended uses in operations, decision making and ...
and to eliminate redundant or obsolete information. Migration architecture is decided on and developed, any necessary software licenses are obtained, and change management processes are started. Migration: Hardware and software requirements are validated, and migration procedures are customized as necessary. Some sort of pre-validation testing may also occur to ensure requirements and customized settings function as expected. If all is deemed well, migration begins, including the primary acts of data extraction, where data is read from the old system, and data loading, where data is written to the new system. Additional verification steps ensure the developed migration plan was enacted in full. Post-migration: After data migration, results are subjected to data verification to determine whether data was accurately translated, is complete, and supports processes in the new system. During verification, there may be a need for a parallel run of both systems to identify areas of disparity and forestall erroneous
data loss Data loss is an error condition in information systems in which information is destroyed by failures (like failed spindle motors or head crashes on hard drives) or neglect (like mishandling, careless handling or storage under unsuitable conditions ...
. Additional documentation and reporting of the migration project is conducted, and once the migration is validated complete, legacy systems may also be decommissioned. Migration close-out meetings will officially end the migration process.


Project versus process

There is a difference between data migration and
data integration Data integration involves combining data residing in different sources and providing users with a unified view of them. This process becomes significant in a variety of situations, which include both commercial (such as when two similar companies ...
activities. Data migration is a project by means of which data will be moved or copied from one environment to another, and removed or decommissioned in the source. During the migration (which can take place over months or even years), data can flow in multiple directions, and there may be multiple migrations taking place simultaneously. The ETL (
extract, transform, load In computing, extract, transform, load (ETL) is a three-phase process where data is extracted, transformed (cleaned, sanitized, scrubbed) and loaded into an output data container. The data can be collated from one or more sources and it can also ...
) actions will be necessary, although the means of achieving these may not be those traditionally associated with the ETL acronym. Data integration, by contrast, is a permanent part of the IT architecture, and is responsible for the way data flows between the various applications and data stores—and is a process rather than a project activity. Standard ETL technologies designed to supply data from operational systems to data warehouses would fit within the latter category.


Categories

Data is stored on various media in files or
databases In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spa ...
, and is generated and consumed by
software applications Software is a set of computer programs and associated documentation and data. This is in contrast to hardware, from which the system is built and which actually performs the work. At the lowest programming level, executable code consists ...
, which in turn support
business processes A business process, business method or business function is a collection of related, structured activities or tasks by people or equipment in which a specific sequence produces a service or product (serves a particular business goal) for a parti ...
. The need to transfer and convert data can be driven by multiple business requirements, and the approach taken to the migration depends on those requirements. Four major migration categories are proposed on this basis.


Storage migration

A business may choose to rationalize the physical media to take advantage of more efficient storage technologies. This will result in having to move physical blocks of data from one tape or disk to another, often using
virtualization In computing, virtualization or virtualisation (sometimes abbreviated v12n, a numeronym) is the act of creating a virtual (rather than actual) version of something at the same abstraction level, including virtual computer hardware platforms, stor ...
techniques. The data format and content itself will not usually be changed in the process and can normally be achieved with minimal or no impact to the layers above.


Database migration

Similarly, it may be necessary to move from one
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases ...
vendor to another, or to upgrade the version of database software being used. The latter case is less likely to require a physical data migration, but this can happen with major upgrades. In these cases a physical transformation process may be required since the underlying data format can change significantly. This may or may not affect behavior in the applications layer, depending largely on whether the data manipulation language or protocol has changed. However, some modern applications are written to be almost entirely agnostic to the database technology, so a change from
Sybase Sybase, Inc. was an enterprise software and services company. The company produced software to manage and analyze information in relational databases, with facilities located in California and Massachusetts. Sybase was acquired by SAP in 2010; ...
,
MySQL MySQL () is an open-source relational database management system (RDBMS). Its name is a combination of "My", the name of co-founder Michael Widenius's daughter My, and "SQL", the acronym for Structured Query Language. A relational database ...
,
IBM Db2 Db2 is a family of data management products, including database servers, developed by IBM. It initially supported the relational model, but was extended to support object–relational features and non-relational structures like JSON a ...
or SQL Server to
Oracle An oracle is a person or agency considered to provide wise and insightful counsel or prophetic predictions, most notably including precognition of the future, inspired by deities. As such, it is a form of divination. Description The word ...
should only require a testing cycle to be confident that both functional and non-functional performance has not been adversely affected.


Application migration

Changing application vendor—for instance a new CRM or ERP platform—will inevitably involve substantial transformation as almost every application or suite operates on its own specific data model and also interacts with other applications and systems within the
enterprise application integration Enterprise application integration (EAI) is the use of software and computer systems' architectural principles to integrate a set of enterprise computer applications. Overview Enterprise application integration is an integration framework comp ...
environment. Furthermore, to allow the application to be sold to the widest possible market, commercial off-the-shelf packages are generally configured for each customer using
metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
.
Application programming interfaces An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how ...
(APIs) may be supplied by vendors to protect the integrity of the data they have to handle. It is also possible to script the web interfaces of vendors to automatically migrate data.


Business process migration

Business processes A business process, business method or business function is a collection of related, structured activities or tasks by people or equipment in which a specific sequence produces a service or product (serves a particular business goal) for a parti ...
operate through a combination of human and application systems actions, often orchestrated by
business process management Business process management (BPM) is the discipline in which people use various methods to discover, model, analyze, measure, improve, optimize, and automate business processes. Any combination of methods used to manage a company's business p ...
tools. When these change they can require the movement of data from one store, database or application to another to reflect the changes to the organization and information about customers, products and operations. Examples of such migration drivers are mergers and acquisitions, business optimization, and reorganization to attack new markets or respond to competitive threat. The first two categories of migration are usually routine operational activities that the IT department takes care of without the involvement of the rest of the business. The last two categories directly affect the operational users of processes and applications, are necessarily complex, and delivering them without significant business downtime can be challenging. A highly adaptive approach, concurrent synchronization, a business-oriented audit capability, and clear visibility of the migration for stakeholders—through a project management office or data governance team—are likely to be key requirements in such migrations.


Migration as a form of digital preservation

Migration, which focuses on the digital object itself, is the act of transferring, or rewriting data from an out-of-date medium to a current medium and has for many years been considered the only viable approach to long-term preservation of digital objects. Reproducing brittle newspapers onto
microfilm Microforms are scaled-down reproductions of documents, typically either films or paper, made for the purposes of transmission, storage, reading, and printing. Microform images are commonly reduced to about 4% or of the original document size. ...
is an example of such migration.


Disadvantages

* Migration addresses the possible obsolescence of the data carrier, but does not address the fact that certain technologies which run the data may be abandoned altogether, leaving migration useless. * Time-consuming – migration is a continual process, which must be repeated every time a medium reaches obsolescence, for all data objects stored on a certain media. * Costly – an institution must purchase additional data storage media at each migration.


See also

* Data conversion * Data curation * Data preservation *
Data transformation In computing, data transformation is the process of converting data from one format or structure into another format or structure. It is a fundamental aspect of most data integrationCIO.com. Agile Comes to Data Integration. Retrieved from: htt ...
*
Digital preservation In library and archival science, digital preservation is a formal endeavor to ensure that digital information of continuing value remains accessible and usable. It involves planning, resource allocation, and application of preservation methods and ...
*
Extract, transform, load In computing, extract, transform, load (ETL) is a three-phase process where data is extracted, transformed (cleaned, sanitized, scrubbed) and loaded into an output data container. The data can be collated from one or more sources and it can also ...
(ETL) *
System migration System migration involves moving a set of instructions or programs, e.g., PLC (programmable logic controller) programs, from one platform to another, minimizing reengineering. Migration of systems can also involve downtime, while the old syste ...


References


External links

* {{Data Data management