Aggregate data
   HOME

TheInfoList



OR:

Aggregate data is high-level
data Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...
which is acquired by combining individual-level data. For instance, the output of an industry is an aggregate of the firms’ individual outputs within that industry. Aggregate data are applied in statistics,
data warehouse In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for Business intelligence, reporting and data analysis and is a core component of business intelligence. Data warehouses are central Re ...
s, and in economics. There is a distinction between aggregate data and individual data. Aggregate data refers to individual data that are averaged by geographic area, by year, by service agency, or by other means. Individual data are disaggregated individual results and are used to conduct analyses for estimation of subgroup differences. Aggregate data are mainly used by researchers and analysts, policymakers, banks and administrators for multiple reasons. They are used to evaluate policies, recognise trends and patterns of processes, gain relevant insights, and assess current measures for strategic planning. Aggregate data collected from various sources are used in different areas of studies such as comparative political analysis and APD scientific analysis for further analyses. Aggregate data are also used for medical and educational purposes. Aggregate data is widely used, but it also has some limitations, including drawing inaccurate
inference Inferences are steps in logical reasoning, moving from premises to logical consequences; etymologically, the word '' infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinct ...
s and false conclusions which is also termed ‘ ecological fallacy’. ‘Ecological fallacy’ means that it is invalid for users to draw conclusions on the ecological relationships between two quantitative variables at the individual level.


Applications

In
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, aggregate data are data combined from several measurements. When data is aggregated, groups of observations are replaced with summary statistics based on those observations. In a
data warehouse In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for Business intelligence, reporting and data analysis and is a core component of business intelligence. Data warehouses are central Re ...
, the use of aggregate data dramatically reduces the time to query large sets of data. Developers pre-summarise queries that are regularly used, such as Weekly Sales across several dimensions for example by item hierarchy or geographical hierarchy. In
economics Economics () is a behavioral science that studies the Production (economics), production, distribution (economics), distribution, and Consumption (economics), consumption of goods and services. Economics focuses on the behaviour and interac ...
, aggregate data or data aggregates are high-level data that are composed from a multitude or combination of other more individual data, such as: *in
macroeconomics Macroeconomics is a branch of economics that deals with the performance, structure, behavior, and decision-making of an economy as a whole. This includes regional, national, and global economies. Macroeconomists study topics such as output (econ ...
, data such as the overall price level or overall inflation rate; and *in
microeconomics Microeconomics is a branch of economics that studies the behavior of individuals and Theory of the firm, firms in making decisions regarding the allocation of scarcity, scarce resources and the interactions among these individuals and firms. M ...
, data of an entire sector of an economy composed of many firms, or of all households in a city or region.


Major users


Researchers and analysts

Researchers use aggregate data to understand the prevalent
ethos ''Ethos'' is a Greek word meaning 'character' that is used to describe the guiding beliefs or ideals that characterize a community, nation, or ideology; and the balance between caution and passion. The Greeks also used this word to refer to the ...
, evaluate the essence of social realities and a social organisation, stipulate primary issues of concern in
research Research is creative and systematic work undertaken to increase the stock of knowledge. It involves the collection, organization, and analysis of evidence to increase understanding of a topic, characterized by a particular attentiveness to ...
, and supply projections in relation to the nature of social issues. Aggregate data are useful for researchers when they are interested in investigating on the relationships between two distinct variables at the aggregate level, and the connections between an aggregate variable and a characteristic at the individual level. Researchers have also made an effort to evaluate policies, practices and precepts of systems critically with the assistance of aggregate data, to investigate the corresponding
relevance Relevance is the connection between topics that makes one useful for dealing with the other. Relevance is studied in many different fields, including cognitive science, logic, and library and information science. Epistemology studies it in gener ...
and
efficacy Efficacy is the ability to perform a task to a satisfactory or expected degree. The word comes from the same roots as '' effectiveness'', and it has often been used synonymously, although in pharmacology a distinction is now often made betwee ...
.


Policymakers

Aggregate data are used by governments to develop more effective policies because they serve as a measure of how capable a government is to be aware of the demands and needs of its citizens and a measure of the way a government maintains social order effectively. For example, governments around the world use of aggregate mobile location data for analysis in response to Covid-19. Aggregate mobile location data could provide insights about the effectiveness of social distancing measures launched by governments. Governments also use aggregate data to identify possible “hot spots” and the potential for transmission. As well as projecting
effectiveness Effectiveness or effectivity is the capability of producing a desired result or the ability to produce desired output. When something is deemed effective, it means it has an intended or expected outcome, or produces a deep, vivid impression. Et ...
of government policies, aggregate data analyses are also taken to evaluate the nature, assess the extent, recognise the trend and study the pattern of a specific phenomenon or process with the aim to devise strategies, prepare short- or long-term policies, and take efficacious and relevant procedures for control or prevention. Policymakers also utilise financial aggregates data in evaluating companies and households’ economic and financial activities because these data help to identify risks associated with
financial stability Financial stability is the absence of system-wide episodes in which a financial crisis occurs and is characterised as an economy with Volatility (finance), low volatility. It also involves financial systems' stress-resilience being able to cope wi ...
. Policymakers can employ aggregate data to better understand the developments of a country’s economic and financial conditions.


Banks

Banks collect aggregated data from a significant number of customers and then anonymise the data through eliminating personal information. The main reason for banks to use aggregate data is to estimate
economic trend Economic trend may refer to: *all the economic indicators that are the subject of economic forecasting **see also: econometrics *general trends in the economy, see: economic history Economic history is the study of history using methodologica ...
s and gain insights on customer clusters. Banks are not permitted to share customers’ personal data, but aggregate data can be shared with banks’ business customers and can be accessed by other partners who also use the same platform to acquire information on aggregate data. In Australia, the Commonwealth Bank provides its business clients anonymised data related to their customers which are derived from card transactions. The ANZ also provides its business customers with anonymised data which is gathered from millions of merchant terminal transactions and ANZ card transactions. In the UK, the Integrated Urgent Care Aggregate Data Collection (IUC ADC) provides comprehensive information about IUC activity, its performance, as well as its service demand. Its data are sourced from the lead data providers responsible for offering integrated urgent care services in England. The
National Health Service The National Health Service (NHS) is the term for the publicly funded health care, publicly funded healthcare systems of the United Kingdom: the National Health Service (England), NHS Scotland, NHS Wales, and Health and Social Care (Northern ...
(NHS) under the Department of Health and Social Care (DHSC) in England stated that this collection of aggregate data is going to replace the NHS 111 minimum dataset. It will also be used as a formal source for IUC statistics, as well as to oversee the Key Performance Indicators (KPIs) of the IUC ADC.


Administrators

National or regional level of available empirical data are used by administrators and intellectuals, as well as people who are concerned about a region or a society’s welfare, as sources of reference. In particular, administrators utilise aggregate data for assessments in current political, religious, social, or other atmosphere of a nation to track the gaps in social responses relating to time and space, and to dictate priorities for action. These assessments help administrators in evaluating current measures that are useful in future
strategic planning Strategic planning is the activity undertaken by an organization through which it seeks to define its future direction and makes decisions such as resource allocation aimed at achieving its intended goals. "Strategy" has many definitions, but it ...
and provide indicators about effective corrective measures.


Sources and collection methods

Aggregate data can be a composition of various types of writings and records, including
biography A biography, or simply bio, is a detailed description of a person's life. It involves more than just basic facts like education, work, relationships, and death; it portrays a person's experience of these life events. Unlike a profile or curri ...
,
autobiography An autobiography, sometimes informally called an autobio, is a self-written account of one's own life, providing a personal narrative that reflects on the author's experiences, memories, and insights. This genre allows individuals to share thei ...
, descriptive accounts and correspondence. For example, a researcher collects, collates, or compiles aggregate data through utilising multiple mechanisms of
social research Social research is research conducted by social scientists following a systematic plan. Social research methodologies can be classified as quantitative and qualitative. * Quantitative designs approach social phenomena through quantifiable ...
, including
inventory Inventory (British English) or stock (American English) is a quantity of the goods and materials that a business holds for the ultimate goal of resale, production or utilisation. Inventory management is a discipline primarily about specifying ...
,
interview An interview is a structured conversation where one participant asks questions, and the other provides answers.Merriam Webster DictionaryInterview Dictionary definition, Retrieved February 16, 2016 In common parlance, the word "interview" re ...
, an opinionnaire, and a
questionnaire A questionnaire is a research instrument that consists of a set of questions (or other types of prompts) for the purpose of gathering information from respondents through survey or statistical study. A research questionnaire is typically a mix of ...
or
schedule A schedule (, ) or a timetable, as a basic time-management tool, consists of a list of times at which possible tasks, events, or actions are intended to take place, or of a sequence of events in the chronological order in which such thing ...
. Official or non-official agencies also collect and compile aggregate data on an ongoing basis through utilising infrastructures available within a department at the field level. Sources of aggregate data can also be regarded as tools for discovering data. In the US, some of the US data are presented in the form of tables. Examples of sources for these US aggregate data include the
United States Census Bureau The United States Census Bureau, officially the Bureau of the Census, is a principal agency of the Federal statistical system, U.S. federal statistical system, responsible for producing data about the American people and American economy, econ ...
,
Statistical Abstract of the United States The ''Statistical Abstract of the United States'' was a publication of the United States Census Bureau, an agency of the United States Department of Commerce. Published annually from 1878 to 2011, the statistics described social, political and econ ...
, and Social Explorer.
International Monetary Fund The International Monetary Fund (IMF) is a major financial agency of the United Nations, and an international financial institution funded by 191 member countries, with headquarters in Washington, D.C. It is regarded as the global lender of las ...
data, World DataBank, and
Penn World Table The Penn World Table (PWT) is a set of national-accounts data developed and maintained by scholars at the University of California, Davis and thGroningen Growth Development Centreof the University of Groningen to measure real GDP across countries ...
are examples of transactional and international aggregate data sources.


Use of aggregate data


Comparative political analysis

Aggregate data is used in comparative political analysis because analysts do not only focus on individual’s behaviour. They also focus on the behaviour of areal units, including electoral constituencies and nations. In political activity analyses, significant data such as those related to
industrialisation Industrialisation ( UK) or industrialization ( US) is the period of social and economic change that transforms a human group from an agrarian society into an industrial society. This involves an extensive reorganisation of an economy for th ...
,
urbanization Urbanization (or urbanisation in British English) is the population shift from Rural area, rural to urban areas, the corresponding decrease in the proportion of people living in rural areas, and the ways in which societies adapt to this change. ...
, as well as mass communication networks, are not expressed readily in individual levels. They are expressed in
per capita ''Per capita'' is a Latin phrase literally meaning "by heads" or "for each head", and idiomatically used to mean "per person". Social statistics The term is used in a wide variety of social science, social sciences and statistical research conte ...
terms in order to control for the variations in the areal units’
population size In population genetics and population ecology, population size (usually denoted ''N'') is a countable quantity representing the number of individual organisms in a population. Population size is directly associated with amount of genetic drift, a ...
. Aggregate data are widely available because demographic, socio-economic, and political data are collected and published by the nations. This facilitates researchers and analysts in carrying out longer trend studies and allows them to bring changes and developments in a deeper focus.


APD scientific meta-analyses

Factors including the need for time, considerable resources and wide international
cooperation Cooperation (written as co-operation in British English and, with a varied usage along time, coöperation) takes place when a group of organisms works or acts together for a collective benefit to the group as opposed to working in competition ...
, impeded the use of individual patient data (IPD)
meta-analysis Meta-analysis is a method of synthesis of quantitative data from multiple independent studies addressing a common research question. An important part of this method involves computing a combined effect size across all of the studies. As such, th ...
, which led to most of the published meta-analyses relying upon aggregate patient data (APD). To acquire data in all trials on all patients, aggregate patient data are collected from completed studies being presented at professional meetings, published in the
medical literature Medical literature is the scientific literature of medicine: articles in journals and texts in books devoted to the field of medicine. Many references to the medical literature include the health care literature generally, including that of denti ...
, or were directly supplied by individual investigators. The aggregated patient data are utilised by users including the Cochrane Collaboration, the
United States Preventive Services Task Force The United States Preventive Services Task Force (USPSTF) is "an independent panel of experts in primary care and prevention that systematically reviews the evidence of effectiveness and develops recommendations for clinical preventive services". ...
, and multiple professional societies in providing support for clinical practice guidelines. Aggregate patient data are also used in time-to-event studies of meta-analyses as the results can inform investors about the worthiness to proceed to conducting more meta-analyses that are based on resource-intensive individual patient data.


Other uses


Health care

In a health information system, aggregate data is the integration of data concerning numerous patients. A particular patient cannot be traced based on aggregate data. These aggregated data are only counts, including Tuberculous,
Malaria Malaria is a Mosquito-borne disease, mosquito-borne infectious disease that affects vertebrates and ''Anopheles'' mosquitoes. Human malaria causes Signs and symptoms, symptoms that typically include fever, Fatigue (medical), fatigue, vomitin ...
, or other diseases. Health facilities use this type of aggregated statistics to generate reports and indicators, and to undertake strategic planning in their health systems. Compared with aggregated data, patient data are individual data related to a single patient, including one’s name, age,
diagnosis Diagnosis (: diagnoses) is the identification of the nature and cause of a certain phenomenon. Diagnosis is used in a lot of different academic discipline, disciplines, with variations in the use of logic, analytics, and experience, to determine " ...
and medical history. Patient-based data are mainly used to track the progress of a patient, such as how the patient responds to particular treatment, over time. The COVID-19 Data Archive, also called the COVID-ARC, aggregates data from studies around the
globe A globe is a spherical Earth, spherical Model#Physical model, model of Earth, of some other astronomical object, celestial body, or of the celestial sphere. Globes serve purposes similar to maps, but, unlike maps, they do not distort the surface ...
. Researchers are able to have access towards the discoveries of international colleagues and forges collaborations to facilitate processes involved in fighting against the disease. Specifically, using aggregated healthcare data allows health care providers to unbolt actionable clinical insights when for instance, thorough views of clinical data or continuous patient records become possible.


Education

Aggregate data such as aggregate school-level demographic data and aggregate school-level achievement data are used in experimental analysis to assess the relationships between student achievement and school-level interventions. Aggregate data can also be used in non-experimental analysis such as regression discontinuity analysis and interrupted time-series analysis. Individual-level data are not required in these non-experimental analyses. For example, interrupted time-series analysis estimates the impact brought by a school-level program through comparing a school’s achievement before and after the program is launched where individual-level data are not necessary.


Limitations

During the process of averaging units within some cluster or within a country, information is lost which increases the probability of drawing inaccurate inferences. Information loss occurs because aggregation of data ignores individual variation as if it were only a type of statistical noise or measurement error. Inference also vary from one to another when either individual firm data or aggregated data is used for analysis. For instance, calculation of country averages does not account for firm-specific variables, such as firm size, firm age, or firm-ownership concentration, but calculation of individual averages does. Differences exist between results generated from aggregate data and individual data. There is also a problem of ‘ecological fallacy’. The concept was brought about by Robinson (1950). The meaning of the term is that the variability around the individual-level means is significantly different from the variability encompassing the aggregate means. With the aggregate concept, things other than the individual equivalents of aggregate data are expressed, which means that individual-level conclusions cannot be drawn. Although aggregate data has wider applicability than individual-level data, it is more challenging for researchers to tackle with analysis on
subgroup In group theory, a branch of mathematics, a subset of a group G is a subgroup of G if the members of that subset form a group with respect to the group operation in G. Formally, given a group (mathematics), group under a binary operation  ...
results when aggregate data is used. Eventually, individual information may also be required. Growth modelling and longitudinal modelling based on aggregate data are also difficult because variables can vary over time.


Other types of aggregate data


Financial aggregates data

Financial aggregates data is a type of aggregate data about
credit Credit (from Latin verb ''credit'', meaning "one believes") is the trust which allows one party to provide money or resources to another party wherein the second party does not reimburse the first party immediately (thereby generating a debt) ...
and the
money supply In macroeconomics, money supply (or money stock) refers to the total volume of money held by the public at a particular point in time. There are several ways to define "money", but standard measures usually include currency in circulation (i ...
in Australia, which is utilised by policymakers in evaluating both the households and the companies’ economic and financial activities.


Credit aggregates

Credit aggregates are measurements of the households and businesses’ borrowings from financial intermediaries. The amount of funds borrowed by businesses for purposes including project investments, assets purchases, or cash flow managements are also measured using credit aggregates.


Monetary aggregates

Monetary aggregates are measurements of the money or ‘money-like’ instruments of the banking system, which is owed to businesses and households. An example of a ‘money-like’ instrument is deposits in the
bank account A bank account is a financial account maintained by a bank or other financial institution in which the financial transaction A financial transaction is an Contract, agreement, or communication, between a buyer and seller to exchange goods, ...
.


Census aggregate data

In the UK,
census A census (from Latin ''censere'', 'to assess') is the procedure of systematically acquiring, recording, and calculating population information about the members of a given Statistical population, population, usually displayed in the form of stati ...
aggregate data are data generated as outputs from the United Kingdom censuses. They provide information about the socio-economic and demographic characteristics of the country’s population. They are a compilation of aggregated, or summarised, calculations of the number of individuals, household residents, or families in particular geographic areas with specific characteristics, or compounds of characteristics, taken from the subjects of people and places, populations, families, health, ethnicity and religion, housing and work. Aggregate data are used as components of the UK censuses’ outputs. They are obtained from analysis on the information given in the census returns. The census aggregate data are used to compare and describe population characteristics across various locations in the UK because they are able to provide comparable information at a range of geographical levels over the entire UK. Census aggregate data are also utilised in the academic sector for teaching and research purposes, as well as for site location and marketing in the private sector.


See also

* AggregateIQ


References

{{Statistics Statistical data types Summary statistics Data processing