HOME

TheInfoList



OR:

Educational data mining (EDM) is a
research Research is "creativity, creative and systematic work undertaken to increase the stock of knowledge". It involves the collection, organization and analysis of evidence to increase understanding of a topic, characterized by a particular att ...
field concerned with the application of data mining,
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
and
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
to information generated from educational settings (e.g.,
universities A university () is an institution of higher (or tertiary) education and research which awards academic degrees in several academic disciplines. Universities typically offer both undergraduate and postgraduate programs. In the United States, t ...
and
intelligent tutoring systems An intelligent tutoring system (ITS) is a computer system that aims to provide immediate and customized instruction or feedback to learners, usually without requiring intervention from a human teacher. ITSs have the common goal of enabling learni ...
). At a high level, the field seeks to develop and improve methods for exploring this data, which often has multiple levels of meaningful
hierarchy A hierarchy (from Greek: , from , 'president of sacred rites') is an arrangement of items (objects, names, values, categories, etc.) that are represented as being "above", "below", or "at the same level as" one another. Hierarchy is an important ...
, in order to discover new insights about how people learn in the context of such settings. In doing so, EDM has contributed to theories of learning investigated by researchers in
educational psychology Educational psychology is the branch of psychology concerned with the scientific study of human learning. The study of learning processes, from both cognitive and behavioral perspectives, allows researchers to understand individual differences i ...
and the
learning sciences Learning sciences (LS) is an interdisciplinary field that works to further scientific, humanistic, and critical theoretical understanding of learning as well as to engage in the design and implementation of learning innovations, and the improvem ...
.R. Baker (2010) Data Mining for Education. In McGaw, B., Peterson, P., Baker, E. (Eds.) International Encyclopedia of Education (3rd edition), vol. 7, pp. 112-118. Oxford, UK: Elsevier. The field is closely tied to that of
learning analytics Learning analytics is the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs. The growth of online learning since ...
, and the two have been compared and contrasted.


Definition

Educational data mining refers to techniques, tools, and research designed for automatically extracting meaning from large repositories of data generated by or related to people's
learning Learning is the process of acquiring new understanding, knowledge, behaviors, skills, value (personal and cultural), values, attitudes, and preferences. The ability to learn is possessed by humans, animals, and some machine learning, machines ...
activities in educational settings. Quite often, this data is extensive, fine-grained, and precise. For example, several
learning management systems A learning management system (LMS) is a software application for the administration, documentation, tracking, reporting, automation, and delivery of educational courses, training programs, materials or learning and development programs. The learni ...
(LMSs) track information such as when each student accessed each
learning object A learning object is "a collection of content items, practice items, and assessment items that are combined based on a single learning objective". The term is credited to Wayne Hodgins, and dates from a working group in 1994 bearing the name. The c ...
, how many times they accessed it, and how many minutes the learning object was displayed on the user's computer screen. As another example,
intelligent tutoring system An intelligent tutoring system (ITS) is a computer system that aims to provide immediate and customized instruction or feedback to learners, usually without requiring intervention from a human teacher. ITSs have the common goal of enabling learni ...
s record data every time a learner submits a solution to a problem. They may collect the time of the submission, whether or not the solution matches the expected solution, the amount of time that has passed since the last submission, the order in which solution components were entered into the interface, etc. The precision of this data is such that even a fairly short session with a computer-based learning environment (''e.g.'' 30 minutes) may produce a large amount of process data for analysis. In other cases, the data is less fine-grained. For example, a student's
university A university () is an institution of higher (or tertiary) education and research which awards academic degrees in several academic disciplines. Universities typically offer both undergraduate and postgraduate programs. In the United States, t ...
transcript may contain a temporally ordered list of courses taken by the student, the
grade Grade most commonly refers to: * Grade (education), a measurement of a student's performance * Grade, the number of the year a student has reached in a given educational stage * Grade (slope), the steepness of a slope Grade or grading may also ref ...
that the student earned in each course, and when the student selected or changed his or her
academic major An academic major is the academic discipline to which an undergraduate student formally commits. A student who successfully completes all courses required for the major qualifies for an undergraduate degree. The word ''major'' (also called ''conce ...
. EDM leverages both types of data to discover meaningful information about different types of learners and how they learn, the structure of
domain knowledge Domain knowledge is knowledge of a specific, specialized discipline or field, in contrast to general (or domain-independent) knowledge. The term is often used in reference to a more general discipline—for example, in describing a software engin ...
, and the effect of instructional strategies embedded within various learning environments. These analyses provide new information that would be difficult to discern by looking at the
raw data Raw data, also known as primary data, are ''data'' (e.g., numbers, instrument readings, figures, etc.) collected from a source. In the context of examinations, the raw data might be described as a raw score (after test scores). If a scientist ...
. For example, analyzing data from an LMS may reveal a relationship between the learning objects that a student accessed during the course and their final course grade. Similarly, analyzing student transcript data may reveal a relationship between a student's grade in a particular course and their decision to change their academic major. Such information provides insight into the design of learning environments, which allows students, teachers, school administrators, and educational policy makers to make informed decisions about how to interact with, provide, and manage educational resources.


History

While the analysis of educational data is not itself a new practice, recent advances in
educational technology Educational technology (commonly abbreviated as edutech, or edtech) is the combined use of computer hardware, software, and educational theory and practice to facilitate learning. When referred to with its abbreviation, edtech, it often refer ...
, including the increase in computing power and the ability to log fine-grained data about students' use of a computer-based learning environment, have led to an increased interest in developing techniques for analyzing the large amounts of data generated in educational settings. This interest translated into a series of EDM workshops held from 2000 to 2007 as part of several international research conferences.C. Romero, S. Ventura. Educational Data Mining: A Review of the State-of-the-Art. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews. 40(6), 601-618, 2010. In 2008, a group of researchers established what has become an annual international research conference on EDM, the first of which took place in
Montreal, Quebec Montreal ( ; officially Montréal, ) is the second-most populous city in Canada and most populous city in the Canadian province of Quebec. Founded in 1642 as '' Ville-Marie'', or "City of Mary", it is named after Mount Royal, the triple-pea ...
, Canada. As interest in EDM continued to increase, EDM researchers established an
academic journal An academic journal or scholarly journal is a periodical publication in which scholarship relating to a particular academic discipline is published. Academic journals serve as permanent and transparent forums for the presentation, scrutiny, and d ...
in 2009, the Journal of Educational Data Mining, for sharing and disseminating research results. In 2011, EDM researchers established the International Educational Data Mining Society to connect EDM researchers and continue to grow the field. With the introduction of public educational data repositories in 2008, such as the Pittsburgh Science of Learning Centre's (
PSLC The Pittsburgh Science of Learning Center (''aka'' LearnLab) is a Science of Learning Center funded by the National Science Foundation and managed by Carnegie Mellon University and the University of Pittsburgh. The PSLC is led by Kenneth Koedinger ...
) DataShop and the
National Center for Education Statistics The National Center for Education Statistics (NCES) is the part of the United States Department of Education's Institute of Education Sciences (IES) that collects, analyzes, and publishes statistics on education and public school district finance ...
(NCES), public data sets have made educational data mining more
accessible Accessibility is the design of products, devices, services, vehicles, or environments so as to be usable by people with disabilities. The concept of accessible design and practice of accessible development ensures both "direct access" (i.e ...
and feasible, contributing to its growth.


Goals

Ryan S. Baker and Kalina Yacef identified the following four goals of EDM: #Predicting students' future learning behavior – With the use of
student modeling An intelligent tutoring system (ITS) is a computer system that aims to provide immediate and customized instruction or feedback to learners, usually without requiring intervention from a human teacher. ITSs have the common goal of enabling learni ...
, this goal can be achieved by creating student models that incorporate the learner's characteristics, including detailed information such as their knowledge, behaviours and motivation to learn. The user experience of the learner and their overall satisfaction with learning are also measured. #Discovering or improving domain models – Through the various methods and applications of EDM, discovery of new and improvements to existing models is possible. Examples include illustrating the educational content to engage learners and determining optimal instructional sequences to support the student's learning style. #Studying the effects of educational support that can be achieved through learning systems. #Advancing scientific knowledge about learning and learners by building and incorporating student models, the field of EDM research and the
technology Technology is the application of knowledge to reach practical goals in a specifiable and reproducible way. The word ''technology'' may also mean the product of such an endeavor. The use of technology is widely prevalent in medicine, science, ...
and
software Software is a set of computer programs and associated documentation and data. This is in contrast to hardware, from which the system is built and which actually performs the work. At the lowest programming level, executable code consists ...
used.


Users and stakeholders

There are four main users and stakeholders involved with educational data mining. These include: * Learners – Learners are interested in understanding student needs and methods to improve the learner's experience and performance. For example, learners can also benefit from the discovered knowledge by using the EDM tools to suggest activities and resources that they can use based on their interactions with the
online learning Educational technology (commonly abbreviated as edutech, or edtech) is the combined use of computer hardware, software, and educational theory and practice to facilitate learning. When referred to with its abbreviation, edtech, it often refer ...
tool and insights from past or similar learners. For younger learners, educational data mining can also inform parents about their child's learning progress. It is also necessary to effectively group learners in an online environment. The challenge is using the complex data to learn and interpret these groups through developing actionable models. *
Educators A teacher, also called a schoolteacher or formally an educator, is a person who helps students to acquire knowledge, competence, or virtue, via the practice of teaching. ''Informally'' the role of teacher may be taken on by anyone (e.g. whe ...
– Educators attempt to understand the learning process and the methods they can use to improve their teaching methods. Educators can use the applications of EDM to determine how to organize and structure the
curriculum In education, a curriculum (; : curricula or curriculums) is broadly defined as the totality of student experiences that occur in the educational process. The term often refers specifically to a planned sequence of instruction, or to a view ...
, the best methods to deliver course information and the tools to use to engage their learners for optimal learning outcomes. In particular, the distillation of data for human judgment technique provides an opportunity for educators to benefit from EDM because it enables educators to quickly identify behavioural patterns, which can support their teaching methods during the duration of the course or to improve future courses. Educators can determine indicators that show student satisfaction and engagement of course material, and also monitor learning progress. * Researchers – Researchers focus on the development and the evaluation of data mining techniques for effectiveness. A yearly international conference for researchers began in 2008. The wide range of topics in EDM ranges from using data mining to improve institutional effectiveness to student performance. * Administrators – Administrators are responsible for allocating the resources for implementation in institutions. As institutions are increasingly held responsible for student success, the administering of EDM applications are becoming more common in educational settings. Faculty and advisors are becoming more proactive in identifying and addressing at-risk students. However, it is sometimes a challenge to get the information to the decision makers to administer the application in a timely and efficient manner.


Phases

As research in the field of educational data mining has continued to grow, a myriad of data mining techniques have been applied to a variety of educational contexts. In each case, the goal is to translate raw data into meaningful information about the learning process in order to make better decisions about the design and trajectory of a learning environment. Thus, EDM generally consists of four phases: # The first phase of the EDM process (not counting
pre-processing In computer science, a preprocessor (or precompiler) is a program that processes its input data to produce output that is used as input in another program. The output is said to be a preprocessed form of the input data, which is often used by so ...
) is discovering relationships in data. This involves searching through a repository of data from an educational environment with the goal of finding consistent relationships between variables. Several
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algorithms are used as specificat ...
s for identifying such relationships have been utilized, including
classification Classification is a process related to categorization, the process in which ideas and objects are recognized, differentiated and understood. Classification is the grouping of related facts into classes. It may also refer to: Business, organizat ...
,
regression Regression or regressions may refer to: Science * Marine regression, coastal advance due to falling sea level, the opposite of marine transgression * Regression (medicine), a characteristic of diseases to express lighter symptoms or less extent ( ...
, clustering,
factor analysis Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observed ...
,
social network analysis Social network analysis (SNA) is the process of investigating social structures through the use of networks and graph theory. It characterizes networked structures in terms of ''nodes'' (individual actors, people, or things within the network) a ...
,
association rule mining Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.Pi ...
, and
sequential pattern mining Sequential pattern mining is a topic of data mining concerned with finding statistically relevant patterns between data examples where the values are delivered in a sequence. It is usually presumed that the values are discrete, and thus time serie ...
. # Discovered relationships must then be validated in order to avoid
overfitting mathematical modeling, overfitting is "the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit to additional data or predict future observations reliably". An overfitt ...
. # Validated relationships are applied to make
predictions A prediction (Latin ''præ-'', "before," and ''dicere'', "to say"), or forecast, is a statement about a future event or data. They are often, but not always, based upon experience or knowledge. There is no universal agreement about the exact ...
about future events in the learning environment. # Predictions are used to support decision-making processes and policy decisions. During phases 3 and 4, data is often visualized or in some other way distilled for human judgment. A large amount of research has been conducted in best practices for visualizing data.


Main approaches

Of the general categories of methods mentioned,
prediction A prediction (Latin ''præ-'', "before," and ''dicere'', "to say"), or forecast, is a statement about a future event or data. They are often, but not always, based upon experience or knowledge. There is no universal agreement about the exact ...
, clustering and relationship mining are considered universal methods across all types of data mining; however, Discovery with Models and Distillation of Data for Human Judgment are considered more prominent approaches within educational data mining.


Discovery with models

In the Discovery with Model method, a model is developed via prediction, clustering or by human reasoning
knowledge engineering Knowledge engineering (KE) refers to all technical, scientific and social aspects involved in building, maintaining and using knowledge-based systems. Background Expert systems One of the first examples of an expert system was MYCIN, an appl ...
and then used as a component in another analysis, namely in prediction and relationship mining. In the prediction method use, the created model's predictions are used to predict a new
variable Variable may refer to: * Variable (computer science), a symbolic name associated with a value and whose associated value may be changed * Variable (mathematics), a symbol that represents a quantity in a mathematical expression, as used in many ...
. For the use of relationship mining, the created model enables the analysis between new predictions and additional variables in the study. In many cases, discovery with models uses validated prediction models that have proven generalizability across contexts. Key applications of this method include discovering relationships between student behaviors, characteristics and contextual variables in the learning environment. Further discovery of broad and specific research questions across a wide range of contexts can also be explored using this method.


Distillation of data for human judgment

Humans can make inferences about data that may be beyond the scope in which an automated data mining method provides. For the use of education data mining, data is distilled for human judgment for two key purposes,
identification Identification or identify may refer to: *Identity document, any document used to verify a person's identity Arts, entertainment and media * ''Identify'' (album) by Got7, 2014 * "Identify" (song), by Natalie Imbruglia, 1999 * Identification ( ...
and
classification Classification is a process related to categorization, the process in which ideas and objects are recognized, differentiated and understood. Classification is the grouping of related facts into classes. It may also refer to: Business, organizat ...
. For the purpose of
identification Identification or identify may refer to: *Identity document, any document used to verify a person's identity Arts, entertainment and media * ''Identify'' (album) by Got7, 2014 * "Identify" (song), by Natalie Imbruglia, 1999 * Identification ( ...
, data is distilled to enable humans to identify well-known patterns, which may otherwise be difficult to interpret. For example, the
learning curve A learning curve is a graphical representation of the relationship between how Skill, proficient people are at a task and the amount of experience they have. Proficiency (measured on the vertical axis) usually increases with increased experience ...
, classic to educational studies, is a pattern that clearly reflects the relationship between learning and experience over time. Data is also
distilled Distillation, or classical distillation, is the process of separating the components or substances from a liquid mixture by using selective boiling and condensation, usually inside an apparatus known as a still. Dry distillation is the heating ...
for the purposes of classifying features of data, which for educational data mining, is used to support the development of the prediction model. Classification helps expedite the development of the prediction model, tremendously. The goal of this method is to summarize and present the information in a useful,
interactive Across the many fields concerned with interactivity, including information science, computer science, human-computer interaction, communication, and industrial design, there is little agreement over the meaning of the term "interactivity", but mo ...
and visually appealing way in order to understand the large amounts of education data and to support
decision making In psychology, decision-making (also spelled decision making and decisionmaking) is regarded as the cognitive process resulting in the selection of a belief or a course of action among several possible alternative options. It could be either rati ...
. In particular, this method is beneficial to educators in understanding usage information and effectiveness in course activities. Key applications for the distillation of data for human judgment include identifying patterns in student learning, behavior, opportunities for
collaboration Collaboration (from Latin ''com-'' "with" + ''laborare'' "to labor", "to work") is the process of two or more people, entities or organizations working together to complete a task or achieve a goal. Collaboration is similar to cooperation. Most ...
and labeling data for future uses in prediction models.


Applications

A list of the primary applications of EDM is provided by Cristobal Romero and Sebastian Ventura. In their taxonomy, the areas of EDM application are: * Analysis and visualization of data * Providing feedback for supporting instructors * Recommendations for students * Predicting student performance * Student modeling * Detecting undesirable student behaviors * Grouping students * Social network analysis * Developing
concept map A concept map or conceptual diagram is a diagram that depicts suggested relationships between concepts. Concept maps may be used by instructional designers, engineers, technical writers, and others to organize and structure knowledge. A conce ...
s * Constructing courseware – EDM can be applied to course management systems such as open source
Moodle Moodle is a free and open-source learning management system written in PHP and distributed under the GNU General Public License. Moodle is used for blended learning, distance education, flipped classroom and other online learning projects in sch ...
. Moodle contains usage data that includes various activities by users such as test results, amount of readings completed and participation in
discussion forums An Internet forum, or message board, is an online discussion site where people can hold conversations in the form of posted messages. They differ from chat rooms in that messages are often longer than one line of text, and are at least temporar ...
. Data mining tools can be used to customize learning activities for each user and adapt the pace in which the student completes the course. This is in particularly beneficial for
online courses Educational technology (commonly abbreviated as edutech, or edtech) is the combined use of computer hardware, software, and Education sciences, educational theory and practice to facilitate learning. When referred to with its abbreviation, edt ...
with varying levels of competency. * Planning and scheduling New research on
mobile Mobile may refer to: Places * Mobile, Alabama, a U.S. port city * Mobile County, Alabama * Mobile, Arizona, a small town near Phoenix, U.S. * Mobile, Newfoundland and Labrador Arts, entertainment, and media Music Groups and labels * Mobile ( ...
learning environments also suggests that data mining can be useful. Data mining can be used to help provide personalized content to mobile users, despite the differences in managing content between
mobile devices A mobile device (or handheld computer) is a computer small enough to hold and operate in the hand. Mobile devices typically have a flat LCD or OLED screen, a touchscreen interface, and digital or physical buttons. They may also have a physical ...
and standard PCs and
web browsers A web browser is application software for accessing websites. When a User (computing), user requests a web page from a particular website, the browser retrieves its Computer file, files from a web server and then displays the page on the user' ...
. New EDM applications will focus on allowing non-technical users use and engage in data mining tools and activities, making
data collection Data collection or data gathering is the process of gathering and measuring information on targeted variables in an established system, which then enables one to answer relevant questions and evaluate outcomes. Data collection is a research com ...
and processing more accessible for all users of EDM. Examples include statistical and visualization tools that analyzes
social networks A social network is a social structure made up of a set of social actors (such as individuals or organizations), sets of dyadic ties, and other social interactions between actors. The social network perspective provides a set of methods for an ...
and their influence on learning outcomes and productivity.


Courses

# In October 2013,
Coursera Coursera Inc. () is a U.S.-based massive open online course provider founded in 2012 by Stanford University computer science professors Andrew Ng and Daphne Koller. Coursera works with universities and other organizations to offer online courses, ...
offered a free online course on "Big Data in Education" that taught how and when to use key methods for EDM. This course moved to edX in the summer of 2015, and has continued to run on edX annually since then. A course archive is now available online. #
Teachers College, Columbia University Teachers College, Columbia University (TC), is the graduate school of education, health, and psychology of Columbia University, a private research university in New York City. Founded in 1887, it has served as one of the official faculties and ...
offers a MS in Learning Analytics.


Publication venues

Considerable amounts of EDM work are published at the peer-reviewed International Conference on Educational Data Mining, organized by the International Educational Data Mining Society. * 1st International Conference on Educational Data Mining (2008) – Montreal, Canada * 2nd International Conference on Educational Data Mining (2009) – Cordoba, Spain * 3rd International Conference on Educational Data Mining (2010) – Pittsburgh, PA, USA * 4th International Conference on Educational Data Mining (2011) – Eindhoven, Netherlands * 5th International Conference on Educational Data Mining (2012) – Chania, Greece * 6th International Conference on Educational Data Mining (2013) – Memphis, TN, USA * 7th International Conference on Educational Data Mining (2014) – London, UK * 8th International Conference on Educational Data Mining] (2015) – Madrid, Spain * 9th International Conference on Educational Data Mining] (2016) – Raleigh, NC, USA * 10th International Conference on Educational Data Mining] (2017) – Wuhan, China * 11th International Conference on Educational Data Mining] (2018) – Buffalo, NY, USA * 12th International Conference on Educational Data Mining] (2019) – Montréal, QC, Canada * 13th International Conference on Educational Data Mining] (2020) – Virtual * 14th International Conference on Educational Data Mining (2021) – Paris, France EDM papers are also published in the Journal of Educational Data Mining (JEDM). Many EDM papers are routinely published in related conferences, such as Artificial Intelligence and Education, International Conference on Intelligent Tutoring Systems, Intelligent Tutoring Systems, and User Modeling, Adaptation, and Personalization. In 2011,
Chapman & Hall Chapman & Hall is an imprint owned by CRC Press, originally founded as a British publishing house in London in the first half of the 19th century by Edward Chapman and William Hall. Chapman & Hall were publishers for Charles Dickens (from 1840 ...
/
CRC Press The CRC Press, LLC is an American publishing group that specializes in producing technical books. Many of their books relate to engineering, science and mathematics. Their scope also includes books on business, forensics and information tec ...
,
Taylor and Francis Group Taylor & Francis Group is an international company originating in England that publishes books and academic journals. Its parts include Taylor & Francis, Routledge, F1000 Research or Dovepress. It is a division of Informa plc, a United Kin ...
published the first Handbook of Educational Data Mining. This resource was created for those that are interested in participating in the educational data mining community.


Contests

In 2010, the
Association for Computing Machinery The Association for Computing Machinery (ACM) is a US-based international learned society for computing. It was founded in 1947 and is the world's largest scientific and educational computing society. The ACM is a non-profit professional member ...
's KDD Cup was conducted using data from an educational setting. The data set was provided by the DataShop, and it consisted of over 1,000,000 data points from students using a cognitive tutor. Six hundred teams competed for over US$8,000 in prize money (which was donated by
Facebook Facebook is an online social media and social networking service owned by American company Meta Platforms. Founded in 2004 by Mark Zuckerberg with fellow Harvard College students and roommates Eduardo Saverin, Andrew McCollum, Dustin M ...
). The goal for contestants was to design an algorithm that, after learning from the provided data, would make the most accurate predictions from new data. The winners submitted an algorithm that utilized feature generation (a form of
representation learning In machine learning, feature learning or representation learning is a set of techniques that allows a system to automatically discover the representations needed for feature detection or classification from raw data. This replaces manual feature e ...
),
random forests Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of th ...
, and
Bayesian networks A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Bay ...
.


Costs and challenges

Along with technological advancements are costs and challenges associated with implementing EDM applications. These include the costs to store logged data and the cost associated with hiring staff dedicated to managing data systems. Moreover, data systems may not always integrate seamlessly with one another and even with the support of statistical and visualization tools, creating one simplified version of the data can be difficult. Furthermore, choosing which data to mine and analyze can also be challenging, making the initial stages very time-consuming and labor-intensive. From beginning to end, the EDM strategy and implementation requires one to uphold
privacy Privacy (, ) is the ability of an individual or group to seclude themselves or information about themselves, and thereby express themselves selectively. The domain of privacy partially overlaps with security, which can include the concepts of a ...
and
ethics Ethics or moral philosophy is a branch of philosophy that "involves systematizing, defending, and recommending concepts of right and wrong behavior".''Internet Encyclopedia of Philosophy'' The field of ethics, along with aesthetics, concerns m ...
for all stakeholders involved.


Criticisms

*
Generalizability Generalizability theory, or G theory, is a statistical framework for conceptualizing, investigating, and designing reliable observations. It is used to determine the reliability (i.e., reproducibility) of measurements under specific conditions. I ...
– Research in EDM may be specific to the particular educational setting and time in which the research was conducted, and as such, may not be generalizable to other institutions. Research also indicates that the field of educational data mining is concentrated in western countries and
cultures Culture () is an umbrella term which encompasses the social behavior, institutions, and norms found in human societies, as well as the knowledge, beliefs, arts, laws, customs, capabilities, and habits of the individuals in these groups.Tyl ...
and subsequently, other countries and cultures may not be represented in the research and findings. Development of future models should consider applications across multiple contexts. *
Privacy Privacy (, ) is the ability of an individual or group to seclude themselves or information about themselves, and thereby express themselves selectively. The domain of privacy partially overlaps with security, which can include the concepts of a ...
– Individual privacy is a continued concern for the application of data mining tools. With free, accessible and user-friendly tools in the market, students and their families may be at risk from the information that learners provide to the learning system, in hopes to receive feedback that will benefit their future performance. As users become savvy in their understanding of
online privacy Internet privacy involves the right or mandate of personal privacy concerning the storing, re-purposing, provision to third parties, and displaying of information pertaining to oneself via Internet. Internet privacy is a subset of data privacy. Pri ...
,
administrators Administrator or admin may refer to: Job roles Computing and internet * Database administrator, a person who is responsible for the environmental aspects of a database * Forum administrator, one who oversees discussions on an Internet forum * N ...
of educational data mining tools need to be proactive in protecting the privacy of their users and be transparent about how and with whom the information will be used and shared. Development of EDM tools should consider protecting individual privacy while still advancing the research in this field. *
Plagiarism Plagiarism is the fraudulent representation of another person's language, thoughts, ideas, or expressions as one's own original work.From the 1995 '' Random House Compact Unabridged Dictionary'': use or close imitation of the language and thought ...
– Plagiarism detection is an ongoing challenge for educators and faculty whether in the classroom or online. However, due to the complexities associated with detecting and preventing digital plagiarism in particular, educational data mining tools are not currently sophisticated enough to accurately address this issue. Thus, the development of predictive capability in plagiarism-related issues should be an area of focus in future research. * Adoption – It is unknown how widespread the adoption of EDM is and the extent to which institutions have applied and considered implementing an EDM strategy. As such, it is unclear whether there are any barriers that prevent users from adopting EDM in their educational settings.


See also

*
Big data Though used sometimes loosely partly because of a lack of formal definition, the interpretation that seems to best describe Big data is the one associated with large body of information that we could not comprehend when used only in smaller am ...
* Data mining *
Education Education is a purposeful activity directed at achieving certain aims, such as transmitting knowledge or fostering skills and character traits. These aims may include the development of understanding, rationality, kindness, and honesty. Va ...
*
Educational technology Educational technology (commonly abbreviated as edutech, or edtech) is the combined use of computer hardware, software, and educational theory and practice to facilitate learning. When referred to with its abbreviation, edtech, it often refer ...
* Glossary of education terms *
Learning analytics Learning analytics is the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs. The growth of online learning since ...
*
Machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
*
Statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...


References

{{DEFAULTSORT:Educational Data Mining Applied data mining Educational psychology