Data Mining In Agriculture
   HOME

TheInfoList



OR:

Data mining in agriculture is a research topic consisting of the application of data mining and
data science Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract or extrapolate knowledge and insights from noisy, structured and unstructured data, and apply knowledge from data across a br ...
techniques to
agriculture Agriculture or farming is the practice of cultivating plants and livestock. Agriculture was the key development in the rise of sedentary human civilization, whereby farming of domesticated species created food surpluses that enabled people to ...
. Recent technologies are able to provide extensive data on agricultural-related activities, which can then be analyzed in order to find information.


Applications


Relationship between sprays and fruit defects

Fruit defects are often recorded (for a multitude of reasons, sometimes for insurance reasons when exporting fruit overseas). It may be done manually or through computer vision (detecting surface defects when grading fruit). Spray diaries are a legal requirement in many countries and at the very least record the date of spray and the product name. It is known that spraying can have affect different fruit defects for different fruit. Fungicidal sprays are often used to prevent rots from being expressed on fruit. It is also known that some sprays can cause russeting on apples. Currently much of this knowledge comes anecdotally, however some efforts have been in regards to the use of data mining in horticulture.


Prediction of problematic wine fermentations

The fermentation process of wine impacts the productivity of wine-related industries as well as the quality of the wine. Data science techniques, such as the
k-means algorithm ''k''-means clustering is a method of vector quantization, originally from signal processing, that aims to partition ''n'' observations into ''k'' clusters in which each observation belongs to the cluster with the nearest mean (cluster centers or ...
, and classification techniques based on the concept of
biclustering Biclustering, block clustering, Co-clustering or two-mode clustering is a data mining technique which allows simultaneous clustering of the rows and columns of a matrix. The term was first introduced by Boris Mirkin to name a technique introduce ...
have been used to study the process of fermentation in order to predict problematic wine fermentations. These methods differ from techniques where a classification of different kinds of wine is performed. See the wiki page
Classification of wine The classification of wine is based on various criteria including place of origin or appellation, vinification method and style, sweetness and vintage,J. Robinson (ed) ''"The Oxford Companion to Wine"'' Third Edition pg 752 & 753 Oxford University ...
for more details.


Predicting metabolizable energy of poultry feed using group method of data handling-type neural network

A group method of data handling-type neural network (
GMDH Group method of data handling (GMDH) is a family of inductive algorithms for computer-based mathematical modeling of multi-parametric datasets that features fully automatic structural and parametric optimization of models. GMDH is used in such fiel ...
-type network) with an evolutionary method of
genetic algorithm In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to gene ...
was used to predict the metabolizable energy of feather meal and poultry offal meal based on their protein, fat, and ash content. Published data samples were collected from literature and used to train a
GMDH Group method of data handling (GMDH) is a family of inductive algorithms for computer-based mathematical modeling of multi-parametric datasets that features fully automatic structural and parametric optimization of models. GMDH is used in such fiel ...
-type network model. The novel modeling of
GMDH Group method of data handling (GMDH) is a family of inductive algorithms for computer-based mathematical modeling of multi-parametric datasets that features fully automatic structural and parametric optimization of models. GMDH is used in such fiel ...
-type network with an evolutionary method of genetic algorithm can be used to predict the metabolizable energy of poultry feed samples based on their chemical content. It is also reported that the
GMDH Group method of data handling (GMDH) is a family of inductive algorithms for computer-based mathematical modeling of multi-parametric datasets that features fully automatic structural and parametric optimization of models. GMDH is used in such fiel ...
-type network may be used to accurately estimate the poultry performance from their dietary nutrients such as dietary metabolizable energy,
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
and
amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha am ...
s.


Detection of diseases from sounds issued by animals

The detection of diseases in
farms A farm (also called an agricultural holding) is an area of land that is devoted primarily to agricultural processes with the primary objective of producing food and other crops; it is the basic facility in food production. The name is used fo ...
can positively impact the productivity of the farm by reducing contamination to other animals. Moreover, the early detection of the diseases can allow the farmer to treat and isolate the animal as soon as the disease appears. Sounds issued by
pigs The pig (''Sus domesticus''), often called swine, hog, or domestic pig when distinguishing from other members of the genus '' Sus'', is an omnivorous, domesticated, even-toed, hoofed mammal. It is variously considered a subspecies of ''Sus s ...
, such as coughs, can be analyzed for the detection of diseases. A computational system is under development which is able to monitor pig sounds by microphones installed in the farm, and which is also able to discriminate among the different sounds that can be detected.


Growth of sheep from genes polymorphism using artificial intelligence

Polymerase chain reaction The polymerase chain reaction (PCR) is a method widely used to rapidly make millions to billions of copies (complete or partial) of a specific DNA sample, allowing scientists to take a very small sample of DNA and amplify it (or a part of it) t ...
-single strand conformation polymorphism ( PCR-SSCP) method was used to determine the
growth hormone Growth hormone (GH) or somatotropin, also known as human growth hormone (hGH or HGH) in its human form, is a peptide hormone that stimulates growth, cell reproduction, and cell regeneration in humans and other animals. It is thus important in h ...
(GH),
leptin Leptin (from Ancient Greek, Greek λεπτός ''leptos'', "thin" or "light" or "small") is a hormone predominantly made by adipose cells and enterocytes in the small intestine that helps to regulate Energy homeostasis, energy balance by inhib ...
,
calpain A calpain (; , ) is a protein belonging to the family of calcium-dependent, non-lysosomal cysteine proteases (protease, proteolytic enzymes) expressed ubiquitously in mammals and many other organisms. Calpains constitute the C2 family of proteas ...
, and
calpastatin Calpastatin is a protein that in humans is encoded by the ''CAST'' gene. The protein encoded by this gene is an endogenous calpain (calcium-dependent cysteine protease) inhibitor. It consists of an N-terminal domain L and four repetitive calpain- ...
polymorphism in Iranian Baluchi male
sheep Sheep or domestic sheep (''Ovis aries'') are domesticated, ruminant mammals typically kept as livestock. Although the term ''sheep'' can apply to other species in the genus ''Ovis'', in everyday usage it almost always refers to domesticated s ...
. An
artificial neural network Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected unit ...
(ANN) model was developed to describe average daily gain (ADG) in lambs from input parameters of GH, leptin, calpain, and calpastatin polymorphism, birth weight, and birth type. The results revealed that the
ANN Anne, alternatively spelled Ann, is a form of the Latin female given name Anna. This in turn is a representation of the Hebrew Hannah, which means 'favour' or 'grace'. Related names include Annie. Anne is sometimes used as a male name in the ...
-model is an appropriate tool to recognize the patterns of data to predict lamb growth in terms of ADG given specific genes polymorphism,
birth weight Birth weight is the body weight of a baby at its birth. The average birth weight in babies of European descent is , with the normative range between . On average, babies of South Asian and Chinese descent weigh about . As far as low birth weight ...
, and birth type. The platform of PCR-SSCP approach and
ANN Anne, alternatively spelled Ann, is a form of the Latin female given name Anna. This in turn is a representation of the Hebrew Hannah, which means 'favour' or 'grace'. Related names include Annie. Anne is sometimes used as a male name in the ...
-based model analyses may be used in molecular marker-assisted selection and
breeding Breeding is sexual reproduction that produces offspring, usually animals or plants. It can only occur between a male and a female animal or plant. Breeding may refer to: * Animal husbandry, through selected specimens such as dogs, horses, and rab ...
programs to design a scheme in enhancing the efficacy of
sheep Sheep or domestic sheep (''Ovis aries'') are domesticated, ruminant mammals typically kept as livestock. Although the term ''sheep'' can apply to other species in the genus ''Ovis'', in everyday usage it almost always refers to domesticated s ...
production.


Sorting apples by watercores

Before going to market,
apple An apple is an edible fruit produced by an apple tree (''Malus domestica''). Apple fruit tree, trees are agriculture, cultivated worldwide and are the most widely grown species in the genus ''Malus''. The tree originated in Central Asia, wh ...
s are checked and the ones showing some defects are removed. However, there are also invisible defects that can spoil the apple flavor and look. An example of invisible defect is an internal apple disorder that can affect the longevity of the fruit called a watercore. Apples with slight or mild watercores are sweeter, but apples with moderate to severe degree of watercore cannot be stored for any length of time. Moreover, a few fruits with severe watercore could spoil a whole batch of apples. For this reason, a computational system is under study which takes
X-ray An X-ray, or, much less commonly, X-radiation, is a penetrating form of high-energy electromagnetic radiation. Most X-rays have a wavelength ranging from 10  picometers to 10  nanometers, corresponding to frequencies in the range 30&nb ...
photographs of the fruit while they run on
conveyor belt A conveyor belt is the carrying medium of a belt conveyor system (often shortened to belt conveyor). A belt conveyor system is one of many types of conveyor systems. A belt conveyor system consists of two or more pulleys (sometimes referred to ...
s, and which is also able to analyse (by data mining techniques) the taken pictures and estimate the probability that the fruit contains watercores.


Optimizing pesticide use by data mining

Recent studies by agriculture researchers in Pakistan showed that attempts of
cotton Cotton is a soft, fluffy staple fiber that grows in a boll, or protective case, around the seeds of the cotton plants of the genus ''Gossypium'' in the mallow family Malvaceae. The fiber is almost pure cellulose, and can contain minor perce ...
crop yield maximization through pro-pesticide state policies have led to a dangerously high pesticide use. These studies have reported a negative correlation between pesticide use and crop yield in Pakistan. Hence excessive use (or abuse) of pesticides is harming the farmers with adverse financial, environmental and social impacts. By data mining the cotton Pest Scouting data along with the meteorological recordings it was shown that how pesticide use can be optimized (reduced). Clustering of data revealed interesting patterns of farmer practices along with pesticide use dynamics and hence help identify the reasons for this pesticide abuse.


Explaining pesticide abuse by data mining

To monitor cotton growth, different government departments and agencies in Pakistan have been recording pest scouting, agriculture and metrological data for decades. Coarse estimates of just the cotton pest scouting data recorded stands at around 1.5 million records, and growing. The primary agro-met data recorded has never been digitized, integrated or standardized to give a complete picture, and hence cannot support decision making, thus requiring an Agriculture Data Warehouse. Creating a novel Pilot Agriculture Extension Data Warehouse followed by analysis through querying and data mining some interesting discoveries were made, such as pesticides sprayed at the wrong time, wrong
pesticides Pesticides are substances that are meant to control pests. This includes herbicide, insecticide, nematicide, molluscicide, piscicide, avicide, rodenticide, bactericide, insect repellent, animal repellent, microbicide, fungicide, and lampric ...
used for the right reasons and temporal relationship between pesticide usage and day of the week.


Analyzing chicken performance data by neural network models

A platform of
artificial neural network Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected unit ...
-based models with
sensitivity analysis Sensitivity analysis is the study of how the uncertainty in the output of a mathematical model or system (numerical or otherwise) can be divided and allocated to different sources of uncertainty in its inputs. A related practice is uncertainty anal ...
and optimization algorithms was used successfully to integrate published data on the responses of broiler chickens to
threonine Threonine (symbol Thr or T) is an amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated −NH form under biological conditions), a carboxyl group (which is in the deprotonated −COOâ ...
. Analyses of the
artificial neural network Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected unit ...
models for weight gain and feed efficiency from a compiled data set suggested that the dietary protein concentration was more important than the
threonine Threonine (symbol Thr or T) is an amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated −NH form under biological conditions), a carboxyl group (which is in the deprotonated −COOâ ...
concentration. The results revealed that a diet containing 18.69% protein and 0.73% threonine may lead to producing optimal weight gain, whereas the optimal feed efficiency may be achieved with a diet containing 18.71%
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
and 0.75%
threonine Threonine (symbol Thr or T) is an amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated −NH form under biological conditions), a carboxyl group (which is in the deprotonated −COOâ ...
.


Literature

There are a few
precision agriculture Precision agriculture (PA) is a farming management strategy based on observing, measuring and responding to temporal and spatial variability to improve agricultural production sustainability. It is used in both crop and livestock production. P ...
journals, such as Springer'
Precision Agriculture
or Elsevier'
Computers and Electronics in Agriculture
but those are not exclusively devoted to data mining in agriculture.


References

{{DEFAULTSORT:Data Mining In Agriculture Applied data mining Agricultural research E-agriculture