CumFreq
   HOME

TheInfoList



OR:

In statistics and data analysis the
application software Application may refer to: Mathematics and computing * Application software, computer software designed to help the user to perform specific tasks ** Application layer, an abstraction layer that specifies protocols and interface methods used in a ...
CumFreq is a tool for
cumulative frequency analysis Cumulative frequency analysis is the analysis of the frequency of occurrence of values of a phenomenon less than a reference value. The phenomenon may be time- or space-dependent. Cumulative frequency is also called ''frequency of non-exceedance ...
of a single variable and for probability distribution fitting. Originally the method was developed for the analysis of
hydrological Hydrology () is the scientific study of the movement, distribution, and management of water on Earth and other planets, including the water cycle, water resources, and environmental watershed sustainability. A practitioner of hydrology is calle ...
measurements of spatially varying magnitudes (e.g.
hydraulic conductivity Hydraulic conductivity, symbolically represented as (unit: m/s), is a property of porous materials, soils and rocks, that describes the ease with which a fluid (usually water) can move through the pore space, or fractures network. It depends on ...
of the soil) and of magnitudes varying in time (e.g. rainfall, river discharge) to find their
return period A return period, also known as a recurrence interval or repeat interval, is an average time or an estimated average time between events such as earthquakes, floods, landslides, or river discharge flows to occur. It is a statistical measurement typ ...
s. However, it can be used for many other types of phenomena, including those that contain negative values.


Software features

CumFreq uses the
plotting position Plot or Plotting may refer to: Art, media and entertainment * Plot (narrative), the story of a piece of fiction Music * ''The Plot'' (album), a 1976 album by jazz trumpeter Enrico Rava * The Plot (band), a band formed in 2003 Other * ''Plot'' ...
approach to estimate the ''cumulative frequency'' of each of the observed magnitudes in a data series of the variable.''Frequency and Regression Analysis''. Chapter 6 in: H.P.Ritzema (ed., 1994), ''Drainage Principles and Applications'', Publ. 16, pp. 175–224, International Institute for Land Reclamation and Improvement (ILRI), Wageningen, The Netherlands. . Free download as PDF from
ILRI website
or from

/ref> The computer program allows determination of the best fitting probability distribution. Alternatively it provides the user with the option to select the probability distribution to be fitted. The following probability distributions are included:
normal Normal(s) or The Normal(s) may refer to: Film and television * ''Normal'' (2003 film), starring Jessica Lange and Tom Wilkinson * ''Normal'' (2007 film), starring Carrie-Anne Moss, Kevin Zegers, Callum Keith Rennie, and Andrew Airlie * ''Norma ...
,
lognormal In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable is log-normally distributed, then has a norma ...
, logistic, loglogistic,
exponential Exponential may refer to any of several mathematical topics related to exponentiation, including: *Exponential function, also: **Matrix exponential, the matrix analogue to the above *Exponential decay, decrease at a rate proportional to value *Expo ...
,
Cauchy Baron Augustin-Louis Cauchy (, ; ; 21 August 178923 May 1857) was a French mathematician, engineer, and physicist who made pioneering contributions to several branches of mathematics, including mathematical analysis and continuum mechanics. He w ...
, Fréchet, Gumbel, Pareto,
Weibull Weibull is a Swedish locational surname. The Weibull family share the same roots as the Danish / Norwegian noble family of Falsenbr>They originated from and were named after the village of Weiböl in Widstedts parish, Jutland, but settled in Skà ...
,
Generalized extreme value distribution In probability theory and statistics, the generalized extreme value (GEV) distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known a ...
, Laplace distribution,
Burr distribution In probability theory, statistics and econometrics, the Burr Type XII distribution or simply the Burr distribution is a continuous probability distribution for a non-negative random variable. It is also known as the Singh–Maddala distribution a ...
(Dagum mirrored),
Dagum distribution The Dagum distribution (or Mielke Beta-Kappa distribution) is a continuous probability distribution defined over positive real numbers. It is named after Camilo Dagum, who proposed it in a series of papers in the 1970s. The Dagum distribution ar ...
(Burr mirrored),
Gompertz distribution In probability and statistics, the Gompertz distribution is a continuous probability distribution, named after Benjamin Gompertz. The Gompertz distribution is often applied to describe the distribution of adult lifespans by demographers and ac ...
, Student distribution and other. Another characteristic of CumFreq is that it provides the option to use two different probability distributions, one for the lower data range, and one for the higher. The ranges are separated by a break-point. The use of such composite (discontinuous) probability distributions can be useful when the data of the phenomenon studied were obtained under different conditions. During the input phase, the user can select the number of intervals needed to determine the histogram. He may also define a threshold to obtain a
truncated distribution In statistics, a truncated distribution is a conditional distribution that results from restricting the domain of some other probability distribution. Truncated distributions arise in practical statistics in cases where the ability to record, or e ...
. The output section provides a calculator to facilitate interpolation and
extrapolation In mathematics, extrapolation is a type of estimation, beyond the original observation range, of the value of a variable on the basis of its relationship with another variable. It is similar to interpolation, which produces estimates between know ...
. Further it gives the option to see the
Q–Q plot In statistics, a Q–Q plot (quantile-quantile plot) is a probability plot, a graphical method for comparing two probability distributions by plotting their ''quantiles'' against each other. A point on the plot corresponds to one of the qu ...
in terms of calculated and observed cumulative frequencies. ILRI''Drainage research in farmers' fields: analysis of data'', 2002. Contribution to the project "Liquid Gold" of the International Institute for Land Reclamation and Improvement (ILRI), Wageningen, The Netherlands

/ref> provides examples of application to magnitudes like crop yield, watertable depth, soil salinity,
hydraulic conductivity Hydraulic conductivity, symbolically represented as (unit: m/s), is a property of porous materials, soils and rocks, that describes the ease with which a fluid (usually water) can move through the pore space, or fractures network. It depends on ...
, rainfall, and river discharge.


Generalizing distributions

The program can produce generalizations of the normal, logistic, and other distributions by transforming the data using an exponent that is optimized to obtain the best fit. This feature is not common in other distribution-fitting software which normally include only a logarithmic transformation of data obtaining distributions like the
lognormal In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable is log-normally distributed, then has a norma ...
and loglogistic. Generalization of symmetrical distributions (like the
normal Normal(s) or The Normal(s) may refer to: Film and television * ''Normal'' (2003 film), starring Jessica Lange and Tom Wilkinson * ''Normal'' (2007 film), starring Carrie-Anne Moss, Kevin Zegers, Callum Keith Rennie, and Andrew Airlie * ''Norma ...
and the logistic) makes them applicable to data obeying a distribution that is skewed to the right (using an exponent <1) as well as to data obeying a distribution that is skewed to the left (using an exponent >1). This enhances the versatility of symmetrical distributions.


Inverting distributions

Skew distributions can be mirrored by distribution inversion (see
survival function The survival function is a function that gives the probability that a patient, device, or other object of interest will survive past a certain time. The survival function is also known as the survivor function or reliability function. The te ...
, or complementary distribution function) to change the skewness from positive to negative and vice versa. This amplifies the number of applicable distributions and increases the chance of finding a better fit. CumFreq makes use of that opportunity.


Shifting distributions

When negative data are present that are not supported by a probability distribution, the model performs a distribution shift to the positive side while, after fitting, the distribution is shifted back.


Confidence belts

The software employs the binomial distribution to determine the confidence belt of the corresponding cumulative distribution function. The prediction of the
return period A return period, also known as a recurrence interval or repeat interval, is an average time or an estimated average time between events such as earthquakes, floods, landslides, or river discharge flows to occur. It is a statistical measurement typ ...
, which is of interest in
time series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Ex ...
, is also accompanied by a confidence belt. The construction of confidence belts is not found in most other software. The figure to the right shows the variation that may occur when obtaining samples of a variate that follows a certain probability distribution. The data were provided by Benson.Benson, M.A. 1960. Characteristics of frequency curves based on a theoretical 1000 year record. In: T.Dalrymple (ed.), Flood frequency analysis. U.S. Geological Survey Water Supply paper 1543−A, pp. 51–71 The confidence belt around an experimental cumulative frequency or return period curve gives an impression of the region in which the true distribution may be found. Also, it clarifies that the experimentally found best fitting probability distribution may deviate from the true distribution.


Goodness of fit

Cumfreq produces a list of distributions ranked by
goodness of fit The goodness of fit of a statistical model describes how well it fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question. Such measure ...
.


Histogram and density function

From the cumulative distribution function (CDF) one can derive a histogram and the
probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) ca ...
(PDF).


Calculator

The software offers the option to use a probability distribution calculator. The cumulative frequency and the
return period A return period, also known as a recurrence interval or repeat interval, is an average time or an estimated average time between events such as earthquakes, floods, landslides, or river discharge flows to occur. It is a statistical measurement typ ...
are give as a function of data value as input. In addition, the confidence intervals are shown. Reversely, the value is presented upon giving the cumulative frequency or the return period.


See also

*
Distribution fitting Probability distribution fitting or simply distribution fitting is the fitting of a probability distribution to a series of data concerning the repeated measurement of a variable phenomenon. The aim of distribution fitting is to predict the probab ...


References

{{DEFAULTSORT:Cumfreq Statistical software Regression and curve fitting software Freeware