Operations research

operational researchoperation researchoperational analysis
Department of Labor Bureau of Labor Statistics.

Ranking

rankrankedrankings
Some kinds of statistical tests employ calculations based on ranks. Examples include: The distribution of values in decreasing order of rank is often of interest when values vary widely in scale; this is the rank-size distribution (or rank-frequency distribution), for example for city sizes or word frequencies. These often follow a power law. Some ranks can have non-integer values for tied data values. For example, when there is an even number of copies of the same data value, the above described fractional statistical rank of the tied data ends in ½.

Cryptography

cryptographiccryptographercryptology
It is theoretically possible to break such a system, but it is infeasible to do so by any known practical means. These schemes are therefore termed computationally secure; theoretical advances, e.g., improvements in integer factorization algorithms, and faster computing technology require these solutions to be continually adapted. There exist information-theoretically secure schemes that cannot be broken even with unlimited computing power—an example is the one-time pad—but these schemes are more difficult to use in practice than the best theoretically breakable but computationally secure mechanisms.

Matrix (mathematics)

matrixmatricesmatrix theory
Online matrix calculators., a freeware package for matrix algebra and statistics. Operation with matrices in R (determinant, track, inverse, adjoint, transpose). Matrix operations widget in Wolfram|Alpha., a freeware package for matrix algebra and statistics. Operation with matrices in R (determinant, track, inverse, adjoint, transpose). Matrix operations widget in Wolfram|Alpha., a freeware package for matrix algebra and statistics. Operation with matrices in R (determinant, track, inverse, adjoint, transpose). Matrix operations widget in Wolfram|Alpha. Operation with matrices in R (determinant, track, inverse, adjoint, transpose). Matrix operations widget in Wolfram|Alpha.

Outlier

outliersstatistical outliersconservative estimate
In statistics, an outlier is a data point that differs significantly from other observations. An outlier may be due to variability in the measurement or it may indicate experimental error; the latter are sometimes excluded from the data set. An outlier can cause serious problems in statistical analyses. Outliers can occur by chance in any distribution, but they often indicate either measurement error or that the population has a heavy-tailed distribution.

Moving average

exponential moving averagesimple moving averageWeighted moving average
From a statistical point of view, the moving average, when used to estimate the underlying trend in a time series, is susceptible to rare events such as rapid shocks or other anomalies. A more robust estimate of the trend is the simple moving median over n time points: where the median is found by, for example, sorting the values inside the brackets and finding the value in the middle. For larger values of n, the median can be efficiently computed by updating an indexable skiplist. Statistically, the moving average is optimal for recovering the underlying trend of the time series when the fluctuations about the trend are normally distributed.

Forecasting

forecastforecastsprojection
Growth curve (statistics). Recurrent neural network. Regression analysis includes a large group of methods for predicting future values of a variable using information about other variables. These methods include both parametric (linear or non-linear) and non-parametric techniques. Autoregressive moving average with exogenous inputs (ARMAX). Composite forecasts. Cooke's method. Delphi method. Forecast by analogy. Scenario building. Statistical surveys. Technology forecasting. Artificial neural networks. Group method of data handling. Support vector machines. Data mining. Machine learning. Pattern recognition. Simulation. Prediction market.

Anecdotal evidence

anecdotalanecdotesMisleading vividness
Similarly, psychologists have found that due to cognitive bias people are more likely to remember notable or unusual examples rather than typical examples. Thus, even when accurate, anecdotal evidence is not necessarily representative of a typical experience. Accurate determination of whether an anecdote is typical requires statistical evidence. Misuse of anecdotal evidence is an informal fallacy and is sometimes referred to as the "person who" fallacy ("I know a person who..."; "I know of a case where..." etc.) which places undue weight on experiences of close peers which may not be typical.

Array data structure

arrayarraysvector
The term is also used, especially in the description of algorithms, to mean associative array or "abstract array", a theoretical computer science model (an abstract data type or ADT) intended to capture the essential properties of arrays. The first digital computers used machine-language programming to set up and access array structures for data tables, vector and matrix computations, and for many other purposes. John von Neumann wrote the first array-sorting program (merge sort) in 1945, during the building of the first stored-program computer. p. 159 Array indexing was originally done by self-modifying code, and later using index registers and indirect addressing.

Randomized controlled trial

randomized controlled trialsrandomized clinical trialrandomized control trial
Good blinding may reduce or eliminate some sources of experimental bias. The randomness in the assignment of subjects to groups reduces selection bias and allocation bias, balancing both known and unknown prognostic factors, in the assignment of treatments. Blinding reduces other forms of experimenter and subject biases. A well-blinded RCT is often considered the gold standard for clinical trials. Blinded RCTs are commonly used to test the efficacy of medical interventions and may additionally provide information about adverse effects, such as drug reactions.

Neural coding

sparse codingneural coderate coding
Other models are based on matching pursuit, a sparse approximation algorithm which finds the "best matching" projections of multidimensional data, and dictionary learning, a representation learning method which aims to find a sparse matrix representation of the input data in the form of a linear combination of basic elements as well as those basic elements themselves. Sparse coding may be a general strategy of neural systems to augment memory capacity. To adapt to their environments, animals must learn which stimuli are associated with rewards or punishments and distinguish these reinforced stimuli from similar but irrelevant ones.

Biophysics

biophysicistbiophysicalbiological physics
Link archive of learning resources for students: biophysika.de (60% English, 40% German). Journal of Medicine, Physiology and Biophysics,(IISTE), USA. Chief Editor of the journal is Ignat Ignatov. Chief editor of all IISTE journals is Alexander Decker.

Social network

networknetworkingnetworks
The social network is a theoretical construct useful in the social sciences to study relationships between individuals, groups, organizations, or even entire societies (social units, see differentiation). The term is used to describe a social structure determined by such interactions. The ties through which any given social unit connects represent the convergence of the various social contacts of that unit. This theoretical approach is, necessarily, relational.

Loss function

objective functioncost functionrisk function
Statistical risk.

Polynomial regression

cubic regressionPolynomial fittingregression
In statistics, polynomial regression is a form of regression analysis in which the relationship between the independent variable x and the dependent variable y is modelled as an nth degree polynomial in x. Polynomial regression fits a nonlinear relationship between the value of x and the corresponding conditional mean of y, denoted E(y |x). Although polynomial regression fits a nonlinear model to the data, as a statistical estimation problem it is linear, in the sense that the regression function E(y | x) is linear in the unknown parameters that are estimated from the data. For this reason, polynomial regression is considered to be a special case of multiple linear regression.

Information security

securityINFOSECinformation
Management, e.g., defining strategies, setting objectives and goals; planning and directing the work; allocating funds, people and other resources; prioritization relative to other activities; team building, leadership, control, motivation and coordination with other business functions and activities (e.g., IT, facilities, human resources, risk management, information risk and security, operations); monitoring the situation, checking and updating the arrangements when things change; maturing the approach through continuous improvement, learning and appropriate investment.

Learning to rank

Machine-learned rankingLearn to Rankmachine-learned
Recently they have also sponsored a machine-learned ranking competition "Internet Mathematics 2009" based on their own search engine's production data. Yahoo has announced a similar competition in 2010. As of 2008, Google's Peter Norvig denied that their search engine exclusively relies on machine-learned ranking. Cuil's CEO, Tom Costello, suggests that they prefer hand-built models because they can outperform machine-learned models when measured against metrics like click-through rate or time on landing page, which is because machine-learned models "learn what people say they like, not what people actually like".

Epidemiology

epidemiologistepidemiologicalepidemiologists
The statistic generated to measure association is the odds ratio (OR), which is the ratio of the odds of exposure in the cases (A/C) to the odds of exposure in the controls (B/D), i.e. OR = (AD/BC). If the OR is significantly greater than 1, then the conclusion is "those with the disease are more likely to have been exposed," whereas if it is close to 1 then the exposure and disease are not likely associated. If the OR is far less than one, then this suggests that the exposure is a protective factor in the causation of the disease. Case-control studies are usually faster and more cost effective than cohort studies, but are sensitive to bias (such as recall bias and selection bias).

Learning

associative learninglearnlearning process
Machine learning, a branch of artificial intelligence, concerns the construction and study of systems that can learn from data.

Parameter

parametersparametricargument
(Note that the sample standard deviation (S) is not an unbiased estimate of the population standard deviation : see Unbiased estimation of standard deviation.) It is possible to make statistical inferences without assuming a particular parametric family of probability distributions. In that case, one speaks of non-parametric statistics as opposed to the parametric statistics just described.

Crime

criminalcriminalscriminal offence
For example: as cultures change and the political environment shifts, societies may criminalise or decriminalise certain behaviours, which directly affects the statistical crime rates, influence the allocation of resources for the enforcement of laws, and (re-)influence the general public opinion. Similarly, changes in the collection and/or calculation of data on crime may affect the public perceptions of the extent of any given "crime problem". All such adjustments to crime statistics, allied with the experience of people in their everyday lives, shape attitudes on the extent to which the state should use law or social engineering to enforce or encourage any particular social norm.

Dependent and independent variables

dependent variableindependent variableexplanatory variable
In data mining tools (for multivariate statistics and machine learning), the dependent variable is assigned a role as (or in some tools as label attribute), while an independent variable may be assigned a role as regular variable. Known values for the target variable are provided for the training data set and test data set, but should be predicted for other data. The target variable is used in supervised learning algorithms but not in unsupervised learning. In mathematical modeling, the dependent variable is studied to see if and how much it varies as the independent variables vary.

Fisher information

Fisher information matrixinformation matrixinformation
If the Fisher information matrix is positive definite for all θ, then the corresponding statistical model is said to be regular; otherwise, the statistical model is said to be singular. Examples of singular statistical models include the following: normal mixtures, binomial mixtures, multinomial mixtures, Bayesian networks, neural networks, radial basis functions, hidden Markov models, stochastic context-free grammars, reduced rank regressions, Boltzmann machines. In machine learning, if a statistical model is devised so that it extracts hidden structure from a random phenomenon, then it naturally becomes singular.

Self-selection bias

self-selectionself-selectedself selection
In statistics, self-selection bias arises in any situation in which individuals select themselves into a group, causing a biased sample with nonprobability sampling. It is commonly used to describe situations where the characteristics of the people which cause them to select themselves in the group create abnormal or undesirable conditions in the group. It is closely related to the non-response bias, describing when the group of people responding has different responses than the group of people not responding. Self-selection bias is a major problem in research in sociology, psychology, economics and many other social sciences.

Machine translation

translation softwareautomatic translationtranslation
Rules post-processed by statistics: Translations are performed using a rules based engine. Statistics are then used in an attempt to adjust/correct the output from the rules engine. Statistics guided by rules: Rules are used to pre-process data in an attempt to better guide the statistical engine. Rules are also used to post-process the statistical output to perform functions such as normalization. This approach has a lot more power, flexibility and control when translating.