# Data set

**datasetdatasetsdatadata setsdata seriesclassic data setsset of data**

A data set (or dataset) is a collection of data.wikipedia

308 Related Articles

### Big data

**big data analyticsbig-databig data analysis**

Data sets that are so large that traditional data processing applications are inadequate to deal with them are known as big data.

Big data refers to data sets that are too large or complex for traditional data-processing application software to adequately deal with.

### Standard deviation

**standard deviationssample standard deviationsigma**

These include the number and types of the attributes or variables, and various statistical measures applicable to them, such as standard deviation and kurtosis.

The standard deviation of a random variable, statistical population, data set, or probability distribution is the square root of its variance.

### Open data

**Open Government Datadataopen**

In the open data discipline, data set is the unit to measure the information released in a public open data repository.

However, the lack of a license makes it difficult to determine the status of a data set and may restrict the use of data offered in an "Open" spirit.

### Data

**statistical datascientific datadatum**

A data set (or dataset) is a collection of data.

Data set

### Iris flower data set

**Iris'' flower data setFisher's iris dataIris Dataset**

Iris flower data set – Multivariate data set introduced by Ronald Fisher (1936).

The Iris flower data set or Fisher's Iris data set is a multivariate data set introduced by the British statistician and biologist Ronald Fisher in his 1936 paper The use of multiple measurements in taxonomic problems as an example of linear discriminant analysis.

### Anscombe's quartet

Anscombe's quartet – Small data set illustrating the importance of graphing the data to avoid statistical fallacies

Anscombe's quartet comprises four datasets that have nearly identical simple descriptive statistics, yet appear very different when graphed.

### Data blending

**blendblended data**

Data blending

Data blending is a process whereby big data from multiple sources are merged into a single data warehouse or data set.

### Statistics

**statisticalstatistical analysisstatistician**

In statistics, data sets usually come from actual observations obtained by sampling a statistical population, and each row corresponds to the observations on one element of that population.

Statistical analysis of a data set often reveals that two variables (properties) of the population under consideration tend to vary together, as if they were connected.

### Robust statistics

**robustbreakdown pointrobustness**

Robust statistics – Data sets used in Robust Regression and Outlier Detection (Rousseeuw and Leroy, 1986). Provided on-line at the University of Cologne.

The data sets for that book can be found via the Classic data sets page, and the book's website contains more information on the data.

### Data collection system

**automated data collection systemsdata collection and processing**

Data collection system

A collection (used as a noun) is the topmost container for grouping related documents, data models, and datasets.

### Data (computing)

**datacomputer datadata representation**

Data (computing)

Dataset

### Table (database)

**tabletablesdatabase table**

Most commonly a data set corresponds to the contents of a single database table, or a single statistical data matrix, where every column of the table represents a particular variable, and each row corresponds to a given member of the data set in question.

### Design matrix

**data matrixdesign matricesdata matrices**

Most commonly a data set corresponds to the contents of a single database table, or a single statistical data matrix, where every column of the table represents a particular variable, and each row corresponds to a given member of the data set in question.

### Column (database)

**columnscolumnAttribute**

Most commonly a data set corresponds to the contents of a single database table, or a single statistical data matrix, where every column of the table represents a particular variable, and each row corresponds to a given member of the data set in question.

### Row (database)

**rowsrowrecord**

### Space probe

**probespace probesprobes**

Less used names for this kind of data sets are data corpus and data stock. An example of this type is the data sets collected by space agencies performing experiments with instruments aboard space probes.

### Data processing

**data-processingprocessingprocessing of data**

Data sets that are so large that traditional data processing applications are inadequate to deal with them are known as big data.

### Statistical parameter

**parametersparameterparametrization**

These include the number and types of the attributes or variables, and various statistical measures applicable to them, such as standard deviation and kurtosis.

### Kurtosis

**excess kurtosisleptokurticplatykurtic**

These include the number and types of the attributes or variables, and various statistical measures applicable to them, such as standard deviation and kurtosis.

### Real number

**realrealsreal-valued**

The values may be numbers, such as real numbers or integers, for example representing a person's height in centimeters, but may also be nominal data (i.e., not consisting of numerical values), for example representing a person's ethnicity.

### Integer

**integersintegralwhole number**

The values may be numbers, such as real numbers or integers, for example representing a person's height in centimeters, but may also be nominal data (i.e., not consisting of numerical values), for example representing a person's ethnicity.

### Number

**number systemnumericalnumeric**

The values may be numbers, such as real numbers or integers, for example representing a person's height in centimeters, but may also be nominal data (i.e., not consisting of numerical values), for example representing a person's ethnicity.

### Level of measurement

**quantitativescaleinterval scale**

### Missing data

**missing valuesincomplete datamissing at random**

However, there may also be missing values, which must be indicated in some way.

### Sampling (statistics)

**samplingrandom samplesample**

In statistics, data sets usually come from actual observations obtained by sampling a statistical population, and each row corresponds to the observations on one element of that population.