Exploratory data analysis

explorative data analysisexploratorydata analysisdata exploratoryEDAexplorative methodexploratory analysisGraphical Exploratory Data Analysis
In statistics, exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods.wikipedia
135 Related Articles

Data analysis

data analyticsanalysisdata analyst
In statistics, exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods.
In statistical applications, data analysis can be divided into descriptive statistics, exploratory data analysis (EDA), and confirmatory data analysis (CDA).

Statistical hypothesis testing

hypothesis testingstatistical teststatistical tests
These statistical developments, all championed by Tukey, were designed to complement the analytic theory of testing statistical hypotheses, particularly the Laplacian tradition's emphasis on exponential families.
Confirmatory data analysis can be contrasted with exploratory data analysis, which may not have pre-specified hypotheses.

Statistical model

modelprobabilistic modelstatistical modeling
A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task.
Models can be compared to each other by exploratory data analysis or confirmatory data analysis.

Targeted projection pursuit

Targeted projection pursuit
Targeted projection pursuit is a type of statistical technique used for exploratory data analysis, information visualization, and feature selection.

Statistical graphics

graphical techniquegraphicalgraphical techniques
Typical graphical techniques used in EDA are:
Exploratory data analysis (EDA) relies heavily on such techniques.

Principal component analysis

principal components analysisPCAprincipal components
Principal component analysis (PCA)
PCA is mostly used as a tool in exploratory data analysis and for making predictive models.

Median polish

Median polish
The median polish is a simple and robust exploratory data analysis procedure proposed by the statistician John Tukey.

John Tukey

TukeyTukey, JohnJohn W. Tukey
Exploratory data analysis was promoted by John Tukey to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments.
He also contributed to statistical practice and articulated the important distinction between exploratory data analysis and confirmatory data analysis, believing that much statistical methodology placed too great an emphasis on the latter.

Ordination (statistics)

ordinationgradient analysisordination techniques
Ordination
Ordination or gradient analysis, in multivariate analysis, is a method complementary to data clustering, and used mainly in exploratory data analysis (rather than in hypothesis testing).

Stem-and-leaf display

Stem-and-Leaf Plotstemplotstem and leaf plot
Stem-and-leaf plot
They evolved from Arthur Bowl's work in the early 1900s, and are useful tools in exploratory data analysis.

Order statistic

order statisticsorderedth-smallest of items
Francis Galton emphasized order statistics and quantiles.
A similar important statistic in exploratory data analysis that is simply related to the order statistics is the sample interquartile range.

Machine learning

learningmachine-learningstatistical learning
Orange, an open-source data mining and machine learning software suite.
Data mining is a field of study within machine learning, and focuses on exploratory data analysis through unsupervised learning.

Data visualization

visualizationdata visualisationdata visualizations
GGobi is a free software for interactive data visualization data visualization
Data visualization is closely related to information graphics, information visualization, scientific visualization, exploratory data analysis and statistical graphics.

TinkerPlots

TinkerPlots an EDA software for upper elementary and middle school students.
TinkerPlots is exploratory data analysis and modeling software designed for use by students in grades 4 through university.

Testing hypotheses suggested by the data

post hocpost-hochypotheses suggested by the data
In particular, he held that confusing the two types of analyses and employing them on the same set of data can lead to systematic bias owing to the issues inherent in testing hypotheses suggested by the data.
Exploratory data analysis

Trimean

Trimean
The foundations of the trimean were part of Arthur Bowley's teachings, and later popularized by statistician John Tukey in his 1977 book which has given its name to a set of techniques called exploratory data analysis.

GGobi

GGobi is a free software for interactive data visualization data visualization
GGobi is a program which allows exploratory data analysis to occur for multi-dimensional data.

Configural frequency analysis

Configural frequency analysis
Configural frequency analysis (CFA) is a method of exploratory data analysis, introduced by Gustav A. Lienert in 1969.

Box plot

boxplotbox and whisker plotadjusted boxplots
Box plot
Exploratory data analysis

Arthur Lyon Bowley

BowleyA. L. BowleyArthur Bowley
Arthur Lyon Bowley used precursors of the stemplot and five-number summary (Bowley actually used a "seven-figure summary", including the extremes, deciles and quartiles, along with the median - see his Elementary Manual of Statistics (3rd edn., 1920), p. 62 – he defines "the maximum and minimum, median, quartiles and two deciles" as the "seven positions").
Bowley's teaching presaged several of the EDA ideas later popularised by John Tukey, including stemplots, decile boxplots, the seven-figure summary and trimean.

Data dredging

p-hackingp''-hackingdata snooping
Data dredging
When neither approach is practical, one can make a clear distinction between data analyses that are confirmatory and analyses that are exploratory.

Descriptive statistics

descriptivedescriptive statisticstatistics
Descriptive statistics
More recently, a collection of summarisation techniques has been formulated under the heading of exploratory data analysis: an example of such a technique is the box plot.

Anscombe's quartet

Anscombe's quartet, on importance of exploration
Exploratory data analysis

Statistics

statisticalstatistical analysisstatistician
In statistics, exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods.

Data set

datasetdatasetsdata
In statistics, exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods.