Genome-wide complex trait analysis

GCTAgenomic analysisgenomic relatedness
Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML) is a statistical method for variance component estimation in genetics which quantifies the total narrow-sense (additive) contribution to a trait's heritability of a particular subset of genetic variants (typically limited to SNPs with MAF >1%, hence terms such as "chip heritability"/"SNP heritability").wikipedia
57 Related Articles

Heritability

heritablebreeder's equationheritabilities
Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML) is a statistical method for variance component estimation in genetics which quantifies the total narrow-sense (additive) contribution to a trait's heritability of a particular subset of genetic variants (typically limited to SNPs with MAF >1%, hence terms such as "chip heritability"/"SNP heritability").
Today, heritability can be estimated from general pedigrees using linear mixed models and from genomic relatedness estimated from genetic markers.

Genetic correlation

Characters that are positively correlatedgenetically correlated
. It can also be extended to analyse bivariate genetic correlations between traits.
They can be estimated in breeding experiments on two traits of known heritability and selecting on one trait to measure the change in the other trait (allowing inferring the genetic correlation), family/adoption/twin studies (analyzed using SEMs or DeFries–Fulker extremes analysis), molecular estimation of relatedness such as GCTA, methods employing polygenic scores like LD score regression, BOLT-REML, CPBayes, or HESS, comparison of genome-wide SNP hits in GWASes (as a loose lower bound), and phenotypic correlations of populations with at least some related individuals.

Twin study

twin studiestwinstudies of twins
GCTA heritability estimates are useful because they provide lower bounds for the genetic contributions to traits such as intelligence without relying on the assumptions used in twin studies and other family and pedigree studies, thereby corroborating them Eric Turkheimer ( "Still Missing", Turkheimer 2011) discusses the GCTA results in the context of the twin study debate: "Of the three reservations about quantitative genetic heritability that were outlined at the outset—the assumptions of twin and family studies, the universality of heritability, and the absence of mechanism—the new paradigm has put the first to rest, and before continuing to explain my skepticism about whether the most important problems have been solved, it is worth appreciating what a significant accomplishment this is.

Missing heritability problem

missing heritabilitymissing heritability'' problemthe same is true
GCTA estimates can be used to resolve the missing heritability problem and design GWASes which will yield genome-wide statistically-significant hits.

Variance

sample variancepopulation variancevariability
Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML) is a statistical method for variance component estimation in genetics which quantifies the total narrow-sense (additive) contribution to a trait's heritability of a particular subset of genetic variants (typically limited to SNPs with MAF >1%, hence terms such as "chip heritability"/"SNP heritability").

Single-nucleotide polymorphism

single nucleotide polymorphismSNPSNPs
Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML) is a statistical method for variance component estimation in genetics which quantifies the total narrow-sense (additive) contribution to a trait's heritability of a particular subset of genetic variants (typically limited to SNPs with MAF >1%, hence terms such as "chip heritability"/"SNP heritability").

Minor allele frequency

MAFminor allele frequencies
Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML) is a statistical method for variance component estimation in genetics which quantifies the total narrow-sense (additive) contribution to a trait's heritability of a particular subset of genetic variants (typically limited to SNPs with MAF >1%, hence terms such as "chip heritability"/"SNP heritability").

Population stratification

population structurestratificationStratification for prevention and therapy
It also suffers from serious methodological weaknesses, such as susceptibility to population stratification. The use of SNP or whole-genome data from unrelated subject participants (with participants too related, typically >0.025 or ~fourth cousins levels of similarity, being removed, and several principal components included in the regression to avoid & control for population stratification) bypasses many heritability criticisms: twins are often entirely uninvolved, there are no questions of equal treatment, relatedness is estimated precisely, and the samples are drawn from a broad variety of subjects.

Heritability of IQ

inheritance of intelligenceHeritability of intelligencegenetics of intelligence
GCTA heritability estimates are useful because they provide lower bounds for the genetic contributions to traits such as intelligence without relying on the assumptions used in twin studies and other family and pedigree studies, thereby corroborating them Eric Turkheimer ( "Still Missing", Turkheimer 2011) discusses the GCTA results in the context of the twin study debate: "Of the three reservations about quantitative genetic heritability that were outlined at the outset—the assumptions of twin and family studies, the universality of heritability, and the absence of mechanism—the new paradigm has put the first to rest, and before continuing to explain my skepticism about whether the most important problems have been solved, it is worth appreciating what a significant accomplishment this is.

Genealogy

genealogistgenealogicalfamily history
GCTA heritability estimates are useful because they provide lower bounds for the genetic contributions to traits such as intelligence without relying on the assumptions used in twin studies and other family and pedigree studies, thereby corroborating them Eric Turkheimer ( "Still Missing", Turkheimer 2011) discusses the GCTA results in the context of the twin study debate: "Of the three reservations about quantitative genetic heritability that were outlined at the outset—the assumptions of twin and family studies, the universality of heritability, and the absence of mechanism—the new paradigm has put the first to rest, and before continuing to explain my skepticism about whether the most important problems have been solved, it is worth appreciating what a significant accomplishment this is.

Power (statistics)

statistical powerpowerpowerful
Like the validity of intelligence testing, the heritability of intelligence is no longer scientifically contentious." and enabling the design of well-powered Genome-wide association study (GWAS) designs to find the specific genetic variants involved. For example, a GCTA estimate of 30% SNP heritability is consistent with a larger total genetic heritability of 70%. However, if the GCTA estimate was ~0%, then that would imply one of three things: a) there is no genetic contribution, b) the genetic contribution is entirely in the form of genetic variants not included, or c) the genetic contribution is entirely in the form of non-additive effects such as epistasis/dominance. Running GCTA on individual chromosomes and regressing the estimated proportion of trait variance explained by each chromosome against that chromosome's length can reveal whether the responsible genetic variants cluster or are distributed evenly across the genome or are sex-linked.

Genome-wide association study

genome-wide association studiesGWASGenome Wide Association Studies
Like the validity of intelligence testing, the heritability of intelligence is no longer scientifically contentious." and enabling the design of well-powered Genome-wide association study (GWAS) designs to find the specific genetic variants involved. For example, a GCTA estimate of 30% SNP heritability is consistent with a larger total genetic heritability of 70%. However, if the GCTA estimate was ~0%, then that would imply one of three things: a) there is no genetic contribution, b) the genetic contribution is entirely in the form of genetic variants not included, or c) the genetic contribution is entirely in the form of non-additive effects such as epistasis/dominance. Running GCTA on individual chromosomes and regressing the estimated proportion of trait variance explained by each chromosome against that chromosome's length can reveal whether the responsible genetic variants cluster or are distributed evenly across the genome or are sex-linked.

Epistasis

epistaticgene interactiongenetic interactions
Like the validity of intelligence testing, the heritability of intelligence is no longer scientifically contentious." and enabling the design of well-powered Genome-wide association study (GWAS) designs to find the specific genetic variants involved. For example, a GCTA estimate of 30% SNP heritability is consistent with a larger total genetic heritability of 70%. However, if the GCTA estimate was ~0%, then that would imply one of three things: a) there is no genetic contribution, b) the genetic contribution is entirely in the form of genetic variants not included, or c) the genetic contribution is entirely in the form of non-additive effects such as epistasis/dominance. Running GCTA on individual chromosomes and regressing the estimated proportion of trait variance explained by each chromosome against that chromosome's length can reveal whether the responsible genetic variants cluster or are distributed evenly across the genome or are sex-linked.

Dominance (genetics)

autosomal recessiverecessiveautosomal dominant
Like the validity of intelligence testing, the heritability of intelligence is no longer scientifically contentious." and enabling the design of well-powered Genome-wide association study (GWAS) designs to find the specific genetic variants involved. For example, a GCTA estimate of 30% SNP heritability is consistent with a larger total genetic heritability of 70%. However, if the GCTA estimate was ~0%, then that would imply one of three things: a) there is no genetic contribution, b) the genetic contribution is entirely in the form of genetic variants not included, or c) the genetic contribution is entirely in the form of non-additive effects such as epistasis/dominance. Running GCTA on individual chromosomes and regressing the estimated proportion of trait variance explained by each chromosome against that chromosome's length can reveal whether the responsible genetic variants cluster or are distributed evenly across the genome or are sex-linked.

Sex linkage

X-linkedsex-linkedX-linked trait
Like the validity of intelligence testing, the heritability of intelligence is no longer scientifically contentious." and enabling the design of well-powered Genome-wide association study (GWAS) designs to find the specific genetic variants involved. For example, a GCTA estimate of 30% SNP heritability is consistent with a larger total genetic heritability of 70%. However, if the GCTA estimate was ~0%, then that would imply one of three things: a) there is no genetic contribution, b) the genetic contribution is entirely in the form of genetic variants not included, or c) the genetic contribution is entirely in the form of non-additive effects such as epistasis/dominance. Running GCTA on individual chromosomes and regressing the estimated proportion of trait variance explained by each chromosome against that chromosome's length can reveal whether the responsible genetic variants cluster or are distributed evenly across the genome or are sex-linked.

Analysis of variance

ANOVAanalysis of variance (ANOVA)corrected the means
Estimation in biology/animal breeding using standard ANOVA/REML methods of variance components such as heritability, shared-environment, maternal effects etc. typically requires individuals of known relatedness such as parent/child; this is often unavailable or the pedigree data unreliable, leading to inability to apply the methods or requiring strict laboratory control of all breeding (which threatens the external validity of all estimates), and several authors have noted that relatedness could be measured directly from genetic markers (and if individuals were reasonably related, economically few markers would have to be obtained for statistical power), leading Kermit Ritland to propose in 1996 that directly measured pairwise relatedness could be compared to pairwise phenotype measurements (Ritland 1996, "A Marker-based Method for Inferences About Quantitative Inheritance in Natural Populations" ).

Restricted maximum likelihood

REMLresidual maximum likelihood
Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML) is a statistical method for variance component estimation in genetics which quantifies the total narrow-sense (additive) contribution to a trait's heritability of a particular subset of genetic variants (typically limited to SNPs with MAF >1%, hence terms such as "chip heritability"/"SNP heritability"). Estimation in biology/animal breeding using standard ANOVA/REML methods of variance components such as heritability, shared-environment, maternal effects etc. typically requires individuals of known relatedness such as parent/child; this is often unavailable or the pedigree data unreliable, leading to inability to apply the methods or requiring strict laboratory control of all breeding (which threatens the external validity of all estimates), and several authors have noted that relatedness could be measured directly from genetic markers (and if individuals were reasonably related, economically few markers would have to be obtained for statistical power), leading Kermit Ritland to propose in 1996 that directly measured pairwise relatedness could be compared to pairwise phenotype measurements (Ritland 1996, "A Marker-based Method for Inferences About Quantitative Inheritance in Natural Populations" ).

External validity

externalExternal validity (scientific studies)generalised
Estimation in biology/animal breeding using standard ANOVA/REML methods of variance components such as heritability, shared-environment, maternal effects etc. typically requires individuals of known relatedness such as parent/child; this is often unavailable or the pedigree data unreliable, leading to inability to apply the methods or requiring strict laboratory control of all breeding (which threatens the external validity of all estimates), and several authors have noted that relatedness could be measured directly from genetic markers (and if individuals were reasonably related, economically few markers would have to be obtained for statistical power), leading Kermit Ritland to propose in 1996 that directly measured pairwise relatedness could be compared to pairwise phenotype measurements (Ritland 1996, "A Marker-based Method for Inferences About Quantitative Inheritance in Natural Populations" ).

Twin

identical twintwinsfraternal twin
However, the twin and family studies have been criticized for their reliance on a number of assumptions that are difficult or impossible to verify, such as the equal environments assumption (that the environments of monozygotic and dizygotic twins are equally similar), that there is no misclassification of zygosity (mistaking identical for fraternal & vice versa), that twins are unrepresentative of the general population, and that there is no assortative mating.

Assortative mating

disassortative matingassortive matingmate assortatively
However, the twin and family studies have been criticized for their reliance on a number of assumptions that are difficult or impossible to verify, such as the equal environments assumption (that the environments of monozygotic and dizygotic twins are equally similar), that there is no misclassification of zygosity (mistaking identical for fraternal & vice versa), that twins are unrepresentative of the general population, and that there is no assortative mating.

Principal component analysis

principal components analysisPCAprincipal components
The use of SNP or whole-genome data from unrelated subject participants (with participants too related, typically >0.025 or ~fourth cousins levels of similarity, being removed, and several principal components included in the regression to avoid & control for population stratification) bypasses many heritability criticisms: twins are often entirely uninvolved, there are no questions of equal treatment, relatedness is estimated precisely, and the samples are drawn from a broad variety of subjects.

Sampling bias

ascertainment biasbiased samplebias
In addition to being more robust to violations of the twin study assumptions, SNP data can be easier to collect since it does not require rare twins and thus also heritability for rare traits can be estimated (with due correction for ascertainment bias).

Design of experiments

experimental designdesignExperimental techniques
GCTA estimates can be used to resolve the missing heritability problem and design GWASes which will yield genome-wide statistically-significant hits.

Polygene

polygenicmany genespolygenic traits
If a GWAS of n=10k using SNP data fails to turn up any hits, but the GCTA indicates a high heritability accounted for by SNPs, then that implies that a large number of variants are involved (polygenicity) and thus that much larger GWASes will be required to accurately estimate each SNP's effect and directly account for a fraction of the GCTA heritability.