Assessment of Parametric and Resampling Methods for Prediction of Quantities Traits
5731 words (23 pages) Dissertation
10th Dec 2019 Dissertation Reference this
Tags: Genomics
Assessment of parametric and resampling methods for prediction of quantities traits with non-additive genetic architecture
Abstract
Whole genome evaluation of quantitative traits using suitable statistical techniques enable researchers to predict genomic breeding values (GEBVs) more accurately. Since the genetic architecture of quantitative traits is different, therefore choosing the most suitable statistical methods should be based of genetic architecture of the trait. Hence, in this study the performance of parametric methods i.e. GBLUP and BayesB and resampling methods i.e, BaggingGBLUP and Random Forest (RF) was compared for traits with different genetic architecture. Three scenarios of genetic architecture including completely additive (A), additive-dominance-epistasis (ADI) and totally epistasis (I) were considered. To this end, an animal genome composed of five chromosomes, each chromosome harboring 1000 SNPs and 4 QTL was simulated. The performance of aforementioned methods was assessed by regression coefficient (b) and correlation coefficient (r) between the true breeding values and predicted breeding values. The results showed that for additive-dominant-epistasis (ADI) and epistasis (I) genetic architectures, non-parametric method, RF, delivered better predictive performance than the other statistical methods. ….
Key words: Genetic architecture- Epistasis- Non parametric methods- Genomic breeding values
Introduction
Whole genome evaluations using suitable statistical techniques enable researchers to predict genomic breeding values (GEBVs) more accurately for the quantitative traits. However, any statistical methods are not selected in genomic evaluations as the most suitable statistical methods and there are basic challenges such as much more parameters than observations in the field of knowledge. Therefore, in solving simultaneous equations due to the large number of unknowns and the limited number of observations, reverse problem coefficients matrix occurs in the statistical methods such as Ordinary Least Squares (OLS) and these matrices are not direct reversal. In order to overcome this problem, the application of parametric, non-parametric and semi-parametric methods has been suggested (Gianola et al., 2006; De Los Campos et al., 2009). There are multiple ways to use statistics to find the most suitable evaluation methods. It has been claimed that parametric methods are more efficient than the corresponding nonparametric methods (reference!!!!!). Although this difference in efficiency is typically not that much of an issue, there are instances where we do need to consider which method is more efficient.
Since the genetic architecture of quantitative traits is different, therefore manipulation choosing the most suitable statistical methods should be based on genetic architecture of the trait. Genetic architecture of quantitative traits consists of parameters that they are explaining genetic variation within and among populations. These parameters are number of effective quantitative trait loci (QTL), their location across the genome, allelic frequency distribution, the distribution of substitution effects, the pattern of linkage disequilibrium between loci and the amount of additive and non-additive effects within and between loci (Abdollahi-Arpanahi et al., 2014). So far, most simulation studies on genomic selection (GS) methods have considered only additive genetic architectures, while non-additive genetic effects may have substantial contribution to the total genetic variation.
High-throughput DNA sequencing technologies has also made it possible to estimate epistatic interactions in genomic selection studies. While, importing interaction effects in the prediction models demand computational burden and convergence problems. However, it seems essential trying to find the best statistical model in contrast to these effects. So far, a few studies have been involved in this regard and compared different statistical methods considering non-additive genetic architecture (Gianola et al., 2006; Howard et al., 2014). Their results showed that nonparametric methods would be able to predict accurately phenotypes that are based on genetic architectures consisting of epistasis effects. They also reported that when the underlying genetic architecture is additive, parametric methods are slightly better than the non-parametric methods. It is mentionable that there are some differences between our study and Howard et al. (2014) study: 1) They investigated only additive and epistatic architecture, while here the joint additive, dominance and epistasis architecture was also examined. In that study Bagging GBLUP and Random Forest were not considered, but these resampling methods are also explored here.
Since the underlying genetic architecture of complex traits are not only additive or epistasis, it is possible that the genetic architecture includes a combination of additive, dominance and epistatic effects. Therefore, given the scarcity of papers on non-additive genetic architecture in the literature and for the purpose of increasing knowledge and enriching discussion of such an essential topic in this field, the objective of this study was to assess the performance of two resampling (Bagging GBLUP and Random Forest) and two parametric (GBLUP and BayesB) methods for prediction of quantitative traits with additive and non-additive (dominant and epistasis) genetic architecture.
Methods
In order to compare parametric and resampling methods, simulation data was created by the package Hypred (Technow, 2013) which is available in the R language/environment (R Development Core Team, 2014 ). A population was simulated for 50 generations at an effective size of 100. In generation 51, the size of population was expanded to 4000 individuals. Genotypes and phenotypes were formed for these 4000 individuals and then used as training population to estimate genetic effects of SNP and these estimates were used to calculate the breeding values of generation 52 to 56 (target population). Number of individuals in each generation was 4000. The genome consisted of five chromosomes each 100 cM length and 1000 loci/chromosome and four QTL/chromosome were distributed randomly over the genome. The simulation programs replicated 5 times.
Three types of traits with genetic architecture only additive (A), additive-dominance-epistasis (ADI) and only epistasis (I) were considered. Both the additive and the dominance effects were sampled from a standard normal distribution and epistasis effects derived from additive effects.
Epistatic effects were created with some modifications in the package Hypred. These effects were approximately calculated as particular Hadamard product. Program was written in such a way that we could evaluate all scenarios of pairwise epistasis including an additive by additive, additive by dominant, dominant by additive and dominant by dominant interactions. In this study only second order epistasis (additive by additive interactions) is considered. The heritability was assumed 0.30. The parameters which used in the simulation are presented in Table 1.
Table 1. The parameters which used in the simulation
Genome | |
5 Morgan | Genome length |
5 | Number of chromosome |
1000 | Number of loci/chromosome |
4 | Number of QTL/chromosome |
Normal | Distribution of QTL effects |
additive (A), additive-dominance-epistasis (ADI) and epistasis (I) | genetic architecture |
Population | |
50 | Number of historical generations |
100 | Effective population size |
51 | Training population generations |
52-56 | Target population generations |
Statistical models
Four different statistical methods including: GBLUP, BayesB, Bagging GBLUP and Random Forest were used to estimate the markers effects of which continue to be so described.
GBLUP
The statistical model used for GBLUP was:
[1] | y = 1µ + Zu + e |
where is the vector of (n˟1) phenotypic data, μ is a vector of (n˟1) overall mean, is a vector of breeding values considered as random effects, is an incidence matrix that related breeding values to the records, and is a vector of random residual effects. It is assumed that ~ N(0,
Gσa2) and ~ N(0,
Iσe2) ,where
σa2and
σe2are the additive genetic and residual variances, respectively, and the matrix described the genomic relationships among all pairs of individuals in both the reference and validation populations based on the SNP genotypes. It was calculated following Yang et al. (2010) formula as:
[2] | Gjj’=1m∑i=1m(xij-2pi)(xij’-2pi)2pi(1-pi) |
where
mis the number of SNPs,
xijwas coded as 0, 1, or 2 for genotypes AA, AB, and BB, respectively, and
piis the observed allele frequency at thei^{th} SNP in the reference and validation populations.
BayesB
The following model was used to estimate SNP effects:
[3] | yj=µ+∑i=1mxijbi+ ej |
wherei = 1…n individual, j = 1…m marker s,
yjis the phenotypic value of animal j, μ is the (n˟1) dimensional vector of overall mean,
xijis the copy number of a given allele of SNP i centered by its mean of the reference population,
biis the allele substitution effect of SNP i, and
ejis the random residual effect for animal j. The prior for b_{i} depends on variance of random substitution effects for all SNPs,
σa2, and the prior probability π that SNP i has zero effect:
[4] | biσa2∼O with probablity πtbiσa2 with probablity 1-π |
The priors of all SNP effects have a common variance in BayesB, which follows a scaled inverted chi-square probability distribution with parameters
vb(degrees of freedom) and
Sb2(scale parameter). We implemented BayesB in the Package BGLR (Pérez and de los Campos, 2014). The package is available at CRAN and at the R-forge website https://r-forge.r-project.org/projects/bglr/. We chose π = 0.90. BayesB uses Gibbs sampling to sample from the posterior distributions of the unknown model parameters. The length of the Markov chain was 70,000 cycles. The first 7000 cycles were considered as burn-in and were discarded.
Bagging GBLUP
Bagging or bootstrap aggregation is a technique for reducing the variance of an estimated prediction function. Bagging seems to work especially well when standard predictions of response variables have large variance.
Prediction methods that use regularization, such as those applied in GBLUP and BayesB, are often stable because penalties on model complexity reduce the effective number of parameters drastically, thus lowering variance. A detailed discussion of bagging in the context of GBLUP is in Gianola et al. (2014). The bagging algorithm employed here was as follows:
Step1. Sample with replacement B times from the training set, each sample has size N_{train}
Step 2. Fit GBLUP in each bag sample and save the predicted genomic breeding values (GEBV) from each sample. Opposite to Gianola et al. (2014) who used SNP-BLUP with all marker effects estimated in each bootstrap sample, we used GBLUP directly. Because of an unbalanced number of GEBV for each individual after the B bootstrap samples, the genotypic values for out of bag samples in each bootstrap sample were predicted using the conditional expectation function as follows:
EgTrainbĝbaggedb=GTrain,baggedGbagged-1ĝbaggedb=ĝTrainb | [5] |
Where
ĝbaggedbis the vector of predicted genetic signals of birds in the b^{th} bag sample,
GTrain,baggedis a rectangular matrix of genomic relationships between individuals in training and bagging samples, and
Gbaggedis the genomic relationship between individuals in the bth sample.
Step 3. After obtaining GBLUP for all individuals in the 25 bootstrap samples, BGBLUP was calculated as follows:
[6] | ϕ̂Bagged= ∑b=1BĝbB |
where
ϕ̂Baggedis the fitted value of the genetic signal after averaging from B = 25 bagging samples.
Step 4: Predict vector
gTestin the testing set:
ĝTest=GTest,Train GTrain-1ϕ̂Bagged
where
GTest,Trainis a matrix of genomic relationships between individuals in testing and training sets. If g_{Test} pertains to records on the same individuals the predictor is
ϕ̂Test~
ϕ̂bagged.
Random Forest
Random Forest method is one of the non-parametric methods which is a substantial modification of bagging with this difference that in this method besides a sampling of observations in each Bootstrap, some of markers are randomly sampled and genomics prediction tree is formed. This method consists of hundreds or thousands of trees which are built from random samples of the information and in its application, is determined through the structure of variables importance instead of measuring the probability values (p-values). Each tree consists of three parts: the root, the nodes and the terminal nodes or the leaves and growing it from top to bottom and like an inverted tree (Pekelis, 2013).
In order to evaluate genome, a set of training samples {x_{i}, y_{i}} is used to train every tree and tree learns that a new sample input (animals with genetic information (x_{i}) but no phenotypic information (y_{i})) to belong what type phenotypic. The most important feature (SNP) is considered as the root and each feature is an index which based on that the data divided. In each node of the tree, a test is performed for each SNP, and the animals are guided into one of the subdirectories based on the answer and in the next node is tested again with next SNP and likewise the tree grows down until they reach one or more leaves which in those finally classification of animals is done according to their genotype (animals with the same genotype for SNP will cluster at one end nodes or the leaves). Finally, a large amount of trees are constructed repeating the above steps to grow a random forest.
RandomForest package in R was used to implement RF method (Liaw and Wiener, 2002). The RF regression prediction for a new observation
x(
f̂rfBx) is made by averaging the output of the ensemble of B trees
{T(x. Ψb)}1Bas (Hastie et al. 2009):
f̂rfB(x)=1B+∑b=1BT(x. Ψb) | [7] |
where Ψ_{b} characterizes the bth RF tree in terms of split variables, cut points at each node, and terminal node values.
There are three main aspects that can be tuned in random forest: the first one is the values of the number of trees to grow (ntree), the second aspect is the number of SNPs randomly selected at each tree node (mtry), the third aspect is the minimum size of terminal nodes of trees (nodesize) (Breiman 2001, Liaw and Wiener 2002). In our study, ntree =1000, mtry = 1500 and nodesize =10 were considered.
Results and discussion
In order to evaluate the performance of statistical models, prediction accuracy of genomic values and regression coefficient was calculated, so the results of this study are presented in two parts; prediction accuracy and regression coefficients.
Prediction accuracy
Figure 1 shows the prediction accuracy of genomic breeding values under the influence of tree genetic architecture including only additive (A), additive-dominant-epistasis (ADI) and only epistasis (I) using four statistical methods including: GBLUP, BayesB, Bagging GBLUP and RF. When dominance and epistasis effects are present in the genetic architecture, the prediction accuracy decreased. These findings are in agreement with de Almeida Filho et al. (2016) results. They used both simulation and real data and indicated that as contribution of dominance variance to total variation increased, the prediction accuracy decreased regardless of the method used for model development. Nazarian and Gezan (2016) pointed out that increase in dominance variance leads to decreases in prediction accuracy.
In another study it was shown that for traits with non-additive genetic architecture, an increase of 20 percent in dominance variance from 30 QTLs had no impact on decreasing accuracy of genomic breeding values, because dominance deviations of individual QTL are small and taken together act approximately like an independent error component (Meuwissen and Goddard, 2010). Whereas for 3 QTLs there was a marked reduction of 13% in prediction accuracy, because dominance deviations of individual QTL are large and not so well modeled as part of the error variance. They also reported this results may hold for epistasis effect (Meuwissen and Goddard, 2010).
Our results are consistent with the results of Howard et al. (2014) study, who showed that when genetic architecture is based on epistasis and the trait exhibits low heritability, predictions are not very accurate for almost all genomic evaluation methods. However, when the heritability is high (i.e., 0.70), the highest mean for prediction accuracy is 0.35. In another studies with real data have shown that when epistatic effects were added in the models, the prediction accuracy did not improved or even reduced (Lee et al., 2008; Lorenzana and Bernardo, 2009). Above studies noted that the reason why the prediction accuracy did not improve is due to inadequate sample size to capture epistatic effects or, alternatively, because epistatic interactions do not contribute much to genetic variance in their data set.
In our study, one explanation for the decrease in prediction accuracy for traits with non-additive genetic architecture could be due to the fact that dominance and epistatic variance are nested within the additive variance. Thus, when dominance and epistatic variance are present, the contribution to additive variance to total genetic variance decreases while error variance is constant.
Figure 1.The prediction accuracy of genomic breeding values under the influence of tree genetic architecture including only additive (A), additive-dominant-epistasis (ADI) and only epistasis (I) using four statistical methods including GBLUP, BayesB, Bagging GBLUP and Random Forest (RF).
Genetic architecture
Figure 2 shows prediction accuracy of genomic breeding values across six generations for traits with only additive genetic architecture. These results showed that the genetic architecture has a great impact on prediction accuracy of genomic evaluation methods. When genetic architecture was only additive, parametric and semi-parametric methods delivered better predictive performance than non-parametric method i.e. RF.
When additivity represents the genetic structure of trait, causal variants are independent in effects. Hence, the assumption of independent explanatory variables in parametric methods confirms and parametric methods shows larger predictive power than the non-parametric methods. Simulation study has also supported this statement (Howard et al., 2014).
Generation
Figure 2. The prediction accuracy of genomic breeding values during six generations for traits with only additive genetic architecture using four methods including GBLUP, BayesB, Bagging GBLUP and Random Forest (RF). The first generation is the reference population and generations 2 to 6 are validation populations.
Among the parametric methods, BayesB outperformed other methods for the traits with additive genetic architecture. BayesB combines features of variable selection with those of shrinkage estimation while GBLUP impose only shrinkage on markers effects. Theoretically, a method (e.g., BayesB) that is flexible enough for any genetic architecture can perform well for a wide range of traits with different attributes (de los Campos et al., 2013). However, this issue is not always confirmed in empirical studies (Heslot et al., 2012). Although in most simulation studies accuracy of BayesB was better than GBLUP (Hayes and Goddard, 2001), but empirical studies reported that BayesB and GBLUP deliver similar prediction accuracy for most of the traits (Hayes et al., 2009).
It is expected when predictors (markers) are unstable or there is a high correlation between independent variables such as SNP markers, due to the reduction in error variance, semi-parametric methods such as Bagging GBLUP provide more accurate prediction than GBLUP (Abdollahi-Arpanahi et al., 2015). Our results showed that there is not much difference between GBLUP and Bagging GBLUP for traits with only additive genetic architecture.
When only epistasis was present, accuracy of nonparametric methods was larger than parametric and semi-parametric methods (Figure 3). In this case RF led to higher accuracy than other methods. As mentioned above, the parametric methods assumes that explanatory variables are independent in the model. However, when epistasis is present in the model, the markers are dependent and this violate assumptions of parametric methods. Non-parametric methods can fit epistasis effects without modeling revealing interactions. This superiority of RF persist across generations.
Generation
Figure 3. The prediction accuracy of genomic breeding values during six generations for traits with only epistasis genetic architecture using four methods including GBLUP, BayesB, Bagging GBLUP and Random Forest (RF). The first generation is reference population and generations 2 to 6 are validation populations.
It has been reported that when the genetic architecture is completely based on epistasis effects, non-parametric methods will be able to better predict phenotypes and if genetic architecture is completely based on additive effects parametric methods will be slightly better than non-parametric methods (Gianola et al., 2006). Their findings are consistent with our study. Our results are also consistent with Howard et al. (2014) findings. In the presence of epistatic effects, BayesB had the best performance among the parametric methods.
Figure 4 represents the prediction accuracy of different statistical methods for traits with additive-dominance-epistatic genetic architecture over six generations. Results showed that when the genetic architecture was a combination of additive, dominance and epistatic effects, RF performs better than other methods. Only an additive model was considered in our study even for traits with combination of additive, dominance and epistasis genetic architecture. Nevertheless, it is essential to use the model that matches the genetic architecture of trait such as additive-dominant-epistasis model in order to achieve higher accuracies in the prediction. This concept requires further investigation in the future.
Generation
Figure 4. The prediction accuracy of genomic breeding values during six generations for traits with additive-dominant-epistasis genetic architecture using four methods including GBLUP, BayesB, Bagging GBLUP and Random Forest (RF). The first generation is reference population and generations 2 to 6 are validation populations.
Regression coefficients
Figure 5-7 show the results of regression coefficient of predicted breeding values on true breeding values for different statistical models across three genetic architectures. The regression coefficient values were less than one for all methods, indicating that all methods have resulted in overestimation of marker variances.
Recently, a study on several egg production traits showed that all regression coefficient values were less than 1 for both GBLUP and BayesC (Heidaritabar et al., 2016). In a study using a small dataset of genotype and phenotype records from Iranian Holstein cattle, the regression coefficient for all traits was less than one for GBLUP method, but for BayesB the regression coefficient for milk and protein yield traits was larger than one (Mohammadi et al., 2016).
On the other hand, in another study that was conducted on a population of Holstein dairy cattle, regression coefficient revealed larger than one for GBLUP method (Charfeddine et al., 2013). Moreover, in another study that was conducted on sheep, regression coefficient was less than one for GBLUP method (Duchemin et al., 2012). This contradictory results may be due to different data structure of the populations and also different statistical models that have been used.
Figure 5: The regression coefficient of genomic prediction during six generations for traits with only additive genetic architecture using four methods including GBLUP, BayesB, Bagging GBLUP and Random Forest (RF). The first generation is reference population and generations 2 to 6 are validation populations
Figure 6: The regression coefficient of genomic prediction during six generations for traits with additive-dominant-epistasis genetic architecture using four methods including GBLUP, BayesB, Bagging GBLUP and Random Forest (RF). The first generation is reference population and generations 2 to 6 are validation populations
Figure 7: The regression coefficient of genomic prediction during six generations for traits with only epistasis genetic architecture using four methods including GBLUP, BayesB, Bagging GBLUP and Random Forest (RF). The first generation is reference population and generations 2 to 6 are validation populations.
Conclusion
The results showed that accuracy of statistical methods in prediction depends on the genetic architecture of quantitative traits. Accuracy of genomic breeding values from all methods decreased with increasing genetic distance from a reference population. The results also showed that for the only additive genetic architecture, parametric methods (BayesB and GBLUP) and semiparametric method (Bagging GBLUP) delivered better predictive performance than the nonparametric method (RF). However, when the genetic architecture was completely epistasis or combination of additive, RF outperformed parametric and non-parametric methods.
Reference
Abdollahi-Arpanahi, R., G. Morota, B. Valente, A. Kranis, and G. Rosa. 2015. Assessment of bagging GBLUP for whole genome prediction of broiler chicken traits. Journal of Animal Breeding and Genetics
Abdollahi-Arpanahi, R., A. Pakdel, A. Nejati-Javaremi, M. Moradi Shahrbabak, and F. Ghafouri-Kesbi. 2014. The relation between the genetic architecture of quantitative traits and long-term genetic response. Journal of Applied Genetics 55:373-381.
Buckler, E. S., J. B. Holland, P. J. Bradbury, C. B. Acharya, P. J. Brown, C. Browne, E. Ersoz, S. Flint-Garcia, A. Garcia, and J. C. Glaubitz. 2009. The genetic architecture of maize flowering time. Science 325(5941):714-718.
Charfeddine, N., S. T. Rodríguez-Ramilo, J. A. Jiménez-Montero, M. J. Carabaño, and O. González-Recio. 2013. Non parametric vs. gblup model for genomic evaluation with large reference population in holstein cattle. Interbull Bulletin (47)
de Almeida Filho, J., J. Guimarães, F. e Silva, M. de Resende, P. Muñoz, M. Kirst, and M. Resende. 2016. The contribution of dominance to phenotype prediction in a pine breeding and simulated population. Heredity 117(1):33-41.
de los Campos, G., J. M. Hickey, R. Pong-Wong, H. D. Daetwyler, and M. P. Calus. 2013. Whole-genome regression and prediction methods applied to plant and animal breeding. Genetics 193(2):327-345.
De Los Campos, G., H. Naya, D. Gianola, J. Crossa, A. Legarra, E. Manfredi, K. Weigel, and J. M. Cotes. 2009. Predicting quantitative traits with regression models for dense molecular markers and pedigree. Genetics 182(1):375-385.
Duchemin, S., C. Colombani, A. Legarra, G. Baloche, H. Larroque, J.-M. Astruc, F. Barillet, C. Robert-Granié, and E. Manfredi. 2012. Genomic selection in the French Lacaune dairy sheep breed. Journal of dairy science 95(5):2723-2733.
Gianola, D., R. L. Fernando, and A. Stella. 2006. Genomic-assisted prediction of genetic value with semiparametric procedures. Genetics 173(3):1761-1776.
Hayes, B., P. Bowman, A. Chamberlain, and M. Goddard. 2009. Invited review: Genomic selection in dairy cattle: Progress and challenges. Journal of dairy science 92(2):433-443.
Hayes, B., and M. Goddard. 2001. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157(4):1819-1829.
Heidaritabar, M., A. Wolc, J. Arango, J. Zeng, P. Settar, J. E. Fulton, N. P. O’Sullivan, J. W. Bastiaansen, R. L. Fernando, and D. J. Garrick. 2016. Impact of fitting dominance and additive effects on accuracy of genomic prediction of breeding values in layers. Journal of Animal Breeding and Genetics
Heslot, N., H.-P. Yang, M. E. Sorrells, and J.-L. Jannink. 2012. Genomic selection in plant breeding: a comparison of models. Crop Science 52(1):146-160.
Howard, R., A. L. Carriquiry, and W. D. Beavis. 2014. Parametric and nonparametric statistical methods for genomic selection of traits with additive and epistatic genetic architectures. G3: Genes| Genomes| Genetics 4(6):1027-1046.
Lee, S. H., J. H. van der Werf, B. J. Hayes, M. E. Goddard, and P. M. Visscher. 2008. Predicting unobserved phenotypes for complex traits from whole-genome SNP data. PLoS Genet 4(10):e1000231.
Liaw, A., and M. Wiener. 2002. Classification and regression by randomForest. R news 2(3):18-22.
Lorenzana, R. E., and R. Bernardo. 2009. Accuracy of genotypic value predictions for marker-based selection in biparental plant populations. Theoretical and applied genetics 120(1):151-161.
Meuwissen, T., and M. Goddard. 2010. Accurate prediction of genetic values for complex traits by whole-genome resequencing. Genetics 185(2):623-631.
Meuwissen, T., B. Hayes, and M. Goddard. 2001. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819-1829.
Mohammadi, Y., M. Shariati, S. Zerehdaran, M. Razmkabir, M. Sayadnezhad, and M. Zandi. 2016. Comparison of accuracy genomic breeding values for production traits in Iranian Holstein Dairy Cattle using parametric and non-parametric methods. Journal of Animal Production
Nazarian, A., and S. A. Gezan. 2016. Integrating nonadditive genomic relationship matrices into the study of genetic architecture of complex traits. Journal of Heredity 107(2):153-162.
Pekelis, L. 2013. Classification And Regression Trees : A Practical Guide for Describing a Dataset. Bicoastal Datafest, Stanford University
Pérez, P., and G. de los Campos. 2014. Genome-wide regression & prediction with the BGLR statistical package. Genetics:genetics. 114.164442.
Technow, F. 2013. hypred: Simulation of genomic data in applied genetics. R package
Table 2. The regression coefficient of genomic prediction under the influence of tree genetic architecture including only additive (A), additive-dominant-epistasis (ADI) and only epistasis (I) using four methods including GBLUP, BayesB, Bagging GBLUP and RF.
Methods | genetic architecture | |||
A | I | ADI | ||
GBLUP | 0.84 | 0.64 | 0.73 | |
BayesB | 0.81 | 0.79 | 0.82 | |
Bagging GBLUP | 0.85 | 0.65 | 0.73 | |
RF | 0.70 | 0.69 | 0.73 |
Cite This Work
To export a reference to this article please select a referencing stye below:
Related Services
View allDMCA / Removal Request
If you are the original writer of this dissertation and no longer wish to have your work published on the UKDiss.com website then please: