EpiXcan online repository


In this website, we provide a database for relevant data and results released along with our EpiXcan manuscript about integrating epigenomic information into transcriptome prediction as well as large-scale gene-trait association study (GTAS).

We propose the EpiXcan approach to integrate the epigenome data and information into the prediction of gene expressions using genotype data. Linkage between genotypic data and unknown transcriptomes is practically ‘imputed’ or predicted using the models that trained by existing cohorts of data sets. The proposed method coalesces epigenomic information into the prediction of transcriptomes and it provides framework of incorporations for other profiles as well. The accurate predictors that provided are utilized to analyze genome-wide association study (GWAS) summary result through a large-scale GTAS. Totally 58 GWAS summary results are analyzed with the trained fourteen predictor databases. GTAS results detect significant gene-trait associations, tissue specificity of trait-correlated genes, trait-trait correlations in terms of the identified gene-trait associations, and gene regulations of the diseases. A repository of EpiXcan source codes is released at Bitbucket.

How to query

The EpiXcan database have three-fold components. Firstly, detailed results of the associations between tissue and trait are provided. These can be filtered using the "EpiXcan performance q-value" and "Adjusted association p-value" cut-offs. All genes in the database passed the performance q-value cutoff of 0.01. Secondly, the trait associated gene heatmaps show each gene that found to be significantly associated with a trait of interest with a FDR < 5%. Thirdly, correlation plots (by clicking on 'Correlations' button) that for a given trait and tissue show the correlation for genes identified in either EpiXcan or PrediXcan or both (please see introduction below). For given trait and tissue, genes that identified uniquely by EpiXcan/PrediXcan are inferred.

The results integrate large-scale GTAS using MetaXcan. pvalue indicates original association significance from each GWAS results regarding the given tissue-specific model. The "Adjusted association p-value" (FDR) is based upon the adjustments across all the tissues and GWASs that involved. Heatmaps show highly associated genes with respect to each trait across all related tissues. Since the study in the mean time intends to test the correlations between EpiXcan and PrediXcan method, scatter plots showing correlations of the z-score are provided. In the plots, grey dots denote genes that are not significantly associated with the trait in the tissue either from EpiXcan or from PrediXcan, green dots (if there are any) indicate genes that are significantly associated with the trait in the tissue from both EpiXcan and PrediXcan, blue/orange dots (if there are any) are distinct genes from EpiXcan/PrediXcan that are significant. From correlation plots, most genes are identified by both PrediXcan and EpiXcan, and those uniquely identified genes either from EpiXcan or PrediXcan are highlighted (giving the gene names of top up to five in terms of the absolute z-scores). Furthermore, we give trait-associated genes that are distinct from EpiXcan study by gene regulation plots. Since these genes passed the FDR threshold of 0.01 and had high predictive performance (qvalue≤0.01), they were candidates of trait-associated genes, which could provide new insights into complex disease gene discovery. Predictors' data is available for users as well (see follwoing Download).


You can download the complete PredictDBs regarding different tissues that we generated. Please click on each following link to download. If you have any other questions, please feel free to contact us.

Cohort Tissue Sample size File size
CMC Brain, DLPFC 467 506.5M
STARNET Artery, aorta 508 397.2M
Mammary, artery 524 408.1M
Blood 443 329.9M
Adipose, subcutaneous 543 343.4M
Adipose, visceral 503 342.3M
Liver 522 365.5M
Muscle, skeletal 507 288.2M
GTEx Artery, aorta 231 375.7M
Blood 307 328.4M
Adipose, subcutaneous 320 390.4M
Adipose, visceral 269 338.8M
Liver 130 256.2M
Muscle, skeletal 413 374.7M


If you use the EpiXcan method or databases in your research, please cite the paper:

Integrative transcriptome imputation reveals tissue-specific and shared biological mechanisms mediating susceptibility to complex traits. Wen Zhang, Georgios Voloudakis, Veera Manikandan Rajagopal, Ben Readhead, Joel Dudley, Eric E. Schadt, Johan L.M. Björkegrn, Yungil Kim, John F. Fullar, Gabriel Hoffman, Panos Roussos*. manuscript.


If you have any further qustions about using the EpiXcan databases or method, please contact us (Wen Zhang: weFightSomeSpamn.zhang1@mssm.edu, Panos Roussos: panFightSomeSpamagiotis.roussos@mssm.edu) for more information.