Data & analysis

Overview

HiGAP will be divided into two parts: Part A assesses heterogeneity in quality control procedures and the effect this has on polygenic risk prediction. Part B focuses on post-GWAS pipelines and how different, potentially equally valid, choices affect the biological interpretation of GWAS results. Teams are strongly urged to participate in both parts of the study, although this is not a requirement. If you would like to participate only in one part, please indicate this on the sign-up page.

Part A: Polygenic risk prediction

Here we will provide you with instructions on how to access a dataset that includes raw genotypic data (the target dataset) and several GWAS summary statistics (the discovery datasets) that can be used for polygenic risk prediction. Your task will be to conduct quality control on the target dataset and subsequent polygenic risk score analyses. Part A can be conducted on a personal laptop or computer and is estimated to take a few day’s work at most. You will receive specific instructions on how to report back on your results.

Part B: Interpretation of GWAS results for ‘Disease X

Here we will provide you with the results of a GWAS for ‘Disease X’. Your task will be to interpret these results, using your usual post-GWAS pipeline, and formulate a hypothesis on putative causal biological mechanisms. We do not require you to prove these causal mechanisms, as that would necessitate functional experiments. Part B can be conducted on a personal laptop or computer and is estimated to take a day’s work at most. You will receive specific instructions on how to report back on your results.

For both part A and B, we can provide access to a cluster computer if needed.