Illumina genotyping arrays have powered thousands of large-scale genome-wide association studies over the last decade. Yet, due to the tremendous volume and complicated genetic assumptions of Illumina genotyping data, processing and quality control of this data remains a challenge. Thorough quality control ensures the accurate identification of SNPs and is required for the correct interpretation of genetic association results. By processing genotyping data on more than 100,000 subjects from over ten major Illumina genotyping arrays, we have accumulated extensive experience in handling some of the most peculiar scenarios related to the processing and quality control of Illumina genotyping data. Here, we describe strategies for processing Illumina genotyping data from the raw data to an analysis ready format, and we elaborate on the necessary quality control procedures required at each processing step. High quality Illumina genotyping datasets can be obtained by following out detailed QC strategies
The detailed protocols can be found in the manuscript.