Towards a public breeding decision support system: Data analysis and management activities in CIMMYT's Global Wheat Program
The Global Wheat Program of CIMMYT is one of the largest public breeding programs in the world consisting of millions of lines/ genotypes derived from thousands of crosses evaluated under using a shuttle breeding cycle and multi-environment testing. The germplasm is phenotyped for conventional (such as yield and grain quality) as well as non-conventional traits (physiological traits) in field and greenhouse conditions. The breeding germplasm is also screened with genome-wide markers (using Illumina SNP array, genotyping-by-sequencing, or DArTseq platforms) and/or multiple gene/QTL region-specific molecular markers (using KASP platform). All genotyped samples are registered in the "DNA SampleTracker," a software system for tracking DNA samples developed at CIMMYT. In collaboration with High Throughput Genotyping Platform project, the plant sample and data collection methods are optimized. Meanwhile, the extensive wheat genealogies and phenotypic information have been maintained in the International Wheat Information System and will be transferred to a new Enterprise Breeding System. Furthermore, several bioinformatics/statistical genetics methods with the objectives of gene discovery and genomic prediction have been developed and utilized for optimizing genomics-assisted selection. The wheat team is a member of "Genomic Open-source Breeding Informatics Initiative (GOBII)" which aims to develop and implement genomic data management systems to enhance the capacity of breeding programs. Under this initiative, a new genomics database has been built and a pilot wheat version is being tested at CIMMYT. Several decision support tools are also under collaborative development, such as a Genomic Selection Pipeline based on Galaxy, Flapjack-based F1/line verification, and marker assisted backcrossing tools. Additional tools are envisioned for the future including a Cross-Assistor and Selection-Assistor. The ultimate aim is to seamlessly connect the genomic database, phenotypic database, and decision support tools to support the breeding selection process and to lead to the development of cultivars with increased rates of genetic gain.