Human Pangenome bring your data (BYOD) analysis workshop
Overview
The human reference genome provides a universal coordinate system that specifies a standardized reference sequence for genes and their annotation. This reference genome is used for the alignment of sequence reads for variant calling in newly sequenced genomes. However, the current reference is composed of a handful of individual genomes which do not necessarily represent the genetic diversity across different world populations and introduces reference allele bias. This is particularly pertinent for African populations which are very genetically diverse. While the haploid, linear reference genome has formed the basis of all genetic variation studies, the availability of new technologies such as highly accurate long read sequencing, coupled with the development of novel computational tools that allow for the efficient de novo assembly of full human genomes, presents the possibility to now build a more representative human reference genome.
In order to capture the diversity that exists across genomes, rather than using a linear sequence, the variation can be expressed in terms of a mathematical graph structure with multiple overlapping sequence paths based on a collection of ethnically diverse genomes. Traversing through the graph structure would allow for variants observed within a group of individuals to be represented in the reference genome, hence allowing for the more accurate calling of both single nucleotide and structural variants.
We have used long read sequencing data to generate high quality de novo assemblies from a diverse set of African samples. We have combined these with other high quality African genome assemblies based on long read sequencing data to generate a genome graph structure that more accurately represents African genetic variation. While a population specific genome reference graph is not representative of global genetic diversity it can be useful for the exploration of population specific genetic variation and can improve variants called in closely-related population samples.
We are currently inviting applications from African researchers interested in addressing specific scientific questions using Pan-genome graphs.
Training application
Competitive selection process
Skill level of training
Intermediate to advanced
Language
English
Type of training
Workshop
Venue
Southern Sun, Newlands, Cape Town, South Africa
Course date
- 21 October - 25 October 2024
Organisers
Tshinakaho Malesa, Karen Miga, Melissa Nel, Andrea Guarracino, Flavia Villani, Mohammed Farhat, Gerrit Botha, Kennedy Mwai Wambui, Chris Fields, Shaun Aron, Sumir Panji, Nicola Mulder.
Sponsors
Roche African Genomics Program, H3ABioNet, eLwazi Open Data Science Platform, Human Pangenome Reference Consortium.
Intended Audience
African based researchers from African genomics funded projects such as H3Africa that are specifically focused on using NGS data for research and clinical applications within African populations.
Prerequisites
Nominees should have experience and completed the following:
Viewed the introductory lectures on pan genome graph building: https://www.youtube.com/playlist?list=PLcQ0XMykNhCSc8ucXrV1g70gRCXbdmiYU
Viewed the second webinar series on practical applications of using a pangenome graph:
Be comfortable working on the command line in unix/linux and an HPC style environment
Be familiar with NGS file formats and tools, be familiar with either variant calling or structural variant analysis e.g. VCF tools, SAM tools etc
Have a research question they would like to learn how to address using the human pangenome graphs
Have a small human dataset they would like to use for the analysis workshop
Learning outcomes
Human Pangenome bring your data (BYOD) analysis workshop outcomes: Be familiar with the methods used for building human pangenome graphs Be able to visualise and work with human pangenome graphs Be able to call variants of human pangenome graphs using tools such as Mini-graph Cactus, Giraffe, VG Utilise the human pangenome graphs to do some preliminary analysis of their data
To apply, please Click HERE