Skip to main content

Training

Our training focuses on upskilling eLwazi project members and users in how to use the open data science platform and additional computational skills they may need. We work closely with the DS-I Africa consortium training working group which is developing a data science curriculum.

Displaying 1 - 9 of 9

The eLwazi Metadata Harmonisation Tool

The exponential growth of scientific data across disciplines has created a pressing need for robust, flexible tools that enable harmonisation and standardisation of metadata to ensure data interoperability and reusability. A major barrier to effective data sharing lies in the heterogeneity of data formats, vocabularies, and annotation practices, which can hinder discovery, integration, and…

Date: 11 June 2025

Read More

Leveraging publicly available DNA and RNA sequencing data for hypothesis-driven science in cancer biology and beyond

This seminar will focus on using publicly available DNA and RNA sequencing datasets to investigate biological hypotheses, taking breast cancer as a case example. To start, we will formulate the hypothesis for this case study, then discuss how to tailor the publicly available datasets for a preliminary exploration of the chosen topic and briefly review the relevant papers. We will walk through…

Date: 21 May 2025

Read More

Lessons from 20 years of Open Source Development

Computational biology has undergone a significant transformation since the advent of high-throughput sequencing, a pivotal breakthrough that democratized large-scale genomic analyses for researchers. However, even prior to this technological advancement, several of the most widely utilized tools in the field prioritized reliable software development practices, with data reproducibility being a…

Date: 23 April 2025

Read More

Pangenome-based structure deconvolution of the amylase locus

The adoption of agriculture triggered a rapid shift towards starch-rich diets in human populations. Amylase genes facilitate starch digestion, and increased amylase copy number has been observed in some modern human populations with high-starch intake, although evidence of recent selection is lacking. Here, using 94 long-read haplotype-resolved assemblies and short-read data from approximately…

Date: 13 September 2024

Read More

Exploring the application of pangenome reference graphs to rare disease diagnosis

Although the CHM13 reference represents a complete human genome, it lacks the full diversity of human haplotypes present in Africa. Analysis pipelines which map sequencing reads to a single linear reference may suffer from “reference genome bias”, where unmapped reads bias downstream analysis. The impact of reference genome bias in the clinical evaluation of genome sequencing data from African…

Date: 10 September 2024

Read More

Introduction to Dockstore.org: An “app” store for bioinformatics workflows

Dockstore.org is a repository of reusable bioinformatics analyses shared by the research community. Tools and workflows registered in Dockstore are written using languages that combine containers to reproduce the exact compute environment with a precise description of each step of a computational analysis. Dockstore currently supports several workflow…

Date: 28 June 2022

Read More

Introduction to the Gen3 Data Commons system

This workshop will provide an overview of the Gen3 Data Commons Framework. We will describe the core microservices used to create a data service that can be used to harmonize data, facilitate access, and assist new research projects to identify relevant data and will provide a demonstration using two of our current data commons.

Date: 23 June 2022

Read More

Introduction to Terra: A scalable platform for biomedical research

Terra is a cloud-native platform for biomedical researchers to access data, run analysis tools, and collaborate. This interactive workshop on Terra will teach you the skills you need to know to start working and collaborating securely in Terra. Specifically, you’ll learn about the architecture of Terra as it relates to cloud-based data sets, tools, and…

Date: 12 April 2022 - 13 April 2022

Read More