Description
This course distils expert knowledge and skills honed by professionals in Health Big Data Science and Bioinformatics for you. You will learn exciting facts about the human body biology and chemistry, genetics, and medicine, which will be intertwined with Big Data science and skills to harness the avalanche of data that is openly available at your fingertips and that we are just beginning to make sense of. We'll look at the various steps required to master Big Data analytics on real datasets, including Next Generation Sequencing data, in a healthcare and biological context, from data preparation to analysis completion, interpreting the results, visualising them, and sharing the results.
Syllabus :
1. Genes and Data
- Introduction to the Course
- Introduction to Module
- DNA and Genes
- RNA and Proteins
- Transcription Process
- Transcription Animation
- Translation Process
- Translation Animation
- Data, Variables, and Big Datasets
- Working with cBioPortal - Genetic Data Analysis
- Working with cBioPortal - Gene Networks
2. Preparing Datasets for Analysis
- Introduction to Module
- Datasets and Files
- Data Sources
- Importance of Data Preprocessing
- Data Preprocessing Tasks
- Replacing Missing Values
- Data Normalization
- Data Discretization
- Feature Selection
- Data Sampling
- Principles of R
- R Language
- Jupyter Notebooks 101
3. Finding Differentially Expressed Genes
- Introduction to Module
- Overview of Feature Selection Methods
- Filter Methods
- Wrapper Methods
- Evaluation Schemes
- Selecting Differentially Expressed Genes
- Heatmaps
- R Scripts for Feature Selection
- Jupyter Notebooks 101
4. Predicting Diseases from Genes
- Introduction to Module
- Overview of Classification and Prediction Methods
- Classification Methods Based on Analogy
- Classification Methods Based on Rules
- Classification Methods Based on Neural Networks
- Classification Methods Based on Statistics
- Classification Methods Based on Probabilities
- Prediction Methods
- Evaluation Schemes
- Prediction Workflow
- R Scripts for Prediction
- Jupyter Notebooks 101
5. Determining Gene Alterations
- Introduction to Module
- Overview of Gene Alterations
- Genetic Mutations
- Finding Genetic Mutations
- Methylation
- Copy Number Alterations
- Genomic Alterations and Gene Expressions
- R Scripts for Gene Alterations
- Jupyter Notebooks 101
6. Clustering and Pathway Analysis
- Introduction to Module
- Overview of Clustering Methods
- Similarity Assessment
- Clustering with KMeans
- Density Based Clustering
- Hierarchical Clustering
- Pathway Analysis
- Pathway Discovery
- Pathway Visualization
- R Scripts for Clustering and Pathway Analysis
- Jupyter Notebooks 101
- Concluding Remarks