Intro to Biomedical Machine Learning in Python
Welcome Bioinformatics Bootcamp: Intro to Biomedical Machine Learning in Python!
​
This two-part workshop is designed to prepare biomedical scientists to use the python programming language for machine learning. Part I will teach fundamental python programming and basic python data science. Part II will teach machine learning fundamentals, building towards capstone projects involving real-world patient datasets.
​
This page contains all the lectures and materials needed to participate.
For any questions, please contact Henry Miller.
How to participate
​
Enrollment link: HERE
​
Asynchronously: You can watch the videos and complete the activities at your own speed. No enrollment is necessary.
​
Semi-synchronously: You watch the videos and complete the activities, keeping pace with the workshop. You will also get access to DataCamp. This requires that you enroll.
​
Synchronously: Same as semi-synchronous, except that you are invited to participate in the live online sessions. This requires that you enroll AND do the following:
​
Complete the (1) Introduction to Python and (2) Intermediate Python courses. However, you do not have to complete these if you score > 60% on both the Python Programming assessment and the Data Manipulation with Python assessment. Once you complete either the courses or the assessments, let Henry know and you will get the invite link to the live sessions. Prior to the live sessions, please also download and install Anaconda. NOTE: you can complete these assignments and join synchronously until module #5 (July 6th) or until max capacity is reached (100 people).
​
**NOTE**: Before attending office hours or emailing instructors for assistance, you must have already attempted the DataCamp assignments.
​
PART I: Python for Data Science
Module #1: Orientation and Introductory Python
In this lecture, we give an overview of the workshop and recap the introductory Python concepts covered in Introduction to Python.
​
Lecturer: Henry Miller
Data/time: June 8th, 2021 (5PM CST)
​
Materials:
- Slides: Here
- Code (download the Zip file): Here
​
Activity/Homework
​
Complete any remaining Module #1 challenge questions, practice on DataCamp, and complete the Python Programming assessment. ​
Module #2: Intermediate Python
In this lecture, we continue with intermediate python concepts, such as lists, if...elif...else, function, list comprehensions, dictionaries, and numpy arrays.
​
Lecturer: Henry Miller
Data/time: June 15th, 2021 (5PM CST)
​
Materials:
- Code (download the Zip file): Here
​
Activity/Homework
​
Complete the first 6 challenge questions in the Module_2_challenge_problems.ipynb notebook in the Module 2 folder before next lecture. Complete questions 7-10 before module #4. ​
Module #3: Python for Data Science
In this lecture, we continue with data science concepts such as numpy, pandas, matplotlib, and scipy. We finish the last part of Module #2 and all of Module #3.
​
Lecturer: Simon Levy
Data/time: June 22th, 2021 (5PM CST)
​
Materials:
- Code (download the Zip file): Here
​
Activity/Homework
​
Complete the last 4 challenge problems in the Module_2_challenge_problems.ipynb notebook and of the challenge problems in the Module_3_challenge_problems.ipynb. ​
Module #4: Review Week
In this lecture, we wrap up Part I and we finish going through the homework answers.
​
Lecturer: Simon Levy
Data/time: June 29th, 2021 (5PM CST)
​
Materials:
- Code (download the Zip file): Here
​
Activity/Homework
​
Begin the Supervised Learning with scikit-learn course on DataCamp and download Weka. ​
Module #5: Getting to know your data
In this lecture, we begin part II by discussing statistical considerations in ML and data tidying.
​
Lecturer: Daniel Montemayor, PhD
Data/time: July 6th, 2021 (5PM CST)
​
Materials:
- Code (download the Zip file): Here
​
Activity/Homework
​
Finish the Supervised Learning with scikit-learn course on DataCamp.​
PART II: Intro to Biomedical Machine Learning
Module #6: Feature Selection and Parsimony
In this lecture, we discuss parsimony and feature selection in python.
​
Lecturer: Daniel Montemayor, PhD
Data/time: July 13th, 2021 (5PM CST)
​
Materials:
- Code (download the Zip file): Here
​
Activity/Homework
​
Finish the Supervised Learning with scikit-learn course on DataCamp and complete the module #6 homework.​
Module #7: Classification
In this lecture, we discuss classification models.
​
Lecturer: Daniel Montemayor, PhD
Data/time: July 20th, 2021 (5PM CST)
​
Materials:
- Code (download the Zip file): Here
​
Activity/Homework
​
Finish the Supervised Learning with scikit-learn course on DataCamp.​
Module #8: Regression
In this lecture, we discuss regression models.
​
Lecturer: Daniel Montemayor, PhD
Data/time: July 27th, 2021 (5PM CST)
​
Materials:
- Code (download the Zip file): Here
​
Activity/Homework
​
Complete the Module #8 Homework assignment.​