Dr Ben Goudey

WORKSHOP: Applying machine learning in life sciences: what does it mean and how to avoid common traps

This workshop will provide a high-level introduction to machine learning: what it is, its advantages and disadvantages compared to traditional modelling approaches and how its usage may impact experimental design. We’ll contrast a few commonly used algorithms for constructing predictive models to give a flavour of the different trade-offs. The majority of the focus will be placed on highlighting common pitfalls in machine learning model construction and evaluation that lead to overly optimistic results. We will discuss how and why such errors arise and strategies to avoid them.

It is hoped that participants will come away with an understanding of when machine learning should and should not be used, how to implement a very basic machine learning pipeline in Python using scikit-learn, as well as a greater ability to critically evaluate the use of machine learning in the life sciences literature.

Requirements: Internet enabled laptop with Python installed. It will be assumed that participants have some programming experience in Python. No prior machine learning or statistical knowledge is required.

Relevance: Relevant to those with some computing experience wanting to learn how to implement a machine learning pipeline. This workshop will make use of simulated and publicly available genomics and demographics data. However, the content of the workshop will be directly applicable across a wide range of application areas.

Dr Benjamin Goudey

Research Fellow, School of Computing and Information Systems, The University of Melbourne

Dr Benjamin Goudey is a Research Fellow in the School of Computing and Information Systems at the University of Melbourne, focusing on genomics and predictive modelling. Dr. Goudey completed his PhD on novel methods for analysing genome-wide association studies in 2016 at the University of Melbourne. From 2013 to 2021, Dr Goudey was a Research Scientist at IBM Research where he applied his expertise in machine learning and applied statistics to a range of challenges across the life sciences from the prediction of cochlear implantation outcomes through to the development of polygenic risk scores for autoimmune conditions.