This project lies in the intersection of Data Science and Computational Biology, aiming to provide a data-driven computational method and a standalone tool for scoring cell identity that will be i) easy to use and ii) integrated in open-source development projects like R/Biconductor to be accessible for further development and integration into custom biological data analysis workflows.

You will explore, evaluate and compare both supervised and unsupervised learning methods for modelling high-dimensional gene expression data from lab-engineered and standard somatic cell types.

Duration and Type

  • 12 week summer scholarship in the New Zealand Summer 2021/2022


  • At least basic skills in statistics, data mining and data visualisation; Intermediate level programming skills in R (or Python).

Supervisor and contact

  • Katerina Taskova
  • Send CV and transcript by mail to Katerina Taskova