Data Science and Machine Learning are at the basis of many scientific discoveries across various Scientific domains. By allowing Machine Learning models, we are able to discover and research more complicated patterns than possible when resorting to human expertise. However, machine learning techniques are not easy to wield, and require substantial data and training time to be applied adequately, and to have the results interpreted correctly.
In this talk, I will address several developments that aim to further automate the data science pipeline, and assist data owners in applying appropriate models for their data, in particular AutoML, meta-learning and OpenML. The field of Automated Machine Learning (AutoML) develops tools that can help domain scientists and experts in applying machine learning tools to their data. The field of Meta-learning develops techniques that leverage knowledge from previous experience, and allows for building adequate models based on less data. During my PhD, we developed OpenML, an on line experiment database for storing results from previous Machine Learning experiments. In this talk I will elaborate on the knowledge that we can gain from this, and how this can be applied to further automate the data science pipeline across research domains.