This repository contains my projects for the CAS in Applied Data Science.
- being familiar with different data sources, data types and able to develop data management plans;
- being able to describe, extract and present scientific knowledge from data by application of statistical methods;
- being able to process data with machine learning tools and methods;
- being familiar with best practices for data management, analytics and science;
- being able to analyse and communicate data science challenges and use a wide range of data science tools and methods;
- being able to perform deep learning for a wide range of tasks,
This project examines the amount of frequency of drug and substance consumption in a representative sample of 1003 young adults around the age of 20. According to the self-medication hypothesis, substance use develops as a way of coping with stress in the absence of adequate solutions and meaningful social relationships. Therefore, the amount of consumption and co-consumption of a broad range of drugs (including psychotropic durgs) are predicted by reports of social and school problems during development of the participants using a SVM alogrithm. This is an unpublished collaborative project with the Jacobs Centre for Productive Youth Development, which generously provided the data for the analysis. (For data protection reasons, the data is not publicly available).
Machine learning algorithms can be powerful diagnostic tools that guide doctors and nurses around the world when making appropriate treatment decisions. This project is testing two neural networks on accuracy in classifying four different tumor types. Specifically, a fully connected neural network and convolutional neural network (CNN) have been applied to MRI images generated from various types of sequences.
Prerequisites:
- A Jupyter Notebook App (e.g. provided by the Anaconda Python Distribution or by Docker)
