**AXIVU**

157 Burwood Rd.

Suite #314

Hawthorn, Victoria

This course will enable the student to:

- setup and install the python environment on their computer program in python
- perform data extract-transform-load operations using the pandas library
- perform statistical analysis and data visualisation using pandas, numpy and scipy
- perform machine learning using scikit-learn

For this course, students only need to have a background in high school mathematics.

Students are only required to bring their laptops.

This course will be delivered in three classes, where each class is for the duration of 3 hours.

*Class 1: 3 hours*

By the end of this class, the student

- Is able to install Python and the required software libraries, including: pandas, sci-kit learn, numpy, matplotlib
- Can write a basic Python program that has the following elements:
- Variables and operations on them
- Loops
- Conditional statements
- Methods
- Exception handling
- File input/output operations

- Understand the following basic principles of machine learning
- Data: dimensionality, formats, sources
- Descriptive statistics
- Data input and output vectors
- Data models, prediction, and qualifying prediction accuracy
- Data visualisation

*Class 2: 3 hours*

By the end of this class, the student will

- Understand the pandas DataFrame and how it can be used to access and transform data
- Can access a publicly available database such as the Google databases and Kaggle
- Can use Python to perform the following tasks:
- Access and manipulate a CSV file using pandas
- Perform basic predictions using basic regression models
- Understand the data outliers, and can treat null values appropriately

*Class 3: 3 hours*

By the end of this class, the student

- Can use Python to perform the following tasks:
- For a given DataFrame, can perform dimensionality reduction using principle component analysis
- Generate a new reduce-dimensionality dataset and splits the data into a training and a test set
- Train one of the machine learning models in sci-kit learn and evaluate it accuracy
- Visualize the outcome of the training experiment and draw conclusions
- Will create a new profile in Kaggle and submit a kernel that yields reasonably accurate predictions

One hour preview interactive session on data science, and how to use Python to unravel the secrets of data