AXIVU © 2022


157 Burwood Rd.

Suite #314

Hawthorn, Victoria

Learning Objectives


This course will enable the student to:

  1. setup and install the python environment on their computer program in python
  2. perform data extract-transform-load operations using the pandas library
  3. perform statistical analysis and data visualisation using pandas, numpy and scipy
  4. perform machine learning using scikit-learn




For this course, students only need to have a background in high school mathematics.




Students are only required to bring their laptops.


Course Outline


This course will be delivered in three classes, where each class is for the duration of 3 hours.


Class 1: 3 hours


By the end of this class, the student

  1. Is able to install Python and the required software libraries, including: pandas, sci-kit learn, numpy, matplotlib
  2. Can write a basic Python program that has the following elements:
    • Variables and operations on them
    • Loops
    • Conditional statements
    • Methods
    • Exception handling
    • File input/output operations
  3. Understand the following basic principles of machine learning
    • Data: dimensionality, formats, sources
    • Descriptive statistics
    • Data input and output vectors
    • Data models, prediction, and qualifying prediction accuracy
    • Data visualisation


Class 2: 3 hours


By the end of this class, the student will

  1. Understand the pandas DataFrame and how it can be used to access and transform data
  2. Can access a publicly available database such as the Google databases and Kaggle
  3. Can use Python to perform the following tasks:
  4. Access and manipulate a CSV file using pandas
  5. Perform basic predictions using basic regression models
  6. Understand the data outliers, and can treat null values appropriately


Class 3: 3 hours


By the end of this class, the student

  1. Can use Python to perform the following tasks:
  2. For a given DataFrame, can perform dimensionality reduction using principle component analysis
  3. Generate a new reduce-dimensionality dataset and splits the data into a training and a test set
  4. Train one of the machine learning models in sci-kit learn and evaluate it accuracy
  5. Visualize the outcome of the training experiment and draw conclusions
  6. Will create a new profile in Kaggle and submit a kernel that yields reasonably accurate predictions

One hour preview interactive session on data science, and how to use Python to unravel the secrets of data

1HourPrev: Data Science