Learning Objectives

 

This course will enable the student to:

  1. setup and install the python environment on their computer program in python
  2. perform data extract-transform-load operations using the pandas library
  3. perform statistical analysis and data visualisation using pandas, numpy and scipy
  4. perform machine learning using scikit-learn

 

Prerequisites

 

For this course, students only need to have a background in high school mathematics.

 

Requirements

 

Students are only required to bring their laptops.

 

Course Outline

 

This course will be delivered in three classes, where each class is for the duration of 3 hours.

 

Class 1: 3 hours

 

By the end of this class, the student

  1. Is able to install Python and the required software libraries, including: pandas, sci-kit learn, numpy, matplotlib
  2. Can write a basic Python program that has the following elements:
    • Variables and operations on them
    • Loops
    • Conditional statements
    • Methods
    • Exception handling
    • File input/output operations
  3. Understand the following basic principles of machine learning
    • Data: dimensionality, formats, sources
    • Descriptive statistics
    • Data input and output vectors
    • Data models, prediction, and qualifying prediction accuracy
    • Data visualisation

 

Class 2: 3 hours

 

By the end of this class, the student will

  1. Understand the pandas DataFrame and how it can be used to access and transform data
  2. Can access a publicly available database such as the Google databases and Kaggle
  3. Can use Python to perform the following tasks:
  4. Access and manipulate a CSV file using pandas
  5. Perform basic predictions using basic regression models
  6. Understand the data outliers, and can treat null values appropriately

 

Class 3: 3 hours

 

By the end of this class, the student

  1. Can use Python to perform the following tasks:
  2. For a given DataFrame, can perform dimensionality reduction using principle component analysis
  3. Generate a new reduce-dimensionality dataset and splits the data into a training and a test set
  4. Train one of the machine learning models in sci-kit learn and evaluate it accuracy
  5. Visualize the outcome of the training experiment and draw conclusions
  6. Will create a new profile in Kaggle and submit a kernel that yields reasonably accurate predictions

One hour preview interactive session on data science, and how to use Python to unravel the secrets of data

1HourPrev: Data Science

 

AXIVU © 2020

AXIVU

34 Roger St.

Doncaster East

Victoria