High productivity analytics & data mining

October 11, 2013

Available course dates

  • 26-27 November 2013 - Canberra City - $2,350 + GST
  • 24-25 February 2014 - Canberra City - $2,350 + GST

Course Summary

This course introduces attendees to the Python programming language with a focus on solving real world business problems related to analytics and data mining. The course takes a strong practical approach focused on learning through doing.

Python is one of the most popular programming languages in use today - with good reason - Python is easier to learn and use than many comparable languages, and it has a huge ecosystem of [modules that add advanced functionality] (https://pypi.python.org/pypi) and it is becoming the language of choice of [interactive scientific analysis] (http://ipython.org/).

Python was built with data manipulation and analysis in mind, so it is the perfect fit for anyone who needs to work with and understand with data that’s not quite made for a spreadsheet.

Duration

2 days

Objectives

The course is designed so participants can immediately use what they learn upon returning to their work environment.

We cover the end-to-end workflow from data manipulation (getting your data out of the spreadsheet) to data cleansing and analysis as well as outputting the results back to a spreadsheet, or a report.

Part 1. Introduction to data manipulation

Most data you’re probably used to working with is in an Excel Spreadsheet. Excel is powerful, but there are some problems it just can’t help you with, like searching for a particular text or set of words pattern within your spreadsheet or automating a simple process like duplicating a column.

Luckily, these types of problems are easy to solve in python. But the first step is liberating your data from the confines of the spreadsheet, and getting it into the right format.

We’ll cover how to setup dates and numbers and save them in CSV format for use in the next two parts of the course.

Part 2. Introduction to Python

def is_today_friday(day):
    return day == 'Friday'

is_today_friday('Monday')
>>> False

If you can read the above snippet, congratulations - you’re already understanding python. As you can see it is one of the most readable langauages around, and as a result is one of the easiest to learn.

This part of the course teaches participants how to build and run a python program, how the language works, and how to translate the problem their trying to solve into a solution in python.

Part 3. Data wrangling and mining

The final part of the course will introduce two concepts building on parts 1 and 2.

The first concept is unstructured data wrangling. How can you use python to wrangle your data into a useful and usable format?

The second is data mining and analytics - once you’ve got your data into the right format, how would you data-mine it to extract insight?

Audience

This course is suitable for professionals that need to rapidly and accurately work with complex data sets and spreadsheets to inform and resolve real world business problems.

The course is designed for those seeking an introduction and hands-on training with analytics and data mining techniques or those looking to complement their existing analytical skill set with the high-productivity benefits delivered by Python.

Prerequisites

A minimal understanding of statistics (mean, median, variance, distribution, sampling) and spreadsheets (formula, date formatting, number formatting) is required.

Additional Notes

The course will be led by Felix Barbalet. Felix is well-known in the data-mining and analytics community in Canberra, and has worked for Department of Treasury, The Productivity Commission, and most recently, as a Senior Data Miner at the Australian Government’s Data Analytics Center of Excellence at the Australian Taxation Office.

His professional experience spans a range of technical disciplines including machine learning, computer science, math, statistics, business analytics and research economics. Felix is a Certified Scrum Master (CSM) and a Cloudera Certified Hadoop Developer and Administrator (CCDH,CCAH).

He is a regular speaker at analytics and data mining events such as IAPA and CeBIT Datacon

Printed course materials are provided, and all software used is provided on USB, including installation instructions for managed environments.

While the course is confirmed to run, we reserve the right to cancel the course up to 1 week in advance due to insufficient bookings. In such an event, attendees will be notified and a refund provided.

Bookings

Bookings can be made by emailing training@pv.tl or through Eventbrite