fbpx

Machine Learning: Classification 1

The science of solving classification tasks

Ad-Hoc Course Registration:

  • Date: 11 – 14 January 2021
  • Time: 18.30 – 21.30
  • Venue: Menara Kadin Lantai 4, Jl. H. Rasuna Said, Jakarta Selatan
  • Investment: Rp. 5.200.000
  • Date: 11 – 14 January 2021
  • Time: 18.30 – 21.30
  • Investment: Rp. 2.600.000

REGISTER

Course details :

Learn to solve binary and multi-class classification models using machine learning algorithms that are easily understood and readily interpretable. You will learn to write a classification algorithm from scratch, and appreciate the mathematical foundations underpinning logistic regressions and nearest neighbors algorithms.

We strongly recommend that you complete the Regression Models workshop prior to taking this course. Upon completion of this workshop, you will acquire the depth to develop, apply, and evaluate two highly versatile algorithms widely used today.

Schedule

  • Relating Probabilities to Odds

    Day 1

  • Logistic Regression

    Day 2

  • Practical Tips and Case Study

    Day 2

  • Performance Evaluation and Model Selection

    Day 3

  • Learn-by-Building

    Day 4

Course Producer

Samuel Chan

An  RStudio-certified instructor and machine learning practitioner in the field of marketing automation, fraud detection, finance and e-commerce.  Samuel is Indonesia’s top-ranked Stack Overflow user in R (top 5% worldwide) for three years running, and boasts certifications from RStudio, Microsoft, MongoDB, Neo4J Database, Stanford University, John Hopkins University, among others.

Prior to Algoritma, he has 8 years of working experience, including a stint as in-house consultant to several public-trading companies from his time staying in China, Japan and Singapore. He is today an active trainer and consultant for various companies in the financial industry. He has guest lectured in various campuses: Binus, NUS (National University of Singapore)’s The Logistics Institute, University of Indonesia, Universitas Gadjah Mada (UGM), Binus, Institute of Technology Bandung (ITB), Telkom University etc. Courses he authored are offered also in Singapore through Ngee Ann Polytechnic.

Samuel is also among the first recipients of Microsoft Professional Program Certificate in Data Science in Southeast Asia, having demonstrated proficiency in R, Python, Microsoft Azure, SQL / T-SQL, PowerBI and a list of other technologies, and among the first to be certified in RStudio’s program. Technical committee member and competition judge on Finhacks 2018, the largest Machine Learning competition of the year organized by PT. Bank Central Asia (BCA) and DailySocial.

4-Day Workshop Modules

Syllabus: Classification in Machine Learning 1

Module 1: Logistic Regression


Relating Probabilities to Odds

  • Understanding Odds
  • Understanding Log of Odds
  • Plotting Odds and Log of Odds

Logistic Regression from First Principles

  • Sigmoidal Logistic Function
  • Key Assumptions of Sigmoid Function
  • Extra Proof: Intuition Behind The
  • Sigmoid Function

Logistic Regression in Action

  • Binary Logistic Regression
  • Interpreting Coefficients
  • Interpretation Against Continuous & Discrete Variables

Practical Tips and Case Study

  • Flight Delay Prediction Examples
  • Customer Churn and Attrition Examples
  • Risk Modeling on Loans from Quarter 4, 2017

Performance Evaluation and Model Selection

  • AIC (Akaike Information Criteria)
  • Null Deviance and Residual Deviance
  • Hauck Donner Effect

Module 2: Nearest Neighbours
Algorithm


Closer Look at Classification

  • Probabilities vs Class responses
  • Cross Validation and Out-of sample error
  • Bias-variance trade off
  • Confusion matrix (accuracy, sensitivity, specificity, & precision)

k-NN in Action

  • Characteristics of k-NN
  • Positives and Negatives
  • Diagnosing Breast Cancer with k-NN

Building Blocks of k-NN

  • Distance Function (Euclidean, Minkowsky)
  • The k Parameter
  • Standardization vs Min-Max Normalization

k-NN from First Principles

  • Classifying Customer Segments with k-NN
  • Writing Your Own k-NN Classifier
  • Predicting Using Your Own k-NN Classifier

Academy Modules


Graded Quiz

Learning-by-Building Module (3 Points)

Logistic Regression on Credit Risk

  • Applying what you’ve learned, present a simple R Markdown document in which you demonstrate the use of logistic regression on the lbb_loans.csv dataset. Explain your findings wherever necessary and show the necessary data preparation steps. To help you through the exercise, consider the following questions throughout the document:
    • How do we correctly interpret the negative coefficients obtained from your logistic regression?
    • How do we know which of the variables are more statistically significant as predictors?
    • What are some strategies to improve your model?

Customer Segment Prediction

  • Applying what you’ve learned, present a simple R Markdown document in which you demonstrate the use of k-NN on the wholesale.csv dataset. Compare the k-NN to the logistic regression model and answer the following questions throughout the document:
    • What is your accuracy? Was the logistic regression better than k-NN in terms of accuracy? (recall the lesson on obtaining an unbiased estimate of the model’s accuracy)
    • Was the logistic regression better than our kNN model at explaining which of the variables are good predictors of a customer’s industry?
    • List down 1 disadvantage and 1 strength of each of the approach (k-NN and logistic regression)

Ad-Hoc Course Registration:

  • Date: 11 – 14 January 2021
  • Time: 18.30 – 21.30
  • Venue: Menara Kadin Lantai 4, Jl. H. Rasuna Said, Jakarta Selatan
  • Investment: Rp. 5.200.000
  • Date: 11 – 14 January 2021
  • Time: 18.30 – 21.30
  • Investment: Rp. 2.600.000

REGISTER

Workshop Receivables:

  • Workshop Lecturer’s Notes

    Including 2x Course Books (PDF), HTML files, course transcripts (if any).

  • Highly-accelerated Learning

    Learn under the assistance of mentorship of our lead instructor and a band of qualified teaching assistants throughout the 4-day course.

  • Certification of Completion

    Show current and prospective employers that you’ve completed the course with a signed certificate of completion.

  • Quality Learning Environment

    We pay meticulous attention to the logistical details of our workshops: quality audio and visual setups, comfortable sitting arrangements, small group size. Dinners are included for evening workshops.

  • Supplement Materials

    Receive supplement datasets to practice on, reference notes, working files (R Notebook or Jupyter Notebook), and other materials that will help you master the topics.

This workshop is recommended for:

The Machine Learning: Classification 1 workshop is an intermediate-level programming workshop best suited to R programmers that are taking their first steps into data science and machine learning.

Students are assumed to have a working knowledge of R and have completed the necessary pre-requisites. Consider taking the pre-requisite course or a beginner-level course instead if you have no prior programming experience or statistics knowledge.

Past Workshops in this Series:

Students work through tons of real-life examples using sample datasets donated by our team of mentors and corporate partners. We believe in a learn-by-building approach, and we employ instructors who are uncompromisingly passionate about your growth and education.

Part of the Machine Learning Specialization

This workshop is part of the Machine Learning Specialization offered by Algoritma Data Science Academy. Participants are rewarded with a certificate of completion upon passing criteria, and are encouraged to advance further in the respective data science specialization.