fbpx

Data Science Intermediate

3

Days

Course details :

This intermediate-level workshop is designed to help you master regression models and a closely related concept, generalized linear models. Regression models has been called the “workhorse” of data science, and often is the first tool of choice for any machine learning task. Students will learn to fit, interpret and examine simple and multiple linear regression models as we build regression models modeled after real-life business cases.

The course will take the student beyond fitting a regression line and interpreting our regression coefficients. We’ll discuss concepts such as confounding variables, and using various diagnostic tools to evaluate our models. Student will also replicate notable case studies and observe how linear methods are equally, if not more, successful at performing complex machine learning tasks than many of the newer AI technologies.

Data is the new oil? No: Data is the new soil. ~ David McCandless

Please bring along:

  • 1x Laptop
  • Purchased ticket (from organizer’s website)

In a nutshell

  • Linear Regression

    Day 1

  • Correlation and Residuals

    Day 1

  • Ordinary Least Squares (OLS)

    Day 2

  • Multiple Regression

    Day 2

  • Logistic Regression

    Day 3

  • Artificial Neural Network

    Day 3

Date to be advised

Trainer

Samuel Chan

samuel@algorit.ma

Detailed Syllabus

Syllabus: Data Science Intermediate (I)

Regression Models

  • Why is linear models so often the first tool of choice for statisticians and data scientists
  • Linear models, visually
  • Estimating the regression coefficients
  • Regression through the origin

Residual Variation and Model Interpretation

  • Properties of the residuals
  • Estimating our error term
  • Total variability vs Regression variability
  • t-Statistics and P-value
  • Confidence level and Confidence Intervals

Least Squares Estimates in Multivariable Regressions

  • Regression with two regressors
  • Simulation: Multivariable regressions
  • Estimating the slope: 3 methods
  • Putting it together

Key Concepts in Regression

  • Adjustment
  • Model diagnostics
  • Model selection
  • ANOVA
  • Tips on picking and fitting a regression model

Logistic Regression and Poisson Regressions

  • Generalized linear model
  • The logit() and inverted logit() function
  • A closer look at the sigmoid curve
  • Logistic regression for a binary response variable
  • Poisson regression examples

Extra tips and techniques

  • Applying regression analysis in the context of machine learning
  • Polynomial regression
  • Tools to evaluate model fit
  • Techniques to test for outliers, their leverage and influence

Artificial Neural Network

  • Cost Function
  • Backpropagational and Feed-Forward algorithms
  • Implementing ANN for optical character recognition
  • Fine-tuning our Neural Nets parameter

This workshop will cost 3 workshop credits for subscribers. Non-subscribers are welcomed to participate at a cost of IDR3,000,000.

Workshop Receivables:

  • Workshop Lecturer’s Notes

    Including 2x Course Books (PDF), HTML files, course transcripts (if any).

  • Highly-accelerated Learning

    Learn under the assistance of mentorship of our lead instructor and a band of qualified teaching assistants throughout the 3 day course.

  • Certification of Completion

    Show current and prospective employers that you’ve completed the course with a signed certificate of completion.

  • Quality Learning Environment

    We pay meticulous attention to the logistical details of our workshops: quality audio and visual setups, comfortable sitting arrangements, small group size. Dinners are included for evening workshops.

  • Supplement Materials

    Receive supplement datasets to practice on, reference notes, working files (R Notebook or Jupyter Notebook), and other materials that will help you master the topics.

Data Science Intermediate Series

Workshops in our Data Science Intermediate series are tailored to R programmers that are taking their first steps into data science and machine learning.

Students are assumed to have a working knowledge of R and ideally some proficiency in statistics / mathematics / algebra. Consider taking our Data Science Fundamentals workshops instead if you have no prior programming experience or statistics knowledge:

Past Workshops in this Series:

Students work through tons of real-life examples using sample datasets donated by our team of mentors and corporate partners. We believe in a learn-by-building approach, and we employ instructors who are uncompromisingly passionate about your growth and education.