fbpx

Practical Statistics

An in-depth statistics course from a data science perspective

Ad-Hoc Course Registration:

  • Time: 18.30 – 21.30
  • Venue: Menara Kadin Lantai 4, Jl. H. Rasuna Said, Jakarta Selatan

  • Time: 18.30 – 21.30
  • Venue: Google Classroom

Course details :

Pave the statistical foundation for more advanced machine learning theories later on in the specialization by picking up the key ideas in statistical thinking. Learn to interpret correlations, construct confidence intervals and other statistical principles that form the basis of many common machine learning models.

The 2-day course is optional for participation of the Data Visualization and Machine Learning Specialization and intended for learners without prior experience in statistics.

Course schedule:

  • 5-Number Summary

    Day 1

  • Central Tendency & Variability

    Day 1

  • Standard Score and z-Score

    Day 1

  • Probabilities

    Day 2

  • Intervals

    Day 2

  • Inferential Statistics in Practice

    Day 2

Course Producer

Samuel Chan

An  RStudio-certified instructor and machine learning practitioner in the field of marketing automation, fraud detection, finance and e-commerce.  Samuel is Indonesia’s top-ranked Stack Overflow user in R (top 5% worldwide) for three years running, and boasts certifications from RStudio, Microsoft, MongoDB, Neo4J Database, Stanford University, John Hopkins University, among others.

Prior to Algoritma, he has 8 years of working experience, including a stint as in-house consultant to several public-trading companies from his time staying in China, Japan and Singapore. He is today an active trainer and consultant for various companies in the financial industry. He has guest lectured in various campuses: Binus, NUS (National University of Singapore)’s The Logistics Institute, University of Indonesia, Universitas Gadjah Mada (UGM), Binus, Institute of Technology Bandung (ITB), Telkom University etc. Courses he authored are offered also in Singapore through Ngee Ann Polytechnic.

Samuel is also among the first recipients of Microsoft Professional Program Certificate in Data Science in Southeast Asia, having demonstrated proficiency in R, Python, Microsoft Azure, SQL / T-SQL, PowerBI and a list of other technologies, and among the first to be certified in RStudio’s program. Technical committee member and competition judge on Finhacks 2018, the largest Machine Learning competition of the year organized by PT. Bank Central Asia (BCA) and DailySocial.

2-Day Workshop Modules

Syllabus: Practical Statistics

Module 1: Descriptive Statistics


5-Number Summary

  • Mean, Median, and Mode
  • Measures of Central Tendency
  • Quantiles in R

Central Tendency & Variability

  • Visualizing Central Tendency
  • Variance, and Covariance

Standard Score and z-Score

  • Standard Normal Curve
  • Central Limit Theorem
  • z-Score Calculation & Student’s T-test

Module 2: Inferential Statistics


Probabilities

  • Probability Mass Function
  • Probability Density Function
  • Expected Values
  • p-Values

Intervals

  • Confidence Intervals
  • Prediction Intervals

Inferential Statistics in Practice

  • Hypothesis Testing
  • Deriving Scientific Truths from Data
  • Case Study

Academy Modules


Tips & Techniques: R for Statisticians

  • Density Plots
  • Interpreting Box Plots (Box-and-Whisker)
  • Better Summary Statistics with skimr()

Learning-by-Building Module (Not Graded)

Statistical Treatment of Retail Dataset

  • Using what you’ve learned, formulate a question and derive a statistical hypothesis test to answer the question. You have to demonstrate that you’re able to make decisions using data in a scientific manner.
    Examples of questions can be:

    • Is there a difference in profitability between standard shipment and same-day shipment?
    • Supposed there is no difference in profitability between the different product segment, what is the probability that we obtain the current observation due to pure chance alone?

Workshop Receivables:

  • Workshop Lecturer’s Notes

    Including 2x Course Books (PDF), HTML files, course transcripts (if any).

  • Highly-accelerated Learning

    Learn under the assistance of mentorship of our lead instructor and a band of qualified teaching assistants throughout the 2-day course.

  • Certification of Completion

    Show current and prospective employers that you’ve completed the course with a signed certificate of completion.

  • Quality Learning Environment

    We pay meticulous attention to the logistical details of our workshops: quality audio and visual setups, comfortable sitting arrangements, small group size. Dinners are included for evening workshops.

  • Supplement Materials

    Receive supplement datasets to practice on, reference notes, working files (R Notebook or Jupyter Notebook), and other materials that will help you master the topics.

This workshop is recommended for:

The Programming for Data Science workshop is designed for casual learners, working professionals and non-programmers that are taking their first steps into data science and machine learning.

Students are not assumed to have a working knowledge of R or prior proficiency in statistics / mathematics / algebra. At such the workshop follows a gentle learning curve and emphasize on hands-on, one-to-one tutoring from our team of instructors and teaching assistants.

Consider taking our Intermediate-level workshops instead for more advanced-level materials in statistical programming and machine learning.

Past Workshops in this Series:

Students work through tons of real-life examples using sample datasets donated by our team of mentors and corporate partners. We believe in a learn-by-building approach, and we employ instructors who are uncompromisingly passionate about your growth and education.

Part of the Data Visualization and Machine Learning Specialization Track

This workshop is part of the two specialization tracks offered by Algoritma Data Science Academy. Participants are rewarded with a certificate of completion upon passing criteria, and are encouraged to advance further in the respective data science specialization.