DATA 606 - Statistics & Probability - Spring 2021


Instructor: Jason Bryer, Ph.D.
Class Meetup: Wednesday 8:30pm to 9:30pm
Office Hours: Friday 12pm to 1pm & by appointment

Course Description

This course covers basic techniques in probability and statistics that are important in the field of data analytics. Discrete probability models, sampling from infinite and finite populations, statistical distributions, basic Bayesian statistics, and non-parametric statistical techniques for categorical data are covered in this course. Each of these statistical concepts will be applied in a variety of real-world scenarios through the use of case studies and customized data sets.

Course Learning Outcomes:

By then end of the course, students should be able to:

  • Understand the foundations of probability theory and perform basic probability calculations.
  • Build basic stochastic models for commonly encountered business problems.
  • Model situations involving uncertainty using appropriate probability distributions and conditional techniques.
  • Explore and summarize data using descriptive statistics.
  • Test hypotheses using classical and modern computational techniques.
  • Construct estimators and calculate intervals using classical and modern computational techniques.
  • Perform basic Bayesian statistical techniques for estimation and testing hypotheses.

Program Learning Outcomes addressed by the course:

  • Business Understanding. Learn when probabilistic techniques apply to certain categories of business problems, discuss the sorts of solutions that are possible, and understand the limitations of these techniques.
  • Foundational Math Skills. Explore and analyze data, build probabilistic and statistical models, construct estimators, and test hypotheses.
  • Predictive Modeling. Learn foundational techniques that underlie predictive modeling algorithms, such as Naïve Bayes.
  • Presentation. Complete and submit collaborative assignments using techniques from the course.

How is this course relevant for data analytics professionals?

Probabilistic techniques are the foundation of many data science applications from data exploration and visualization to outlier analysis, stochastic modeling, and data mining algorithms. This course will ensure that students have a strong understanding of these foundations.


Grade Distribution

Quality of Performance Letter Grade Range % GPA
Excellent - work is of exceptional quality A 93 - 100 4
Excellent A- 90 - 92.9 3.7
Good - work is above average B+ 87 - 89.9 3.3
Satisfactory B 83 - 86.9 3
Below Average B- 80 - 82.9 2.7
Poor C+ 77 - 79.9 2.3
Poor C 70 - 76.9 2
Failure F < 70 0

How This Course Works

This course is conducted entirely online. Each week, you will have various resources made available, including weekly readings from the textbooks and occasionally additional readings provided by the instructor. Most weeks will have homework assignments and labs to be submitted (although some chapters will take more than one week, see the schedule for details). There will also be a presentation required and a forum post introduction required. You are expected to complete all assignments by their due dates.

You are expected to attend or watch every Meetup. I highly recommend attending the Meetups live if possible but I understand that may not be possible for everyone. Recordings will be made available by the next morning on the Meetups page. In addition to highlighting key concepts from each learning module, some topics will be discussed that are not in the textbook. Moreover, I regularly make announcements in the Meetups that will be important to being successful in this course. At the end of each Meetup there will be a short reflective exercise. These will contribute to your participation grade.

Meetup presentations will comprise the solution and presentation to the class of one of the suggested problems for study from the weekly materials (not the graded homework problems). Each student must present one problem during the semester. Problems are chosen by entering your name and problem in the Google Spreadsheet. Note there is a maximum of three presentations per meetup and presentations should be no more than five minutes. Prepare your presentation so that the slides or document (I suggest using R Markdown) will be shared on the course website. Problems are assigned first come, first served, so any problem not already chosen by another student is available.

The culmination of the course will be the presentation of the analysis of a dataset of your choosing. There will be a number of time slots available to present. You will be required to attend one presentation session, present yoru analysis and provide peer feedback for other students in that timeslot. See the project for more information.

Accessibility and Accommodations

The CUNY School of Professional Studies is firmly committed to making higher education accessible to students with disabilities by removing architectural barriers and providing programs and support services necessary for them to benefit from the instruction and resources of the University. Early planning is essential for many of the resources and accommodations provided. Please see:

Online Etiquette and Anti-Harassment Policy

The University strictly prohibits the use of University online resources or facilities, including Blackboard, for the purpose of harassment of any individual or for the posting of any material that is scandalous, libelous, offensive or otherwise against the University’s policies. Please see:

Academic Integrity

Academic dishonesty is unacceptable and will not be tolerated. Cheating, forgery, plagiarism and collusion in dishonest acts undermine the educational mission of the City University of New York and the students' personal and intellectual growth. Please see:

Student Support Services

If you need any additional help, please visit Student Support Services:

Last updated on Tue Oct 17, 2017
Edit on GitHub