Statistics and Data Exploration

Home/ Statistics and Data Exploration
Course TypeCourse CodeNo. Of Credits
Foundation CoreSLS2EC1044

Semester and Year Offered: 1st Semester 1st year

Course Coordinator and Team: Saranika Sarkar

Email of course coordinator:

Pre-requisites: Students are expected to have done a first level course in Statistics covering Descriptive Statistics, Probability, Basics of Estimation, Tests of Significance based on Normal Distribution, and Simple Regression.

Course Objectives/Description:

This course aims to train students in application of statistical methods for data analysis. It will primarily focus on empirical investigation of relationships by means of regression analysis and other related methods. The course deals with data analysis in both exploratory and confirmatory framework though the relative emphasis will be on the former. Classical courses on Statistics sought to train students and practitioners in the art of ‘testing ideas with data’ based upon the theory of probability and statistical inference. This is the confirmatory framework. The exploratory framework, on the other hand, constitutes a different paradigm of learning from data in a theory guided process i.e ‘getting ideas from data’ given the knowledge of the subject matter. Confirmatory analysis is about summarising data for testing of hypotheses, while exploratory analysis is about visualising data for discovery of hypotheses. The pedagogic approach is that of ‘learning by doing’, and to enable students to ‘think with data’ in order to argue with evidence.

Course Outcomes:

On successful completion of this course students will be able to:

  1. Identify the key techniques relevant for exploring and analysing economic data.
  2. Develop the capacity to use clues and ideas from data for developing empirical models required for economic analysis.
  3. Handle large datasets using statistical tools such as Stata for empirical analysis.
  4. Use the statistical methods and techniques introduced in this course for their own empirical research work.

Brief description of modules/ Main modules:

  1. Exploring distributions: centre, spread, shape and tails
  2. Comparing distributions: transformation and shape
  3. Investigating relationships: regression idea and the classical model
  4. Woes of regression: influential point and other diagnostics
  5. Changing the scatter: non-linearity, heteroscadasticity and transformation
  6. Simple to multiple regression: interpreting co-efficients and diagnostic analysis
  7. Assessing uncertainty: confidence intervals and tests of significance, classical and bootstrap
  8. Exploring change over time: trend, breaks and growth rates
  9. Broadening the scope: quantile regression
  10. Categorical response: logit regression model

Assessment Details with weights:

Three written assignments of weights 30%, 30% and 40% respectively. The students may have to write the assignments in the computer-lab (under vigilance) and submit online.

Test 1 will be a class test based on material covered during first half of the semester.

Test 2 will be a class test based on material covered during second half of the semester.

Test 3 will be an end of semester class test based on all material covered in the course.

Reading List:

Selected chapters/sections from textbooks, lecture notes and handouts. Chapters/sections will be mostly drawn from the following three books:

  • Regression with Graphics (1992) by Lawrence C. Hamilton, Brooks/Cole (Acc. No. 10031 at KG and 10030 at Dwarka Campus, 519.536 HAM-R);
  • Econometrics and Data Analysis for Developing Countries (1998) by C. Mukherjee et. al ((AUD Library Acc. No. 9661and 9660, 330.015195 MUK-E);
  • Introduction to Econometrics (2001) by G.S. Maddala, Wiley.