Chapter 2: Probability Distributions and Sampling Distribution
Syllabus hours: 14 | Exam weight: 15 marks | Marks breakdown: distributions 10, sampling 5
Difficulty type: High priority | Version / Last Updated: 2026-04-18 | Not in syllabus: advanced measure-theoretic probability
Outcome: identify the correct distribution, compute probabilities, and use sampling distributions for inference foundations.
1. Fundamental Concepts
- Discrete random variables use PMFs; continuous random variables use PDFs.
- Expectation gives the long-run center; variance measures spread around the mean.
- Binomial, Poisson, and Negative Binomial are the main discrete models in this chapter.
- Normal, Gamma, and Chi-square are the key continuous/foundation models in PYQs.
- Sampling distributions and CLT support Chapter 3 confidence intervals and tests.
2. Core Methods and Formulas
When to use: use PMF/PDF normalization when a constant is missing; use model-specific formulas after identifying the distribution; use CLT for sample means and proportions.
When not to use: do not mix up exact and approximate methods; do not use Poisson when the trial count is fixed and the success probability is explicit; do not use Normal without checking the conditions.
3. Standard Models / Topics
Topic 1: Random Variables and PMF/PDF
Basic notes: a discrete random variable counts outcomes, while a continuous random variable measures on an interval. A PMF gives probability at a point; a PDF gives density, so probabilities come from integration.
Conditions / use: use PMF for countable outcomes; use PDF for measurable quantities like time, height, or voltage.
Formula recap: , ,
Seen-Before Check: missing constant k, “find the PDF/PMF constant,” or “probability between two values” are the key signals.
PYQ pattern: find k, then compute mean or probability.
[Core] Problem 1: For , find k and .
[Advanced] Problem 2: For the PMF with probabilities , find k and .
Topic 2: Expectation and Variance
Basic notes: expectation is the weighted average of all possible values and variance is the average squared spread. Use when the table or PDF makes direct variance awkward.
Conditions / use: use the distribution’s exact probability formula first, then compute the moments.
Formula recap: , ,
Seen-Before Check: any question asking “mean and variance” from a table is this topic.
PYQ pattern: compute moments from PMF/PDF, often with a missing k.
[Core] Problem 1: For PMF , find mean and variance.
[Advanced] Problem 2: If , find mean and variance.
Topic 3: Binomial, Poisson, Negative Binomial
Basic notes: Binomial counts successes in fixed trials, Poisson counts events in a fixed interval, and Negative Binomial counts trials until the r-th success. This is a high-frequency PYQ cluster and must be handled separately for each subtopic.
Conditions / use: Binomial needs fixed n and constant p; Poisson needs a rate parameter and rare independent events; Negative Binomial needs a target success count r.
Formula recap: for Binomial, for Poisson, and for Negative Binomial.
Seen-Before Check: fixed trials? count in time/space? trial until r-th success? That tells you the correct model immediately.
Extra shortcut: when is large, is small, and is moderate, use .
PYQ pattern: exact probability, at least / at most probability, mean/variance, and model comparison.
[Core] Problem 1: If 70% of chips receive enough coating, find probability exactly 10 out of 15 are acceptable.
[Advanced] Problem 2: For , find mean and variance.
[Core] Problem 3: Accidents occur at mean rate 2.5 per month. Find .
[PYQ-Trap] Problem 4: For the same Poisson process, find .
[Advanced] Problem 4b: Approximate for using Poisson.
[Core] Problem 5: Infection probability per child is 0.4. Find the probability that the 10th child is the 3rd infected.
[Advanced] Problem 6: For , find mean and variance of trials until 3rd success.
Topic 4: Normal, Gamma, Chi-square
Basic notes: Normal is symmetric and centered at μ; Gamma models positive skew; Chi-square appears in variance and goodness-of-fit logic. This topic is essential because it supplies the distribution language used in later inference problems.
Conditions / use: use Normal when the distribution is explicitly normal or when CLT applies; use Gamma for positive skewed waiting/consumption data; use Chi-square when a squared normal relation or variance-based test is involved.
Formula recap: , , and .
Seen-Before Check: “normally distributed,” “burning life,” “inadequate supply,” or “squared standard normal” are the quick triggers.
PYQ pattern: tail probability, interval probability, and transformation-based chi-square relation.
[Core] Problem 1: Breakdown voltage . Find .
[Advanced] Problem 2: Bulb life . Find .
[Advanced] Problem 3: Daily demand . Find .
Why this shortcut works: for , the Gamma survival function is the Erlang tail; integrating the general Gamma pdf twice gives the compact form used in PYQs.
[PYQ-Trap] Problem 4: If and , find .
Topic 5: Sampling Distribution and CLT
Basic notes: a sampling distribution is the distribution of a statistic over repeated samples. CLT says the sample mean is approximately normal for large n.
Conditions / use: use CLT for large samples and for approximate probabilities on sample means or proportions.
Formula recap: , , .
Seen-Before Check: “sample mean,” “standard error,” “large sample,” or “sampling distribution” indicates this topic.
PYQ pattern: probability on a sample mean and unbiasedness of sample proportion.
[Core] Problem 1: Service time has and . Find .
[Advanced] Problem 2: Population values are 1, 4, 8, 11. Show that sample proportion of odd numbers is unbiased for the population proportion.
4. Applied Problem Solving
- [Core] Find missing k in a PMF/PDF and evaluate a probability.
- [Core] Compute mean and variance from a discrete distribution table.
- [PYQ-Trap] Compare distributions and choose the correct model from wording cues.
5. System-Level Understanding
- Chapter 2 creates the random-variable and sampling language used in Chapter 3 inference.
- Binomial/Poisson/Negative Binomial are the repeated discrete-model pattern set in PYQs.
- Normal and CLT are the bridge from raw data to confidence intervals and tests.
6. Quick Reference
Binomial: fixed n, constant p, independent trials.
Poisson: count in interval with mean rate λ.
Negative Binomial: trials until r-th success.
Normal: standardize using .
Sampling mean: .
Seen-Before Check: fixed trials, rare events, waiting until success, normal wording, or sample mean/proportion.
7. Exam Tips
- Identify the model first, then write the exact formula with symbols.
- For Poisson and Binomial questions, write what k means before calculation.
- For “at least” questions, use complements when faster.
- Seen-Before Check: ask yourself whether the question is about fixed trials, rare events, normal approximation, or a sample statistic before deriving anything.
8. Common Pitfalls
- Using Poisson when the setup is clearly a fixed-trial Binomial problem.
- Forgetting that a PDF must be integrated to get probability.
- Using σ when the problem provides σ², or vice versa.
- Skipping standard error before a sample mean probability calculation.
9. Tools and Guides
- Calculator tips: keep cumulative normal table values handy for Z-based questions.
- Test-statistic cheat line: CLT turns sample means into Normal-style problems.
- Distribution selection rule: fixed trials → Binomial, rate over interval → Poisson, waiting until r-th success → Negative Binomial.