ENSH 304 Probability Notes

Chapter 2: Probability Distributions and Sampling Distribution

Syllabus hours: 14 | Exam weight: 15 marks | Marks breakdown: distributions 10, sampling 5

Difficulty type: High priority | Version / Last Updated: 2026-04-18 | Not in syllabus: advanced measure-theoretic probability

Outcome: identify the correct distribution, compute probabilities, and use sampling distributions for inference foundations.

1. Fundamental Concepts

Discrete random variables use PMFs; continuous random variables use PDFs.
Expectation gives the long-run center; variance measures spread around the mean.
Binomial, Poisson, and Negative Binomial are the main discrete models in this chapter.
Normal, Gamma, and Chi-square are the key continuous/foundation models in PYQs.
Sampling distributions and CLT support Chapter 3 confidence intervals and tests.

2. Core Methods and Formulas

When to use: use PMF/PDF normalization when a constant is missing; use model-specific formulas after identifying the distribution; use CLT for sample means and proportions.

When not to use: do not mix up exact and approximate methods; do not use Poisson when the trial count is fixed and the success probability is explicit; do not use Normal without checking the conditions.

\sum_x p(x)=1,\quad f(x)\ge 0,\quad \int f(x)dx=1

E[X]=\sum x p(x),\quad E[X]=\int x f(x)dx

Var(X)=E[X^2]-\{E[X]\}^2

Z=\frac{X-\mu}{\sigma},\quad Z_{\bar X}=\frac{\bar X-\mu}{\sigma/\sqrt n}

SE(\bar X)=\frac{\sigma}{\sqrt n},\quad SE(\hat p)=\sqrt{\frac{p(1-p)}{n}}

3. Standard Models / Topics

Topic 1: Random Variables and PMF/PDF

Basic notes: a discrete random variable counts outcomes, while a continuous random variable measures on an interval. A PMF gives probability at a point; a PDF gives density, so probabilities come from integration.

Conditions / use: use PMF for countable outcomes; use PDF for measurable quantities like time, height, or voltage.

Formula recap: $\sum_x p(x)=1$ , $\int f(x)dx=1$ , $P(a<X<b)=\int_a^b f(x)dx$

Seen-Before Check: missing constant k, “find the PDF/PMF constant,” or “probability between two values” are the key signals.

PYQ pattern: find k, then compute mean or probability.

[Core] Problem 1: For $f(x)=k\sqrt{x},\ 0<x<1$ , find k and $P(X<1/4)$ .

\int_0^1 k\sqrt{x}dx=1\Rightarrow k\frac23=1\Rightarrow k=\frac32

P(X<1/4)=\int_0^{1/4}\frac32\sqrt{x}dx=(1/4)^{3/2}=1/8=0.125

[Advanced] Problem 2: For the PMF $x=-1,0,1,2,3$ with probabilities $0.1,k,0.2,2k,0.3$ , find k and $E[X]$ .

0.1+k+0.2+2k+0.3=1\Rightarrow k=0.1333

E[X]=(-1)(0.1)+0(0.1333)+1(0.2)+2(0.2666)+3(0.3)=1.5332

Interpretation checklist: confirm whether the answer is a point probability or an interval probability; for PDFs, report the interval area, not the density value.

Topic 2: Expectation and Variance

Basic notes: expectation is the weighted average of all possible values and variance is the average squared spread. Use $E[X^2]-E[X]^2$ when the table or PDF makes direct variance awkward.

Conditions / use: use the distribution’s exact probability formula first, then compute the moments.

Formula recap: $E[X]=\sum xp(x)$ , $E[X^2]=\sum x^2p(x)$ , $Var(X)=E[X^2]-E[X]^2$

Seen-Before Check: any question asking “mean and variance” from a table is this topic.

PYQ pattern: compute moments from PMF/PDF, often with a missing k.

[Core] Problem 1: For PMF $x:0,1,2,3; p(x):0.2,0.4,0.3,0.1$ , find mean and variance.

E[X]=1.3,\quad E[X^2]=2.5,\quad Var(X)=2.5-(1.3)^2=0.81

[Advanced] Problem 2: If $F(x)=1-e^{-2x},\ x\ge0$ , find mean and variance.

f(x)=2e^{-2x},\quad E[X]=1/2=0.5,\quad Var(X)=1/2^2=0.25

Interpretation checklist: report what the mean tells about location and what the variance tells about stability.

Topic 3: Binomial, Poisson, Negative Binomial

Basic notes: Binomial counts successes in fixed trials, Poisson counts events in a fixed interval, and Negative Binomial counts trials until the r-th success. This is a high-frequency PYQ cluster and must be handled separately for each subtopic.

Conditions / use: Binomial needs fixed n and constant p; Poisson needs a rate parameter and rare independent events; Negative Binomial needs a target success count r.

Formula recap: $P(X=k)=\binom{n}{k}p^k(1-p)^{n-k}$ for Binomial, $P(X=k)=\frac{e^{-\lambda}\lambda^k}{k!}$ for Poisson, and $P(X=x)=\binom{x-1}{r-1}p^r(1-p)^{x-r}$ for Negative Binomial.

Seen-Before Check: fixed trials? count in time/space? trial until r-th success? That tells you the correct model immediately.

Extra shortcut: when $n$ is large, $p$ is small, and $np=\lambda$ is moderate, use $Binomial(n,p)\approx Poisson(\lambda)$ .

PYQ pattern: exact probability, at least / at most probability, mean/variance, and model comparison.

[Core] Problem 1: If 70% of chips receive enough coating, find probability exactly 10 out of 15 are acceptable.

X\sim Binomial(15,0.7),\quad P(X=10)=\binom{15}{10}(0.7)^{10}(0.3)^5\approx0.206

[Advanced] Problem 2: For $X\sim Binomial(15,0.7)$ , find mean and variance.

E[X]=np=10.5,\quad Var(X)=np(1-p)=3.15

[Core] Problem 3: Accidents occur at mean rate 2.5 per month. Find $P(X\le3)$ .

X\sim Poisson(2.5),\quad P(X\le3)=\sum_{k=0}^{3}\frac{e^{-2.5}2.5^k}{k!}\approx0.7576

[PYQ-Trap] Problem 4: For the same Poisson process, find $P(X\ge1)$ .

P(X\ge1)=1-P(X=0)=1-e^{-2.5}=0.9179

[Advanced] Problem 4b: Approximate $P(X=0)$ for $X\sim Binomial(100,0.02)$ using Poisson.

\lambda=np=2,\quad P(X=0)\approx e^{-2}=0.1353

[Core] Problem 5: Infection probability per child is 0.4. Find the probability that the 10th child is the 3rd infected.

X\sim NegBin(r=3,p=0.4),\quad P(X=10)=\binom{9}{2}(0.4)^3(0.6)^7\approx0.0645

[Advanced] Problem 6: For $r=3,p=0.4$ , find mean and variance of trials until 3rd success.

E[X]=\frac{r}{p}=7.5,\quad Var(X)=\frac{r(1-p)}{p^2}=11.25

Interpretation checklist: identify whether the answer is a count, a rate, or a waiting time.

Topic 4: Normal, Gamma, Chi-square

Basic notes: Normal is symmetric and centered at μ; Gamma models positive skew; Chi-square appears in variance and goodness-of-fit logic. This topic is essential because it supplies the distribution language used in later inference problems.

Conditions / use: use Normal when the distribution is explicitly normal or when CLT applies; use Gamma for positive skewed waiting/consumption data; use Chi-square when a squared normal relation or variance-based test is involved.

Formula recap: $Z=\frac{X-\mu}{\sigma}$ , $X\sim Gamma(\alpha=2,\beta): P(X>x)=e^{-x/\beta}(1+x/\beta)$ , and $Y=Z^2\sim\chi^2(1)$ .

Seen-Before Check: “normally distributed,” “burning life,” “inadequate supply,” or “squared standard normal” are the quick triggers.

PYQ pattern: tail probability, interval probability, and transformation-based chi-square relation.

[Core] Problem 1: Breakdown voltage $X\sim N(40,1.5^2)$ . Find $P(39<X<42)$ .

z_1=-0.67,\quad z_2=1.33,\quad P(39<X<42)=\Phi(1.33)-\Phi(-0.67)=0.6568

[Advanced] Problem 2: Bulb life $X\sim N(250,50^2)$ . Find $P(X>300)$ .

z=1,\quad P(X>300)=1-\Phi(1)=0.1587

[Advanced] Problem 3: Daily demand $X\sim Gamma(\alpha=2,\beta=3)$ . Find $P(X>12)$ .

P(X>12)=e^{-4}(1+4)=5e^{-4}=0.0916

Why this shortcut works: for $\alpha=2$ , the Gamma survival function is the Erlang tail; integrating the general Gamma pdf twice gives the compact form used in PYQs.

[PYQ-Trap] Problem 4: If $Y=Z^2$ and $Z\sim N(0,1)$ , find $P(Y<3.84)$ .

P(Y<3.84)=P(-1.96<Z<1.96)=0.95

Interpretation checklist: interpret whether the result is a tail risk, a service-level probability, or a variance-related threshold.

Topic 5: Sampling Distribution and CLT

Basic notes: a sampling distribution is the distribution of a statistic over repeated samples. CLT says the sample mean is approximately normal for large n.

Conditions / use: use CLT for large samples and for approximate probabilities on sample means or proportions.

Formula recap: $SE(\bar X)=\frac{\sigma}{\sqrt n}$ , $SE(\hat p)=\sqrt{\frac{p(1-p)}{n}}$ , $Z_{\bar X}=\frac{\bar X-\mu}{\sigma/\sqrt n}$ .

Seen-Before Check: “sample mean,” “standard error,” “large sample,” or “sampling distribution” indicates this topic.

PYQ pattern: probability on a sample mean and unbiasedness of sample proportion.

[Core] Problem 1: Service time has $\mu=176,\sigma^2=256$ and $n=100$ . Find $P(175<\bar X<178)$ .

SE(\bar X)=1.6,\quad z_1=-0.625,\quad z_2=1.25,\quad P=0.6284

[Advanced] Problem 2: Population values are 1, 4, 8, 11. Show that sample proportion of odd numbers is unbiased for the population proportion.

P=2/4=0.5,\quad Y\sim Binomial(2,0.5),\quad E(\hat p)=E(Y/2)=0.5

Interpretation checklist: state the meaning of the standard error and whether the sample-based estimate is centered at the population value.

4. Applied Problem Solving

[Core] Find missing k in a PMF/PDF and evaluate a probability.
[Core] Compute mean and variance from a discrete distribution table.
[PYQ-Trap] Compare distributions and choose the correct model from wording cues.

5. System-Level Understanding

Chapter 2 creates the random-variable and sampling language used in Chapter 3 inference.
Binomial/Poisson/Negative Binomial are the repeated discrete-model pattern set in PYQs.
Normal and CLT are the bridge from raw data to confidence intervals and tests.

6. Quick Reference

Binomial: fixed n, constant p, independent trials.

Poisson: count in interval with mean rate λ.

Negative Binomial: trials until r-th success.

Normal: standardize using $Z=(X-\mu)/\sigma$ .

Sampling mean: $SE(\bar X)=\sigma/\sqrt n$ .

Seen-Before Check: fixed trials, rare events, waiting until success, normal wording, or sample mean/proportion.

7. Exam Tips

Identify the model first, then write the exact formula with symbols.
For Poisson and Binomial questions, write what k means before calculation.
For “at least” questions, use complements when faster.
Seen-Before Check: ask yourself whether the question is about fixed trials, rare events, normal approximation, or a sample statistic before deriving anything.

8. Common Pitfalls

Using Poisson when the setup is clearly a fixed-trial Binomial problem.
Forgetting that a PDF must be integrated to get probability.
Using σ when the problem provides σ², or vice versa.
Skipping standard error before a sample mean probability calculation.

9. Tools and Guides

Calculator tips: keep cumulative normal table values handy for Z-based questions.
Test-statistic cheat line: CLT turns sample means into Normal-style problems.
Distribution selection rule: fixed trials → Binomial, rate over interval → Poisson, waiting until r-th success → Negative Binomial.