ENSH 304 Chapter Notes

Each chapter has a separate page with the same framework.

Chapter 2: Probability Distributions and Sampling Distribution

Syllabus hours: 14 | Exam weight: 15 marks | Marks breakdown: distributions 10, sampling 5

Difficulty type: High priority | Version / Last Updated: 2026-04-18 | Not in syllabus: advanced measure-theoretic probability

Outcome: identify the correct distribution, compute probabilities, and use sampling distributions for inference foundations.

1. Fundamental Concepts

  • Discrete random variables use PMFs; continuous random variables use PDFs.
  • Expectation gives the long-run center; variance measures spread around the mean.
  • Binomial, Poisson, and Negative Binomial are the main discrete models in this chapter.
  • Normal, Gamma, and Chi-square are the key continuous/foundation models in PYQs.
  • Sampling distributions and CLT support Chapter 3 confidence intervals and tests.

2. Core Methods and Formulas

When to use: use PMF/PDF normalization when a constant is missing; use model-specific formulas after identifying the distribution; use CLT for sample means and proportions.

When not to use: do not mix up exact and approximate methods; do not use Poisson when the trial count is fixed and the success probability is explicit; do not use Normal without checking the conditions.

xp(x)=1,f(x)0,f(x)dx=1\sum_x p(x)=1,\quad f(x)\ge 0,\quad \int f(x)dx=1
E[X]=xp(x),E[X]=xf(x)dxE[X]=\sum x p(x),\quad E[X]=\int x f(x)dx
Var(X)=E[X2]{E[X]}2Var(X)=E[X^2]-\{E[X]\}^2
Z=Xμσ,ZXˉ=Xˉμσ/nZ=\frac{X-\mu}{\sigma},\quad Z_{\bar X}=\frac{\bar X-\mu}{\sigma/\sqrt n}
SE(Xˉ)=σn,SE(p^)=p(1p)nSE(\bar X)=\frac{\sigma}{\sqrt n},\quad SE(\hat p)=\sqrt{\frac{p(1-p)}{n}}

3. Standard Models / Topics

Topic 1: Random Variables and PMF/PDF

Basic notes: a discrete random variable counts outcomes, while a continuous random variable measures on an interval. A PMF gives probability at a point; a PDF gives density, so probabilities come from integration.

Conditions / use: use PMF for countable outcomes; use PDF for measurable quantities like time, height, or voltage.

Formula recap: xp(x)=1\sum_x p(x)=1, f(x)dx=1\int f(x)dx=1, P(a<X<b)=abf(x)dxP(a<X<b)=\int_a^b f(x)dx

Seen-Before Check: missing constant k, “find the PDF/PMF constant,” or “probability between two values” are the key signals.

PYQ pattern: find k, then compute mean or probability.

[Core] Problem 1: For f(x)=kx, 0<x<1f(x)=k\sqrt{x},\ 0<x<1, find k and P(X<1/4)P(X<1/4).

01kxdx=1k23=1k=32\int_0^1 k\sqrt{x}dx=1\Rightarrow k\frac23=1\Rightarrow k=\frac32
P(X<1/4)=01/432xdx=(1/4)3/2=1/8=0.125P(X<1/4)=\int_0^{1/4}\frac32\sqrt{x}dx=(1/4)^{3/2}=1/8=0.125

[Advanced] Problem 2: For the PMF x=1,0,1,2,3x=-1,0,1,2,3 with probabilities 0.1,k,0.2,2k,0.30.1,k,0.2,2k,0.3, find k and E[X]E[X].

0.1+k+0.2+2k+0.3=1k=0.13330.1+k+0.2+2k+0.3=1\Rightarrow k=0.1333
E[X]=(1)(0.1)+0(0.1333)+1(0.2)+2(0.2666)+3(0.3)=1.5332E[X]=(-1)(0.1)+0(0.1333)+1(0.2)+2(0.2666)+3(0.3)=1.5332
Interpretation checklist: confirm whether the answer is a point probability or an interval probability; for PDFs, report the interval area, not the density value.

Topic 2: Expectation and Variance

Basic notes: expectation is the weighted average of all possible values and variance is the average squared spread. Use E[X2]E[X]2E[X^2]-E[X]^2 when the table or PDF makes direct variance awkward.

Conditions / use: use the distribution’s exact probability formula first, then compute the moments.

Formula recap: E[X]=xp(x)E[X]=\sum xp(x), E[X2]=x2p(x)E[X^2]=\sum x^2p(x), Var(X)=E[X2]E[X]2Var(X)=E[X^2]-E[X]^2

Seen-Before Check: any question asking “mean and variance” from a table is this topic.

PYQ pattern: compute moments from PMF/PDF, often with a missing k.

[Core] Problem 1: For PMF x:0,1,2,3;p(x):0.2,0.4,0.3,0.1x:0,1,2,3; p(x):0.2,0.4,0.3,0.1, find mean and variance.

E[X]=1.3,E[X2]=2.5,Var(X)=2.5(1.3)2=0.81E[X]=1.3,\quad E[X^2]=2.5,\quad Var(X)=2.5-(1.3)^2=0.81

[Advanced] Problem 2: If F(x)=1e2x, x0F(x)=1-e^{-2x},\ x\ge0, find mean and variance.

f(x)=2e2x,E[X]=1/2=0.5,Var(X)=1/22=0.25f(x)=2e^{-2x},\quad E[X]=1/2=0.5,\quad Var(X)=1/2^2=0.25
Interpretation checklist: report what the mean tells about location and what the variance tells about stability.

Topic 3: Binomial, Poisson, Negative Binomial

Basic notes: Binomial counts successes in fixed trials, Poisson counts events in a fixed interval, and Negative Binomial counts trials until the r-th success. This is a high-frequency PYQ cluster and must be handled separately for each subtopic.

Conditions / use: Binomial needs fixed n and constant p; Poisson needs a rate parameter and rare independent events; Negative Binomial needs a target success count r.

Formula recap: P(X=k)=(nk)pk(1p)nkP(X=k)=\binom{n}{k}p^k(1-p)^{n-k} for Binomial, P(X=k)=eλλkk!P(X=k)=\frac{e^{-\lambda}\lambda^k}{k!} for Poisson, and P(X=x)=(x1r1)pr(1p)xrP(X=x)=\binom{x-1}{r-1}p^r(1-p)^{x-r} for Negative Binomial.

Seen-Before Check: fixed trials? count in time/space? trial until r-th success? That tells you the correct model immediately.

Extra shortcut: when nn is large, pp is small, and np=λnp=\lambda is moderate, use Binomial(n,p)Poisson(λ)Binomial(n,p)\approx Poisson(\lambda).

PYQ pattern: exact probability, at least / at most probability, mean/variance, and model comparison.

[Core] Problem 1: If 70% of chips receive enough coating, find probability exactly 10 out of 15 are acceptable.

XBinomial(15,0.7),P(X=10)=(1510)(0.7)10(0.3)50.206X\sim Binomial(15,0.7),\quad P(X=10)=\binom{15}{10}(0.7)^{10}(0.3)^5\approx0.206

[Advanced] Problem 2: For XBinomial(15,0.7)X\sim Binomial(15,0.7), find mean and variance.

E[X]=np=10.5,Var(X)=np(1p)=3.15E[X]=np=10.5,\quad Var(X)=np(1-p)=3.15

[Core] Problem 3: Accidents occur at mean rate 2.5 per month. Find P(X3)P(X\le3).

XPoisson(2.5),P(X3)=k=03e2.52.5kk!0.7576X\sim Poisson(2.5),\quad P(X\le3)=\sum_{k=0}^{3}\frac{e^{-2.5}2.5^k}{k!}\approx0.7576

[PYQ-Trap] Problem 4: For the same Poisson process, find P(X1)P(X\ge1).

P(X1)=1P(X=0)=1e2.5=0.9179P(X\ge1)=1-P(X=0)=1-e^{-2.5}=0.9179

[Advanced] Problem 4b: Approximate P(X=0)P(X=0) for XBinomial(100,0.02)X\sim Binomial(100,0.02) using Poisson.

λ=np=2,P(X=0)e2=0.1353\lambda=np=2,\quad P(X=0)\approx e^{-2}=0.1353

[Core] Problem 5: Infection probability per child is 0.4. Find the probability that the 10th child is the 3rd infected.

XNegBin(r=3,p=0.4),P(X=10)=(92)(0.4)3(0.6)70.0645X\sim NegBin(r=3,p=0.4),\quad P(X=10)=\binom{9}{2}(0.4)^3(0.6)^7\approx0.0645

[Advanced] Problem 6: For r=3,p=0.4r=3,p=0.4, find mean and variance of trials until 3rd success.

E[X]=rp=7.5,Var(X)=r(1p)p2=11.25E[X]=\frac{r}{p}=7.5,\quad Var(X)=\frac{r(1-p)}{p^2}=11.25
Interpretation checklist: identify whether the answer is a count, a rate, or a waiting time.

Topic 4: Normal, Gamma, Chi-square

Basic notes: Normal is symmetric and centered at μ; Gamma models positive skew; Chi-square appears in variance and goodness-of-fit logic. This topic is essential because it supplies the distribution language used in later inference problems.

Conditions / use: use Normal when the distribution is explicitly normal or when CLT applies; use Gamma for positive skewed waiting/consumption data; use Chi-square when a squared normal relation or variance-based test is involved.

Formula recap: Z=XμσZ=\frac{X-\mu}{\sigma}, XGamma(α=2,β):P(X>x)=ex/β(1+x/β)X\sim Gamma(\alpha=2,\beta): P(X>x)=e^{-x/\beta}(1+x/\beta), and Y=Z2χ2(1)Y=Z^2\sim\chi^2(1).

Seen-Before Check: “normally distributed,” “burning life,” “inadequate supply,” or “squared standard normal” are the quick triggers.

PYQ pattern: tail probability, interval probability, and transformation-based chi-square relation.

[Core] Problem 1: Breakdown voltage XN(40,1.52)X\sim N(40,1.5^2). Find P(39<X<42)P(39<X<42).

z1=0.67,z2=1.33,P(39<X<42)=Φ(1.33)Φ(0.67)=0.6568z_1=-0.67,\quad z_2=1.33,\quad P(39<X<42)=\Phi(1.33)-\Phi(-0.67)=0.6568

[Advanced] Problem 2: Bulb life XN(250,502)X\sim N(250,50^2). Find P(X>300)P(X>300).

z=1,P(X>300)=1Φ(1)=0.1587z=1,\quad P(X>300)=1-\Phi(1)=0.1587

[Advanced] Problem 3: Daily demand XGamma(α=2,β=3)X\sim Gamma(\alpha=2,\beta=3). Find P(X>12)P(X>12).

P(X>12)=e4(1+4)=5e4=0.0916P(X>12)=e^{-4}(1+4)=5e^{-4}=0.0916

Why this shortcut works: for α=2\alpha=2, the Gamma survival function is the Erlang tail; integrating the general Gamma pdf twice gives the compact form used in PYQs.

[PYQ-Trap] Problem 4: If Y=Z2Y=Z^2 and ZN(0,1)Z\sim N(0,1), find P(Y<3.84)P(Y<3.84).

P(Y<3.84)=P(1.96<Z<1.96)=0.95P(Y<3.84)=P(-1.96<Z<1.96)=0.95
Interpretation checklist: interpret whether the result is a tail risk, a service-level probability, or a variance-related threshold.

Topic 5: Sampling Distribution and CLT

Basic notes: a sampling distribution is the distribution of a statistic over repeated samples. CLT says the sample mean is approximately normal for large n.

Conditions / use: use CLT for large samples and for approximate probabilities on sample means or proportions.

Formula recap: SE(Xˉ)=σnSE(\bar X)=\frac{\sigma}{\sqrt n}, SE(p^)=p(1p)nSE(\hat p)=\sqrt{\frac{p(1-p)}{n}}, ZXˉ=Xˉμσ/nZ_{\bar X}=\frac{\bar X-\mu}{\sigma/\sqrt n}.

Seen-Before Check: “sample mean,” “standard error,” “large sample,” or “sampling distribution” indicates this topic.

PYQ pattern: probability on a sample mean and unbiasedness of sample proportion.

[Core] Problem 1: Service time has μ=176,σ2=256\mu=176,\sigma^2=256 and n=100n=100. Find P(175<Xˉ<178)P(175<\bar X<178).

SE(Xˉ)=1.6,z1=0.625,z2=1.25,P=0.6284SE(\bar X)=1.6,\quad z_1=-0.625,\quad z_2=1.25,\quad P=0.6284

[Advanced] Problem 2: Population values are 1, 4, 8, 11. Show that sample proportion of odd numbers is unbiased for the population proportion.

P=2/4=0.5,YBinomial(2,0.5),E(p^)=E(Y/2)=0.5P=2/4=0.5,\quad Y\sim Binomial(2,0.5),\quad E(\hat p)=E(Y/2)=0.5
Interpretation checklist: state the meaning of the standard error and whether the sample-based estimate is centered at the population value.

4. Applied Problem Solving

  • [Core] Find missing k in a PMF/PDF and evaluate a probability.
  • [Core] Compute mean and variance from a discrete distribution table.
  • [PYQ-Trap] Compare distributions and choose the correct model from wording cues.

5. System-Level Understanding

  • Chapter 2 creates the random-variable and sampling language used in Chapter 3 inference.
  • Binomial/Poisson/Negative Binomial are the repeated discrete-model pattern set in PYQs.
  • Normal and CLT are the bridge from raw data to confidence intervals and tests.

6. Quick Reference

Binomial: fixed n, constant p, independent trials.

Poisson: count in interval with mean rate λ.

Negative Binomial: trials until r-th success.

Normal: standardize using Z=(Xμ)/σZ=(X-\mu)/\sigma.

Sampling mean: SE(Xˉ)=σ/nSE(\bar X)=\sigma/\sqrt n.

Seen-Before Check: fixed trials, rare events, waiting until success, normal wording, or sample mean/proportion.

7. Exam Tips

  • Identify the model first, then write the exact formula with symbols.
  • For Poisson and Binomial questions, write what k means before calculation.
  • For “at least” questions, use complements when faster.
  • Seen-Before Check: ask yourself whether the question is about fixed trials, rare events, normal approximation, or a sample statistic before deriving anything.

8. Common Pitfalls

  • Using Poisson when the setup is clearly a fixed-trial Binomial problem.
  • Forgetting that a PDF must be integrated to get probability.
  • Using σ when the problem provides σ², or vice versa.
  • Skipping standard error before a sample mean probability calculation.

9. Tools and Guides

  • Calculator tips: keep cumulative normal table values handy for Z-based questions.
  • Test-statistic cheat line: CLT turns sample means into Normal-style problems.
  • Distribution selection rule: fixed trials → Binomial, rate over interval → Poisson, waiting until r-th success → Negative Binomial.