ENSH 304 Chapter Notes

Each chapter has a separate page with the same framework.

Chapter 1: Descriptive Statistics and Basic Probability

Syllabus hours: 6 | Exam weight: 10 marks | Marks breakdown: theory 4, numericals 6

Difficulty type: Mixed | Version / Last Updated: 2026-04-18 | Not in syllabus: advanced measure theory

Outcome: compute descriptive summaries, interpret basic probability laws, and solve Bayes-style PYQs.

1. Fundamental Concepts

  • Statistics summarizes and interprets data for engineering decisions.
  • Population is the full set; sample is the observed subset.
  • Central tendency describes the center; dispersion describes spread.
  • Probability measures uncertainty on the scale from 0 to 1.
  • Conditional probability updates probability using new evidence.

2. Core Methods and Formulas

When to use: use mean/SD/CV for numerical data summary; use probability laws for event algebra; use Bayes when a cause must be inferred from an observed effect.

When not to use: do not use CV for direct location comparison, and do not apply Bayes unless there are prior groups and a conditional observation.

xˉ=1ni=1nxi\bar{x}=\frac{1}{n}\sum_{i=1}^{n}x_i
s2=1n1i=1n(xixˉ)2,s=s2s^2=\frac{1}{n-1}\sum_{i=1}^{n}(x_i-\bar{x})^2,\quad s=\sqrt{s^2}
xˉ=fixifi,s2=fixi2(fixi)2nn1\bar{x}=\frac{\sum f_i x_i}{\sum f_i},\quad s^2=\frac{\sum f_i x_i^2-\frac{(\sum f_i x_i)^2}{n}}{n-1}
CV=sxˉ×100%CV=\frac{s}{\bar{x}}\times 100\%
P(AB)=P(A)+P(B)P(AB)P(A\cup B)=P(A)+P(B)-P(A\cap B)
P(AB)=P(A)P(BA)=P(B)P(AB)P(A\cap B)=P(A)P(B\mid A)=P(B)P(A\mid B)
P(AB)=P(AB)P(B)P(A\mid B)=\frac{P(A\cap B)}{P(B)}
P(AiB)=P(Ai)P(BAi)jP(Aj)P(BAj)P(A_i\mid B)=\frac{P(A_i)P(B\mid A_i)}{\sum_j P(A_j)P(B\mid A_j)}

3. Standard Models / Topics

Topic 1: Central Tendency and Dispersion

Basic notes: mean, median, and mode describe center; range, variance, SD, and CV describe spread. SD is in the original unit; CV is unit-free and helps compare consistency.

Conditions / use: use CV when two datasets have different means and you want consistency comparison.

Formula recap: xˉ=1nxi\bar{x}=\frac{1}{n}\sum x_i, s2=1n1(xixˉ)2s^2=\frac{1}{n-1}\sum(x_i-\bar{x})^2, CV=sxˉ×100%CV=\frac{s}{\bar{x}}\times100\%

Seen-Before Check: wording like “more consistent,” “merits/demerits,” or “correct the mean/SD after a wrong item” signals this topic immediately.

PYQ pattern: correction of mean/SD and consistency via CV.

[Core] Problem 1: For 20 items, mean = 10 and SD = 2. One value 8 is replaced by 12. Find corrected mean and SD.

x=20×10=200,x2=20(4+100)=2080\sum x=20\times10=200,\quad \sum x^2=20(4+100)=2080
x=2008+12=204,xˉ=20420=10.2\sum x'=200-8+12=204,\quad \bar{x}'=\frac{204}{20}=10.2
x2=208064+144=2160,s2=216020(10.2)2=3.96\sum {x'}^2=2080-64+144=2160,\quad s'^2=\frac{2160}{20}-(10.2)^2=3.96

Answer: corrected mean = 10.2 and corrected SD = 1.99.

Interpretation checklist: corrected values reflect the data after fixing a recording error; compare the SD before and after to see whether spread changed materially.

[Advanced] Problem 2: Company A has mean 460 and SD 50; Company B has mean 490 and SD 40. Which is more consistent?

CVA=50460×100=10.87%,CVB=40490×100=8.16%CV_A=\frac{50}{460}\times100=10.87\%,\quad CV_B=\frac{40}{490}\times100=8.16\%

Answer: Company B is more consistent because its CV is smaller.

[PYQ-Trap] Problem 3: For the grouped data x = 10, 20, 30 with frequencies 2, 3, 5, find the mean.

n=2+3+5=10,fixi=2(10)+3(20)+5(30)=230n=2+3+5=10,\quad \sum f_i x_i=2(10)+3(20)+5(30)=230
xˉ=230/10=23\bar{x}=230/10=23

Answer: grouped mean = 23.

Interpretation checklist: mention whether the data were grouped and whether the result is a weighted average or a corrected summary.

Topic 2: Graphical Representation and Five-Number Summary

Basic notes: histograms show frequency structure, box plots show median, quartiles, spread, and skewness, and scatter plots show association between variables.

Conditions / use: use box plots for quick comparison of spread and outliers; use histograms when shape matters; use scatter plots for bivariate relation.

Formula recap: five-number summary = min, Q1, median, Q3, max; IQR=Q3Q1IQR=Q_3-Q_1.

Seen-Before Check: words like “five-number summary,” “box and whisker,” “interpret the display,” or “comment on shape” point here.

PYQ pattern: compute quartiles and comment on skewness.

[Core] Problem 1: Find the five-number summary of 1, 2, 4, 5, 6, 8, 9, 10, 12, 13, 15, 17, 30.

Answer: (1, 4.5, 9, 14, 30).

[PYQ-Trap] Problem 2: Using the above data, interpret skewness and spread.

IQR=144.5=9.5,upper whisker=3014=16,lower whisker=4.51=3.5IQR=14-4.5=9.5,\quad upper\ whisker=30-14=16,\quad lower\ whisker=4.5-1=3.5

Answer: right-skewed because the upper whisker is much longer than the lower whisker.

Interpretation checklist: note center, spread, skewness, and whether there is a long upper tail.

Topic 3: Addition and Multiplication Laws

Basic notes: addition law handles union of events; multiplication law handles joint occurrence and conditional chaining.

Conditions / use: use addition law for “A or B”; use multiplication for “A and B,” sequential events, or independent events.

Formula recap: P(AB)=P(A)+P(B)P(AB)P(A\cup B)=P(A)+P(B)-P(A\cap B), P(AB)=P(A)P(BA)P(A\cap B)=P(A)P(B\mid A).

Seen-Before Check: phrases like “either/or,” “both,” or “at least one” are the signal words.

PYQ pattern: union probability and “none solve” complement calculations.

[Core] Problem 1: If P(A)=0.5P(A)=0.5, P(B)=0.4P(B)=0.4, and P(AB)=0.2P(A\cap B)=0.2, find P(AB)P(A\cup B).

P(AB)=0.5+0.40.2=0.7P(A\cup B)=0.5+0.4-0.2=0.7

[Advanced] Problem 2: If A, B, C solve a problem independently with probabilities 1/3, 1/4, 1/5, find probability none solve it.

P(none)=(113)(114)(115)=23×34×45=25P(\text{none})=(1-\frac13)(1-\frac14)(1-\frac15)=\frac23\times\frac34\times\frac45=\frac25

Answer: 0.4.

Interpretation checklist: say whether the result is a union, intersection, or complement probability.

Topic 4: Conditional Probability and Bayes Theorem

Basic notes: conditional probability updates the chance of A after B is known; Bayes theorem reverses the conditioning from effect to cause.

Conditions / use: use Bayes when data is split into source groups and one observed result must be traced back.

Formula recap: P(AB)=P(AB)P(B)P(A\mid B)=\frac{P(A\cap B)}{P(B)}, P(AiB)=P(Ai)P(BAi)jP(Aj)P(BAj)P(A_i\mid B)=\frac{P(A_i)P(B\mid A_i)}{\sum_j P(A_j)P(B\mid A_j)}.

Seen-Before Check: words like “defective,” “if selected item is found defective,” or “what is the probability it came from” are Bayes triggers.

PYQ pattern: quality-control source attribution and spam/diagnostic reversals.

[Core] Problem 1: Chips from A (20%) and B (80%) have defect rates 10% and 5%. Find P(defective) and P(A | defective).

P(D)=0.2(0.1)+0.8(0.05)=0.06,P(AD)=0.2(0.1)0.06=13P(D)=0.2(0.1)+0.8(0.05)=0.06,\quad P(A\mid D)=\frac{0.2(0.1)}{0.06}=\frac13

[PYQ-Trap] Problem 2: Emails go to accounts 1,2,3 with probabilities 0.7, 0.2, 0.1 and spam rates 0.01, 0.02, 0.05. Find P(spam) and P(account2 | spam).

P(S)=0.7(0.01)+0.2(0.02)+0.1(0.05)=0.016,P(A2S)=0.2(0.02)0.016=0.25P(S)=0.7(0.01)+0.2(0.02)+0.1(0.05)=0.016,\quad P(A_2\mid S)=\frac{0.2(0.02)}{0.016}=0.25
Interpretation checklist: state which prior group is most likely after observing the evidence.

4. Applied Problem Solving

  • [Core] Mean/SD correction after wrong value replacement.
  • [Core] Box plot and five-number summary interpretation.
  • [PYQ-Trap] Bayes theorem with source attribution.

5. System-Level Understanding

  • Chapter 1 provides the descriptive-statistics language needed before distribution and inference chapters.
  • Probability laws and Bayes logic reappear in quality control, inference, and decision problems.
  • Dispersion and graphical interpretation help identify when a model is plausible or suspicious.

6. Quick Reference

Mean: xˉ=1nxi\bar{x}=\frac{1}{n}\sum x_i; SD: s=1n1(xixˉ)2s=\sqrt{\frac{1}{n-1}\sum(x_i-\bar{x})^2}; CV: sxˉ×100%\frac{s}{\bar{x}}\times100\%.

Addition law: P(AB)=P(A)+P(B)P(AB)P(A\cup B)=P(A)+P(B)-P(A\cap B).

Bayes: P(AiB)=P(Ai)P(BAi)jP(Aj)P(BAj)P(A_i\mid B)=\frac{P(A_i)P(B\mid A_i)}{\sum_j P(A_j)P(B\mid A_j)}.

Seen-Before Check: consistency, box plot, “defective from which source,” and “at least one” are top Chapter 1 cues.

7. Exam Tips

  • Write the formula first, then substitute values line by line.
  • For Bayes questions, write the groups and priors clearly before computing.
  • For box-plot questions, sort the data and show the five-number summary explicitly.
  • Seen-Before Check: if the question mentions defective items, consistency, or a box plot, classify the topic before calculating.

8. Common Pitfalls

  • Forgetting the overlap subtraction term in the addition law.
  • Confusing P(AB)P(A\mid B) with P(BA)P(B\mid A).
  • Using percentages directly instead of decimal probabilities.
  • Using population variance formula instead of sample variance when the question says sample data.

9. Tools and Guides

  • Calculator tips: use memory keys for repeated sums in correction questions.
  • Distribution test lines: Chapter 1 mostly uses summary measures and Bayes, not formal distribution tests.
  • Decision cue: if the question asks who/what source generated the observed result, think Bayes first.