Chapter 1: Descriptive Statistics and Basic Probability
Syllabus hours: 6 | Exam weight: 10 marks | Marks breakdown: theory 4, numericals 6
Difficulty type: Mixed | Version / Last Updated: 2026-04-18 | Not in syllabus: advanced measure theory
Outcome: compute descriptive summaries, interpret basic probability laws, and solve Bayes-style PYQs.
1. Fundamental Concepts
- Statistics summarizes and interprets data for engineering decisions.
- Population is the full set; sample is the observed subset.
- Central tendency describes the center; dispersion describes spread.
- Probability measures uncertainty on the scale from 0 to 1.
- Conditional probability updates probability using new evidence.
2. Core Methods and Formulas
When to use: use mean/SD/CV for numerical data summary; use probability laws for event algebra; use Bayes when a cause must be inferred from an observed effect.
When not to use: do not use CV for direct location comparison, and do not apply Bayes unless there are prior groups and a conditional observation.
3. Standard Models / Topics
Topic 1: Central Tendency and Dispersion
Basic notes: mean, median, and mode describe center; range, variance, SD, and CV describe spread. SD is in the original unit; CV is unit-free and helps compare consistency.
Conditions / use: use CV when two datasets have different means and you want consistency comparison.
Formula recap: , ,
Seen-Before Check: wording like “more consistent,” “merits/demerits,” or “correct the mean/SD after a wrong item” signals this topic immediately.
PYQ pattern: correction of mean/SD and consistency via CV.
[Core] Problem 1: For 20 items, mean = 10 and SD = 2. One value 8 is replaced by 12. Find corrected mean and SD.
Answer: corrected mean = 10.2 and corrected SD = 1.99.
[Advanced] Problem 2: Company A has mean 460 and SD 50; Company B has mean 490 and SD 40. Which is more consistent?
Answer: Company B is more consistent because its CV is smaller.
[PYQ-Trap] Problem 3: For the grouped data x = 10, 20, 30 with frequencies 2, 3, 5, find the mean.
Answer: grouped mean = 23.
Topic 2: Graphical Representation and Five-Number Summary
Basic notes: histograms show frequency structure, box plots show median, quartiles, spread, and skewness, and scatter plots show association between variables.
Conditions / use: use box plots for quick comparison of spread and outliers; use histograms when shape matters; use scatter plots for bivariate relation.
Formula recap: five-number summary = min, Q1, median, Q3, max; .
Seen-Before Check: words like “five-number summary,” “box and whisker,” “interpret the display,” or “comment on shape” point here.
PYQ pattern: compute quartiles and comment on skewness.
[Core] Problem 1: Find the five-number summary of 1, 2, 4, 5, 6, 8, 9, 10, 12, 13, 15, 17, 30.
Answer: (1, 4.5, 9, 14, 30).
[PYQ-Trap] Problem 2: Using the above data, interpret skewness and spread.
Answer: right-skewed because the upper whisker is much longer than the lower whisker.
Topic 3: Addition and Multiplication Laws
Basic notes: addition law handles union of events; multiplication law handles joint occurrence and conditional chaining.
Conditions / use: use addition law for “A or B”; use multiplication for “A and B,” sequential events, or independent events.
Formula recap: , .
Seen-Before Check: phrases like “either/or,” “both,” or “at least one” are the signal words.
PYQ pattern: union probability and “none solve” complement calculations.
[Core] Problem 1: If , , and , find .
[Advanced] Problem 2: If A, B, C solve a problem independently with probabilities 1/3, 1/4, 1/5, find probability none solve it.
Answer: 0.4.
Topic 4: Conditional Probability and Bayes Theorem
Basic notes: conditional probability updates the chance of A after B is known; Bayes theorem reverses the conditioning from effect to cause.
Conditions / use: use Bayes when data is split into source groups and one observed result must be traced back.
Formula recap: , .
Seen-Before Check: words like “defective,” “if selected item is found defective,” or “what is the probability it came from” are Bayes triggers.
PYQ pattern: quality-control source attribution and spam/diagnostic reversals.
[Core] Problem 1: Chips from A (20%) and B (80%) have defect rates 10% and 5%. Find P(defective) and P(A | defective).
[PYQ-Trap] Problem 2: Emails go to accounts 1,2,3 with probabilities 0.7, 0.2, 0.1 and spam rates 0.01, 0.02, 0.05. Find P(spam) and P(account2 | spam).
4. Applied Problem Solving
- [Core] Mean/SD correction after wrong value replacement.
- [Core] Box plot and five-number summary interpretation.
- [PYQ-Trap] Bayes theorem with source attribution.
5. System-Level Understanding
- Chapter 1 provides the descriptive-statistics language needed before distribution and inference chapters.
- Probability laws and Bayes logic reappear in quality control, inference, and decision problems.
- Dispersion and graphical interpretation help identify when a model is plausible or suspicious.
6. Quick Reference
Mean: ; SD: ; CV: .
Addition law: .
Bayes: .
Seen-Before Check: consistency, box plot, “defective from which source,” and “at least one” are top Chapter 1 cues.
7. Exam Tips
- Write the formula first, then substitute values line by line.
- For Bayes questions, write the groups and priors clearly before computing.
- For box-plot questions, sort the data and show the five-number summary explicitly.
- Seen-Before Check: if the question mentions defective items, consistency, or a box plot, classify the topic before calculating.
8. Common Pitfalls
- Forgetting the overlap subtraction term in the addition law.
- Confusing with .
- Using percentages directly instead of decimal probabilities.
- Using population variance formula instead of sample variance when the question says sample data.
9. Tools and Guides
- Calculator tips: use memory keys for repeated sums in correction questions.
- Distribution test lines: Chapter 1 mostly uses summary measures and Bayes, not formal distribution tests.
- Decision cue: if the question asks who/what source generated the observed result, think Bayes first.