Algorithm Analysis & Design (CST306) - Module 5 Study Guide
Part A: High Frequency Definitions (Module 5 Focus)
Topic: Define P, NP, NP-Hard, NP-Complete
[SURE SHOT | ↻ REPEATED PYQ]
Section A: What is this? (The Concept)
Imagine you are trying to solve a giant Sudoku puzzle. If I ask you to solve it from scratch, it might take you hours. But if I hand you a completely filled-out Sudoku board and ask, “Is this correct?”, you can verify it in two minutes just by scanning the rows and columns.
The Analogy: P vs NP is the difference between “Solving” and “Verifying”. Class P contains problems that are easy to solve. Class NP contains problems that might be incredibly hard to solve, but if someone hands you a magic answer, it is easy to verify. NP-Complete problems are the absolute hardest problems in that “easy to verify” category.
Section B: Exam-Ready Theory (The Rigor)
- Class P (Polynomial Time): The set of decision problems that can be solved by a deterministic algorithm in polynomial time . (e.g., Shortest Path, Merge Sort).
- Class NP (Non-Deterministic Polynomial Time): The set of decision problems that can be verified by a deterministic algorithm in polynomial time. If you are given a “certificate” (a proposed solution), you can check if it is correct in time.
- Class NP-Hard: A problem is NP-Hard if every problem in NP can be reduced to in polynomial time (). It means is at least as hard as the hardest problems in NP. (Note: itself does not have to be in NP).
- Class NP-Complete: A problem is NP-Complete if it satisfies two conditions:
- It is in NP.
- It is NP-Hard. (e.g., TSP, Clique, Vertex Cover, 3-CNF-SAT).
Section C: Step-by-Step Worked Example (The Application)
(Theoretical Trace: Hamiltonian Path)
- Problem: Does there exist a path in graph that visits every vertex exactly once?
- Why is it in NP? If I claim the path is , you can verify it in polynomial time by simply checking two things: 1) Are all vertices present exactly once? 2) Is there an actual edge between each adjacent vertex in the sequence? Both checks take time. Hence, it is in NP.
Section D: How to Write This in the Exam (The Strategy)
- Start With: Define P as “solvable in polynomial time” and NP as “verifiable in polynomial time.”
- Body: Draw the standard Venn Diagram showing P inside NP, and NP-Complete at the intersection of NP and NP-Hard.
- Traps: A very common mistake is saying NP stands for “Not Polynomial”. It stands for Non-Deterministic Polynomial. State this explicitly.
- Close With: Provide one standard example for each class.
Topic: Compare Las Vegas and Monte Carlo Algorithms
[HIGH PROB | ↻ REPEATED PYQ]
Section A: What is this? (The Concept)
Imagine you have a deck of cards and need to find the Ace of Spades.
- Strategy A: You draw a card randomly. If it’s not the Ace, you put it back, shuffle, and draw again until you find it. You will definitely find it eventually, but it might take 1 minute, or it might take 5 hours.
- Strategy B: You draw 10 random cards. If the Ace is there, great! If not, you just give up and say “I didn’t find it.” You only spent exactly 10 seconds looking, but you might be wrong.
The Analogy: Strategy A is a Las Vegas algorithm: The answer is always a guaranteed jackpot (correct), but you gamble with your time. Strategy B is a Monte Carlo algorithm: You strictly control your time, but you gamble with the accuracy of the answer.
Section B: Exam-Ready Theory (The Rigor)
| Feature | Las Vegas Algorithms | Monte Carlo Algorithms |
|---|---|---|
| Output Correctness | Always produces the correct and optimal output (Probability of correctness = 1). | May produce incorrect output with some probability of failure. |
| Running Time | The running time is a random variable and is NOT bounded. | Runs for a fixed, deterministic number of steps (Time is bounded). |
| Trade-off | Gambles with computing resources (Time/Space). | Gambles with the accuracy of the result. |
| Examples | Randomized Quick Sort. | Miller-Rabin Primality Test, checking for a majority element. |
Section D: How to Write This in the Exam (The Strategy)
- Start With: Create a clear comparison table.
- Body: Write down the two definitions clearly. Mention the probabilities (Las Vegas success = 1, Monte Carlo success < 1).
- Close With: Give one example for each to secure full marks.
Part B: Core Topics
Randomized Algorithms: Quick Sort
[SURE SHOT | ↻ REPEATED PYQ]
Section A: What is this? (The Concept)
Imagine sorting a line of students by height. In standard Quick Sort, you always pick the first student as the “pivot” to divide the line. But what if the teacher arranged them from tallest to shortest as a prank? Your standard method would perform terribly, comparing everyone one-by-one.
The Analogy: Randomized Quick Sort beats the prankster. Instead of always picking the first student, you close your eyes, spin around, and pick a random student as the pivot. Because you choose randomly, the worst-case scenario (a perfectly bad pivot every single time) becomes astronomically improbable.
Section B: Exam-Ready Theory (The Rigor)
- Deterministic Quick Sort Issue: If the input array is already sorted or reverse-sorted, picking the first element as the pivot results in unbalanced partitions ( and elements). This degrades the time complexity to .
- Randomized Quick Sort Algorithm:
Algorithm randQuickSort(A[], low, high) { if low >= high, then EXIT While pivot 'x' is not a Central Pivot: 1. Choose uniformly at random an element 'x' from A[low..high]. 2. Count elements smaller than 'x' (sc) and greater than 'x' (gc). 3. If sc >= n/4 and gc >= n/4, then 'x' is a central pivot. Break loop. Partition A[low..high] using 'x'. Let index of 'x' be pos. randQuickSort(A, low, pos-1) randQuickSort(A, pos+1, high) }
Section C: Step-by-Step Worked Example (The Application)
Example: Expected Running Time Analysis (University Question) Prove that the expected running time of Randomized Quicksort is .
- Define Indicator Variables: Let be the number of comparisons. , where is 1 if elements and are compared, and 0 otherwise.
- Probability of Comparison: Elements and are compared only if one of them is chosen as a pivot before any other element in the range .
- Number of elements in range .
- .
- Expectation: .
- .
- Simplify the Sum: Let .
- .
- Harmonic Series: The inner sum is the Harmonic Series .
- .
- Final Result: .
Section D: How to Write This in the Exam (The Strategy)
- Start With: Explain the difference between Deterministic and Randomized complexity.
- Body: Write the algorithm explicitly.
- Traps: The analysis proof is where marks are won. You must use the indicator variable and the probability .
- Close With: State clearly that this is a Las Vegas randomized algorithm.
Approximation Algorithms: Bin Packing
[SURE SHOT | ↻ REPEATED PYQ]
Section A: What is this? (The Concept)
Imagine you are moving to a new house and have a pile of assorted heavy items. You have moving boxes that can hold exactly 10kg each. You want to use the absolute minimum number of boxes. Finding the mathematically perfect packing combination takes thousands of years for a large house (it is NP-Hard).
The Analogy: Instead of freezing in calculation paralysis, you use a “good enough” strategy. Next Fit: Put items in the current box until it’s full, then tape it shut and start a new one. First Fit: Leave boxes open; if an item fits in any previous box, put it there. First Fit Decreasing: Sort the items from heaviest to lightest first, then do First Fit. This guarantees a near-optimal packing in seconds.
Section B: Exam-Ready Theory (The Rigor)
- Definition: Approximation algorithms return near-optimal solutions in polynomial time for NP-Complete optimization problems.
- Bin Packing Problem: Given items of sizes and bins of capacity , assign items to bins to minimize the number of used bins.
- Lower Bound: Minimum bins .
- Approximation Ratio:
- First Fit (FF): .
- First Fit Decreasing (FFD): .
Section C: Step-by-Step Worked Example (The Application)
Example: First Fit Decreasing Trace (University Question) Bin Capacity = 10. Items = .
- Sort items: .
- Packing:
- Item 7: Bin 1 (Rem: 3).
- Item 6: Bin 2 (Rem: 4).
- Item 5: Bin 3 (Rem: 5).
- Item 5: Bin 3 (Fits exactly! Rem: 0).
- Item 5: Bin 4 (Rem: 5).
- Item 4: Bin 2 (Fits exactly! Rem: 0).
- Item 2: Bin 1 (Fits! Rem: 1).
- Item 2: Bin 4 (Fits! Rem: 3).
- Item 1: Bin 1 (Fits exactly! Rem: 0).
- Result: 4 bins used. Optimal bins = .
Section D: How to Write This in the Exam (The Strategy)
- Start With: Calculate and state the Lower Bound immediately using the ceiling formula.
- Body: For numericals, draw actual boxes (rectangles) on your paper. Write the numbers inside them as you process the list. Do not try to hold the remaining capacities in your head.
- Traps: Pay close attention to whether the question asks for First Fit or First Fit Decreasing. If it says Decreasing, you lose all marks if you forget to sort the array first.
- Close With: State the final number of bins used and compare it to the theoretical lower bound.
NP Completeness Proofs (Clique)
[HIGH PROB | ↻ REPEATED PYQ]
Section A: What is this? (The Concept)
Imagine a high school cafeteria. A “Clique” is a group of students where every single person in the group is friends with every other person in the group. Finding if a clique of size 50 exists in a school of 1000 is incredibly hard.
The Analogy: To prove that the Clique problem is mathematically one of the hardest problems in the universe (NP-Complete), we use a “Reduction”. We take another notoriously impossible problem (like satisfying a massive logical boolean formula, 3-CNF-SAT), and we prove that we can disguise it as a Clique problem. If we can map the boolean formula to a graph of friendships, then solving the Clique problem automatically solves the boolean formula.
Section B: Exam-Ready Theory (The Rigor)
Steps to prove a problem is NP-Complete:
- Prove (Write a polynomial time verifier).
- Prove (Reduce a known NP-Complete problem to in polynomial time: ).
Proof: CLIQUE is NP-Complete
Step 1: Prove CLIQUE is in NP
- Certificate: A set of vertices.
- Verifier:
- Check if .
- For every pair of vertices , check if the edge exists in the graph .
- Since checking edges takes time (polynomial), CLIQUE is in NP.
Step 2: Prove CLIQUE is NP-Hard (Reduction from 3-CNF-SAT)
- Let be a boolean formula in 3-CNF with clauses. Each clause has 3 literals: .
- Construct Graph :
- For each clause, create a “triple” of vertices in corresponding to its 3 literals.
- Connect vertex to vertex with an edge IF AND ONLY IF:
- They belong to different clauses (different triples).
- They are not logical negations of each other (e.g., do not connect and ).
- Equivalence: The boolean formula is satisfiable IF AND ONLY IF the constructed graph has a CLIQUE of size exactly . (Picking one true literal per clause forms a fully connected -clique).
- Since constructing this graph takes polynomial time, 3-CNF-SAT CLIQUE.
Section C: Step-by-Step Worked Example (The Application)
Example: Constructing graph for
- Clause 1: Vertices for .
- Clause 2: Vertices for .
- Edges:
- Connect to . (Different clauses, no negation).
- Do NOT connect to . (Negation!).
- Connect to . (No negation).
- Connect to . (No negation).
Section D: How to Write This in the Exam (The Strategy)
- Start With: Explicitly write the two required steps: “1. Prove it is in NP. 2. Prove it is NP-Hard via reduction.”
- Body: For Step 1, write the verifier algorithm. For Step 2, clearly state you are reducing from 3-CNF-SAT. List the two critical rules for drawing edges in the graph construction.
- Visuals: Draw a small example. Write . Draw the 6 nodes grouped into two columns, and draw the valid connecting edges between the columns.
- Close With: “Since CLIQUE is in NP and NP-Hard, it is NP-Complete.”
NP Completeness Proofs (Vertex Cover)
[HIGH PROB | ↻ REPEATED PYQ]
Section A: What is this? (The Concept)
Imagine you are a security guard at a museum with several hallways. Hallways connect different rooms. You want to place guards in the rooms so that every single hallway is watched by at least one guard.
The Analogy: Vertex Cover is the “Minimum Security Guard” problem. A “Vertex Cover” is a set of rooms (vertices) such that every hallway (edge) has at least one of its endpoints in that set. Finding the absolute smallest number of guards needed is an NP-Hard problem.
Section B: Exam-Ready Theory (The Rigor)
Proof: VERTEX COVER is NP-Complete
Step 1: Prove VERTEX COVER is in NP
- Certificate: A subset of vertices .
- Verifier:
- Check if .
- For every edge in the graph , check if OR .
- Since checking all edges takes time, VERTEX COVER is in NP.
Step 2: Prove VERTEX COVER is NP-Hard (Reduction from CLIQUE)
- Concept: A clique in a graph is a vertex cover in the complement graph (with a size adjustment).
- Reduction Algorithm:
- Given an instance of CLIQUE .
- Construct the complement graph (where an edge exists in iff it does NOT exist in ).
- A subset is a CLIQUE of size in IF AND ONLY IF the subset is a VERTEX COVER of size in .
- Since constructing the complement graph takes polynomial time, CLIQUE VERTEX COVER.
Section C: Step-by-Step Worked Example (The Application)
Example: Verify Vertex Cover Graph with edges: (1, 2), (2, 3), (3, 4). Proposed Vertex Cover .
- Edge (1, 2): Node 2 is in . (Watched)
- Edge (2, 3): Node 2 and 3 are in . (Watched)
- Edge (3, 4): Node 3 is in . (Watched)
- Conclusion: Every edge is covered by . is a valid vertex cover of size 2.
Section D: How to Write This in the Exam (The Strategy)
- Start With: Define Vertex Cover: “A set of vertices such that every edge is incident to at least one vertex in the set.”
- Body: Perform the 2-step NP-Complete proof. Use the Reduction from CLIQUE to the complement graph.
- Visuals: Draw a simple 3-node triangle graph. Show how picking 2 nodes covers all 3 edges.
- Close With: “Because VERTEX COVER is in NP and is NP-Hard, it is NP-Complete.”