Table of Contents
References: Introduction to the Theory of Computation Michael Sipser
1. P (Polynomial time)
Polynomial time decidable languages P is the class of languages that are decidable in polynomial time on a deterministic single-tape Turing machine. \( P = U_k Time(n^k) \) where k is a constant
2. NP (Nondeterministic Polynomial time)
In P, we can avoid brute-force search in many problems and obtain polynomial time solutions. However, attempts to avoid brute force in certain other problems, including many interesting and useful ones, haven't been successful, and polynomial time algorithms that solve them aren't known to exist.
2.1. Languages View
P = The class of languages for which membership can be decided (solved) quickly. (polynomial time) NP = The class of languages for which membership can be verified quickly. (polynomial time)
2.2. TM view
NP is the set of decision problems solvable in polynomial time by a nondeterministic Turing machine. NP is the set of decision problems verifiable in polynomial time by a deterministic Turing machine.
2.2.1. NTM Decider (Page 283)
A Nondeterministic Turing Machine (NTM) decider is guaranteed to halt on all inputs. An NTM decider is designed so that every possible computation branch halts, either by accepting or rejecting the input. This means that for any input the machine is given, it will always come to a conclusion (halt) within a finite number of steps. There are no branches where the machine runs forever without deciding the outcome.
2.2.2. Solving NP
The nondeterministic nature of NP gives us an abstraction to imagine a machine (NTM) that could guess a solution "in parallel" and verify it quickly. If we had such a machine, it would allow us to "solve" NP problems quickly by magically finding the right solution path. However, for real-world deterministic machines, we still don’t have efficient algorithms to solve many NP problems.
2.2.3. Example: Hamiltonian Path Problem
If a directed or undirected graph, G, contains a Hamiltonian path, a path that visits every vertex in the graph exactly once. The HAMPATH problem has a feature called polynomial verifiiaility that is important for understanding its complexity. Verifying the existence of a Hamiltonian path may be much easier than determining its existence.
3. NP-Completeness
3.1. P versus NP
Whether all problems can be solved in polynomial time, typically, without the searching.
3.1.1. P == NP
You can always eliminate searching. If these classes were equal, any polynomially verifiable problem would be polynomially decidable.
3.1.2. P != NP
There were cases where you need to search. "Most researchers believe that the two classes are not equal because people have invested enormous effort to fiind polynomial time algorithms for certain problems in NP, without success. Researchers also have tried proving that the classes are unequal, but that would entail showing that no fast algorithm exists to replace brute-force search."
3.2. Defn: B is NP-complete if
3.2.1. 1. B is a member of NP
3.2.2. 2. For all A in NP, A <=p B
Every language in NP has the polynominal time reduce to that NP complete language. => If B is NP-complete and B is in P then P = NP One important advance on the P versus NP question came in the early 1970s with the work of Stephen Cook and Leonid Levin. Cook-Levin Theorem: SAT is NP-complete.
3.3. Importance of NP-completeness
3.3.1. 1. Showing B is NP-complete is the strong evidence of computational intractability (hard).
3.3.2. 2. Gives a good candidate for proving P != NP.
Michael Sipser in 2020: "Back 20 years ago, I was working very hard to show the composites number problem is not in P. And then, turns out, composites was in P (proved by Agrawal, M., Kayal, N., & Saxena, N PRIMES is in P in 2002). So it was the wrong to pick the composite number problem, but what NP-complete is guarentees is that: If you work on a problem, which is NP-complete, you can't pick the wrong problem, because if any problem is in NP and not in P, an NP-complete problem is going to be an example of that. Because if the NP-complete problems in P, everything in NP is in P."
3.4. The 3SAT Problem
3.4.1. Conjunctive Normal Form (CNF)
A boolean formula \( \phi \) is in Conjunctive Normal Form (CNF) if it has the form: \[ \phi = (x \lor \neg y \lor z) \land (\neg x \lor \neg s \lor z \lor u) \land ... \land (\neg z \lor \neg u) \]
3.4.2. SAT
"Boolean satisfiability problem (SAT) is the problem of determining if there exists an interpretation that satisfies a given Boolean formula. In other words, it asks whether the variables of a given Boolean formula can be consistently replaced by the values TRUE or FALSE in such a way that the formula evaluates to TRUE."
3.4.3. 3SAT is the satisfatory problem restrited to 3CNF formulars
\[ 3SAT = \{ \phi \mid \phi \text{ is a satisfiable 3CNF formular} \} \]
3.4.4. Theorem: 3SAT <=p K-CLIQUE
We will show that we can reduce 3SAT to K-CLIQUE in polynominal time by building a model on 3SAT. And hence to show that K-CLIQUE problem is also NP-complete.
- The K-Clique Problem
A k-clique in a graph is a subset of k nodes all directly connected by edges. In the k-clique problem, the input is an undirected graph and a number k. The output is a clique (closed) with k vertices, if one exists. \[ \text{K-CLIQUE} = \{ \langle G, k \rangle \mid \text{ graph G contains a k-clique} \} \]
- The K-Clique Problem is in NP
You can easily verify that a graph has a k-clique by exhibiting the clique.
- Proof: 3SAT <=p K-CLIQUE
Given polynomial-time reduction \( f \) that map \(\phi\) to \(\langle G,k \rangle \) where \( \phi \) is satisfiable if and only if G has a k-clique.
Given the structure of a CNF, a satisfying assignment to a CNF formula has \( \ge 1 \) true literal in each clause.
We will show that we can reduce 3SAT to K-CLIQUE in polynominal time by constructing a model based on 3SAT (adding rules). \[ \phi = (a \lor b \lor \neg c) \land (\neg a \lor b \lor d) \land (a \lor c \lor \neg e) \land ... \land (\neg x \lor y \lor \neg z) \]
- G: Assume each literal in the formula is a node in G, where:
- The forbidden edges:
- No edges within a clause.
- No edges that go between inconsistent labels (\( a \) and \( \neg a \) )
- G has all non-forbidden edges.
- k is the number of clauses
- Other than those forbidden edges, all other edges are connected.
- The forbidden edges:
Claim: \( \phi \) is satisfiable if and only if (iff) \( G \) has k-clique.
- \(\to \) Proof: If \( \phi \) is satisfiable then \( G \) has a k-clique.
Taking any satisfying assignment to \( \phi \). Pick 1 true literal in each clause.
Assuming that we can find that satisfying assignemnt (That 3SAT is solvable).
- Then the corresponding nodes in G are a k-clique:
No forbidden edges among them
Because based on our rules of the model, those nodes on different clauses have edges connected.
No chosen nodes between inconsistent labels (e.g., \( a \) and \( \neg a \) )
Because they all came from the same assignment. (a is true then \( \neg a \) is false, we cannot pick them both in different clauses)
- \(\gets \) Proof: If G has a k-clique then it will makes \( \phi \) satisfiable.
Taking any k-clique in G. It must have 1 node in each clause.
- It cannot have 2 nodes or 0 nodes in the same clause.
Because when we construct 3SAT from given G, nodes cannot appear in a clique together.
And since there are k clauses, each clause must have exactly one node to form a k-clique graph.
By setting each corresponding literal TRUE, that will gives a satisfying assignment to \( \phi \).
- G: Assume each literal in the formula is a node in G, where:
- Claim: The reduction f is computable in polynominal time.
- Corollary: K-CLIQUE is in P -> 3SAT is in P
If k-clique can be solved in polynominal time, then 3SAT can be solved in polynominal time.
Conversely, a polynomial-time solution to 3SAT implies that all NP problems, incluing K-clique, are in P.
3.4.5. Theorem: 3SAT <=p HAMPATH
Similarly, we will show that we can reduce 3SAT to HAMPATH in polynominal time by building a model on 3SAT. And hence to show that HAMPATH problem is also NP-complete.
- Proof: Check Zig-Zag proof by Michael Sipser
Basic idea: Just like in Proof: 3SAT <=p K-CLIQUE, in these kinds of reductions: We are trying to simulate a boolean formula from the satisfiability perspective to some structures inside the target languages. We are going to build gadgets to simulate the structures in the formula, namely, the variables, literals and the clauses. "Simulate variable and clauses with 'gadgets'." he "We give a way to convert 3CNF-formulas to graphs in which Hamiltonian paths correspond to satisfying assignments of the formula. The graphs contain gadgets that mimic variables and clauses. The variable gadget is a diamond structure that can be traversed in either of two ways, corresponding to the two truth settings. The clause gadget is a node. Ensuring that the path goes through each clause gadget corresponds to ensuring that each clause is satisfiied in the satisfying assignment."
More NP-complete problems and their proofs can be found in "Page 331, Section 7.5 Additional NP-Complete Problems of the book <Introduction to the Theroy of Computation> by Michael Sipser"
4. The Cook-Levin Theorem
Once we have one NP-complete problem, we may obtain others by polynomial time reduction from it, as we've seen in K-CLIQUE and HAMPATH. However, establishing the first NP-complete problem is more difficult. Now we do so by proving that SAT is NP-complete. In 1971, Stephen Cook published his paper "The complexity of theorem proving procedures" in conference proceedings of the newly founded ACM Symposium on Theory of Computing. "The theorem states that the Boolean satisfiability problem is NP-complete. That is, it is in NP, and any problem in NP can be reduced in polynomial time by a deterministic Turing machine to the Boolean satisfiability problem."
4.1. Theorem: SAT is NP-complete
4.2. Proof:
4.2.1. 1) SAT is in NP
A nondeterministic polynomial time machine can guess an assignment to a given formula \( \phi \) and accept if the assignment satisfies \( \phi \).
4.2.2. 2) Show that for each A in NP we have A <=p SAT:
(Any language in NP is polynomial time reducible to SAT).
- Proof Idea:
Let N be a nondeterministic Turing machine that decides A in \( n^k \) time for some constant k. We are trying to proof that for any w belongs to any NP problems, there is a polynominal time reduction procedure that can transform that w to \( \phi(\text{SAT}) \)
- Key to the Proof:
For any w belongs to any NP problems, it can be determinned in polynomial time by a nondeterministic Turing machine N, say the running time is \(n^k \). Then we can construct a Tableau for N is an \( n^k \times n^k \) table whose rows are the configurations of a branch of the computation of N on input w. Which represents the computation steps/history of that branch of NTM (N). Based on this Tableau, by carefully define each part of Phi: \[ \phi = \phi_{cell} \land \phi_{start} \land \phi_{move} \land \phi_{accept} \] We can show that the construction time is is \( O(n^{2k}) \), the size of \( \phi \) is polynomial in n. Therefore we may easily constract a reduction that produces Phi in polynomial time from the input w of any NP problem.
5. Probabilistic Algorithms
Probabilistic Algorithms (Randomized Algorithms) is an algorithm that employs a degree of randomness as part of its logic or procedure. Which is an algorithm that makes random choices during its execution. Instead of always producing the same output for a given input, a probabilistic algorithm can produce different outputs on different runs, even with the same input, due to the randomness involved.
Probabilistic algorithms are useful because they can often solve problems more efficiently than deterministic algorithms, especially for complex problems where deterministic solutions are too slow or unknown.
5.1. Example: Primality Testing
whether a given number n is prime or not
5.1.1. Probabilistic Primality Tests
The fastest algorithms for primality testing were probabilistic, such as the Miller-Rabin and Solovay-Strassen tests.
- Use random numbers to test whether n behaves like a prime number under certain conditions.
- Can quickly identify composite numbers.
- For prime numbers, they may sometimes incorrectly classify a composite number as "probably prime," but the probability of error can be reduced by running the test multiple times with different random values.
They offer a trade-off between speed and certainty. By accepting a small probability of error, we gain significant speed improvements.
5.1.2. The AKS Primality Test (2002)
In 2002, the Agrawal-Kayal-Saxena (AKS) Primality Test was introduced, providing the first deterministic polynomial-time algorithm for primality testing. However, despite being polynomial-time, the AKS algorithm is not as efficient in practice as probabilistic tests for large numbers. Probabilistic algorithms remain popular in practical applications due to their speed and acceptable error rates, which can be made arbitrarily small.
5.2. Bounded-Error Probabilistic Poynomial Time (BPP)
BPP is the class of decision problems solvable by a probabilistic Turing machine in polynomial time, with an error probability of less than 1/3 for all instances. The computation can use random bits to make decisions, which means it may produce incorrect answers, but the probability of error is bounded away from 1/2 and can be reduced exponentially by repeating the algorithm, which runs in polynomial time.
5.3. Trust the "Randomless"
Trust is placed not in specific random values but in the statistical behavior over many trials.