Computer Science Building, Princeton University


Introduction to CS


1.  A Simple Machine

2.  Java Programming

3.  OOP

4.  Data Structures

5.  A Computing Machine

6.  Building a Computer

7.  Theory of Computation

    • Formal Languages

    • Regular Expressions

    • Finite State Automata

    • Kleene's Theorem

    • Turing Machines

    • Universality

    • Computability

    • Intractability

    • Cryptography

8.  Systems

9.  Scientific Computation

10.  Perspective


 Lecture Notes

Assignments

FAQ









7.7. COMPUTABILITY


This section under major construction.

In David Hilbert's famous 1900 address to the International Congress of Mathematics, he asserted:

Take any definite unsolved problem, such as the question as to the irrationality of the Euler-Mascheroni constant γ, or the existence of an infinite number of prime numbers of the form 2n+1. However, unapproachable these problems may seem to us and however helpless we stand before them, we have, nevertheless, the firm conviction that their solution must follow by a finite number of purely logical processes.

Now that we have a clear notion of an algorithm (the Turing machine), we will see that some computational problems cannot be solved, regardless of the amount of resources available. If an algorithm exists to solve a particular problem, that problem is said to be solvable; otherwise the problem is unsolvable. We give several natural examples of unsolvable problems. Unsolvable problems arise in many areas including: cellular automata, chaos theory, combinatorics, operations research, statistics, physics, compiler theory, knot theory, logic, set theory, and topology. Note that unsolvability is a very strong statement about a problem - it says not only that scientists have not discovered an algorithm for the problem, but that such a discovery is impossible.

Halting problem. The halting problem is the most famous of all unsolvable problems, and it was the first one classified as such. The input to the halting problem is a Turing machine and its input. The goal is to determine whether or not that Turing machine will ever reach the halt state. This is more difficult than it appears because very simple Turing machines, often referred to as busy beavers, may perform remarkably complex actions. An N-state busy beaver is a Turing machine defined on N states over the binary alphabet (0 and 1) that when started with an all zero tape, leaves as many 1's on the tape as possible before halting. Finding busy beavers for even modest size values of N is a surprisingly daunting task. The following 8-state Turing machine leaves 4,098 ones on the tape and halts after 47,176,870 steps.

However, it is not a busy beaver. In fact, Marxen and Buntrock have an 8-state Turing machine (when converted to our Minsky style notation) that leaves over 1047 ones on the tape after more than 1095, and an astonishing 9-state Turing machine that leaves over 10865 ones on the tape, and halts after over 101730 steps.

Halting problem in Java. We can recast the Halting problem in terms of the Java programming language. Here, the goal is to write a program, say Halt.java, that determines whether or not some static method, say mystery, goes into an infinite loop on some specific input x. This would be a powerful debugging tool. We've all written programs that go into infinite loops. The possibility of an infinite loop in your code usually signifies a bug. An infinite loop in a commercial piece of software could lead to angry customers, or worse. It would sure be nice if hte Java compiler could warn us if our function might go into an infinite loop. To see why this is such a daunting task, consider program Perfect.java. It searches for an odd perfect number: a number that equals the sum of its proper divisors (e.g., 28 is perfect sine 28 = 1 + 2 + 4 + 7 + 14). Does the program halt (assuming we never run into overflow problems)? If so, how long do we have to wait until we can conclude that it is in an infinite loop? We can type in the code and see what happens. If the program terminates, we can safely answer yes. The main obstacle is determining when to say no. Suppose we do stop the program at some point (Ctrl-c) and answer no. Maybe if we had left the program running just a bit longer it would have halted on its own. Despite intensive research, no one knows whether Perfect.java terminates. Mathematicians have proved that it will not terminate until n is at least 10300. This is an extreme example, but it highlights the fact that there is no easy way to tell whether a given program will terminate just by looking at it. In contrast, the program Cube.java searches for a positive integer-valued solution to 313(a3 + b3) = c3. It turns out that Cube.java does halt (assuming we never run into overflow problems), but not until c is greater than 101000. Indeed, we could pose mathematical problems like Fermat's Last Theorem in the same way (see Exercise XYZ). If the halting problem were solvable, then mathematics would be easy.

The halting problem is unsolvable.   We sketch below a mind-blowing proof that no algorithm to solve the halting problem exists or could ever exist. The idea behind the proof is inspired by the following paradox. Is the following statement true or false? This sentence is false. The essence of this paradox is caused by self-reference. The crucial idea in our Halting proof is to feed a program itself as input.

Barber's Paradox. Suppose Barry is a Barber who claims that he shaves all people in the town who do not shave themselves, and that Barber lives in the town and is clean-shaven. We can prove by contradiction that such a barber cannot exist.... We now apply the same logic reasoning to Java programs...

Informal proof. Since the Java programming language is equivalent to Turing machines, it suffices to show that we can't write a Java program to solve the halting problem. We consider programs that take some arbitrary input (say from stdin). We denote the result of a program P run with input x by P(x). Suppose, for the sake of contradiction, that there exists such a halting program Halt(P, x). It takes two inputs: a program P and its input x. Program Halt(P, x) outputs yes if P(x) halts, and no otherwise. Note that by our hypothesis, Halt(P, x) itself always halts for any pair of inputs.

Now, the fun begins. Create a new program Strange(P) that takes as input a single program P. This new program calls the halting program with P as both inputs, i.e., Halt(P, P). It may seem odd to take a program as input, but this is rather common. Compilers do exactly this; they read in a Java program as input, and output a machine language program. As Marvin Minsky observers, you don't need to worry much about why we'd want to perform such an introverted computation. But, you can gain some intuition by considering a computational biologist who wants to create a complete description of their own genome! There is nothing at all absurd about this.

Now, we design the program Strange(P) so that it promptly halts if Halt(P, P) outputs no. Otherwise, we make Strange(P) go into an infinite loop. In Java, the code for Strange(P) might look like:

if (Halt(P, P) == true)
   while (true)    // infinite loop
      ;

In summary

  1. If P(P) does not halt, then Strange(P) halts.
  2. If P(P) halts, then Strange(P) does not halt.
Now, we perform the crucial step: give program Strange(P) itself as input, i.e., set P = Strange. Let's see what crazy thing happens. Statements (a) and (b) now reduce to:
  1. If Strange(Strange) does not halt, then Strange(Strange) halts.
  2. If Strange(Strange) halts, then Strange(Strange) does not halt.
Both cases lead to a contradiction. The only thing we can conclude is that our hypothesis that program Halt(P, x) exists is impossible. That is, the halting problem is unsolvable!

Other unsolvable problems. Here are some more example of unsolvable problems:

Implications.   The existence of unsolvable problems has profound consequences in both computation and philosophy. First, it says there are some languages that cannot be recognized by any computer. That is, all computers have intrinsic limitations. These problems can be of enormous practical significance, including Hilbert's 10th problem. We must work within the limitations of computers by recognizing and avoiding unsolvable problems. Second, the whole assertion that logic can solve any problem is intrinsically false. If the human brain operates equivalently to a machine, then according to the Church-Turing thesis, the human brain would be no more powerful than a Turing machine. Hence, humans would be incapable of solving problems like the Halting problem. Humans may have fundamental limitations, just like computers. Others would view this differently. Rosenbloom concludes man can never eliminate the necessity of using his own cleverness, no matter how cleverly he tries.

Computable numbers. One of Turing's main interests was in defining computable number - those numbers whose digits can be described by a mechanical process. For example, it is possible to write a Turing machine that leaves the digits of π (3, 1, 4, 1, 5, etc) on some designated (write-once) part of the tape. Even though π is irrational, after a finite number of steps the first N digits are printed on the special part of the tape. Other examples of computable numbers include all integers, all rational numbers, sqrt(2), all algebraic numbers, sin(10), and erf(.4), the zeros of the Bessel function, etc. This includes pretty much all numbers that arise in scientific computing.

Countable and uncountable numbers. There are countably many Turing machines (like the integers), but uncountably many decision problems (like the real numbers). Hence, virtually all decision problems are undecidable. Similarly, there are only a countably infinite number of computable numbers, but uncountably many irrational numbers. Thus, most irrational numbers are not computable. Fortunately, the problems and numbers we encounter most often in science and engineering are decidable. Of course, this may be because those are exactly the ones we know how to cope with.

Godel's incompleteness theorem. Turing undecidability result was anticipated by one of the most shocking blows that shook the very foundations of mathematics. In 1931, Kurt Godel proved that the most widely accepted mathematical formalization of the natural numbers (Principia Mathematica) was either incomplete (not every true statement could be proved true) or was inconsistent (some false statements could be proved true). Godel's incompleteness theorem applies to any formal mathematical system rich enough to express....

Negative results. One of the distinguishing features of computer science that sets it apart from other sciences is the concept of "lower bounds" and "negative results." We see this with undecidability. We will also see it with NP-completeness and its derivatives, and lower bounds for sorting. Some of the great achievements in the social and physical sciences have dealt with impossibility results. There is Heisenberg's uncertainty principle in quantum mechanics, Arrow's impossibility theorem in economics, and Carnot's theorem in thermodynamics. In mathematics, there is Abel's impossibility theorem for finding the roots of 5th degree polynomials, the impossibility of trisecting an angle with a straight-edge and compass, proving Euclid's Parallel Postulate and the existence of non-Euclidean geometry, the irrationality of the square root of 2, Lindenman's proof that π is transcendental. Outside of mathematic logic (e.g, Goedel's theorem), there are few disciplines with formal methodologies for proving such negative results.

Exercises

Creative Exercises

  1. (Hopcroft, Motwani, and Ullman.) For each of the two instances to the PCP, either find a solution or argue why no such solution is possible.

    0   1    2   3        0    1    2     3
    a   aba  b   bb       abb  ba   bab   bbabaa
    ba  ab   ba  b        bb   bab  abb   babaab
    

    Answer: A solution to the first instance is bababaababb (2-0-2-1-1-3). The second instance has no solution. Observe that all cards have at least as many b's on the bottom as on the top. We must start with card 1, which has one more b on the bottom than on the top.

  2. Solvable instance of PCP. The PCP problem is undecidable even if the only allowable symbols are 0 and 1. Devise an algorithm that determines whether a PCP problem is a yes instance assuming that the only allowable symbol is 1. Hint: requires some number theory.
  3. 3x + 1 function. It is not always easy to tell whether or not a specific function terminates on all inputs, even if the function is only a few lines long. For example, consider whether or not the following Collatz function terminates for input x.

    void mystery(int x) {
       while (x > 1) {
          if (x % 2 == 1) x = 3*x + 1;
          else            x = x / 2;
       }
    }
    

    On many inputs, including 8 and 7, the program terminates:

    mystery(8):  8 4 2 1
    mystery(7):  7 22 11 34 17 52 26 13 40 20 10 5 16 8 4 2 1
    

    At first glance, this appears like a simple problem. To determine whether or not a program terminates on any specific input, it is tempting to run the program with the given input and see what happens. If the program terminates, we can safely answer yes. The main obstacle is deciding when to stop and say no. Suppose we do stop the program at some point and answer no. Maybe, if we had left the programming running just a bit longer, it would have terminated. There is no way to know for sure.

  4. 196 problem. Given a positive integer, reverse the digits and add it to the original number. Repeat until you get a palindrome. For example, if we start with 5280, the reverse is 825, so the next number is 5280 + 825 = 6105. The next number is 6105 + 5016 = 11121. The last number is 11121 + 12111 = 23232 since it is a palindrome. This reverse and add algorithm terminates quickly for most integers. Nobody knows whether it terminates for 196, although it is known not to terminate in the first 9 million iterations.
  5. Fermat style problem. Is there a positive integer-valued solution to 313(a3 + b3) = c3? Write a program Cube.java that enumerates all integers a, b, and c and checks for a solution to the equation. Does the program halt (assuming no overflow)? Answer: yes, but don't wait around since the smallest counterexample has more than 1,000 digits!
  6. Fermat's last theorem. In 1XYZ Fermat conjectured that that there are no positive integers a, b, c, and n such that an + bn = cn for n > 2. Write a program Fermat.java to search for a counterexample and terminate if it finds one. Enumerate over all tuples a, b, c, and n in increasing order of cn. In 19XY Andrew Wiles proved Fermat's last theorem, which implies that Fermat.java will never terminate, assuming no overflow.
  7. Quine. A quine is a program that when executes prints itself as output. Write a Java program Quine.java that is a quine. Reference. Here is a list of quines in many languages.
  8. Pangram. Write a computer program to create true sentence of the form "This computer-generated pangram contains _ a's, _ b's, ... " where the blanks are replaced by English words for the numbers.

    This computer-generated pangram contains six a's, one b, three c's, three d's, thirty-seven e's, six f's, three g's, nine h's, twelve i's, one j, one k, two l's, three m's, twenty-two n's, thirteen o's, three p's, one q, fourteen r's, twenty-nine s's, twenty-four t's, five u's, six v's, seven w's, four x's, five y's, and one z.

    Reference: here

  9. Self-halting problem. Self-halting problem: given a program that takes one input, does it terminate when given itself as input? Prove that this problem is undecidable by following a similar argument as for the halting problem.
  10. Totality problem. The totality problem is to decide whether an arbitrary Turing machine halts on all inputs. Such a program would enable us automatically detect for the possibility of infinite loops in our Java programs. Prove that the totality problem is undecidable by showing that you could solve the halting problem if you had a program TOTALILTY(P) that returns true or false depending on whether the Turign machine P halts on all inputs.

    Solution: suppose we want to solve the halting problem, e.g., to know whether the Turing machine Q halts on input x. Create a new machine P that takes an arbitrary input, ignores it, and runs Q on x. Now, P halts on all inputs if and only if Q halts on input x. Thus, we could use P to solve the halting problem.

  11. Program equivalence problem. Prove that the program equivalence problems in undecidable. Program equivalence problem: given two programs P and Q, do they produce the same result for all possible input values. Such a program means that an optimizing compiler cannot guarantee to find the optimally efficient program since there may be better versions, but the compiler can't be sure that they are equivalent.

    Solution: We will restrict attention to Turing machines with no output - either they halt or don't halt. We can easily construct a Turing machine Q that always halts and outputs nothing. Then PEQ(P, Q) outputs true if and only if P halts for all possible input values. Thus, if we had a subroutine PEQ(P, Q) we could solve the totality problem, which is undecidable. Hence PEQ(P, Q) is as well.

  12. Busy beavers and computability. The busy beaver function B(n) is defined to be the maximal number of ones that an n-state Turing machine over the binary alphabet can leave on an initially blank tape, while still halting. The functions B(n) is not computable: it is not possible to write a computer program that will take in an integer n and return B(n). Jeffrey Shallit's handout provides a rigorous proof of this statement.

  13. A consequence. Our proof of the undecidability of the Halting problem uses a Turing machines whose input is a representation of itself. In fact, the Halting problem is undecidable even if the input to the Turing machine is blank (all 0's). Such machines are called Blank Tape Turing Machines. Furthermore, the problem is undecidable even if we only allow a two symbol alphabet (0 and 1). We leverage the non-computability of B(n) to establish this fact.

    For the sake of contradiction, let's assume that we can decide whether or not any Turing machine M halts when started with a blank tape. Given such a decider, we'll show how to compute B(n).

    INPUT:  integer n
    
    FOREACH n-state Turing machine M
       -  Check whether or not M halts when started with a blank tape
       -  If it halts, run M until it halts and count how many 1's
          it leaves on the tape
    
    OUTPUT: the maximum number of 1's left on the tape by any machine
    

    This procedure will terminate since there are only finitely many n-state Turing machines (and we can enumerate them in an orderly manner), and we only simulate the actions of those Turing machines that halt. This procedure computes B(n) since all n-state Turing machines that halt are considered. This construction implies that we have a method for computing B(n). Since computing B(n) is undecidable, our original assumption that we could decide whether a Turing machine halts must be invalid. Thus, it is impossible to write a computer program to decide whether a Turing machine will halt, even if the Turing machine begins with an empty tape and it works over the binary alphabet.

  14. Amoeba growth. Suppose that you start with one ameoba in a jar. Every minute, each ameoba turns into 0 (dies), 1 (does nothing), 2 (splits into two), or 3 (splits into three) amoeba with probability 25%. What is the probability that the ameoba population eventually goes extinct? Write a Java program to simulate this experiment. How do you know when to stop and conclude that the population will not die out? Answer: using probability theory, can compute that extinction probability = sqrt(2) - 1. Using Java, no easy answer.