Computer Science Building, Princeton University


Introduction to CS


1.  A Simple Machine

2.  Java Programming

3.  OOP

4.  Data Structures

5.  A Computing Machine

    • Data Representations

    • TOY Machine

    • TOY Instruction Set

    • TOY Programming

    • TOY Simulator

6.  Building a Computer

7.  Theory of Computation

8.  Systems

9.  Scientific Computation

10.  Perspective


 Lecture Notes

Assignments

FAQ









5.1. DATA REPRESENTATIONS


This section under construction.


All data on digital computers is represented as a sequence of 0s and 1s. This includes numeric data, text, executable files, images, audio, and video. The ASCII standard associates a seven bit binary number with each of 128 distinct characters. The MP3 file format rigidly specifies how to encode each raw audio file as a sequence of 0s and 1s. All data are numbers, and all numbers are data. In this chapter, we will treat all data as binary numbers; in Chapter XYZ, we will treat all data as strings of symbols. Here is a recent study of how much new information is created each year.

We review how to represent integers in binary, decimal, and hexadecimal. We also review how to convert between bases and do arithmetic. Finally, we describe how to represent negative integers using ``two's complement notation.''

Number systems. There are many ways to represent integers: the number of days in the month of October can be represented as 31 in decimal, 11111 in binary, 1F in hexadecimal, or XXXI in Roman Numerals. It is important to remember than an integer is an integer, no matter whether it is represented in decimal or with Roman Numerals. We are most familiar with performing arithmetic with the decimal (base 10) number system. This number system has been widely adopted, in large part because we have 10 fingers. Other number systems still persist in modern society. The Babylonians divided a circle into 360 degrees since they believed the Sun rotated around the Earth in about 360 days. 360 is more convenient than 365 since 360 has lots of divisors, so it can be divided into 12 months of 30 days each. Ptolemy tabulated trigonometric tables using base 360, and, even today, we still often use degrees instead of radians when doing geometry. Following in this spirit, most clocks are based on the sexagecimal (base 60) number system. Computers are based on the binary (base 2) number system because each wire can be in one of two states (on or off). Writing numbers in binary is tedious since this representation uses between 3 and 4 times as many digits as the decimal representation. The hexadecimal (base 16) number system is often used in machine languages, including TOY, as a shorthand for binary. Base 16 is useful because 16 is a power of 2, and numbers have roughly has many digits as in the corresponding decimal representation. A sequence of digits x = xn, xn-1, ..., x1, x0 in base b integer

x = xn bn + xn-1 bn-1 + ... + x1 b1 + x0 b0.

This representation is called positional notation. The xi terms are the positional digits, and each digit is required to be an integer between 0 and b - 1. In binary, the two digits (also referred to as bits) are 0 and 1; in decimal, the ten digits are 0 through 9; in hexadecimal, the sixteen digits are 0 through 9 and the letters A through F. Every positive integer can be expressed using positional notation, and the representation is unique modulo an arbitrary number of the leading 0's. As an example the number of days in a leap year is 36610 = 1011011102 = 16E16 since:

366 = 3×102 + 6×101 +  6× 100
366 = 1× 28 + 0× 27 +  1× 26 + 1×25 + 0×24 + 1×23 + 1×22 + 1×21 + 0×20
366 = 1×162 + 6×161 + 14×160

The following table gives the binary, decimal, and hexadecimal representations of the first 48 integers.


BIN DEC HEX   BIN DEC HEX   BIN DEC HEX
0 0 0 10000 16 10 100000 32 20
1 1 1 10001 17 11 100001 33 21
10 2 2 10010 18 12 100010 34 22
11 3 3 10011 19 13 100011 35 23
100 4 4 10100 20 14 100100 36 24
101 5 5 10101 21 15 100101 37 25
110 6 6 10110 22 16 100110 38 26
111 7 7 10111 23 17 100111 39 27
1000 8 8 11000 24 18 101000 40 28
1001 9 9 11001 25 19 101001 41 29
1010 10 A 11010 26 1A 101010 42 2A
1011 11 B 11011 27 1B 101011 43 2B
1100 12 C 11100 28 1C 101100 44 2C
1101 13 D 11101 29 1D 101101 45 2D
1110 14 E 11110 30 1E 101110 46 2E
1111 15 F 11111 31 1F 101111 47 2F


Number conversion.

Converting from Decimal. It is slightly more difficult to convert an integer represented in decimal to one in base b because we are accustomed to performing arithmetic in base 10. The easiest way to convert from decimal to base b by hand is to repeatedly divide by the base b, and read the remainder upwards. For example, the calculations below show that 36610 = 1011011102 = 16E16.

Converting from decimal to binary

Converting between binary and hexadecimal. We describe a fast way to convert from the binary to hexadecimal representation of an integer. In principle, we could convert from binary to decimal, and then from decimal to hexadecimal. The following approach is simpler. To convert from binary to hexadecimal: first, group the digits 4 at a time starting from the right; then convert each group to a single hexadecimal digit, padding 0's to the very last group if necessary.

0011 1010 1110 0111 0001
 3     A    E    7   1

To convert from hexadecimal to binary: convert each hexadecimal digit individually into its corresponding 4 digit binary number, removing any leading 0's.

 9     F    0    3
1001 1111 0000 0011

These simple procedures work because one base is a power of the other. Likewise, it would be easy to convert between the base 125 and base 5 representations.

Number conversion in Java. Converting a string to another type is called parsing, and can be a significant computational burden, since there are normally numerous cases to consider. We have already used one such built-in static method extensively: Integer.parseInt converts from a string of decimal digits to an integer. The program BinaryConverter.java defines a static method fromBinaryString(s) that converts a string s of binary digits to an integer, and another static method toBinaryString(n) that converts an int n to a binary string. We note that Java's library calls Integer.parseInt(s, 2) and Integer.toBinaryString(n) accomplish the same goal.

Program HexInOut.java reads in a hexadecimal integer from standard input and prints it out in hexadecimal to standard output.

Arithmetic in other number systems.

Addition. In grade school you learned how to add two decimal integers: add the two least significant digits (rightmost digits); if the sum is more than 10, then carry a 1 and right down the sum modulo 10. Repeat with the next digit, but this time include the carry bit in the addition. The same procedure generalizes to base b by replacing the 10 with the base b. For example, if you are working in base 16 and the two summand digits are 7 and E, then you should carry a 1 and write down a 6 because 7 + E = 1616. Below, we compute 456710 + 36610 = 493310 in binary (left), decimal (middle) and hexadecimal (right).

  1 0 0 0 1 1 1 0 1 0 1 1 1
+ 0 0 0 0 1 0 1 1 0 1 1 1 0
  -------------------------
  1 0 0 1 1 0 1 0 0 0 1 0 1
  4 5 6 7
+   3 6 6
---------
  4 9 3 3
  1 1 D 7
+   1 6 E
---------
  1 3 4 5

Multiplication. One compelling reason to use positional number systems is to facilitate multiplication. Multiplying two Roman Numerals is awkward and slow. In contrast, the grade school algorithm for multiplying two decimal integers is straightforward and reasonably efficient. As with addition, it easily generalizes to handle base b integers. All intermediate single-digit multiplications and additions are done in base b. Below, we multiply the decimal integers 4,567 and 366 (left) and then again in hex (right).

      4 5 6 7
    *   3 6 6
    ---------
    2 7 4 0 2
  2 7 4 0 2
1 3 7 0 1
-------------
1 6 7 1 5 2 2
       1 1 D 7
     *   1 6 E
--------------
       F 9 C 2
     6 B 0 A
   1 1 D 7
--------------
   1 9 8 1 6 2

Signed and unsigned integers. On a machine with 16-bit words, there are 216 = 65,536 possible integers that can be stored in one word of memory. By interpreting the 16 bits as a binary number, we obtain an unsigned integer in the range 0 through 65,535. Instead, we can interpret the leading bit as the sign of the number, using two's complement notation. This allows us to interpret the 16 bits as a signed integer in the range -32,768 through +32,767, as described in the table below. As with binary integers, it is often convenient to express 16-bit two's complement integers using hexadecimal notation.

BINARY HEX DECIMAL
0000 0000 0000 0000 0000 0
0000 0000 0000 0001 0001 +1
0000 0000 0000 0010 0002 +2
0000 0000 0000 0011 0003 +3
...
0111 1111 1111 1110 7FFE +32,766
0111 1111 1111 1111 7FFF +32,767
1000 0000 0000 0000 8000 -32,768
1000 0000 0000 0001 8001 -32,767
1000 0000 0000 0010 8002 -32,766
...
1111 1111 1111 1101 FFFD -3
1111 1111 1111 1110 FFFE -2
1111 1111 1111 1111 FFFF -1


Negating a two's complement integer. To negate (change from positive to negative or negative to positive) a two's complement integer, first complement all of the bits, then add 1. By complement, we mean replace all of the 0's with 1's, and the 1's with 0's. The table below illustrates a few examples.

DECIMAL BINARY COMPLEMENT INCREMENT DECIMAL
+3 0000 0000 0000 0011 1111 1111 1111 1100 1111 1111 1111 1101 -3
-3 1111 1111 1111 1101 0000 0000 0000 0010 0000 0000 0000 0011 +3
+40 0000 0000 0010 1000 1111 1111 1101 0111 1111 1111 1101 1000 -40
-40 1111 1111 1101 1000 0000 0000 0010 0111 0000 0000 0010 1000 +40
+366 0000 0001 0110 1110 1111 1110 1001 0001 1111 1110 1001 0010 -366
0 0000 0000 0000 0000 1111 1111 1111 1111 0000 0000 0000 0000 0
-32,768 1000 0000 0000 0000 0111 1111 1111 1111 1000 0000 0000 0000 -32,768


Number Conversions with two's complement integers. We describe how to convert between the decimal and two's complement representations of an integer. To convert the 16-bit two's complement integer FE92 into decimal, we start by writing down its binary representation: 1111 1110 1001 0010. We recognize it as a negative integer since the most significant bit is 1. Then, we negate it (flip bits and add 1) to obtain: 0000 0001 0110 1110. We convert 101101110 (binary) to 366 (decimal). After putting back in the negative sign, we obtain the final answer of -366 (decimal).

To convert from the decimal integer -366 to its 16-bit two's complement representation, we can apply the above steps in reverse. First we convert 366 (decimal) into 101101110 (binary). Next, we negate it (flip bits and add one) to obtain 1111 1110 1001 0010. It is important that we fill in all 16 bits. If desired, we can convert this to hexadecimal: FE92.

Adding two's complement integers. Adding two's complement integers is straightforward: add the numbers as if they were unsigned integers, ignoring any overflow. Below we compute 456710 + -36610 = 420110 in decimal (left) and again in binary using 16-bit two's complement integers (right). Note that the second binary integer represents a negative integer using two's complement notation.

  4 5 6 7
+  -3 6 6
---------
  4 2 0 1
  0 0 0 1 0 0 0 1 1 1 0 1 0 1 1 1
+ 1 1 1 1 1 1 1 0 1 0 0 1 0 0 1 0
---------------------------------
  0 0 0 1 0 0 0 0 0 1 1 0 1 0 0 1

In the example above, we carry a one into the most significant bit (leftmost bit), and we carry a one out, which we subsequently discard. Despite the apparent overflow, we are left with the correct result. However, there is one situation where this will produce an incorrect answer: if we do not carry a one into the most significant bit, but do carry a one out. Below we compute -32,76610 + (-36610) = -33,13210 using 16-bit two's complement integers. At first, we might be surprised to see that the result of adding two negative integers is a positive integer (to see this quickly, look at the most significant bit of the result). This occurs because there is carry out of the most significant digit, but no carry in to it. In hindsight, we should not be surprised because the true answer -33,132 cannot be represented as a 16-bit two's complement integer.

   -3 2 7 6 6
 +     -3 6 6
-------------
   -3 3 1 3 2
  1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
+ 1 1 1 1 1 1 1 0 1 0 0 1 0 0 1 0
---------------------------------
  0 1 1 1 1 1 1 0 1 0 0 1 0 1 0 0

Bit-whacking operators in Java. In Java, an int is a 32-bit two's complement integer. Like TOY, Java supports a number of operators to manipulate the bits of integer types, as summarized below. These bit-whacking operators are especially useful when performing low-level data processing such as cryptography, data compression, error correction, and transmitting email. We will also use them in Section 5.5 to simulate the behavior of the TOY machine.

NAME OPERATOR PURPOSE
bitwise NOT ~x Flip all the bits of x
bitwise AND x & y Take the AND of each pair of bits
bitwise OR x | y Take the OR of each pair of bits
bitwise XOR x ^ y Take the XOR of each pair of bits
left shift x << y Move the bits of x to the left y positions
right shift x >> y Move the bits of x to the right y positions
unsigned right shift x >>> y Move the bits of x to the right y positions

Left shifting an integer by i bits means that each of the bits in the integer are moved i positions to the left. Zeros are padded to the right end. This is equivalent to multiplying the integer by 2i. For example 43 <3 is 344 and -43 <3 is -344.

 43       = 0000 0000 0000 0000 0000 0000 0010 1011
 43 <3 =0000 0000 0000 0000 0000 0001 0101 1000=344 -43=1111 1111 1111 1111 1111 1111 1101 0101 -43 << 3=1111 1111 1111 1111 1111 1110 1010 1000=-344 

Right shifting an integer means that you shift all of the bits i positions to the right, padding 0s to the beginning if the integer is positive and 1s if the integer is negative. It is equivalent to dividing the integer by 2i and rounding the result down (towards -infinity). For example 43 >> 3 is 5 and -43 >> 3 is -6. For positive integers, this is equivalent to integer division, but for negative integers, the rounding works in the opposite direction.

 43       = 0000 0000 0000 0000 0000 0000 0010 1011
 43 >>  3 = 0000 0000 0000 0000 0000 0000 0000 0101  =  5
 43 /   8 = 0000 0000 0000 0000 0000 0000 0000 0101  =  5
-43       = 1111 1111 1111 1111 1111 1111 1101 0101
-43 >>  3 = 1111 1111 1111 1111 1111 1111 1111 1010  = -6
-43 /   8 = 1111 1111 1111 1111 1111 1111 1111 1011  = -5

An unsigned right shift is identical to a signed right shift when applied to positive integers, but differs when applied to negative integers. An unsigned right shift always pads 0s to the left, regardless of whether the integer is positive or negative. For example, 43 >>> 3 is 5 but -43 >>> 3 is 536870906. To see why, observe that

 43       = 0000 0000 0000 0000 0000 0000 0010 1011
 43 >>> 3 = 0000 0000 0000 0000 0000 0000 0000 0101  =  5
-43       = 1111 1111 1111 1111 1111 1111 1101 0101
-43 >>> 3 = 0001 1111 1111 1111 1111 1111 1111 1010  = 536870906

Program Shift.java reads in two integers a and b from the command line and computes the above quantities and prints the results.

Big Endian, little endian. Computers differ in the way in which they store multi-byte chunks of information, e.g., the 16-bit short integer 0111000011110010 = 70F2. The two are two primary formats, and they differ only in the order or "endianness" in which they store the bytes. Big endian systems store the most significant bytes first, e.g., they store the integer above in its natural order 70F2. Java uses this format, as does Apple Mac, IBM PowerPC G5, Cray, and Sun Sparc. Little endian systems store the least significant bytes first, e.g., they store the integer above in reverse order 2F07. This format is more natural when manually performing arithmetic, e.g., since for addition you work from least significant byte to most significant byte. Intel 8086, Intel Pentium, Intel Xeon use this format. Computer scientists occasionally engage in religious wars about which is better. Fortunately, Java hides the endianness from the end user, so if you create binary data files in Java, you won't need to worry about endianness when sharing them over the Internet with Java users on Mac, PC, and Solaris platforms. Unless you need to read in a legacy binary file (e.g., written in C on a PC), you shouldn't have to directly confront these details.

Dangers of not recognizing overflow. In September 2004, a computer glitch grounded air traffic in southern California for several hours and left planes without radio communication. The software was running on Windows 2000 Advanced Server. The software relied on a Windows function named GetTickCount(), which returns the number of milliseconds since the system was started. Unfortunately, this value is represented using a 32-bit integer. After approximately 49.7 days, the value overflows. This glitch was known when the system was launched. To avoid the ensuing problems, the documentation recommended manually rebooting the system once a month.


Q + A

Q. Do all programming languages use 32 bit two's complement integers?

A. No. Java is unusual in that it completely specifies the representation of an int. In C, there is no requirement that an int be 32 bits or that it uses two's complement notation to represent negative integers. Different C compilers can represent integers in different ways, and this can lead to incompatibilities when trying to use the same program with different compilers.

Q. My program only needs integers between -32,768 and 32,767. Does using a 16-bit short make my program faster? Does it save space?

A. Typically a single short variable is internally stored using 32 bits (especially if the underlying hardware architecture is 32 bit), in which case it does not save any space to declare a single short. Moreover, it can take longer because the Java system must make the 32 bits behave as if it were representing a 16-bit two's complement integer. On the other hand, if you declare a huge array of type short, then the elements will be packed two to an int, and the short array will use approximately half as much memory as an int array of the same length.

Q. Why do I need to cast to add two variables of type short?

A. Java convert the results of most integer operations to be of type int. If a, b, and c are of type short, then a + b is promoted to type int, and assigning this sum to a short requires an explicit cast. One exception to this rule is if you use +=, in which case the cast is not necessary.

Q. Can I apply bitwise & and bitwise | to boolean values? If so, is there any difference between the corresponding logical operators && and ||?

A. Yes. The difference is that the logical operators are subject to short circuiting: the expression (f(x) && g(x)) will not evaluate g(x) if f(x) is false. This also explains why there is no ^^ operator for logical XOR: it is never possible to short-circuit an XOR, so it would always be identical to bitwise ^.

Exercises

  1. Convert the decimal number 92 to binary.

    Answer: 1100100.

  2. Convert the hexadecimal number BB23A to octal.

    Answer: first convert to binary 1011 1011 0010 0011 1010, then consider the bits three at a time 10 111 011 001 000 111 010, and convert to octal 2731072.

  3. Add the two hexadecimal numbers 23AC and 4B80 and give the result in hexadecimal. Hint: add directly in hex instead of converting to decimal, adding, and converting back.
  4. Assume m and n are positive integers. How many 1 bits are there in the binary representation of 2^(m+n)?
  5. How many bits are in the binary representation of 2^2^2^2^17?
  6. IPv4 is the protocol developed in the 1970s that dictates how computers on the Internet communicate. Each computer on the Internet needs it own Internet address. IPv4 uses 32 bit addresses. How many computers can the Internet handle? Is this enough for every human being to have their own? Every cell phone and toaster?
  7. IPv6 is an Internet protocol in which each computer has a 128 bit address. How many computers would the Internet be able to handle if this standard is adopted? Is this enough? Answer: 2^128. That at least enough for the short term - 5000 addresses per square micrometer of the Earth's surface!
  8. When you but a hard drive, 1 GB means 1,000 MB (megabytes) and 1 MB means 1,000 KB (kilobytes) and 1 KB means 1,000 bytes. But when you buy memory, 1 GB means 1,024 MB, 1 MB means 1,024 KB, and 1 KB means 1,024 bytes. What percentage difference is there in the amount of storage in a 100 MB hard drive vs. 100 MB memory? 1GB hard drive vs. 1 GB memory? (1024/1000)2 = 4.9% and (1024/1000)3 = 7.4%.
  9. Why does the following code fragment fail?
    short a = 4;
    short b = 5;
    short c = a + b;
    
    Answer: Java automatically promotes the sum to be of type int. To assign the result to a short, you need to explicitly cast it back c = (short) (a + b). Yes, this is rather quirky.
  10. What does the following code fragment from program Overflow.java print out?
    int a = 2147483647;   // 2^31 - 1
    int b = a + 1;
    System.out.println("a = " + a);
    System.out.println("b = " + b);
    
  11. What does the following code fragment print out?
    int a = -5 >>  3;
    int b = -5 >>> 3;
    System.out.println(a);
    System.out.println(b);
    
  12. List all values a of type int for which (a == (a >> 1)). Hint: there is more than one.
  13. What does the following code fragment print out?
    int a = 11 & 17;
    int b = 11 ^ 17;
    int c = 11 | 17;
    int d = ~11;
    System.out.println(a);
    System.out.println(b);
    System.out.println(c);
    System.out.println(d);
    
  14. Given two positive integers a and b, what result does the following Java code fragment leave in c?
    c = 0;
    while (b > 0) {
       if (b & 1 == 1) c = c + a;
       b = b >> 1;
       a = a <1; } 

    Answer: a * b.

  15. What does the following code do to the integers stored in two different variables a and b?
    a = a ^ b;
    b = a ^ b;
    a = a ^ b;
    
  16. Repeat the previous question, but assume a and b are the same variable.
  17. What does the following code do to the integers stored in two different variables a and b? Any problems with overflow?
    a = a + b;
    b = a - b;
    a = a - b;
    
  18. What do each of the following to statements do?
    x = - ~x;
    x = ~ -x;
    

    Increment x, decrement x

  19. Modify Binary.java so that it converts from base 7 to decimal and vice versa.
  20. What is the value of cnt after the following loop?
    int cnt = 0;
    for (int i = 1; i != 0; i = 2 * i) {
        cnt++;
    }
    
    Hint: it's not an infinite loop.
  21. Explain why the following Java code fragment correctly determines whether the integer n is a power of 2.
    boolean isPowerOfTwo = (n & -n) == n;
    

Creative Exercises

  1. Linear feedback shift register. Rewrite LFSR.java to simulate the linear feedback shift register from Chapter 1 using bit-whacking operations.
  2. Linear feedback shift register cycle length. Modify the program from the previous exercise to compute the cycle length of the LFSR using Floyd's method. What if you change the taps to xyz?
  3. IP addresses and IP numbers An IP address (IPV4) is comprised of integers w, x, y, and z and is typically written as the string w.x.y.z. The corresponding IP number is given by
    IP number = 16777216*w + 65536*x + 256*y + z. Given an IP number, the corresponding IP address is w = (ipnum / 16777216 ) % 256, x = (ipnum / 65536) % 256, y = (ipnum / 256) % 256, z = (ipnum) % 256. [Or use shifting and masking.]

    Write a function that takes an IP number and returns a String corresponding to the IP address. Write a function that takes an IP address and returns a int corresponding to the IP number. For example, if the IP number is 3401190660, then the function should return "202.186.13.4".

  4. IP address. Write a program that takes a 32 bit string as a command line argument, and prints out the corresponding IP address in dotted decimal form. That is, take the bits 8 at a time, convert each group to decimal, and separate each group with a dot. For example, the binary IP address 01010000000100000000000000000001 should be converted to 80.16.0.1.
  5. Base64 encoding. Base64 encoding is a popular method for sending binary data over the Internet. It converts arbitrary data to ASCII text, which can be emailed back between systems without problems. Write a program to read in a arbitrary binary file and encode it using Base64.
  6. Counting in base -2. Use the definition of the positional notation to define the base -2 number system. There are two digits 0 and 1. Count from -7 to 7 in this system.

    Answer:

    0 = 0                         -1 = 11   (-2 + 1)
    1 = 1                         -2 = 10
    2 = 110   (4 + -2)            -3 = 1101 (-8 + 4 + 1)
    3 = 111                       -4 = 100
    4 = 100                       -5 = 1111
    5 = 101                       -6 = 1110
    6 = 11010 (16 + -8 + -2)      -7 = 1001
    7 = 11011
    System.out.println(ans);
    
  7. RGBA color format. Some of Java's classes (BufferedImage, PixelGrabber) use a special encoding called RGBA to store the color of each pixel. The format consists of four integers, representing the red, green, and blue intensities from 0 (not present) to 255 (fully used), and also the alpha transparency value from 0 (transparent) to 255 (opaque). The four 8-bit integers are compacted into a single 32-bit integer. Write a code fragment to extract the four components from the RGBA integer, and to go the other way.
    // extract
    int alpha = (rgba >> 24) & 0xff;
    int red   = (rgba >> 16) & 0xff;
    int green = (rgba >>  8) & 0xff;
    int blue  = (rgba >>  0) & 0xff;
    
    // write back
    rgba = (alpha <24) | (red << 16) | (green << 8) | (blue << 0); system.out.println(ans); 
  8. Find the missing value. Suppose you have an array consisting of 232 - 1 integers of type int such that no two integer appears more than once. Since there are 232 possible values, exactly one integer is missing. Write a code fragment to find the missing integer using as little extra storage as possible.

    Hint: this is a popular interview question. It's possible to do it using only one extra int. Use either properties of integer overflow on two's complement integers or use the XOR function.

  9. Cyclic redundancy check. Write programs CRC16.java and CRC32.java that read in data from standard input and computes its 16 or 32-bit CRC. Use the library CharStdIn.
  10. Number conversion. Write a program Converter.java that converts between base b and decimal for any 2 &le b ≤ 36. You should have a static method toString(int n, int b) that converts n to a base b string and a static method fromString(String s, int b) that converts from a base b string to an integer. Consider defining and using

    String digits = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
    

  11. Excel column numbering. Write a function that takes a nonnegative integer and converts it into the corresponding Excel column name (0 = A, 1 = B, ..., 25 = Z, 26 = AA, ..., 702 = AAA).
  12. Elias Gamma coding. Write a function elias that takes as input an integer N and returns the Elias Gamma code as a string. The Elias Gamma code is a scheme to encode the positive integers. To generate the code for an integer N, write the integer N in binary, subtract 1 from the number of bits in the binary encoding, and prepend that many zeros. For example, the code for the first 10 positive integers is given below.
    1 1       6 00110
    2 010     7 00111
    3 011     8 0001000
    4 00100   9 0001001
    5 00101  10 0001010
    
  13. Bit reversal. Write a function that takes an integer input, reverse its bits, and returns that integer. For example if n = 8, and the input is 13 (00001101), then its reversal is 176 (10110000).
    public static int bitReverse(int input) {
       int ans = 0;
       for (int i = 0; i > 1;
       }
       return ans;
    }
    
  14. Bit-reversal sorting. Use the previous algorithm to "sort" an array of N = 2n elements into their bit-reversed order. Swap elements i and j if i and j are bit reversal of each other. Such permutations arise in the Fast Fourier Transform.
    0    1    2    3    4    5    6        12   13   14   15
    0000 0001 0010 0011 0100 0101 0110 ... 1100 1101 1110 1111
    
    0000 1000 0100 1100 0010 1010 0110 ... 0011 1011 0111 1111
    0    8    4    9    2    10    6        3   11    7   15
    
  15. Swap without temporary storage. What do the following two code fragments do given integers a and b?
    a = a + b;
    b = a - b;
    a = a - b;
    
    a = a ^ b;
    b = a ^ b;
    a = a ^ b;
    
    Answer: each 3-line fragment swaps a and b. It works for two's complement notation.
  16. Gray codes. An order n Gray code is an n-bit encoding of the integers 0 to 2n - 1 such that adjacent numbers differ in only one bit. The following is a 3-bit Gray code.
    000 001 011 010 110 111 101 100
     0   1   3   2   6   7   5   4
    

    Gray codes have many applications. For example, they can be used to increase the reliability of counting of analog-to-digital converters. Counting by Gray code reduces the chance of such errors since only one bit of the counter changes in each step. Gray codes have also been used in pulse code communication, the minimization of logic circuits, hypercube architectures, and were even proposed to organize books on library shelves.

    The binary reflected Gray code of order n is generated as follows. The code of order one is 0, 1. To get the code of order two, write it forwards then backwards: 0, 1, 1, 0. Then, pre-pend 0's to the first half and 1's to the second half: 00, 01, 11, 10. In general, to get the code of order n, write the code of order n-1 forwards and backwards. Then pre-pend 0's to the first half and 1's to the second half. Write a program GrayCode.java that reads in a parameter N and prints out the order N binary reflected Gray code. Observe that the index of the bit that changes in line i is the disc that is moved in the ith move of our solution to the Towers of Hanoi problem!

  17. Puzzle. There are n people waiting outside an empty room. Each minute you can send one person into the room or take one person out of the room. Can you arrange it so that every possible combination of people is attained exactly once?
  18. Bit-whacking version of Gray codes Repeat the previous exercise, but use bit-whacking operations and iteration instead of recursion. Name your program BitWhackingGrayCode.java.
  19. Gray codes II. Redo the previous exercise but do not use Java's built in string class. Instead, do it with a boolean array and name your program GrayCodeArray.java.
  20. Combinational Gray Code. Print out all combination of k of n items in such a way that consecutive combinations differ in exactly one element, e.g., if k = 3 and n = 5, 123, 134, 234, 124, 145, 245, 345, 135, 235, 125. Hint: use the Gray code, but only print out those integers with exactly k 1's in their binary representation.
  21. Free the prisoners I. A warden meets with 17 new prisoners when they arrive. The warden tells them that they may meet today and plan a strategy, but after the meeting, each prisoner will be in solitary confinement and will not be able to communicate with one another. The prison has a switch room with 17 switches that can be on or off, although the initial configuration is not revealed. There is one special setting of the 17 switches that if it is ever achieved will enable the prisoners to go free. Each hour the warden escorts one prisoner to the switch room, where the prisoner can flip at most one switch (from on to off or off to on). The warden can choose prisoners in arbitrary order, so one prisoner may be chosen four times in a row, or not at all. Design a strategy for the 17 prisoners so that they are guaranteed to be set free after some finite amount of time.
  22. Free the prisoners II. Same premise as above, except that the switch room has 2 switches (initially both off), and a prisoner must flip exactly one of the two switches upon entering the switch room. At any time, a prisoner may declare "all 17 of us have visited the control room." If it is true, all prisoners are freed; otherwise they are all executed. The warden can choose prisoners in arbitrary order, so one prisoner may be chosen four times in a row, but each prisoner will be chosen infinitely often (assuming they are never freed). Design a strategy for the 17 prisoners so that they are guaranteed to be set free after some finite amount of time. Extra credit: don't assume the initial configuration is known.
  23. Count the number of 1 bits. Write function that takes an integer input and returns the number of 1's in its binary representation.

    Answer:

    public static int bitCount(int input) {
       int count = 0;
       for (int i = 0; i <32; i++) count=count + (input>>> i & 1);
       return count;
    }
    
  24. Sparse bit-counting. Explain why the following function (that often appears in job interviews for programmers) correctly counts the number of 1 bits in the binary representation of its input. If the input has k 1's, how many times does the while loop iterate?
    public static int bitCount(int input) {
       int count = 0;
       while (input != 0) {
           count++;
           input = input & (input - 1);
       }
       return count;
    }
    
  25. Table lookup bit-counting. Repeat the previous exercise, but pre-compute a table to speed up the computation.

    Answer: this one assumes you have a precomputed table of size 256, with bits[i] storing the number of 1 bits in the binary representation of i. You can use the bit counting function from the previous exercise to initialize it. previous

    public static int bitCount(int input) {
        return bits[(input >>  0) & 0xff]
            +  bits[(input >>  8) & 0xff]
            +  bits[(input >> 16) & 0xff]
            +  bits[(input >> 24) & 0xff];
    }
    

    Increasing the table size to 216 = 65,536 will make things faster assuming you have sufficient memory. A table of size 232 is likely prohibitive.

  26. Subset sum. Given N integers and a target value V, your goal is to find a subset of the integers whose sum is as close to V as possible. Write a program SubsetSum.java that takes N as a command line input, reads in N integers from standard input, and print out the best subset and sum. Enumerate all 2^N subsets by brute force, using an integer variable to count from 0 to 2^N - 1, and using bit-whacking to calculate the subset sums.
  27. Faster subset sum. Repeat the previous exercise, but enumerate the 2^N subsets using a Gray code.
  28. Even faster subset sum. Divide the elements in half. Enumerate the subset sums in each half, sort each half, and compare.
  29. Exact subset sum. Divide the elements in half. Enumerate the subset sums in the first half and insert into a hash table, enumerate the subset sums in the second half and check whether the complementary value is in the hash table.
  30. Dictionary attack. One method that sleazy spammers use to auto-generate email addresses is by enumerating all possible email addresses at a give domain, e.g., hotmail.com. This annoying tactic is called a dictionary or Rumpelstiltskin attack and explains why you sometimes receive spam on a new email address to which you haven't given to anybody. Use Converter.java to design such a program. Your program Rumpelstiltskin.java should take a command line parameter N and print out all 36N possible passwords of N or fewer characters involving numbers and uppercase letters.
  31. Breaking a gold chain. You have a gold chain with 14 links that you are going to use to pay an worker for 15 days at a fee of 1 gold link per day. It's possible to split the chain into 15 pieces by cutting 14 times. Your goal is to pay the worker while only breaking the chain 3 times. The worker must receive exactly the right fraction of total payment after each day of work. Hint: break the chain so there are pieces of 1 section, 2 sections, 4 sections, and 8 sections.
  32. Hamming encoder. Write a Java program HammingEncoder.java that reads in a sequence of 0s and 1s, 4 bits at a time, and encodes them using Hamming codes.
  33. Hamming decoder. Write a Java program HammingDecoder.java that reads in a sequence of 0s and 1s encoded using Hamming codes, 7 bits at a time, and decodes and correct them.
  34. Hamming codes. Modify your solutions to the previous two exercises so that the input bits are packed 8 to the byte.
  35. Absolute value. The constant Integer.MIN_VALUE is the most negative 32-bit two's complement integer. What is Math.abs(Integer.MIN_VALUE)?
  36. Prove that a k-digit decimal number can be represented in binary with no more than 4k bits.
  37. CD Database. CDDB and freedb are databases that allow you to look up CD information on the Web and display the artist, title, and song name. Each CD has a (nearly) unique disc ID number which is used to query the database.
    1. Write a static method sumDigits() that takes an integer parameter and returns the sum of the decimal digits in the integer. For example, sumDigits(6324) returns 15 since 6 + 3 + 2 + 4 = 15.
    2. Write a program CDDB.java that computes the disc ID from a list of lengths of the track lengths. The 32-bit (8 hex digit) ID number is computed from the length of the tracks on the CD and the number of tracks as follows:
      XXYYYYZZ
        XX   = checksum of track offsets in seconds, taken mod 255
        YYYY = length of the CD in seconds
        ZZ   = number of tracks on the CD