Dr M. Levison --- Assignment #1
Solutions

(§ denotes the original question.)

§ 1. Draw the state transition diagram of an fsm to accept the set of binary integers divisible by 3.

Placing 0 on the end of a binary integer p gives 2p; placing 1 gives 2p + 1. The same is also true of the remainder on dividing p by 3, which will be denoted by p mod 3.


Then, if p mod 3 = 0     2p mod 3 = 0    (2p + 1) mod 3 = 1
                   1                2                     0
                   2                1                     2

So, if we have three states Q0, Q1, Q2 which indicate whether p mod 3 is 0, 1 or 2, we get the following fsm, in which Q0 (remainder 0) is both the initial and the accepting state:


          _____                                         _____
         |     |0                                     1|     |
         |     |  1                   0                |     |
          --> Q0 ---------------> Q1 ---------------> Q2 <---
               ^                 1/ ^                0/
               |_________________/  |________________/

§ Use one of the Kleene-Myhill theorems to find a regular expression for binary integers divisible
by 3.

Omitting Q from the names of the states, and using the notation of the Kleene-Myhill fsm-to-regexp notes, we want the regular expression R[00].

From (b), R[00] = R[00,0]*

From (d), R[00,0] = P[00]
         v P[01] R[11,0] P[10]
         v P[01] R[12,0] P[20]
         v P[02] R[21,0] P[10]
         v P[02] R[22,0] P[20]

Now P[02] and P[20] are empty, P[00] = 0, P[01] = 1 and P[10] = 1.

So R[00,0] = 0 v 1 R[11,0] 1

Now we can apply the same process to find R[11,0] in the fsm containing Q1 and Q2 only. But we can surely see visually that a single round trip from Q1 to Q1 is 01*0, and that a journey with multiple round trips is (01*0)*.

So R[00] = (0 v 1(01*0)*1)*

§ 2. A language L is accepted by some finite state machine F. Describe how to adapt F to recognize the language consisting of all the strings of L reversed.

Simply reverse every transition arrow of F, make the initial state of F the accepting state of the new fsm, and the accepting state of F the initial state of the new fsm. Then any path leading to an acceptance in F can be traversed backwards in the new fsm, which will therefore accept the reverse strings of F. The new fsm may be non-deterministic, even if F is deterministic.

A small problem: what if F has several accepting states? The new fsm will have several initial states, which seems to break the rules. The solution is to add an extra state which becomes the only initial state of the new fsm, and copy onto it all the arrows leaving any of the previous initial states. This is exactly what we have done elsewhere when constructing an fsm for E v F, and for exactly the same reason. In that case, the machine for E v F is simply the combination of the two machines E and F regarded as a single fsm; but that has two initial states, so we require the extra state.

§ Use this construction on the fsm of (1) above. Do you notice anything curious?

In the case of my fsm, this construction produces exactly the fsm which we started with.

§ What does this imply?

That if we reverse a binary number which is divisible by 3, the reverse is also divisible by 3.

The same, by the way, applies to decimal numbers divisible by 11 (consider 132 and 231, 1474 and 4741, etc.); to octal numbers divisible by 9; and in general to base n numbers divisible by (n + 1).

§ 3. A language L is accepted by a deterministic fsm F. Show how to adapt F to recognize the complement language L consisting of all strings over the alphabet not in L.

Just make every non-accepting state of F an accepting state of the new fsm, and vice versa. Every string leads to exactly the same path in the new machine as in F, but the result is the opposite.

§ Why does this adaptation work only for deterministic fsm's?

Because if F is non-deterministic, some string x in L might give rise to several alternative paths in F, one reaching an accepting state A, another reaching a non-accepting state B. If you make the proposed change to F, one of the paths of x will still lead to B, which is now an accepting state. So x will be found to be both in L and in its complement.

Basically, an nfsm accepts if one of its paths accepts. The negative is that ALL of the paths reject; so, in the "inverted" F, we need ALL of the paths to accept. But this is not what an nfsm does.

So we have to replace the nfsm by a deterministic fsm before doing the "inversion".

§ 4. Write down representations in each of the decimal, unary, binary and octal bases for the first few powers of 2.

decimal: 1, 2, 4, 8, 16, 32, ...

unary: 1, 11, 1111, 11111111, 1111111111111111, ...

binary: 1, 10, 100, 1000, 10000, 100000, ...

octal: 1, 2, 4, 10, 20, 40, 100, 200, 400, 1000, ...

§ Prove that the language consisting of the unary representations:

{1ⁿ | n >= 1, n is a power of 2}

is not accepted by any fsm.

Suppose there is such a machine and it has q states. Choose any power of 2 which is greater than q, say 2 power r.

Then we must be able to split 2 power r into three integers x, y and z, with y non-zero, such that the fsm accepts (x + ky + z) 1's for all integers k >= 0.

But this means that (x + ky + z) must be a power of 2 for all integers k >= 0. But these can't all be powers of 2 because the gap, y, between successive values is constant, while the gap between successive powers of 2 increases exponentially, and must eventually exceed y. This contradiction invalidates the supposition, and proves the required result.

§ On the other hand, write down regular expressions for languages of the binary, and of the octal, representations.

In the binary base, the powers of 2 are described by the regular expr 10* (or if you wish, 0*10*, since zeros at the start don't change the number.)

In octal, (1 v 2 v 4) 0* or 0*(1 v 2 v 4) 0*.

§ The first of these results seems to suggest that no fsm can discover whether an integer is a power of 2. The later results seem to contradict this. What is going on here?

The superficial answer to this is that the representation matters. We must not claim that an fsm cannot recognize the powers of 2, only that an fsm cannot recognize the powers of 2 expressed in the unary base or in decimal.

But the question deserves a deeper answer. What is really going on is that in the binary and octal cases, the information about whether the number is a power of 2 or not is already explicit in the input. The task which an fsm cannot carry out has already been done by whoever/whatever converted the number into binary or octal. Given, say, a unary or decimal representation of an integer, no fsm can translate this to an octal or binary representation. (We have noted elsewhere that the outputs of fsms are regular languages.)

A second example may make this clearer. We prove below that no fsm can recognize the prime numbers expressed in unary form (or for that matter, decimal). But let me propose a weird new representation for the set of integers >= 1. (This may not be original. I suspect that Donald Knuth may have mentioned it in Vol 1 of The Art of Computer Programming.) This representation is based on the fact that all integers can be expressed uniquely as the product of prime numbers.

In my representation, the digits starting from the left refer to the successive primes 2, 3, 5, 7, 11, ... respectively. Each digit specifies the power of that prime in the prime factorization of the integer.

So 2 is represented by 1, 3 by 01, 4 by 2, 8 by 3, 12 by 21, 36 by 22, etc. (the last implying that the number is 2 x 2 x 3 x 3). Of course, we have a problem when some integer has a prime factor to a power greater than 9, but then we enclose the power in []. So we represent 1024 by [10], and 3072 by [10]1.

In this representation, the prime numbers are 0*10*, and are clearly accepted by an fsm. But it is surely obvious in this case that whoever writes down the representation must compute the prime factorization, and therefore do all the work necessary for determining if a number is prime.

The analogous situation applies to powers of 2. In this case, the binary representation of a number is its expression as a sum of distinct powers of 2, and computing the representation of n gives its status as a power of 2.

[By the way, this integer representation simplifies the multiplication algorithm, but is very awkward for addition!]

§ 5. Prove that there is no fsm to recognize the language

{1^p | p is a prime number}

Suppose there is such a machine and it has q states. Choose any prime number p greater than q.

Then we must be able to find x, y and z, with y non-zero, such that p = x + y + z and x + ky + z is prime for all integers k >= 0.

It now remains to prove that we can always choose k in such a way that x + ky + z can be factorized into factors >= 2. This contradiction will invalidate the supposition, and prove the required result.

Let x + z be denoted by w.

On the face of it, all we have to do is choose k to be w. For then, x + ky + z = w + wy = w(1 + y), and this exhibits a factorization of the supposed prime. Unfortunately, although 1 + y >= 2, since y > 0, this breaks down if w = 0 or w = 1, requiring us to cover these cases separately.

Better still, note that w + ky = (w + 2y) + (k - 2)y, which becomes (w + 2y) + (w + 2y)y if we choose k to be w + 2y + 2.

Then x + ky + z = (w + 2y) + (w + 2y)y = (w + 2y)(1 + y), and each of these factors >= 2, even if w is 0 or 1. So x + ky + z is not a prime number for this choice of k, giving the contradiction and completing the proof.