# Collatz conjecture experiments

this section contains some experiments on the collatz conjecture which has deep connections to TCS.[4][19] this is the nagging, taunting, ¾-century-unsolved problem for which Erdös semifamously said:

Mathematics is not yet ripe for such problems. —Erdös

also Terrence Tao immediately cites it when referring to “mathematical diseases”[9] and as “One of the most notorious problems in elementary mathematics that remains unsolved”.[7] he compares succeeding on such problems to “winning the lottery”. maybe apropriately, the following is a work-in-progress & this tiny writeup is sorely sketchy and incomplete at this point.

the approach Ive followed for many years is to construct a FSM transducer for iterations as described on the wikipedia page, “as an abstract machine that computes in base 2″ [3] although the origination of the technique is not so clear.[5] the FSM transducer computes iterates. built the transducer in the mid 1990s along with various auxiliary tools mainly for FSM logic. in the early ~2000s used the AT&T FSM library in many ways with this unique formulation.[6]

this approach has some strong connections to Shallit & Wilson [1] and is also similar to [2].

via this angle have found various interesting properties that probably have not been explored in the literature [which is large & forbidding, see eg the excellent survey/bibliography by Lagarias cited in [5]]. however, a proof is still elusive.

the general idea was/is to do a kind of experimental/empirical “datamining” on the algorithmic innards of the problem. this has led to many different views/angles/perspectives/analyses of the problem but (surprise!) not to a breakthrough so far.

fsm3nb.txt
this is the Collatz FSM iterate transducer. the $s_{from}$ and $s_{to}$ (state-from, state-to) are given in the 1st and 2nd column. the 3rd column is the input digit. the 4th column is the output digit. a 0 in the input or output column indicates $\epsilon$ ie the empty string. the transducer computes iterates of the Collatz problem in binary, specified in lsb to msb order (least significant to most significant bit). binary digits are specified as pairs. the pair 11 indicates binary bit $0$ and 12 indicates bit $1$. the input number is terminated with the digit 2 in the first position. the final/end state is 30 with no new state, input, or output.
tree5.rb
this generates what might be called the “collatz graph”. the graph is all integers processed by the transducer working in series, like a “sandwich”. this graph can be explored in different orders. the general idea is that a particular enumeration order might lead to more insight or some kind of regular (as opposed to quasirandom) structure that would be amenable to induction.[8] there are currently two sort orders specified, rank1 and rank2 which are switched in the rank function. the code outputs the current iteration related statistics like the size of the graph. the prior and new graph nodes codes generated are output.
tree3b.rb
this code enumerates integers in a greedy order based on the size of the ensuing transducer sequence, but without the terminating ’1′ symbol that triggers the cascading final computation, which leads to a “cocked” transducer sequence/array (and also a natural equivalence class structure for assigning/sorting integers into). it also alternatively calculates the transducer array sequence via a recursive or fractal-like formula, and then verifies they are equal.

have many ideas on this problem & am more motivated to go into more detail with an engaged audience. therefore if you have any interest or reaction, please comment & it will help inspire me to continue to improve the detail of the writeup.

at times it feels a breakthrough is in the midst, within nearby reach, other times it all feels like a hopeless mirage not achievable in a single lifetime (there is some circumstantial evidence the problem may be undecidable, see eg [4]).

however do want to say one hopeful thing—have long strongly suspected that this problem may be resolvable with not-too-complex TCS that can not be easily/readily formulated in mathematical theory eg number theory, and the refs here would tend to support that and point in that direction.

note: complete ref [1] is

Jeffrey O. Shallit and David W. Wilson (1991), The “3x + 1” Problem and Finite Automata, Bulletin of the EATCS (European Association for Theoretical Computer Science), No. 46, 1991, pp. 182–185.