❤ 💡 didnt expect to focus on collatz so much when starting this blog, but hard problems require (sometimes intense) focus! as you can see/ tell it has sucked me in. in the sense of “femme fatale” maybe? such a much more interesting/ evocative metaphor imho than others that are proposed.
am gearing up for a new saga/ installment/ pivot (because last mos saga ofc wasnt quite enough, lol!). it requires a bit of “psyching up” as the vernacular goes. it involves dredging up some (now) very old ideas that are potentially quite powerful but at this point have cobwebs on them. blast from the past™/ back from the dead™ one might say dramatically. yeah its a possibly big success story of long having something valuable/ powerful in the back pocket and rescuscitating it. its also a case of years-long near miss/ strong/ shiny/ sparkling lead that came close at the time but didnt realize exactly/ completely pursue it in the moment. what goes around comes around™.
background (“static”), have been assigned a new fat/ significant prj at work and am a bit bummed/ chagrined/ annoyed about that because it cuts into this “fun time with collatz” (so to speak!). usually end-of-year period allows some respite/ hiatus as the code goes into a temporary freeze period. whatever
😎 more painting scenic/ colorful background: on the porch in backyard this second with cap + sunglasses + headband + tshirt + long sleeve shirt + sweatshirt + thick black leather jacket + long underwear (“layers”) vs the sun rise, next to nice winebarrel firepit, rose incense in the shamanic wood tower holder imported from eucador, admiring my nice sculpture collection (from fish tank decor, repurposed!) including the 2 new 2ft faux rock dragon protectors doing a great job so far. if someone is really nice/ or maybe just human to me in comments, may post picture highlights! believe me its well worth it! it started out at 40°F on this fall day but already up to 50° in about only 30m. the headband + bike gloves with open fingertips defn help! am lately thinking/ musing that maybe the problem difficulty should be measured not in midnight oil mentioned earlier but in incense sticks burned, and it is quite numerous/ sizeable so far!
other news: am starting Frenkels book Love and Math 2013, was tipped on this half a year by colorful chat denizen
0celo7 RIP (its a long story™… oops post correction it was cohort
Balarka) who pointed out the extraordinary porn star “erotica angle” lol! wow, iirc skipped buying this at the time, thought it was a little too fluffy maybe, but now think its an instant classic. he talks about beauty in math a lot the way Hardy did in Mathematicians apology. poetic! what a great writer!
ok heres the rundown. its back to the FSM/ transducer formulation to try to analyze the long 0/1 runs behavior after mappings wrt unlimited size iterates. looking/ searching thru blog history, generally havent poked at the FSMs in ~2½ yrs and in 2nd case it was for another idea besides collatz (prime analysis/ auto thm proving bertrands postulate via SAT etc). seem to have been inspired at time partly by the Konev/ Lisitsa SAT-EDP work in 2014 and have said so in this blog on occasion. that work is outstanding and highly aligned with certain directions/ themes/ approaches pointed out/ emphasized/ advocated in this blog repeatedly over the years but it seems like few have gone in the same or similar directions since then (maybe even the authors themselves!). ❓
the most recent blogs on this are from 1/2016 and 4/2016 which utilize the OpenFST library. another motivation at the time was that FSMs have a natural way to “compress” information using the minimization algorithm and this is an early investigation into entropy/ (dis)order concepts surrounding collatz before running into them more directly more recently. the other major “elephant in the room” around here wrt this is this early writeup that references transducer code.
(later) recalled writing up some of these ideas long ago but couldnt recall details. yep, almost exactly 3 yrs ago oct 2015 was musing on transducers and the relations between input and output lengths, citing my SE questions, this one is the closest to current directions. suspect one of the answers got deleted, the current one mentions the deletion, not sure what the deleted answer was about/ who wrote it, alas. (disliked those SE history vaporizing aspects.) the current question under consideration has some vague resemblance.
(11/4) this took a few hours of some careful coding but is basically/ simply a “warmup exercise”. it demonstrates the ideas of transduction and composition of the collatz transducer. the composition concept is alluded to a lot over the years in this blog, have written sophisticated code on that in the past, but no code is posted on that until now, and those following closely (is anybody out there?!?) will understand it will play a key role in this current angle. the transducer constructed was given here a few years ago on the general outline pg just cited. it combines the 3n + 1 and repeated div2 ops in a single pass (1st followed by 2nd) and runs correctly on odd #s (only) greater than 1. there is some binary encoding over 2 symbols per bit, lsb to msb, followed by a symbol
2 terminator, as described on the general pg. some transducer code is also already given on the general pg. this is a rewrite.
fsm3 batch file demonstrates various sophisticated concepts using the FST(M) library cited, ie/ eg/ mainly the composition of 2 (identical) collatz transducers into a single transduction. the new library is compatible with the old AT&T FSM library format. there is a little code to handle ϵ-epsilon in/ out transitions in the transducer but its not fully general, it “only” is enough logic to work on the collatz transducer format given, where epsilon transitions do not occur with other transitions, and there are none at the starting state. the 2-transducer construction also has a
fstepsnormalize which effectively takes them out of the final results. theres also an “arcsort” which is required by the library for some algorithms/ input files, basically another simple normalizing operation, it would be more convenient for purposes here if the parsers didnt require this. the test code verifies the 1- and 2-transducer constructions work for the case “27”. there is encoding and decoding of binary-to-symbols (“alphabet”) and symbols-to-binary. then it selects 3 random numbers under 1000 and tests both the 1- and 2-transducers with no failures.
(11/8) 😀 heres another exercise that is still mostly “warmup for the main event” but gets/ advances everything closer within shooting distance so to speak and is again not very much code but nontrivial to put together and debug/ validate carefully. the basic idea is that next steps will require a FSM encoding “all integers with 0/1 binary bit runs less than n“. it is a “bit” )( quite extraordinary that FSMs can encode such concepts/ sets, they are not available elsewhere in number theory, but this is considered rather elementary CS. more crosscutting/ interdisciplinary action/ building bridges!
this took several pieces of code. the 1st batch file constructs 2 FSMs for 10-runs and 9-runs calling the
fstlen generator which outputs a symbolic description of the FSM to
stdout. the language is fairly easy to understand, there are basically 2 alternating state runs/ branches, 1 for each bit 0/ 1, and which count the # of sequential bits currently generated, from lsb to msb with at least one msb set (ie not ending in a 0 bit). then there is some FST code to compile and determinize these for further processing (as earlier misc cmds from this list). here the arcsort seems to be unnecessary after the determinize step, but its undocumented that/ whether determinize sorts output arcs (actually the cmd documentation really leaves something to be desired).
then FST difference is called on the 10-runs minus the 9-runs sets. it is not completely obvious how to visualize this but the remaining set will have “all” (odd) integers (in binary) with at least 1 10-run in either/ and/ or both 0/1 bits. this can be verified in 3 separate trials which use the helpful/ convenient random path generator from the FSM library to create a random accepted string from the result (difference) set, parse it into the 0/1 binary format from the 2-symbol format using the previous/ recent
sym2bin code/ function adapted, and then break up the bit runs for analysis, correctly finding the 10-run instances sometimes as either 0 or 1 bit runs in all 3 cases. apparently the randomizer tends to find cases where the 1st runs are 10 length but other samples show this is not always the case.
am thinking more about the
mix22 code from last may, it looks at subsequent maximum counts in 0/1 bit runs vs seed sizes and finds it exceeds/ increases gradually. but its a little undefined based on the ruby
max operator: does it find the 1st or the last maximum in a list if there are multiple equivalent maximums in the list, is it consistent/ defined? (easy to check, but away from coding right now, iirc that is the case.) that experiment somewhat directly looks into whether 0/1 bit run lengths can increase in trajectories and answers affirmatively, but on now 2nd look doesnt give a totally conclusive answer, and now looks like a near miss on finding the current guiding conjecture. the current conjecture is that for larger runs, there are consistent decreases (as found recently, ie/ eg
gap3 3rd graph from last month), but for smaller runs, its random, and that prior experiment mixes them together in the analysis, and also doesnt seem to discriminate between multiple repeated maximums… taking into account the more recent results, it may be counting new “maximums” that are equal! this all shows (again/ as usual!) how crucial it is to get the analysis right/ carefully/ precisely formalized… ❓ ❗
(11/9) 💡 building towards some more insight. this shows some of the remarkable strength of these techniques. eg they can find global minima across all integers although sometimes at large memory expenses. the basic question being answered here is “find a (smallest) number with at least 1 n-run that also has at least 1 n-run after 1 collatz mapping”. this takes quite a bit of careful coding (despite the fairly simple description theres a lot of different sets involved/ “juggled”) and uses the semi sophisticated
fstshortestpath code to find it in the end, and some hack-like windows cmds to do basic stuff like create empty files or output lines without newlines, yikes, so ugly! really need to move to unix scripts huh!
honestly the bigger conjecture as alluded all along was more like “after something like t collatz mappings, all n-runs would turn into n-1 or smaller runs (for sufficiently large t, n).” but that looks like, while possible to test with an algorithm in theory, it will be practically difficult to test due to combinatorial explosion of states, ie it looked like on 1st cut the 2nd collatz mapping for the “at-least-1-n-run” language blew up to ~15K or more states immediately and required a few hundred megabytes of memory to process, may look into tuning that further but it might not help.
elsewhere, its never far from a graph around here for long. this graph is the n-run length on the x axis and the minimum symbol count on the y axis, which is basically double the bit size. the regularity of this graph seems quite remarkable and seems to point to deeper properties and ofc important, maybe crucial inductive structure. a little counterintuitively, the determinize and minimize operations do not automatically remove epsilon states, although that doesnt figure until near the end of the analysis because other earlier constructions dont lead to them. its natural to look at the actual found shortest numbers for some pattern, but whew all this took quite awhile.
(later) ofc that glaring oversight is easily corrected although it didnt take a totally brief amount of time. this code adapts
sym2bin to be more generic and alters the batch file slightly to output the found strings/ seeds instead of measuring lengths. theres a slightly strange quirk to the library, the output shortest path FSM starts at a highest numbered state and works/ transitions backwards thru states, otherwise FSMs handled elsewhere typically start at state 0. theres a very nice utility called
fsttopsort that perfectly handles this by renumbering states in increasing order starting at 0. the display utility looks at the final and corresponding initial seeds in binary grid format, lsbs left aligned, also with some embedded simple collatz mapping inverse logic. the patterns are quite remarkable. it seems almost like a collatz yin-yang and/ or checkerboard. the alternation from input to output of the transducer is a n-0-run on left to a n-1-run on right. feel like am close to something deep, still cant quite put my finger/ brain on it. its like logotherapy or life… what does it mean™? ❓
(11/12) lots of progress/ ideas yesterday, writing it up today. here is another way of looking at it. as noted there are some similarities between the theory of (prime) implicants and the 1-runs ideas. need to come up with some better notation/ identification to try to avoid confusion/ ambiguity. call a “(x,y)” run set a set with 0/1 runs of length between ‘x’ and ‘y’, lower and upper bounds, the sets created earliest above (
fsm4/ fstlen code). the set may or may not have runs at exactly the lower and upper bound lengths. a “(x,y,z)” set is a (x,y) set with at least 1 (0/1) run of at least length ‘z’. it might also be helpful to define a set where there is at least 1 run length of exactly length ‘z’. these are all slightly different but interrelated concepts. (eg if y=z then there must be an exact length run of that length.)
wrote some code to create the (x,y,z) set (as a batch file “subroutine”, basically (re)factoring it out of the
fsm5c code above. it does this by building the (x,y) set (actually here x=1) and taking the difference with the (x,y-1) set giving a (x,y,y) set. then one can do an intersection of the (x,y,y) set with a (x,y-1,y-1) set (or actually any (x,y-n,y-n) set and find it empty, ie it leads to zero states in the intersection. these are simple “proof of concept” exercises. these concepts are not exactly simple to keep straight with words. in other words for this intersection idea, the sets with at least 1 exact y-run are disjoint with the sets with at least 1 exact (y-n)-run as long as all run lengths in the 2nd are smaller than y. whew! had to think about that carefully/ at length. “math is built forward but understood backward…”
(later) this is some not very pretty code (am going to mostly blame windows/ bill gates for that instead of my own prowess) but “gets the job done,” or better, does some neat/ somewhat deep analysis. at this point some sophisticated theorem-like experiments/ observations can be constructed. via nearly an accident in some code, noticed that some intersections of sets were coming up empty and then built up this more systematic experiment. to understand it, realize now the 1-run concepts are somewhat close to an idea in boolean function theory called “slices”. a (1,y,y) set acts a lot like a “slice” so call it a “y-run slice” here (but used with the caveat the ideas are not exactly the same). then one can look at intersections of a single collatz mapping with “nearby/ adjacent slices”. this maps an n-slice and then looks at intersections with the same n-slice and n-1, n-2, n-3 slices. it finds the n-slice mapping and the n-3 slice intersection is always empty. the slices are disjoint but the union of all (intersection) slices of a set equals the full set.
this is a nearly direct confirmation of some of the ideas about the mapping altering the run/ slice structure to speak. however quite solidly this is more approaching an inductive proof structure than the earlier empirical observations. it alters it but not past a certain limit, here found to be 3. this leads to a sort of concept of “higher and lower slices”. the higher slices tend to get mapped into lower slices in a sort of “cascade” effect. the minimize code is altered slightly in v2 to only output the final state count. also after some confusion the dos
setlocal command turned out to be crucial to utilize to avoid variable collisions.
the graph is the minimized state counts in the slices with x axis the slice “length” count. as stated the n-3 slice is empty with 0 states. the minimization leads to some blowup in state counts as the nondeterministic FSM is converted to a deterministic one. have long searched for a great nondeterministic minimization algorithm and it looks like maybe nobody has ever constructed one in the literature ie is nearly an open TCS problem. the lines are so linear, wonder if they perfectly match some linear formulas. was surprised at that. somehow this suggests the pov is “perfectly” analyzing/ characterizing the problem somehow. ie maybe there is in some sense “analytic formulas” for constructing these FSMs. another way of looking at state counts in line with prior analysis themes is that its a type of entropy measurement of the sets.
💡 ❗ 😀 😮 breakthru! this is further vindication of some of the key ideas/ conjectures outlined. this looks at a 3-slice and subsequent collatz mappings. it finds all 3-slices “crunch” to 2-slices after 6 consecutive mappings. again a near-direct confirmation and again these results are a very big deal because they hold over arbitrary “size” integers. the minimization results are interesting too because sometimes the minimized sets are “smaller” (in state counts/ entropy) than the starting sets and other times larger.
alas, just analyzing the next 4-to-3 slice crunch looks like it is impossible due to state complexity explosion into tens or hundreds of thousands of states.* some of this is from the determinizing step but maybe cant be avoided even without it. this immediately leads to the idea of trying to figure out ways to manipulate FSMs without using the standard minimization methods. are there some kinds of patterns/ regularity in these FSMs that can be compressed but not in the standard way? maybe some ingenious reformulations are possible. another approach that has worked well in the past for me, not documented here, is to look at/ analyze size ‘n’ bit sets for some constant. eg one can easily construct the intersection of a t-slice with a n-bit set for arbitrary t, n. another powerful unpublished experiment along these lines looked at how many collatz mappings it takes for (all!) n-bit integers to decrease to a n-1 bit integer and could feasibly run for ‘n’ around say a few dozen (iirc)…
(later) 😳 😮 ❓ ❗ had some trouble/ strangeness with this code. made some changes in it and it didnt seem to affect results. also it was outputting incorrect filenames. calling wrong code/ batch file by accident? some kind of windows glitch? stuck/ hung process in memory? doesnt fully make sense. came back few hours later and had to make changes/ fixes/ replacement and it seems to be coherent now. cant fully explain what happened…
--1 intersect.fst # of states 212 # of states 160 --2 intersect.fst # of states 945 # of states 1115 --3 intersect.fst # of states 3473 # of states 6262 --4 intersect.fst # of states 8518 # of states 6424 --5 intersect.fst # of states 2984 # of states 827 --6 intersect.fst # of states 0 # of states 0
(later) * 😮 🙄 💡 ❗ 😀 😎 ⭐ breakthru II! an old expression: the impossible just takes longer™… spoke too soon! the 4-to-3 slice code works! it was as simple as doing epsilon removal before the minimization step. apparently there is a large blowup due to epsilon transitions. (“computer/ algorithm”) proven: the 4-to-3 run slice “reduction/ crunch” occurs in 7 mappings instead of the prior 6 verified for the 3-to-2 case. the 3rd step uses ~10.5K states.
--1 intersect.fst # of states 274 # of states 375 --2 intersect.fst # of states 2090 # of states 9359 --3 intersect.fst # of states 10459 # of states 7556 --4 intersect.fst # of states 6189 # of states 2744 --5 intersect.fst # of states 3098 # of states 1107 --6 intersect.fst # of states 345 # of states 128 --7 intersect.fst # of states 0 # of states 0
(11/14) ❗ 💡 😀 😎 ⭐ breakthru III! yet another cool/ fantastic/ aligned/ consistent finding! building on prior tools this was easy to write/ debug in only about 1hr. the
len code generates a FSM language for all possible binary strings of length ‘n’ (in symbol pairs for bits). this can be passed into the mapping code as
fpre set here width 50. the
fin intersection set is 10-length slices.
the code measures how many 50-bit 10-slice mappings lead to only 9-slices or lower, in this case 6— further confirmation/ vindication! the 4th mapping leads to a remarkable ~1.2M states. this shows some of the remarkable power of this “computerized thm proving system”. this findings/ property can now be expanded to look at other trends over different integer + slice sizes, for example it took fewer mappings for 20 bit width integers (for same 10-9 crunch)— these combined may tend to suggest there is no constant t for which t mappings always crunch n slices to n-1 slices. also graphing state counts thru the “bulge” of the reduction/ crunch would be interesting. the bulge involves a complexity/ entropy transformation/ transition point from low-to-high-to-low again.
notably, these results are far outside of anything that can be done by hand. while it couldnt be completely ruled out, there is no expectation at all that any by-brain+hand theory could prove such a fact. can relate, just read amusing article on online dating as howling into the void… so is anyone impressed yet? 😛
as mentioned also a basic exercise now is to measure how many mappings of n-bit integers lead to entirely n-1 bit integers. it would be interesting to compare those counts with the time to go from n slices to n-1 slices. the 2nd case is apparently substantially shorter/ therefore possibly (“far”?) more tractable to study. ❓
fsm8 m=50 n=10 --1 intersect.fst # of states 15950 # of states 7366 --2 intersect.fst # of states 73924 # of states 82085 --3 intersect.fst # of states 498925 # of states 748780 --4 intersect.fst # of states 1150225 # of states 345272 --5 intersect.fst # of states 47662 # of states 17086 --6 intersect.fst # of states 0 # of states 0
(11/15) 😳 some slight issues with that uniform-length language code on closer look. (1) it allows (accepts) a zero length string. (2) it allows msbs that dont end in a 1 bit, ie end in 0 bits. after fixing it, it appears it has no effect on the prior computation, leading me to think that the collatz transducer doesnt accept these inputs anyway. (verifying left as exercise for reader.)
⭐ ❗ 😀 😎 anyway need a slight variant of that language, ie “all n-bit integers up to n“. that is an easy variant in this new code and as a bonus the code also produces the “exactly n-bit integers” with a simple command line flag. combining these leads to the following code that has been alluded to for a long time but never posted before. came up with this construction in the early to late 2000s (working it at mainly two different times/ attacks in the decade, early and late… deja vu!).
the code is similar to prior code except “shaves off” the lower integers from the “running composition” using the difference/ subtraction method instead of intersection as previously. ie it starts with all exactly n-bit integers and then shaves off all n-1 bit integers and lower. there is a little trick to test for an empty set by constructing one and then testing for equivalence. there is an additional minimization step pre-difference to decrease the substantial complexity of the difference step. it finds all 20 bit integers seeds decrease to 19 bits in 125 steps using up to 110K states max at mappings 10-11. note this is not exactly the same mapping counts as real glides, it may take “somewhat more” steps for an n-bit iterate to reach n-1 bits than the glide length.
my feeling has long been that this code might potentially outperform exhaustive searches being used to rule out collatz ranges over the decades based on superior speed/ memory efficiency. there is a distributed one running these days, need to cite it sometime. this could possibly be adapted to a distributed version. its large-scale dynamics are fully unexplored and surely have many exotic properties. any takers? 😀 😎 💡 ⭐ ❗
(later) called with n=26 leads to about 3.7M states on mapping #13 and completes in 239 mappings total. memory usage on some of the operations seems to approach about ~1½ gigabyte and overflow after/ over a little more than that. there is clearly an exponential increase in states so some of that enthusiasm for time/ space complexity is clearly overblown. however its basically/ definitely a very novel way to parallelize the “search” that is unpublished in the literature. the remarkable combination/ fusion of CS and number theory is also quite novel.
(11/16) 💡 ❗ 😀 😎 ⭐ another breakthru! this code alters the mapping step “slightly” but gets very/ dramatically improved time/ space characteristics. theres also a little code to terminate in case of memory overflow in the generally most-memory-intensive composition step, although it cant be totally relied on because undetected memory overflows messing up the computation can happen in other places (noticed for n=40). the new logic is not easy to visualize but basically it avoids previously-visited numbers by doing an intersection with the prior set of “unresolved” numbers. this then excludes trajectories that increase and then decrease onto previously-seen numbers, ie sort of “delayed” trajectories. (in this way it functions a little like the “seen” cache used in the
seen variable often seen in earlier collatz code.) maybe came up with this construction many years ago (about a decade) & recalled its strong/ remarkable performance characteristics but lost it in the sands of time. the same verification for n=20 finishes in 13 steps only using max ~8k states. its almost like magic…
the n=30 case takes 19 steps and max ~293k states. this same “start” overflowed memory with the
n=36 22 steps/ max ~2.2M states.
n=38 24 steps/ max ~4.1M states.
n=40 (undetected) memory overflow. fst util output memory error(s) (maybe in minimization subroutine?) but code continued (full disclosure/ caveat with incorrect/ “corrupted” file(s)!), manually terminated.
looking at results the max states seem to happen early on about iterations #11/ #12. which suggests (trying) some new strategy that tries to even/ smooth out the state bulge/ explosion somehow. what is causing it? again its basically an entropy mesurement. eg note it is possible to separately process partitions of the set in a parallel way. what is a great (memory-preserving) strategy for building these partitions? eg one would want to find “natural partitions” that somehow tend not to overlap as the computation proceeds.
on the other hand/ “bigger picture” pov there are so many great/ cool “spin off” ideas/ directions but really need to try to focus on what kind of structure(s) would lead to a proof…