# collatz revisited

hi all… last months installment was all about pushing the pre vs postdetermined concept as far as it could go and found that it was stretched thin. in sharp contrast to the rosetta diagram idea, relatively ingenious code reveals the postdetermined region can have major divergences. this broke a major conjecture and has left me somewhat thoughtful at best and reeling at worst. candidly, am feeling sometimes like maybe all the powerful ideas that have been exercised have not made any dent whatsoever in the problem. so yeah, a darker mood, and am wondering sometimes if there is anything that can be proven about the problem at all.

didnt have a lot of immediate ideas. but as time goes on without much experimenting (a few weeks), am feeling maybe some ideas coalescing from the emptiness.

my new idea is that the pre vs postdetermined idea is again a anthropomorphic bias where humans see dichotomies everywhere aka “dualism or duality”. my new idea is that maybe its more of a spectrum. ie the initial part of the walk is completely programmable and the final part of the walk is nearly “unprogrammable” and closer to the (nearly linear downward) drain type phenomenon, and am now thinking statistically there is some shift from one extreme to the other, where more randomness/ “divergence” is possible at the beginning than end. some old experiments may have revealed something like this. its hard to remember this far back, it goes back years, but think it was shown that maybe the divergence in the initial part of the trajectory steadily decreases toward the end of the trajectory on average, judging wrt a linear trend. (gotta look all that stuff up again, its been ages.) also from last month it looks like divergence from the linear drain is bounded somehow by the current iterate bit size. eg an upward spike of roughly that size could be contrived but maybe thats the extreme case.

the basic idea in this code is similar to a prior experiment that preceded the slope correlation ideas, `bitwise18` from 1/2018. it measures/ maximizes max divergence from the linear trend divided by iterate bit size ‘dv’ which seems to be bounded in the 1st graph. my expectation this would lead to higher divergences in the early part of the trajectory (ie in line with the (dis)order spectrum idea) but just the opposite happened where the divergence is pushed toward the end by the algorithm. the 2nd graph shows, for the max divergence found, the associated trajectory red and corresponding divergence sequence green. the algorithm also tries to maximize the divergence over the linear trend (should have plotted that too), prior code looked at absolute values. have to think more about how to normalize all this in a sensible way. part of the tricky part here is that drawing the linear trend starting at the 1st iterate means theres always zero divergence at the beginning. is there some other obvious idea? maybe a best linear (ie least squares regression line) fit? that would then come very close to the earlier slope correlation calculations.

in other news, have a ton of ideas/ links for new blogs but my Big Corp restricting the bookmark sharing is really a drag and cuts deeply into my modus operandii. have an idea about how to overcome it by manually exporting bookmarks but my ipad at home is not going to be sufficient to handle the merging operation, and am thinking of buying a new computer/ laptop partly just for this. after too many years to mention since last time (hint, it was to buy a ps4 at its earlier heyday…!), have a lot of credit card points to cash in for gift cards and ordered \$500 worth of best buy gift cards, and then waited 3 weeks for them not to show up in the mail. reordered them on monday and hopefully this time the mail will work. this world we live in seems like a mixed bag, from the heights of technology to the depths of (trying to think of a word for this…) random haphazardness.

`bitwise31.rb`

(5/16) this code looks at divergence around the linear trend for the drain type trajectories. the general idea here is reminiscent of a key/ pivotal experiment from the beginning of this year 1/2019 which was one of the earliest schemes touching on and even revealing the pre vs postdetermined distinction, `vary5b`. as discovered awhile back but not exactly fully understood at the earliest, a randomly chosen ½-density bit string is a drain seed with high probability. so this chooses 100 bit ½-density seeds and calculates the trajectories and the (absolute) divergence sequences for each one (absolute value of the trajectory logarithm minus the linear trend fit to the start and endpoints). then these are normalized to fixed 100 width samples/ vectors and sequence-wise averaged over 500 samples. not surprisingly the result is a near-half-circle but theres a strange/ anomalous bump at the end and dont have an explanation for it right now. this tends to show the drain is typically/ nearly a random walk around a mean-returning slope. but also the endpoint-fitting linear trend idea is probably not typically going to show a larger-to-smaller divergence, instead its typically small-large-small. my thinking is this emergent semi circular shape is characteristic of any unbiased random walk, and it would be very interesting to figure out how to prove that, such a project seems far less hopeless than the collatz conjecture and maybe related to known statistics.

`spread5.rb`

this code is similar, a fairly quick riff, but reveals something unexpected. again was looking for a high-to-low divergence trend and something else turned up. this does a linear regression fit on a drain sequence instead of the endpoint fitting method and then again calculates the (absolute) deviations. it finds a sort of triple-bump pattern. increasing the size of the seeds (color coding) apparently only changes the scale of the pattern. in graph 2 the deviation averages are normalized/ scaled by max size and they then tend to overlap. it looks like a major scale invariant pattern in the data. this time 200 averages. an immediate question, could this same pattern be reflected in some other (familiar?) statistics? ❓

`spread6.rb`

❗ this is surprising to me! looking for correlated/ corresponding trends in entropy and density yielded no signal, then thinking it over, was starting to wonder… it made sense to run that same analysis on a real random walk, as a sort of control study. this simulates a collatz-like random walk by starting with the “same” random seeds (scheme) and then multiplying by 3/2 or 1/2 with equal probability, and then taking the logarithmic values again. as is known in study of the collatz problem, this random walk in some sense “closely approximates” collatz random walks for the semicompressed sequence, and visually analyzing them (in comparison to drain seed trajectories that is) seems to lead to no discernable difference. the analysis code ie absolute divergences from the regression fit finds exactly the same pattern! is there some hint of a gaussian-like trend here? it seems to be something like that. maybe this curve is derived somewhere in the literature?

actually the high-to-low divergence pattern has already been found/ uncovered in the earlier rosetta diagram. these recent experiments are basically drain-focused with different statistics, showing the drain is indeed in a key sense undifferentiable from a (unbiased) random walk.

`spread8.rb`

(5/18) this code was an idea that occurred to me but on looking at the results have already moved past it so to speak.  the idea is an old one that can be seen in a new light, ie breaking down a trajectory into subtrajectories that have identical parity sequences as “smaller” seeds. earlier experiments found there is a lot of redundancy in this way and the Terras construction and semicompressed mapping dynamics shed new light on why this is the case. basically for every large sequence of length m, for any subsequence n < m, there is a seed of n bit size that has that sequence. this code tries to break or “decompose” the trajectory into 3 parts based on this idea aka “trisecting,” starting with a middle part centered on the trajectory max and the same bit width as the starting seed. the code optimization is on increasing the max length of any of the 3 parts ‘l3’. but immediately the code reveals the optimization focus is all on the 3rd part and that the 3rd part tends to stretch out and cant be “covered” with a single smaller seed. this “3rd part stretch” was predictable and has been hinted in other experiments, eg am thinking of `bitwise26` from 3/2019 which optimized ratio of postdetermined vs predetermined “areas” and found a “stretch” in the postdetermined vs predetermined.

`bitwise32.rb`

(5/20) 😐 💡 not having a lot of new ideas. maybe not thinking about the problem as intensely with other misc situations/ distractions/ going on. heres one angle that did occur to me (again playing the unenviable role of “trying to salvage something useful from the wreckage, aka make lemonade out of lemons”). it seems not so easy to lay out in sequential ideas, its still a little bit of a jumble. but anyway, thinking it over, the ufos found via backtracking all seemed to be in drain regions, and glides were not found in front of those drains, although maybe the code was biased against creating glides and/ or the “easiest” pre-sequences were not glides.

an idea guiding earlier conjectures (now discarded) was that maybe the entire drain didnt have ufos. but this is an idea of a proof based on full trajectories, and yet the idea of only focusing on glides should always be considered a fundamental proof direction/ angle/ approach.

instead of ufos wrt pre vs postdetermined regions, am now thinking about the distinction of ufos in the glide vs postglide region. maybe every large ufo found was in a postglide region (drain) and/ or no significant glide can be constructed in front of them, ie containing them in the glide descent? and it appeared to be hard to construct any large glides at all via backtracking (in front of the ufo, or even generally). maybe this hardness was a “feature” of the problem and not a “bug” of the code…

hence maybe, putting all this together, its also (apparently) hard to construct large ufos in the post-glide region (or maybe even post-climb/ ie drain) with a significant glide/ climb preceding it. written out this all is not so straightfwd but maybe its a rough summary of the ufo findings from a different/ new pov that also points to some possible crack/ opening… the overall idea guiding it is that maybe large ufos exist in the drain, but not in the drain “associated with” (contained in) the glide (aka descent region of the glide), and therefore maybe some approach with viability shows the drain part of the glide always exists without “ufo complications” inside it; in other words, proving all glides are finite based on combined unavoidability and special structure of the drain part of the glide, even with known ufo complications existing postglide. another direction to look at along the lines not ufo-focused would be to simply try to construct large glides via backtracking, can it be done? ❓

all this may sound/ look/ be somewhat contrived and unlikely, nevertheless it doesnt seem to be directly ruled out by existing results and even more is maybe aligned with them. on the face of it, trying to refute it will be a nontrivial/ worthwhile exercise. some of these ideas maybe dont fit too well into the existing optimization frameworks, both bitwise and hybrid, eg because they are all based on building “post-seed” trajectories instead of “pre-seed” trajectories… on other hand trying to think through all possible angles/ possibilities, maybe trajectories exist that refute the ideas but are rare and/ or hard to find via backtracking approaches…

(5/21) 1st cut on the general idea. this code tries to build a glide in front of a ufo by maximizing glide lengths. the 1-ufo is 50%-75% overlay on a 100 width ½ density iterate. it succeeds to some degree in creating a few nontiny ~30 iterate count ‘c’ magenta glides but none containing the ufo. the output is time found vs new maximums. the ratio ‘mrw’ red is the iteration count of the ufo ‘m’ lightblue divided by the starting bit width and stays under ~½ ie all the ufos occur in the 1st ½ of the predetermined region but all post-glide. this is seen in the ‘rcw’ green statistic which is iteration count of the end of the glide ‘c’ divided by starting bit width always below ‘mrw’. the algorithm explores/ sticks to nearly 100 bit wide ‘w’ blue search space and moves past it upward about halfway through the run. this all seems to be confirmatory on the overall hypothesis however there is a commented line generating random ½ density integer start seeds and tends to get similar results, dont visually see a difference. the code also intermittently starts in some kind of linear dynamic “trap”, not sure of the cause of this right now.

`backtrack5.rb`

❗ 😮 backtracking ideas have been a little slippery/ hard to nail down. this is another fairly obvious way to study the same question wrt existing code using bitwise method and its a striking confirmation. it now looks like this is a phenomenon that deserves to be named. it just looks for large 0/1 runs in the drain part of the glide, or just the post-glide-max range excluding the trailing nonglide, post-glide, or “subglide,” or “postglide drain.” from deep search of 10k iterations it only finds max runs ‘mx01’ of about ~15 magenta line, it looks “very bounded.” the optimization is on bit width ‘nw’ blue, glide max index ‘cm’ green, glide length ‘cg’ red, max 0/1 run in post-glide-max region ‘mx01’ magenta. ‘nw’ typically close to ‘cm’ means that the post-max-glide point also roughly corresponds to the start of postdetermined region.

a caveat, as always for null-like results, maybe this somewhat unusual optimization objective eludes the bitwise method, have seen other cases of that maybe esp wrt ufos. nevertheless, overall, taking results at face value, another way of looking at this is that it seems the “glide drain” region may have different dynamics than the post-glide (drain) such that the former avoids ufos but latter may contain them… this seems a surprising property because both are part of the drain so to speak, but not inconceivable… possibly wrt this the drain idea may have some anthropomorphic pov going on… a lot of prior experiments could be reanalyzed/ reevaluated based on this new distinction which did not seem to occur/ be encountered previously… ❓

`bitwise34.rb`

a lot of prior work touches on these directions/ questions, but the hybrid experiments at the end of 1/2019 stick out in my mind (as already cited once before). they were the 1st to discover ufos in the postdetermined region even after a prior bitwise optimization didnt find any. looking again at those results, it looks a little uncanny, there seems to be several cases where large ufos show up but only near the end of glides, in drains, and/ or at the start of postdetermined region for at least some iterates. its as if there might be some invisible yet-to-be-discovered force fields. however, the sample size is very low. so maybe a merely a case of a human seeing signal in the noise. maybe a cute term for this might be jesus sighting (click to see for yourself lol!)

💡 at this point am thinking the next step is a sophisticated hybrid optimizer to search this question. havent hooked up a multi parameter hybrid optimizer yet, except if the bit size is considered one of the parameters, but have thought about it in a few cases. as already outlined/ pushed on, want to optimize on “size of glide (and/ or drain) + size of any ufo contained in it.” and then the location of the ufo in the glide drain is something else to consider: beginning, middle, end, no difference? ❓

(5/22) it occurred to me that maybe the backtracking idea can lead to some new basic statistics and that furthermore maybe they could be related to trajectory dynamics somehow. this idea was mostly shot down with the following analysis but there are still some things to report. this does 20 level/ depth backtracking on 200 random 200 bit width ½ density iterates. there are a lot of metrics but some familiar.

the output is sorted by total # of preiterates found (divided by 10), ‘c’ red. then there is ‘c1’ green the 20th level preiterate count, and ‘c2’ the max preiterate count by level, and c1 = c2 is found empirically. c3 = c1 / c and its found to be a 2-case constant. this seems to indicate that the search trees expand in a nearly uniformly geometric way. the step effect is due to that if an initial iterate is exactly divisible by 3 then all preiterates are just it multiplied by powers of two. ‘m’ is the initial iterate “triplarity” (in contrast to “parity”) ie its mod 3 value. ‘r’ light blue is the ratio of larger preiterates out of all preiterates found. (‘r,c’ right side scale.)

one finding is that ‘r’ declines slightly as more preiterates are found from left to right. another finding is that there is some kind of bunching in the preiterate count by triplarity as in ‘m’ black line. null findings are that basic trajectory statistics like ‘ls’ trajectory length (divided by 10) gray, ‘cg’ glide length orange, ‘cm’/ ‘cm1’ local and global trajectory index max are apparently uncorrelated with the preiterate count. there was some attempt to find any binary pattern in the initial iterates sorted by ‘c’ that came up emptyhanded. the analysis also reinforces/ shows more numerically/ analytically that finding smaller preiterates is uncommon and only seems to happen for the larger search trees. it seems like size of preiterate search tree has some other deeper implications on the overall dynamics/ structure of the problem but cant see much further than these basic aspects right now. larger trees somehow hint at the iterates being “more common” almost like “hubs” in/ over all trajectories.

`backtrack6b.rb`

(5/23) ❗ 😮 💡 this is surprising and something of a breakthru because it shows some connection between preiterate and postiterate structure/ dynamics and it wasnt easy to find. this was dashed off quickly but seems substantial. ‘cm’ (local) glide max index red, ‘cm1’ global glide max index green, ‘cg’ glide length blue, ‘ls’ trajectory length magenta are found to be “wedge correlated” with ‘rrmn’ in the code, the ratio of the smallest preiterate to starting iterate after backtrack searching 20 levels, 1000 100-bit width ½ density samples analyzed. there are only a few bins larger than 1 for ‘rrmn,’ apparently consistently 3 only, not sure how to explain that (maybe their fractional values are slightly different?), it relates to growth of the preiterate tree vs search levels and preiterates bunching near each other. anyway overall these are signals in the noise, but their deeper meaning is hazy to me right now. have to think more about what this means/ how to exploit it.

`backtrack9b.rb`

(5/27) heres some further investigation along these lines, variations on a theme. am not totally satisfied with these versions but am writing it up as a checkpoint.

• the 1st looks at the average in the log backtracking trajectories divided by log of starting iterate. the 2 shapes are the 2 different mod 3 values for starting iterates, the higher one is n mod 3 = 0 (ie divisible by 3) and the 2nd lower one is n mod 3 = 1. color coding is by ‘cm’ max trajectory index. am mostly focused on the lower band. this 1sts graph showed fairly strong signal for low ‘cm’ dark colors and higher ‘cm’ tends to get mixed up.
• the 2nd version divides log min of the backtracking bins by log starting iterate (bins are all the backtracking points arranged by backtrack level/ depth). the color distribution is similar, strong signal for low ‘cm’ and more mixing for higher ‘cm’.
• the last version changes the distribution because its found that most samples have low ‘cm’ and this code has a simple scheme to even out the distribution, by oversampling and throwing out repeated values of ‘cm’ for samples. it seems a lot of the color signal by ‘cm’ is lost. was actually aiming for stronger signal and the outcome was exactly the opposite. my only explanation is that maybe there are far fewer low ‘cm’ which have the strongest color signal, but am not entirely certain about that interpretation.

on further thought maybe some way to interpret this. it seems like maybe low ‘cm’ value iterates have more backtracking signal, whereas higher ‘cm’ ones dont, and maybe (“naively/ biased”) sampling more of the signal-rich iterates increases the apparent signal.

update: suspect the significant `backtrack11b` color signal might be related to overplotting.

`backtrack11.rb`

`backtrack11b.rb`

`backtrack12.rb`

(5/30) this data/ effect is rather hard to work with and it leads to more processing layers but many prior tools/ techniques are applicable. this idea is to sample the backtracking bins with only 10 points and color code the points by ‘cm’ max trajectory index. the code is refactored/ streamlined in various ways. it handles both the distributions, the naive and the replacement distributions. the replacement distribution is adjusted to contain exactly the same number of pre-trajectories in this case 40. the mod 3 = 0 points are excluded from both distributions and are thrown out earlier instead of later, this also had the nice unexpected effect of improving the number of points from the replacement distribution and it didnt have to sample as many points/ run as long, decreased that by 5x to 1K iterations. the plots are very similar but one can see more cooler points in the 1st distribution. also its been occurring to me that bin spread/ deviation is maybe somehow playing a major role and want to get into that.

`backtrack15.rb`

update: on examination it looks like for cooler trajectories (lower ‘cm,’ actually cm=0) the trajectories are starting with even seeds and hotter trajectories start with an odd seed. so there is some distinguishable feature here seen visually but its not as expected.