collatz alive and still kicking

the theme of the moment is “still alive, still kicking”. sometimes feeling like a lump of coal under extreme heat/ pressure… aka “soul crushing“… and sometimes even “little guys” like me have problems not entirely unlike those among the remote, lofty stratosphere… on a vastly different scale… nevertheless now can relate to ariannas sentiments on this, “no tears left to cry”… (supposedly) “that which does not destroy me makes me stronger“. nietzsches saying is either supposed to be inspiring or having the ulterior bleak or barren allusion (ofc one does not voluntarily wish to experience near-destroying events). life is a double edged sword. his own life would be a case study in that… and ofc a cyberspace blog is the perfect place to share + elicit appropriate, soothing audience sympathy + compassion for these worldly all-too-human predicaments/ tribulations! 😳 o_O 🙄 😥

one could spend a lot of time, even endless time just trying out different optimizations. some different patterns tend to emerge. there is a concept noted earlier of “pushing on different dimensions of the problem”. some dimensions just expand without correlation or connection with the other dimensions, others are correlated or uncorrelated, and pushing on one tends to suppress another. it is not easy to discover or chart/ map all these. just picking different parameters to optimize can lead to dramatically different outcomes.

immediate example of that. this optimization has same parameters calculated as last time but only optimizes on ‘nw’, ‘cgnw’. noticed that ‘cgnw’ seems to be suppressed somehow in some optimizations, not sure exactly why yet, but looks like maybe didnt save those runs/ code. think recall doing an experiment that showed it trended upward in one optimization set, but here it is strongly suppressed to the point of being bounded by about ~100, black line. the run ran out of memory at a little after 3K iterations. here ‘mx01s’ behaves in the more expected way, narrowing, as iterates get much larger. not easily discernable in the graph, for ‘cgnw’ positive ‘pmx01’ blue closely tracks it. ‘mx01’ green mostly tracks the same and seems to be bounded around ~15. ‘cmnw’ gray is mostly negative ie peak glide in predetermined range. so maybe this has something to do about some kind of major difference about peak glide in predetermined vs postdetermined range.

at this point it seems like just proving ‘cgnw’ can expand would be an important point/ step. there is seemingly some unexplained mystery or phenomenon at this moment. wondering maybe it has to do with different regions of transition phenomena.

need to look back on ‘cgnw’. thought that there were experiments proving it expanding but dont see them. it was introduced last month almost on a whim with the new stepwise optimization stepwise2. its not clear in that graph if it expanded due to overplotting. think recall looking at it closer/ isolating it at the time and finding subtle upward trend (maybe lower bit size ranges) but this would be in strong contrast to that.



(2/3) it turns out to be not hard to find increasing ‘cgnw’ using a recent trick and a quick riff off construct95 from 12/2019: the “triangle seeds” ie starting seeds in the form 2^x – 1. ‘cgnw’ lightblue; ‘cmnw’ blue plotted right side scale.



so there is (necessarily) a lot of play-by-play nuance to comparing results via different optimization schemes around here (saying this 1day after the superbowl, did you catch it? hey, even nonsportsfans must admit $5M/30s must have some relevance to or indication of cultural significance/ zeitgeist! picture this something like the big neon-lined desk with all the square-jawed, buzz-cut, steely-eyed alpha male commentators incl living legends like terry bradshaw! oh yeah, speaking of that, whatdya think of jlo + shakira, living legend goddesses of my dreams, lol!).

the finding here is that the stepwise optimization code has trouble pushing up ‘cgnw’ even though the strongly increasing trend apparently exists “behind the scenes” as apparently proven in prior experiment. the idea here was to optimize on cgnw, cmnw, cgcm, not including ‘nw’, thinking that pushing on ‘nw’ maybe decreasing the (better) optimization of the other parameters. finally the graph displays the optimization parameters applied, an informative idea that was used a long time ago but was missed over many experiments with multi parameter optimizations.

the 1st graph is the run order, the 2nd graph is post-run resorted by ‘nw’ green. sometimes the postprocessing leads to some better understanding of the trend, although here the bit increase order is very orderly/ spike free over the optimization and the reordering doesnt change the interpretation much. the trend is very gradual but not easy to interpret with all the spikes. another way of thinking about the whole result is that the algorithm explores “mostly lower-sized” ‘nw’ and maybe therefore a definite trend cannot be seen. note the samples are just randomly selected at even positions; this code/ analysis would also benefit from the “extrema sampling” idea used in several prior experiments.

💡 an idea occurred to me thinking about the multioptimization dynamics. the current multioptimization is (probably? apparently?) working somewhat with gaussian-like clouds/ distributions of the variables. it might be interesting in contrast to look at the parameter optimization space using PCA, principal component analysis; maybe some signal exists wrt to it, ie the distribution is not gaussian-like in the (combinations of) different dimensions. it seems likely over all kinds of different optimization problems, some tendencies/ biases might show up. how could they be identified/ exploited?




(2/5) it occurred to me to do a study of 0/1 runs inside postdetermined glide for “large iterates”. ofc the optimizers create “relatively large” iterates but other strategies are needed for “even bigger ones”. the 1-triangle idea works well and also the new “0.64 (terras) glide” strategy can handle/ generate large iterates also. this code tries both approaches graphed in that order. the 0/1 run sizes tend to flatten esp with the 1-triangles but there is an unmistakable very gradual trend upward more visible in the terras glides, am guessing its something like the cube root of the initial iterate size (a formula like that appeared awhile back, and similar 0/1 run trends elsewhere, have to dig both up again), need to do some curve fitting on that for a better idea. run metrics are graphed on left side, other metrics on right side.

💡 ❓ ❗ looking at/ thinking about this, had a rather obvious idea that hasnt been tried before now and would have made sense to be investigated long ago (think alluded to this a long time ago). WDITTIE! (why didnt I think of that idea earlier?) one can look at 1-runs in ½ density iterates as a control and see if they have the same statistics! either way the finding would be significant. my expectation is that they will be (very?) close.




(2/6) was interested in looking at “length of postdetermined glide vs same of sibling” ie difference here called ‘cg21’. all the indications so far are that this (or each) is a nearly random walk and that they are close, and was wondering how much, eg even to the point of “getting lucky”. for those following carefully (insert cricket chirping foley here), theres a very natural corresponding inductive structure, ie “a glide terminates if its sibling glide terminates.” however, it would be a kind of unexpected breakthrough if there was a constant bound. intuitively the lengths of the random walks get longer and even if they have the same deviation per sequence advance lengths, as the walks get longer the deviation between them will tend to increase. am suspecting nearly linear increase in deviation as the lengths increase. deviation per length was studied a little a long time ago, not much recently. this gradual trend is found here via 3 different analysis methods, the stepwise, 1-triangles, and 0.64 terras glides.






(2/7) 💡 ❗ this is a remarkable analysis. decided to start looking at histograms for 0/1-runs for random iterates/ sequences vs generated ones. have done stuff like this for years but this particular angle turns out to be dramatic. there was very strong signal isolated. then hit on this particular striking pov. this graphs the histograms as slices, working through different Terras parity density starting seeds. the histograms are log scaled and calculated over entire predetermined range but no farther. its also color coded by parity density. its a 3d splot. the histograms are “distorted” for the falling vs climbing glides, and nearly linear for the middle range, which is itself a remarkable finding that dont recall seeing previously. its a key manifestation of a log-scaling phenomenon, characteristic of fractals. aka yet another fractal angle to the problem identified/ isolated. over the whole range it graphs into a 3d shoe-like figure. do recall another experiment likely isolating this “high end histogram” distortion, it would take awhile to locate it. there are other similar experiments touching on it, am thinking of construct9c in particular (which has a histogram scaling idea reused several times later).

it seems this could be or is or nearly the whole story about the problem, maybe. the histogram distortion on falling vs climbing walks is almost surely due to more (longer) 0 and 1 runs, respectively. the graph also reveals the concept of mixing. after the predetermined region with possible high incidence/ frequency of “unmixed” iterates, the postdetermined iterates are mixed the same as the “midrange” seen here. this mixing seems to be a strong transition point between the predetermined and postdetermined range. it would seem a proof now “reduces” to proving that in the postdetermined glide, a mixed iterate cant “spontaneously jump” into an unmixed one. it sounds great in theory but in a crucial finding/ twist exactly such “spontaneous jumps” have been contrived in the postglide drain.

overall its a very difficult idea, far from straightfwd, and seemingly/ apparently unlike any proof structure previously known to man—it almost seems something like “proving a definite shape or characteristic structure of noise”—but nevertheless have been having some ideas on that and already have findings along those lines, am thinking in particular of the finding of repeated adjacent bitstrings in iterates preceding or “foreshadowing” “spontaneous” 0/1 runs—ie is it provably missing/ impossible in the postdetermined glide? it seems at the heart of the problem earlier observed in the paradigm shift, the undifferentiated region is actually a “feature.” proving this characteristic “mixing” phenomenon almost sounds like the direction of thermodynamic entropy increase, aka 2nd law of thermodynamics. ❓ 😮



(2/8) 💡 the construct5 experiment showed mostly low differentiability in the initial iterates of Terras density trajectories esp over the broad midrange. the prior graph seems to show much higher differentiability over the broad midrange. this seems to reveal something deeper. maybe/ apparently there is more information/ signal past the 1st iterates in the Terras density trajectories. something to try to understand better/ deeper. this then reminds me of an old idea. it would be possible to analyze the problem in n-count of sequential iterations instead of n=1 and this leads all kinds of new directions not explored much; rethinking this, now wondering if maybe there has been way too much focus on individual iterates up until now. there were some experiments way back along those lines.

now thinking about this along with the insightful Terras ideas, its clear that any 2^n parity sequence is possible for arbitrary n. however, as the last experiment shows, maybe the iterates contain much more information than is “summarized” or “compressed” in the parity sequences. via Terras generation it must be possible to show that any n count of parity sequence information can lead to arbitrary later behavior. but what about analyzing the remaining iterate bit structure? an ideal feature to look for, as long sought for years, would be a monotonic decreasing trajectory length estimator from n successive iterates. can it be ruled out by Terras constructions? dont immediately see how to rule it out… ❓

(2/14) this is another way to look at the scaled histogram structure and confirms the immediate idea about alternation in 0 and 1 runs causing the effect. and reveals a lot about the structure of Terras iterates. earlier experiments have probably poked at this but its a neat new way to visualize it. there are 2 separate regions, the “submean and the supermean” in terms of future glide tendency and they show up as “wings” or “tail fins” sort of like the batmobile! here the mean-returning aspect of Terras glides is strongly encoded in the 0-1 histograms and this seems to be a really informative/ maybe fundamental high level pov on the overall dynamics, almost to the point of plausibly hinting at “general inductive structure” for Terras density seeds with the strong hint of even deeper implication. the submean region is “0-denser” with longer 0 runs and the supermean is “1-denser” with longer 1 runs. in the opposite regions, the opposite scaled histograms are flat. this code separately analyzes the 0 histogram and the 1 histogram sequentially “stacked” in the 3d splot. taking the separate histograms removes the asymmetry of the prior analysis.



this next analysis is a “histogram difference” of the scaled histograms for the 0 and 1 runs. it leads to a complex 3d object that is not easy to characterize and spent a long time spinning the 2d projection trying to understand its structure. but basically most of the histogram difference is positive for the supermean region and negative for the submean region.



this next code adds up the scaled histogram bin differences over the Terras density iterates and finds a very long-sought after signal, apparently very fundamental. it maybe is a long-sought general structure to the 0/1 run distributions. its a measure of the distance-from-mean and is asymmetric about the origin, ie it is not just “absolute difference from mean” (as found in some other experiments) but actual difference. this is an apparently extraordinary finding!




Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s