collatz finetuning

the last idea from 10/2018 was bouncing around in my thinking some, it left open a question about “degree of monotonicity”. it shows that the max 0/1 runs seems to be almost monotonically decreasing even in a climb/ glide. havent really looked at the monotonicity of the max 0/1 runs all that directly with the nonmono run length measurement. ofc all the recent analysis is related to monotonicity, but not exactly tied to nonmono run lengths. there is some relation. at this point am interested in finding long nonmono run lengths in the max 0/1 runs sequence. last months genetic algorithm code looked into this but for a fixed (“1d”) bit width. what would that look like for variable bit width (2d)?

the last “2d” genetic algorithm search code was mix32 from 9/2018. the mix series is aptly named because it mixes quite a few ideas across many algorithms. it has single dimensional genetic algorithms, bitwise optimizations, and 2d genetic algorithms (ie both within given bit widths and over multiple bit widths). am going to rename the 2d genetic algorithm here hybrid.

got all kinds of cool ideas for refactoring/ streamlining working on it. it has some abstraction for all the initialization and combination/ crossover operators. it has a very sophisticated 2d algorithm that is similar but different from prior algorithms. it has no restrictions on expanding “bit bins” except that the expansion operator does it only by a max of 3 bits. it dynamically analyzes bit bins for fitness similarly to analyzing fitness within the bins based on top performing candidate in each bin, and combination operators work on the top performing bins. it currently throws out the most underperforming bit bin and currently keeps the total bit bins to 50 and the size of each to 50 for max 2500 candidates. the code enforces a minimum of the max 0/1 runs in each iterate, here 6. starting bit size 10. the recent initw bit string initializer code turned out to be crucial after the minimum was added because the other initializers dont seem to create many iterates with sufficiently large max runs.

after ~1hr this code seems to paint into a corner for a max nonmono run length ~650 and max bit sizes ~300 (total 122K candidates explored on last output) but needs further analysis/ runs. letting it run longer about ~9hr it reaches max 415 bit width, 1091 max nonmono run length, 1274 glide length, 609K candidates explored. this is very little code and packs quite a lot of (optimizational) punch. (honestly theres a lot of moving parts and all very sophisticated and only a very advanced student of lots of prior inquiry here would be able to comprehend it all.) what this is suggesting is that for a large enough min max 0/1 run ~6, no seeds exist of arbitrary size with arbitrary large nonmono run lengths in the max 0/1 runs, or they are very rare/ hard to find. lol, everyone got that?

this almost seems to suggest an extraordinary infinite/ finite transition point where maybe glides of arbitrary length exist (for larger iterate sizes) with small 0/1 runs but not for larger 0/1 runs, but such a conjecture while tempting would be risky/ questionable; one near achilles-heel of this type of research is (as pointed out/ commented long ago and intermittently) that its very hard to differentiate very rare/ slowly/ hard-found solutions from total lack of them/ nonexistence. have to look back over the prior FSM results to try to figure out how it meshes with those. ofc this code is fundamentally different in that its operating over a glide instead of a trajectory and is in line with the idea that both have to be meshed somehow to work toward the “final solution”.

another somewhat tricky element of this logic is in the nonmono run length analysis and has something like a “fencepost” aspect/ factor. in the current code a later max 0/1 run of exactly the same count does not reset the length counter. the alternative is to reset it each time on equality and then the results would be different, am now wondering about that given the high prevalence of the case in the result, suspect the alternative logic could lead to substantially different results… this “order” selection happened somewhat by chance in current code and was not intentionally assigned. from inspection the min_by/ max_by operators are apparently selecting the 1st equal element in the list (in this case preferring/ prioritizing the prior minimum). this subtlety has come up occasionally in prior code over the years…

here is a graph of the max 0/1 runs red left side scale on the final iterate/ glide found after code was terminated at ~9hr and iterate bit width in blue right side scale. from the graph theres very strong overlap/ “coupling” between the 6-min 0/1 run range (entirely the glide trail extending into some pre-descent climb range) and the glide decline. this further generally is aligned with/ supports the “crunch” idea but again specifics need to be worked out.

n=16298397491632173330154922373989217019805837799941759439490191263164482415394740301070197391506878932875994066099915194367

hybrid.rb

hybrid

(12/7) was thinking quite a bit about the remarkable histogram for review30 from 5/2018, my thoughts return to it & it really calls out for further attn, meant to revisit it. my recent thinking/ conjecture was that something like it might arise naturally on all iterates descent region. was a bit surprised to find this is not the case although in 2020 hindsight/ retrospect the conjecture seems naive. as a relatively simple exercise created some random iterates based on ½ density and then looked at/ analyzed the 0/1 histograms over 10 iterate windows.

results are as follows. the earlier histograms are color coded hotter/ starting 1st because it seems like it looked better that way eg wrt overplotting. the 0 runs are on left and 1 runs on right and theres a strong symmetry in this case. the prior peak at width 4 is not apparent, the peaks are instead always at 1 although the histogram shapes do change somewhat. so there is some scale invariance found here but not like the earlier case. am looking at/ thinking about other ways to normalize these curves which so far dont seem to add insight, need to try by iterate bit width next. am also wondering about the earlier data for post-glide region, would it look the same as previously or as below? in both cases theres some kind of (different-yet-similar) stability in the histograms over iterates…

another way to look at this, it also again shows the distinction of entropy vs density. the review30 distribution and this one both have ½ density but different entropies. have to think about this more and what it means. it seems almost like 2 different “fixed points” of the collatz mapping (for both density and entropy) and maybe it makes sense to study fixed points more generally. some of this is confirming the current hypothesis about these random-looking patterns actually themselves being a sort of pattern as seen in the histogram statistics. are there some more subtle patterns embedded in them? ❓

stats.rb

stats

another fairly simple calculation. this looks at max 0/1 runs (red) over a 5-iterate window. the bit width average is in blue right side scale and declines in a rough linear way but the red max is a little warped/ nonlinear in comparison, a possible edge to exploit? also graphed are ratio and inverse ratio green, magenta. this suggests theres “something going on” in the “higher length” max 0/1 runs as seen in some prior analysis.

stats3.rb

stats3

(12/14) this idea occurred to me thinking about density + entropy. how can one vary the two independently? there is some interrelation. my idea was to start with a fixed density and then vary the entropy over that density. this involves cutting or sectioning the 0/1 runs similar/ related to an earlier idea from 10/2018 in the gap3 code. from this code it seems like maybe entropy cannot vary over all values for given densities. this is probably not hard to prove once found but not very trivial either.

in this graph the x axis is density, the y axis is entropy, and the color is the change in new iterate size versus prior iterate size using the compressed mapping. the pyramid shape shows the entropy-density restriction.

blend.rb

blend

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s