# collatz match

that last graph from last month shows clear signal on Terras-generated glides wrt a matching feature/ metric. so then a simple exercise is to look at last months matching metric/ feature associated with the trajectory database. this code works on glides longer than 10 and calculates the corresponding match count up to that point. the color coding is by 1 of the 8 generation types. the ‘i’ matching metric shows some signal relating to the glide lengths for some of the trajectory generation types. there seems to be an inverse correlation. but its also a bit noisy and maybe “fooled” by some of the generation types. there is some further adjustment to the increasingly general plotting routine. there are so many varieties of plots and its hard to build a general system/ API that services all of them, gnuplot is something of a wonder in its flexibility/ power at times.

(later) an obvious idea on hindsight is to sort generation types by glide lengths. thats a 1 line modification in the 2nd graph. the color scheme by generation type is different for the graphs. it makes it more apparent that maybe 1 generation scheme, the blue one in 2nd graph, is generally low in the metric while others seem to be higher or more random. which is the blue scheme? obviously it would have been nice to label the generation schemes in the graphs, didnt figure it out yet.

review134.rb

review134b.rb

(9/3)construct20 and construct41 are worthy of a lot of focus and have been thinking about them more. am increasingly convinced some of the construct20 effect may be due to truncated glides, have to look at it more closely. as for construct41 heres a new simple explanation on further thought. the x · 3^n matching function is likely to hover around ½ density and the early correlation vs anticorrelation in the Terras-generated glides is simply glide parity density versus the ~½ density. so this immediately suggests, while there seems to be substantial usable signal, there needs to be some more sophisticated ways of analyzing the various density parity functions/ constructions.

😳 further thinking about it, maybe this explains some trend but this does not explain the very high (anti)correlation in the early part of the graph. or does it explain any trend at all? on 2nd thought am thinking the 3^n sequence does not have random ~½ density (parity). have to stop thinking and start coding! the graphs reveal something and omit something else. as usual theres some deeper story here. am going to have to sharpen my statistical intuition even further! always room for improvement it seems! 🙄

❓ another idea: construct40 from last month showed the 3^n sequence has ~½ density parity, actually slightly higher, for iterates in the drain region. but presumably those would act differently than those in the predetermined range of a Terras-generated glide. this is now reminding me of Bohms idea of the implicate order in the form of the analogy with different surfaces of a fishbowl. each graph is like a pane. something emergent is going on here. have to try to integrate/ synthesize a picture from at times inherently-reductionistic tools. also have been enthusiastically cranking out new experiments quickly without a lot of additional analysis.

(9/4) immediately rectifying that some. this is a revision of construct41 the last experiment from last month revealing an intricate dynamics hinted at but not really anticipated. in short the density parity dynamics of the Terras-generated density variations have very strong signal in the 3^n sequence (call it the 3-power sequence) and at least initially its far from ½ density as found in the drain, which is very exciting because it suggests a nondrain vs drain differentiating signal!

the graphs are not labelled in a very descriptive way; heres the rundown. the 1st graph is almost indistinguishable from the construct41 code, its the cumulative match ratio for the 3-power sequence where its density parity is odd. in the 2nd graph some of the expected anticorrelation is found; this is cumulative match ratio for 3-power sequence density parity even, in other words often when 3-power sequence density parity is ‘0’ the bits mismatch leading to a ‘1’ in the sequence parity. then in the 3rd+4th graphs the running density average is graphed. it was found to have overlapping structure so its split into the two sets of 50 iterates (the graph label is not really accurate). the higher parity density Terras iterates tend to have narrower running average density and low parity density Terras iterates have a very strong running average density signal. overall theres some complex signal here and it will take awhile to untangle it/ exploit it for proof purposes, but it seems to be very significant. basically the 3-power sequence seems to function as a forecasting indicator/ verging on system… ❓

construct42.rb

(9/7) this is a quick study of the running average density parity of the 3-power sequence generated starting from Terras density gradient glides (seeds). its mostly predictable/ expected from/ in line with last results but worth looking at. its inverted from the lower density to higher density seeds seen on right side (actually seen left to right), 1st graph hotter Terras density glides to the bottom and inverted in 2nd graph. note this 3-power sequence is “compressed” in a sense that each iteration is multiplied by 3 instead of in the matching sequence where the 3 multiplication is aligned with the collatz sequence odd iterates/ parity.

❓ at this point a basic observation is that the 3 power sequence finds the extremes of the Terras density glides from low to high but cant distinguish between the two. some special additional twist is needed. dont have an immediate idea right now.

construct43.rb

(9/8) this is a quick riff on construct43 that realizes a rather straightfwd idea based on its dynamic of the 3-power sequence. this is not exactly what is desired in terms of a “distance to drain” in a sense, but its significant. (maybe there has been some overthinking going on.) the tricky aspect is low Terras parity density glides and that they are in a sense sub-drain in their slope, they are in steeper decline than the drain slope. so are they inside or outside the drain? in a sense they are an accelerated drain that decreases in acceleration aka decelerates. but anyway the prior metric can be modified slightly to find the extremes (slopes) away from the drain in this sense. the color coding is hot are far from drain slope and cooler colors are closer. this metric is similar to the long-studied density distance and does seem like it could be key in a proof as far as proving a kind of drain slope convergence which the diagram seems to indicate. actually this is maybe/ apparently a direct revelation of the (slope) mean-returning aspect/ property of the 3-power sequence.

in other words there are 2 distinct ways of looking at the problem and maybe 1 of them (1st following) is less helpful toward a proof. (1) there is a distance of a glide and that is like a distance to the drain. a very short glide is like a short distance to the drain. (2) the trajectory is in a above-drain slope (high Terras parity density) or in a below-drain slope (low Terras parity density) and it converges toward the drain slope. ie mean returning.

construct45.rb

(9/11) 💡 shifting gears. this idea ties in/ meshes with a lot of prior ideas, came up with it after some survey. the recent ideas remind me of a proof based on “density narrowing”. the “wrench in that works” was the surprise of ufos which can arise in density ranges very close to ½. but then it was found that no ufos seem to arise in the postdetermined glide (drain). some of that work last was seen about 6/2019. reviewing it, especially notable in retrospect was the hybrid18 experiment that seems to show ‘mx01’ max bit run wedge/ narrowing/ tightening as a transition-point-like effect around the pre-to-postdetermined region! which reminds me of other experiment(s). however its also known that there seems to be some kind of scaling effect of 0/1 max runs in various regions. how to combine all this?

my new related idea is that maybe scaled max 0/1 runs in the postdetermined glide are limited and that this also tends to be associated with drain behavior. this is a quick study via bitwise analysis. it tries to maximize postdetermined scaled max 0/1 runs ‘mx01s’ while maximizing all the other basic trajectory size metrics. note for this metric if the glide is already over at the 1st postdetermined point then ‘mx01s’ evaluates to zero.

the graph is generated on the 2nd pass of the data in the review code. it finds that indeed ‘mx01s’ black narrows (right side scale) as iterate bit widths ‘nw’ (x axis) increase. this also ties in with experiments that show limiting ‘mx01’ and ‘mx01s’ max 0/1 runs/ scaled tends to limit glide length. there are a lot of related ideas (here reinvoking the Bohmian metaphor of the fishbowl glass panes) but havent thought about the problem in this emergent/ consolidated way so far. so then the proof sketch goes like this, trying to fully take everything known into account:

1. there is a postdetermined point. either it will be below the starting iterate or it will still be in a glide.
2. if its still in a glide, its ‘mx01s’ will tend to be around (at most) ~0.05 as in this graph. recall finding this in some other experiments but it will be hard to track down.
3. the small ‘mx01s’ tends to cause drain behavior. apparently via self-similarity of the iterate binary structure. however ‘mx01s’ tends to slowly increase through the trajectory and climb steeply at the end, cf review124 below.
4. there may be large ‘ufos’ after the glide, but these do not disrupt this overall inductive structure of finding glides necessarily terminate with drains (ie more technically, a climb followed by a “intra-glide” drain).
5. in other words (apparently) the postglide drain dynamic is different than the intraglide drain for this all to work, which is something of a twist and seems a little surprising/ unexpected, but the idea is now supported by multiple experiments, some of them very thorough with the hybrid search code.
6. but, maybe this is not so unexpected, because ‘cmnl’ experiments and others seem to indicate the drain may in some sense be (nearly) an entirely postclimb phenomenon and the longest climbs basically span the predetermined range and not much more (maybe less than a constant “offset” as far as current experiments seem to indicate). in other words if theres drainlike behavior in the glide (tail), its predetermined, and therefore different than postdetermined behavior ie almost in that case something like a “pseudodrain”.
7. yet another pov, the postdetermined range can be split into a (“seed-“) upper and lower part (or rather, higher vs lower), and the upper part seems to have different dynamics wrt ufos, ie not contain them whereas the lower part may contain them. the upper part may be empty if the glide is shorter than the predetermined range.

bitwise44.rb

(later)

• looking back thru this year, the very similar bitwise22b on 1/2019 looked at ‘mx01s’ over the postdetermined region and found a nearly similar constant/ dynamic. this experiment was run right before discovering ufos. there was some writeup there of a guiding conjecture/ framework but it is seen flawed (or less harshly, incomplete) in retrospect. the idea was about crunch regions and a constant ‘mxc’ that might govern crunching. even at the time there were strong hints/ signs of scaling in ‘mx01’ runs. but later analysis has brought it out more carefully/ extensively. in short instead of a bound in absolute lengths its maybe in terms of ratios instead.
• also notable in finding ‘mx01’ scaling behavior were FSM experiments fsm14 and fsm14b from 11/2018.
• actually on 10/2018 there was a lot of notable ideas + experiments + themes along the lines: bitwise9, review117, review124. actually ‘mx01s’ self-similar scaling is an old idea around here, probably almost/ nearly as old as the metric… maybe even the raison d’etre for it?
• and looking closely at/ thinking about that somewhat striking last one review124 there was an idea of ‘mx01s’ as an inverted glide/ trajectory mirror, was there/ surely there was earlier notice of its longterm dynamic/ trend? yet dont recall right now… maybe noticed but not posted? think now this was very promising and dont think it was followed up much at all… maybe because have had some aversion to floating point sequences in some contexts for some long time (there were some old ideas using genetic algorithms)…? ofc there are many that are nontrending like density/ entropy… relooking at it this moment immediately gives me a cool/ easy idea… stay tuned… it fits in further with the concept mentioned occasionally of “random walk within a random walk/ recursive self-similarity”… its almost now utterly staring me in the face… 💡 ❗ 😮
• also construct9c from 2/2019 based on Terras density gradient generation seems very notable for relating to 0/1 bit run scaling properties.
• also from the same month there were 2 experiments relating to density narrowing: bitwise24 on the limbo range (postdetermined climb range) and bitwise25 on the 1st postdetermined iterate, similar to this last experiment;
• same month also construct5e is applicable and looks at density narrowing of 1st postdetermined iterate over Terras density gradient glides, finds it very largely/ almost entirely flat, but does find somewhat diverging densities for the low and high Terras density gradients (sequence parity) for 400 bit width iterates.
• looking back very thoroughly/ even farther, 2/2018 mix4 experiment had some hint of the idea of ‘mx01s’ involved in the trajectory “trap” dynamic.

(9/13) following up, my recent idea goes like this. the ‘mx01’ and ‘mx01s’ sequences are like “inner random walks”. this theme was explored somewhat in 12/2018 with some study of the nonmonotonicity of the ‘mx01’ sequence and it was generally/ apparently found to be nonmonotonic like the collatz sequence within experimental limits. however theres more to think about with this “inner random walk” idea. do these random walks look like the collatz walk in some way? if so all the existing frameworks/ mechanics can be reused to study them. for example are the trajectory metrics over the inner random walk also expanding/ unlimited? these questions are easy to study, that is, leveraging prior code (fineprint: built up at great labor). heres a quick cut. this attempts to expand (optimize up) the trajectory statistics for the ‘mx01’ sequence and finds they seem to be unlimited.

bitwise45.rb

this is a quick study, changing a few lines, of studying standard trajectory statistics of ‘mx01i’, the inverse/ reciprocal of ‘mx01s’, ie bit width of iterate divided by ‘mx01’, as noted highly inspired by review124 graph/ results. there seem to be no obvious limits. the slopes are different but nothing else really stands out except for some difference in ‘da’ density distance trend. these two experiments suggest that the inner random walks have somewhat similar aspects as the main (collatz) random walk. roughly null results. as they say it was worth a shot™. on other hand, shifting pov, maybe it a strong confirmation of fractal selfsimilarity… ❗

❓ however, there can be some remaining open question. do the inner random walks correlate with, or even stronger, apparently control the outer ones? that was some of the original rationale for investigating them. and the prior proof sketch/ outline proposes that ‘mx01s’ sequence is a key to understanding/ proving the problem. but, have to think hard about how to formulate/ formalize this, seemingly havent done anything like it so far…

😳 these included the review code. was in rush, forgot to rip it out, did not try/ run it, only ran optimizations. it accidentally calls on nonexistent variable(s).

bitwise46.rb

(9/14) 💡 ❗ this is an extraordinary finding! the basic idea here could have been studied long ago but it was missed, again very sophisticated analysis is brought out/ extracted with careful statistical technique over large aggregates/ samples and not even any optimization (on which massive attention/ effort/ focus has been expended at this point) is required, again bringing out the einsteinian concept in some sense that the theory determines what can be observed. only recent ideas led to this inspiration. on almost a whim, not knowing what to expect or even whether to expect anything at all, its a study of the ½ density range and possible hidden structure and uncovers it rather dramatically. its not especially complicated but ingenious.

the starting idea is to look at any possible effect of ‘mx01’ on the seed (ie wrt basic trajectory properties/ dynamics). but then a clever new distribution is introduced, somewhat similar to constructions seen for hybrid init seed code earlier, yet probably novel. it creates varying ‘mx01’ iterates by building a ‘1’ run of random length spanning up to half the seed bit size. then it matches that by setting any of the remaining unset bits to ‘0’ but they may not be adjacent. then any remaining bits, which are again not necessarily adjacent, are set to 0/1 in a balanced way. (dont recall this before but do recall similar ideas, there was an old experiment looking at varying entropy over ½ density, suspect it was similar, have to dig it up to compare…) all this ensures a ½ density seed overall but with (at least) the desired 1 bit run length up to half the iterate size. would have to do a big review of prior generation ideas to understand how it relates to them.

next 100 samples over the smoothly varying ‘mx01’ distribution are generated, and then sorted by ‘mx01’. standard trajectory statistics are evaluated including ‘nw2’ the bit width of 1st postdetermined iterate, and lsb1, lsb0 the 1/0 run lengths of the initial iterate. these 2 last metrics are found to be basically zero. note cg, cm are 10x scale and nw2 is 1/10 scale, ‘c’ green on right side scale. theres a lot going on here, a lot of “moving parts”…

• the amazing finding is that ‘cm1’ the global iterate max blue is highly correlated with ‘mx01’—didnt really predict this at all but it makes perfect sense/ is exactly in line with prior ideas in retrospect. this shows up better after averaging over 100 datasets. it is a breakthru to find this connection with ‘cm1’ because its so central to any possible (inductive) proof structure.
• theres been some idea that cm, cm1 might vary in some significant way but in many prior experiments that is not observable, here is a dramatic counterexample/ exception.
• the “local” cm, cg variables magenta, red for glide max and glide length descend gradually with highly correlated spikes. not sure what to make of them yet but my strong initial suspicion is that theyre due to lsb 1-runs in the middle of the glide, but maybe for only individual anomalous glides in the 100 count dataset.
• ‘mx01’ not very visible in yellow is very smoothly linearly varying after the 100 sample averaging (its almost a shame not to highlight this in this graph, its a central feature of the analysis).
• in this graph ‘c’ trajectory length is very noisy and looks like it might have some upward trend but maybe its misleading (ie just random).
• its worth noting while ‘nw2’ appears very flat in this graph, on closer look and graphing them separately ‘nw2’ and ‘c’ are highly correlated with bumps/ spikes and this isnt surprising, its uncovered in some prior experiments and has to do with (or is even fully explained/ “caused by”) the roughly constant slope associated with the drain.
• also on closer look apparently both ‘nw2’ / ‘c’ tend to increase in deviation/ spread from left to right around a roughly flat trend. not sure how to think about that/ explain it right now.
• something else notable, ‘cm1’ is roughly flat until the 2nd ~½ of the graph right which corresponds to ~¼ of the iterate bit size, ie ‘mxs’≈¼ ie an apparent transition point possibly/ seemingly found in other experiments.

overall this is a very big deal because in some sense its a victorious/ vindicating triumph over the long/ deep/ seemingly near-deadly curse of undifferentiability in exactly a key way that meshes with piles of prior analysis and hints at, even points to a proof structure! (“deadly” may sound like dramatic overstatement, but if one spends an entire lifetime laboriously going in circles on a problem—with a directionlessness and/ or hopelessness even approaching “near zombie-like”—its not entirely inapt!) ‘cm1’ is long recognized as a critical aspect of the problem but has not been studied too specifically with the idea that ‘cm’ and ‘cm1’ havent been seen to diverge much so far.

(9/16) 😳 oops lol! dashed that off too quick! the lsb code was coded very quickly and not unit tested and while looking plausible turns out/ shows merely thats not enough. it invariably returns 0 due to 2 different problems, can you spot them? exercise for reader…

construct48.rb

(9/17) ❗ this quickly fixes the lsb logic and adds some additional analysis of ‘lsb1mx’ and ‘lsb0mx’ the max lsb1/0 runs found in the trajectories. something very significant is going on here but really need to look at sequence grids at this point. ‘lsb1’ red is plotted right side scale here. the very strong ‘lsb1mx’ blue correlation with ‘cm1’ lightblue is extraordinary and unexpected and seems to point to/ signify some very big/ interacting story… ‘lsb0’ green + ‘lsb0mx’ magenta have nearly similar/ corresponding trends also without the bumps… a key unaddressed question is where the maximums are occuring in the trajectory…

construct49.rb

(later) 💡 😎 😀 on closer inspection quickly looking at trajectory grids simply in binary text and a few case studies the story seems somewhat simple/ coherent and can immediately 1st be put into words instead of pictures. and then can include some graphical examples. actually think prior ufo grid diagrams probably already reveal most of this dynamic! note that this seed generation code does not ensure uniform size seeds (something of an oversight/ afterthought/ “TODO” in the quick/ exciting rush) but suspect they are almost entirely the same size and at moment dont think adjusting generation for uniform size will change results much.

• looking at the right region from above graphs, ie longer 1-runs, the generation code creates long 1-runs (but also long 0-runs) which tend to show up as ufo-triangle-like structures.
• a better word is probably “wedge” as applied earlier because the general triangle shape is somewhat distorted on the lsb-side edge. the 1-wedges are not typically starting at lsbs although the 0-wedges do as seen in the above rising ‘lsb0’ statistic. there are both 1-wedges and “smaller” 0-wedges.
• as the sequence evolves, the “1-ufo” shifts toward the lsb, but its lsb-nearest edge tends to get “mixed”. this mixing may happen faster or slower than the shifting of the 1-ufo/ wedge. the sizable but shorter 0-runs/ ufos/ wedges tend to get mixed faster/ earlier.
• if the 1-wedge mixing is “faster” than the wedge shift, the wedge disappears without causing the trajectory to increase, and no additional glide ensues.
• in the other case if the 1-wedge lsb-shifts faster than the mixing, the wedge “collides” with the lsb edge and then turns into a (familiar/ known) lsb-triangle, and the iterate bit widths increase roughly by the size of the triangle, and if the triangle is big/ early enough in the trajectory, this will cause the trajectory max bit width to exceed the initial seed and therefore “cause” a substantial (further) glide.
• in other words for the longer 1-run range seeds there tends to be 2/ bimodal/ “bipolar” cases of no glide vs sizeable glide (spiking like phenomenon).
• its also important to visualize the difference between local index max ‘cm’ and global max ‘cm1’ and that this is all for the latter. eg for ‘cm1’ a trajectory could be decreasing starting from the seed and then lsb 1-run spike into “late starting” glide.

(later) ⭐ ❗ the challenge has always been relating a local limit to a global limit, that was recognized/ explained early on. this idea just occurred to me. ‘mx01s’ the local measure seems to somehow correlate with/ limit glide lengths. throwing together many diverse ideas, so then how about looking at a ratio of a glide length metric by ‘mx01s’ or some f(mx01s)? then is there a bound on that function? prior work is pointing in this direction and it seems to be only a matter of choosing a function with the necessary/ sufficient trend. so experimentally it appears that $g({h}, {mx_{01s}}) = \frac{h}{{mx_{01s}} ^ \frac{1}{3}} < {h_{mx}} \approx 4$ where h = cm / nw ie with ‘hmx’ gray bounded by ~4 (“for sufficiently large trajectories aka nw“) from this bitwise analysis. the idea here is that ‘mx01s’ relates to a trajectory length transition point. there would be other ways of estimating f(x) through curve fitting and the exponent found would be expected to be ~1/3. (10 · mx01s, h, hmx right side scale.)

this is a relation/ bound between inner and outer Collatz random walk sequences just dreamed about! it also seems to relate to the inherent fractal scale invariance of the outer/ inner walks!

this bitwise code is modified with a neat idea of saving/ plotting extreme points (min, max) out of the equal size batches instead of randomly sampling the points (at fixed intervals). the idea is to get the key data to plot without plotting 10K · n points where n is the number of variables, otherwise the graph is mostly just overplotted/ slow. the algorithm has the very nice property that it ensures all extremal points (spikes) are in the plot and uses a small memory buffer and minimal (linear) time to calculate min/ max over the buffer, the long-used sampling method is very vulnerable to missing spikes/ extremal values. such a simple idea! really should have thought of this a long time ago! it was nearly staring me in the face!

bitwise49.rb

(9/20) ❗ 💡 ⭐ this is an extraordinary idea/ finding!* was looking back at last graph and then feeling some similar frustration. its “yet another (highly plausible/ deep) bound/ limit” that cant be proven exactly. then was thinking again about older ideas of trying to find an “ordered traversal” of an optimization as a way of finding a proof. then was led in this direction. dont recall trying it. feel that am close to something, cant quite put finger on it yet…

another way of looking at trajectories is that they have a bunch of quantifiable features. then one can look at how those features cluster. my idea was to normalize the features based on ‘z’ values (“gaussian spread normalization”) and then traverse based on that, from another pov its just another type of dispersion/ entropy measurement. so looked at traversing based on min or max of the ‘z’ values and this had the familiar problem of avoiding certain points. so then my idea was to combine traversing based on “removing” (aka advancing) the extreme ‘z’ value nodes but interlaced with code that simply removes the least iterate in the frontier. the frontier advance can be based on various schemes but here chose the familiar/ oft used “inverse collatz order.” then came up with this striking sequence. the extraordinary finding is that it converges to an apparent order.

there are 6 graphs in triples and theres a huge amt comment on. left side graphs are the main graphs and there are 2 corresponding/ following detail views of upper left/ lower right. the 1st 3 are the iterates. the pattern stabilizes by 10k iterations to a lower and upper pattern. the red points are the smallest iterate in the frontier removed every 50 iterations, the algorithm removes high entropy iterates in 50 iteration batches. the green points removed are the high-entropy iterates, typically much larger.

however, looking at the lower right detail 3rd right, strikingly the algorithm is also within each batch removing a “low” iterate with high entropy (4th green point among 3 red points pattern). after thinking about it carefully that seems to me a very critical dynamic/ challenge of the algorithm, something of an elevated design pattern! so theres some kind of tradeoff: removing points/ iterates in an orderly way (the lowest ones) which increases disorder/ entropy, then removing points with the maximum disorder to decrease overall entropy.

the 2nd triple of graphs is the entropy dynamic. it shows similar order. however, the trend looks like it might run out of higher-entropy points to remove as the lower green edge (50th rank entropy) trends upward over the red samples. it also shows the entropy ranges clustering into several bands. so overall, maybe only a temporary stability.

the overall theme however seems to be attacking fractals with fractals. if a fractal order can be found to the traversal order, and it has enough order such that its in a sense “fractal but not random” (eg in the way famous other objects such as the sierpinski gasket is), that seems to be the key to the solution… in other words finding a fractal inductive ordering.

❓ note: seems to be strange glitch where gnuplot doesnt recognize pseudo variable ‘x’ in these plot cmds with error "undefined variable: x" until at least 1 plot cmd has already been entered not with the variable.

traverse9.rb

⭐ ⭐ ⭐

(later) this is a quick revision that looks at internal variable and computes the dispersion over the frontier instead of all points and shows slightly different convergence but a resilience to the algorithm change. the 3rd graph shows the “trajectories” of individual features are chaotic but overall the system is ordered out of disordered components so to speak, esp looking at the 1st iterate extraction graph, and thinking about it more, this is actually a striking emergent property. this converges to a 3 step process instead of 4. the lower iterate stepwise pattern needs to be analyzed closely, its apparently an inductive fractal pattern so to speak. the 4th graph is the evolution of the normalized feature ‘z’ means and standard deviations.

traverse9c.rb

* (later) 😳 oops! dashed off that code too quick without looking at it. the overall structure is almost entirely correct but a single snag showed up where the 1, 2, 4 sequence is incorrectly handled as the algorithm (re)cycles thru the loop and doesnt correctly mark it as visited. or rather, it marks them as visited and then accidentally unmarks them (or remarks them as unvisited). the very regular “inductive fractal pattern” is nothing but a trivial defect. another beautiful finding bites the dust. this quick fix causes a similar pattern but the lowest iterate extraction is instead jagged and noisy as one would expect. correct code makes all the difference, too good to be true, back to the drawing board, luckily almost no one reads this blog, etc, lol!

theres something more to note in the prior graphs. in each of the 1st series, the “removed” points stay at the very bottom and in the same line, pretty much exposing the issue. in the 1st algorithm though the entropy of those points remains the same, because apparently the shifting frontier doesnt affect their entropy much. in the 2nd algorithm their entropy shifts upward as entropy of the frontier shifts.

(9/21) heres a quick fix of the issue and immediately the extraction order of smallest iterates, while not extremely disorderly, somewhat melts away as one would expect. here the banding of the extracted high entropy iterates into distinct regions is even more apparent (1st graph). this is presumably due to binary patterns in the iterates, and/ or maybe traversal patterns, have to look at this more closely. interestingly, rather “anomalous” lower trajectories like 27 are extracted early on, think this may be due to them having larger/ extreme measurements such as glide/ trajectory length etc.

❓ 💡 on looking at the graphs, an immediate question comes to mind, is there some way to advance the frontier that decreases overall entropy? another idea might be to look at multiple smaller points to extract instead of a single one and extract those that are most in line with prior order. another idea is looking at reorderings of the extracted points to decrease disorder after the extraction ie a sort of (sliding) buffer operation.

so taking a broader pov, it seems getting an orderly induction sequence will “take every trick in the book” so to speak. but maybe here are some new ideas, or a subtle yet key refinement on prior analyses: prior optimizations are usually looking for bounds in the problem. the overarching/ conquering optimization algorithm has a somewhat different goal, and somehow has to track/ manage the overall entropy of the traversal.

theres a basic tradeoff, all possible successful induction orderings must have order/ low entropy, whereas it seems almost all traversal orders are disordered/ have high entropy. so its again like finding a needle in a haystack: but the needle is the ordered proof (if it exists) in the disordered haystack of all possible orderings. another metaphor is an intelligent/ finetuned strategy for “chewing and digesting” entropy. and immediately one imagines that ML+AI might be well suited to attacking this. in a sense the entropy of the problem is the fundamental adversary…

traverse10.rb

(9/22) 😳 there is a new subtle glitch to notice. did you catch it? lol! (only a very astute reader could/ would spot the “hole” here, and deja vu, suspect this following aspect was noticed/ remarked on/ coded for years ago also.) the algorithm is finding lowest points in the frontier, but there is no guarantee the frontier is thoroughly covering all integers as it advances. a quick code check reveals that (not surprisingly!) it isnt… hence it can move “past” lower numbers (maybe) without ever returning to them. so thats something crucial to address/ fix. dont have an immediate idea what the approach should be. ❓

(9/23) this code fixes the issue, its a bit quick and dirty, some near-hacks, it uses global variables maybe more than is elegant. it adds “smallest untracked” iterates along with misc other adjs/ tweaks. the traversal and entropy patterns differ in significant ways, most notably the inner/ middle bands are missing from the traversal and it separates into two extreme bands. but am thinking while interesting maybe this is not exactly the way to go. already have a bunch of new ideas from this stuff. 💡

traverse11.rb

(9/24) cant always build to something bigger so to speak. heres some quick ideas/ detour. was wondering about statistics of “all possible glides of length n“. or rather, “all prefixes of length n that are glides”. this was studied years ago, but dont recall doing this particular experiment. (also, discovered entropy after the earlier experiments.) this is not very efficient code because it constantly recomputes glide prefixes but its acceptable. (nontrivial exercise, find the efficient/ incremental version; hint, it must necessarily be very similar to the Terras algorithm/ theorem.) as found earlier the sets increase in size exponentially. after 23 bits it finds 85061 glides. it computes the basic statistics, displayed here. its natural to wonder if some of the trends are asymptotically approaching constants and if that could play some kind of role in a proof (part of the motivating idea).

in the 2nd graph, my idea was to just advance a fixed maximum sample of points here 10K. for advances smaller than that the algorithm is exact and then after that point, iteration #20, the algorithm is an approximation. the results quickly diverge and are visually “obviously” not accurate, theres a “decay” + discontinuities reflected in the trends, for most except maybe a few of the metrics. an interesting idea, is there some better/ efficient way to sample the points that increases the accuracy of the extended trends? this could be very tricky due to the fractal nature of the problem, ie later iterate dynamics do not seem much related to/ computable from earlier dynamics. ❓

construct51.rb

construct52.rb

⭐ ⭐ ⭐

(9/26) ❓ 💡 ⭐ the recent new entropy+traversal ideas are stuck in my head. feel something is a little off, cant put my finger on it. they are more sophisticated than prior experiments. using many/ all the existing features as a combined entropy metric seems like an advancement and like it could point to something deeper. it seems it might help with emergent properties of finding order in the disorder. the ideas are a little more developed/ refined than last time but something is still missing. thinking over the schemes in a more general/ theoretical way, have some new ideas. it seems like theres 3 key interacting components:

• entropy metric: this measures different apparently relevant features of the computations, probably in some normalized way. the computations are not known to be halting or not. the computations are a verification of a property of each object from an infinite set, here integers and collatz trajectories. am thinking that (as proposed in an important paper cited on this site, but not noticed much elsewhere) entropy tends to correspond to computational complexity. am wondering if there is something or some advantage to trying to figure out a “stationary” way of defining entropy where the word comes from statistics theory + random walks, eg the ‘h’ and ‘v’ metrics defined earlier that are floating point ratios of length or widths over starting iterate and maybe arent increasing over all trajectories. as entropy is currently defined above it will increase as trajectories increase because the trajectory lengths are unbounded, although thats not as clear with normalized lengths, but seems highly probable. it has to do with whether the algorithm can ever completely “move past” trajectories with low entropies before it visits high ones, but from experiments, it seems like the extremes in entropy go on forever at very nearly adjacent positions, or even among very many iterates spanning long ranges…
• traversal algorithm: this is any algorithm that provably visits all objects aka integers in some “ideal” order. this is a decidable algorithm in the sense that it decidably computes a (1-1) mapping of visiting sequence to all objects (integers). defined more technically/ strictly in this way, actually have not explored different traversal algorithms too much. counting, f(n) = n, is basically the simplest traversal algorithm. have explored “possible” traversals that are implied by measuring other properties and then looked for order. these are actually quasi- or pseudo-traversals in contrast to what might be called a complete traversal but theyre not provably decidable because theyre based on collatz trajectory properties. in a way the traversal algorithm seems to be provably decidable without “knowledge” of the (possibly very diverse/ noisy/ chaotic-looking) features/ trends/ patterns of objects being visited. note that some traversal algorithms may be highly irregular, disordered, semi-random, or fractal-looking. as pointed out long ago in another context, its possible to use pseudorandom number generators to build complete traversals.
• emergent entropy/ traversal dynamic: the entropy trend traced out over a quasi- traversal algorithm will have different properties. it could be noisy or smooth. it seems the goal is a smoothly, nearly monotonically increasing entropy metric. a declining one seems not realistic or achievable. a key aspect is “how complete” the quasi-traversal is, ie looking at the entropy of objects not being visited vs those that are being visited by the traversal algorithm. it could be the algorithm is (consistently) excluding objects with certain entropies eg high or low ones in which case it clearly needs adjustment. as noticed above it could be excluding lower integers (which would generally/ on large average tend to be lower entropy) etc. eg the dynamics of the last traverse11 are quite distinctive in various ways, eg the iterate traversal ordering separates into two distinct/ maximally separate bands, one increasing exponentially with powers of 2, and the entropy dynamics has low iterate points scattered in red with mostly low entropy but spiking, and the high entropy point removal in green is mostly sparse in the upper range and then clusters thickly in a lower band, etc.

phased in this way, then the core problem seems to relate to (but not entirely consist of) (1) finding a traversal algorithm that leads to a (provably) smooth entropy dynamic. or, “approaching from the opposite side,” (2) semi randomly traversing in the order of what appears to be a relatively smooth entropy curve and then trying to find “nearby” decidable/ complete traversals. the traversal algorithm must visit objects with widely varying entropy but in a way that smoothes out the average entropy. this could involve a running average or visiting the objects in bunches/ batches.

hence so anyway there are two competing sides ie (1) smooth entropy traversal, ie quasi traversing in order of entropy dynamics, which increases disorder in the traversal, vs (2) orderly traversal, which increases the noisiness of the emergent entropy dynamic. and the solution seems to be adjusting each side so they “meet in the middle”.

all this is substantial and yet leaves out one final critical element, it leads to the final conclusion.

• “final solution/ proof”: this smooth entropy dynamic somehow tends to imply the aggregate computation is orderly (ie uncovers hidden order!) and therefore apparently decidable. this is already observable to some degree in the sense of traversal algorithms clearly/ effectively smoothing out entropy dynamics but dont have the exact specifics yet on converting the “smooth traversal” into a decidable algorithm. it seems to involve further mostly unexplored techniques. somehow it relates to combining/ merging/ reorganizing/ packing/ rearranging/ “refactoring” separate computations with high disorder into new computation groupings with less disorder. ie finding hidden correlations in what in raw form appears to be uncorrelated. and here the relationship to ML (which extracts signal from noise so to speak) seems to be apparent. theres also an associated emergent feature dynamic eg traverse9c 3rd/ 4th graphs etc, (also generated/ inspected for the following 2 but similar/ not displayed)

so, this all makes the fuzzy a little more distinct. it reminds me of trying out/ groping for concepts/ words for (eventually settling on) “pre vs postdetermined.” looking this over there is some ambiguity in the use of the word traversal in two different distinct senses. but maybe thats not a bug but a feature. it seems to be a word that cuts to the heart of the problem in multiple ways. there is problem/ entropy-driven quasi-traversal (the world of disordered but finite fractals) and there is order-driven traversal (the world of decidable algorithms). somehow maybe traversal can bridge the gap…? note that traversal seems to be the algorithmic equivalent of an inductive process and/ or the old idea of infinite descent invented/ pioneered to great effect by one of the celebrated pioneers of number theory—and one of my personal heroes even as early as a teenager—Fermat.

(9/27) this is some relatively quick revision of prior code and a few new ideas. it has 4 more frontier advancing operations which are conveniently very easy to add in this framework including +1, -1, top 2 bit (msb) extension used in the bitwise code (which can now be seen as a basic/ useful traversal mechanism). my idea was these might make the minimal unvisited point less likely to get stuck, which happened to some degree, but not enough to make the unvisited logic unnecessary. there was again some glitch found with the code revisiting previously visited points, hadnt fully fixed that yet, addressed with this code again, think its fixed now! this code 3-way alternately samples from the bottom, center, and top of the entropy distribution to remove points leading to a remarkable pattern! and it also makes me immediately wonder if theres a way to “shave” off the distribution such that it stays stable, and if thats somehow closer to the overall goal! still thinking + tinkering… also there seems to be something notable that all the low-extracted points quickly converge to inside the current distribution… ie in a sense no outliers… another remarkable attribute of the stability is the unchanging center of the distribution… the center seems to be very dense with many points sharing apparently very close )( entropy… ❓ ❗ ❓

traverse12.rb

## 2 thoughts on “collatz match”

1. Pingback: intractable collatz | Turing Machine