intractable collatz

(background color: truly is the nightmare before xmas around here lately. seems as if santa died this year.) when in doubt, write some code. this idea occurred to me and is similar to hybrid3 experiment from 12/2018. the idea there is to just run the hybrid optimizer to find large ‘cm’ max glide index trajectories for fixed bit widths. here that is changed to ‘cg’ glide length. what will be the pattern of results? this generates 500 trajectories after 10K optimization iterations and then sorts results from longest to shortest by ‘cg’ and looks at the initial bit patterns. some of the hypothesis is confirmed. it turns out they are all utilizing a 1-lsb triangle to generate very long glides; this explains the lsbs. the msbs are harder to characterize. this grid plots the inverse of the bits and the msbs line up on the right in the same ‘cg’ order, longest to shortest bottom to top.

but generally this all fits into a brand new hypothesis. the idea is that generally the longest glides by bit width utilize the 1-lsb run structure. this would seem to explain a lot of other experiments that optimized but did not look further at the bit structure. the similar hybrid3 experiment did find/ notice the 1-runs but didnt notice at the time the probable lsb link. there was a series of experiments to find longest glides or trajectories limiting by 1-run lengths. my suspicion now is that for those longest trajectories found, the 1-runs tend to be in the lsb. it would have been simple to check at the time, but didnt. recall this experiment is showing that starting from random seeds, optimization of the longest glides all converge to this 1-lsb bit structure. therefore it seems likely (all) other optimization schemes (eg bitwise etc) would tend to follow the same pattern.

this is a very simple idea building on lots of prior analysis/ features but hasnt been surfaced from this angle until now. it also reminds me of the construct49 experiment from 9/2019. the basic idea there was that, iterating, the way 1-runs eventually intersect with the lsb seem to have the key effect on glide lengths.

some days it seems like a proof is still far off or unreachable, and other days it seems like just a slight twist or change of focus could bring out the path to solution. when simple unifying ideas are found that were previously missed, as here, its a cause for optimism.

💡 building on this, general proof idea: the longest glides have long lsb 1-runs, up to a point. all shorter glides have shorter or missing lsb 1-runs. this points to some kind of induction on the lsb 1-run length. this rough idea has come up previously. from prior observations there are instances of adjacent lsb 1-runs in trajectories, but maybe they are limited in size somehow? thinking it over, have not yet studied yet the scaled lsb 1-run… ❓

hybrid24.rb

hybrid24

(12/16) that code deserves some further look. this adds some straightfwd but somewhat timeconsuming analysis code. it samples 200 of all trajectories found during the hybrid optimization, sorted by ‘cg’ glide length. trajectory length ‘c’ and glide length ‘cg’ are plotted right side scale. am guessing the downward spikes in the left ‘c’ range are probably due to 0-runs. the code looks at the size of the lsb triangle ie initial lsb 1-run as ‘c1’ measurement. (as hypothesized/ suspected/ guessed previously, somewhat described previously but maybe not with detail/ “explicitly”) it closely tracks the trajectory lengths. then the max lsb 1-run in the remaining part of the trajectory after the initial triangle ‘mx1’ is measured along with the max 1-run ‘mx2’ in the same remainder section. the statistics are showing a transition point like phenomena.

💡 the basic idea here is that this is a new key insight into “typical” trajectories sorted by glide length. apparently the lsb 1-runs are crucial to controlling this dynamic/ feature. again there are hints/ aspects/ elements of an overall inductive structure here. the story is that theres a kind of 3-region organization. there are shorter trajectories with mostly small 1-runs, the middle trajectories have 1-runs in the “middle” of the binary iterate structure but not typically in the lsb (or rather, sorted by glide length, in the middle transition point range, roughly the 1st half have none, after which it spikes upward in the 2nd half), and the larger trajectories start with a 1-lsb run triangle followed by the small 1-run iterates.

the important concept here is that this is all scale invariant and must apparently hold for arbitrary starting iterate bit widths ie hinting or pointing at a natural or intrinsic inductive structure! also the midrange (or midrange right) seems to again be very similar to the construct49 experiment (distribution/ dynamic) just cited. but also note that the prior experiment was associated with global glide measurements instead of local ones as here and that interrelationship between the two is still not well understood.

another pov to highlight about the midrange vs higher range: it appears to be showing that 1-lsb runs larger than a certain size cause significant mixing in the trajectory remainder, leading to no long 1-runs in it. this could be a basic element of a proof because as long noticed the 1-run absence “feature” seems to be nearly the same as causing the drain. a possible wrench in all these works: its not so clear which regions where “ufos” could unexpectedly appear and one cant assume the distributions are typical, as always they may be (subtly?) biased toward easy-to-find solutions. or more specifically, the 1st half of the midrange clearly has ufo tendencies but do ufos show up in the other regions? (also note actually it is known from prior analysis/ artificial-construction that ufos show up in the left region where glides are short.)

as noted there was a lot of prior analysis of lsb 1-runs and it seemed to run into a dead end or went into hiatus at least. it seems that it was roughly the right direction but didnt/ couldnt see this “bigger picture” and then got stuck. my memory is that the general challenge was that there were arbitrarily large crunch regions without any lsb 1-runs and so the lsb 1-runs seemingly couldnt be used as a feature for many trajectories. but this new angle seems not to hang up on that, its just a basic property of the left-region. actually looking further at the left region, it does have lsb 0-runs associated with/ causing the downward spikes in trajectory lengths.

hybrid26b.rb

hybrid26b

(12/18) 💡 ❗ not at all in a prolific cycle wrt code ideas/ experimental riffs compared to some other epic months, but definitely the “gears are turning” in my head over recent findings. the long held dream is to let the algorithms do the work of finding/ exposing/ isolating the natural inductive structure of the problem—much easier said than done!—and yet it seems this all may be something of a windfall or new breakthru in the area!

  • re-looking at hybrid24 heres a cool idea! it naturally suggests a “reverse bitwise” approach. set all the bits of fixed widths uniformly to 1 and then adjust the msbs rightward (instead of as previously lsbs leftward) and look for longest glides! this is much less processing than the hybrid code and it seems likely it could achieve nearly as performant results (in the sense of longest glides).
  • another natural idea for comparison. some early code generated long glides using bitwise approach. but it was long prior to looking at bit diagrams. were those ever examined (ie generated glides wrt binary diagrams)? do not recall right now that they were. a key question, do they have any resemblance to the long 1-lsb pattern found via the hybrid approach? if they do its a huge oversight not to have noticed it by now. actually a lot of old results probably could use a 2nd look with the binary diagrams.

(12/19) new ideas/ prov. the last inductive structure is really again about the undifferentiated crunch region and attempting to show it “converges” ie all iterates inside it converge. the other cases involving 1-lsb triangles basically again “reduce” to this case.

but then again it seems to come down to figuring out some kind of “remaining glide” estimator metric based on some kind of feature in the crunch region. have had some success with the idea of the monotonic lookahead estimator documented about a half year ago and it seems natural in a sense (inspired by Terras pre/post determined transition ideas). while all experiments support the idea, however, its not at all clear how to prove its a monotonically declining estimator.

it seems after lots of crunching (so to speak) that the crunch region is featureless. however, there seem to be some slight edges that have been extracted in recent experiments. the crunch region is expectedly a bit paradoxical. any iterate chosen at random in the crunch region also will drain with very high probability. but a few “needles in the haystack” dont, and glide for apparently relatively “long” distances, and there seems to be no way of explaining their behavior, in the sense that the pre-max iterates look identical to/ undiscernable from post-max ones, and both are undiscernable from the drain. (the post-max region is often pictured as in the drain, but lots of prior experiments esp related to Terras constructions show how this is a simplification, because in a sense a “large part” of the post-max is programmable, the “earlier” the max occurs. and actually the Terras ideas show the whole concept of just a typical pre-vs-post max behavior may be an anthropocentric bias or illusion.)

the increase-to-decline behavior seems a lot like a transition point phenomenon but there seems to be almost no metrics that capture this transition point from iterate features. again (have probably said this now in so many ways) there seem to be no features at all!

my idea was to just look at a long glide in the crunch region and maybe look at more global statistics out of desperation. there are many experiments that generate these, but its been a somewhat long time ago since focusing on it at this point. so just fired up some basic code in the bitwise algorithm that simply minimizes ‘da’ the density distance while increasing all the other glide metrics. dont recall doing (exactly) this before (although theres definitely a lot of similar code). this is sufficient to create long glides without any sizeable 1-runs. some other algorithms that look at 0/1-runs are definitely much more computationally intensive to analyze the glides based on “parsing” all the iterates (eg using regular expressions etc).

but after some analysis of the top glide, did actually find some discernable trends/ features, some “narrow/ thin” but yet consistent, these are all related to relatively recent ideas; not sure how exploitable they are, thats the immediate next question. was on a run adding a bunch of features, coded this up quickly, and the graph is rather busy. it will take awhile to untangle/ analyze/ investigate all these (but definitely something is better than nothing!). there are many interrelationships without immediately obvious explanations.

the variables are calculated over left/ right ‘l’, ‘r’, over even and odd iterates ‘0’, ‘1’. the density/ entropy are computed over the 50 lsb of iterates. in general some of the signals are biased above or below ½ and others are balanced across it. it was split over even/ odd iterates similar to prior experiments that found significantly different statistics for match ratio over each.

  • mr0/1 – match ratio between density parity and parity
  • dp0/1 – density parity
  • d – density
  • e – entropy
  • drl – density difference right minus left
  • erl – entropy difference right minus left

it would be a great exercise for the reader to now delineate all the basic trends from the graph. alas while there is a number of hits to the site, the commenters have taken a very long vacation + hiatus these days. so, again doing everything around here… these are extracted by using the gnuplot legend click on/off lineplot feature (discovered this feature late! very useful/ valuable for cluttered graphs).

  • dl is above ½
  • dr is below ½
  • drl is therefore negative
  • el is below ½
  • er is above ½
  • erl is therefore positive
  • dp1l, dp1r above ½
  • dp0l, dp0r seem close to ½
  • mr1l, mr1r above ½
  • mr0l, mr0r seem close to ½
  • mr0l, mr1l, mr0r, mr1r anticorrelated in pairs (despite half the pair being close to ½!)
  • mr1l, dp1l, mr1r, dp1r equal in pairs! ❓ ❗
  • mr0l, dp0l anticorrelated
  • mr0l, dp1l anticorrelated
  • mr1l, dp0l correlated
  • dp0l, dp1l, dp0r, dp1r correlated in pairs (*)
  • similarly for mr0/1r, dp0/1r

bitwise56.rb

review139b.rb

review139b

(12/20) this code looks into the density parity dp1, dp3 and match ratios mr, m1, m3 related to sequence and its associated 3 power sequence over 5 longest trajectories. the plots are in pairs for pre vs post max (left vs right). looking at individual trajectories it seems like there is a signal in some metrics but they are not much consistent from one trajectory to the next, except for the match ratio is consistently nonrandom above ½, and in this series it does seem to decline faster on the decline vs the climb. it takes a few hundred iterations to remove the initial noise out of the running averages. my memory is that there were other trajectories/ experiments where density parity was more of a predictor of series parity but here there seems to be no signal.

review140.rb

review140

(later) 😳 argh! reanalyzing review139b finds a very subtle glitch, can you spot it?! a case where a small syntax error didnt interfere with processing/ results at all but subtly altered them, not in a dramatic way hence evading immediate detection! seem to vaguely recall making something like this error before/ some near deja vu. heres the corrected code. amazingly almost all the relationships hold after reviewing except for the single (*) one.

review139c.rb

the above code for review139b has a lot to look at but basically suggests signal lies in odd instead of even iterates. so, briefly stated, stripping those out of the trajectories, and cutting out an equal size of the left and right, and then looking at lsb cross sections (1 bit “sections”) over climbs ‘1’ and descents ‘2’, and differences between right/ left ’21’, for density and entropy, leads to this signal. it starts with the 2nd lsb and graphs 10 trajectories. the spikes indicate a strong signal but mostly limited to only the lowest lsbs.

review141.rb

review141

(12/21) 😳 caveat that last bitwise code outputs associated trajectory statistics that are not quite the same as the trajectory seed dynamics, its because of the edge logic that finds the ranges of the graph line batches… was confusing for a bit when doing some analyzing and the numbers/ graphs didnt line up! sometimes this is all a bit )( like accounting or balancing a checkbook lol!

💡 ❓ ❗ new idea on waking this morning, and after looking at some more noisy plots: this should have been conjectured a long time ago and there is many sentiments something like it scattered about. maybe all the small edges detected, while actually real, are really red herrings! wrt the crunch region, maybe no base-n representation can reveal much about the long term dynamics of a trajectory. and moreover, apparently arbitrary complexity trajectories are lurking in the crunch region. another related idea occuring to me intermittently is that maybe despite all the appearances of a phase transition, the trajectory peak is not signified by any local features… ❓

this is all based on “banging head against wall” principle which is the opposite of the “absence of evidence is not evidence of absence” concept. if enough time has been spent looking for something, maybe it isnt there! it seems possible all the signal/ features found in base-n representations up until now may be something of a mirage, or extraordinary (anthropological?) distraction wrt really solving the problem…? maybe a “closing idea” apropos for the time of closing of the year.

(later) 💡 just reviewed the cmnl, cmnw ideas over many months, over more than ~1½ year. am thinking need to revisit them. and a overview/ survey at this point would be helpful too. there is something possibly very deep that didnt really realize at the time, need to spell it out. the ‘cmnl’ limit was discovered quite awhile before looking into Terras ideas and havent really meshed the two fully yet (‘cmnl’ looks like 1st noticed 4/2018 although 1st using the high compression scheme, Terras ideas uncovered 10/2018). theres clearly an interrelationship. phrased a different way, there appears to be only a finite # of possible climbs after the predetermined range in the initial postdetermined range! this is a remarkable, maybe even gamechanging property if it holds. it also seems to relate to the monotone estimator results found/ analyzed… maybe more than coincidentally the max nonmonotone span was found to be around ~200, not far from max ‘cmnl’.

so then it occurred to me to try to poke at this idea in some new ways. didnt have any ideas immediately after the hybrid experiments. then now just came up with some new povs.

eg, here is a very simple idea. it is easy to prove that Terras construction for “all up” sequences of length n is 2^n – 1. one would expect these sequences to tend to create the longest climbs, and therefore tend to maximize ‘cmnl’. this is some very simple code that should have been tried long ago. what are the dynamics of trajectories in the form 2^n – 1? this is just a “single giant 1-lsb triangle”.

very quickly implemented, it looks like about half the trajectories dont generate further climbs, but the other half do. how far are the climbs? this measures ‘cmnl’ for the trajectories for n=1..2000 (“large”!) and finds (remarkably?) that it seems to be bounded under ~110. but also it seems from prior experiments that maybe larger ‘cmnl’ can be found within this range than these iterates and that theyre not exactly representative. as repeated the largest ‘cmnl’ found was from a hybrid experiment that turned up ~300 but didnt record the starting bit width for that. and all this leads me to the idea of going back to the recent hybrid algorithm with some related/ nearby variants. eg never did look at bit grids of ‘cmnl’ or ‘cm’ optimized iterates, another basic exercise… do they have some kind of structure like what glaringly showed up for ‘cg’ optimized? could it be similar?

construct95.rb

construct95

(12/22) looked for grid patterns in the trajectory database for the highest values of each of the generation strategies, didnt see patterns, ie null results. recall its built basically by bitwise technique and two of the generation strategies are ‘cm’ and ‘cg’. maybe tried looking at this a long time ago and/or didnt mention the null result, seems vaguely familiar, cant remember.

quickly adapted hybrid24 code to look for highest ‘cm’ instead and on analysis it seems to find “nearly the same trajectories” as the ‘cg’ optimization, ie hypothesis confirmed; in the sense that they have the same lsb 1-run structure and seemingly other statistics.

am running some hybrid experiments looking for large ‘cmnl’ at bit widths 750 and 1000 and they are not easy or consistent to find, didnt even find cmnl > 0! an idea of starting/ initializing with uniform 1 bits in the seed either alone or combined with the other standard seed initializations does not seem to help. looking through old blogs with the help of google, the cmnl=298 max was found in hybrid9 experiment almost exactly a year ago on 12/2018. the experiment found it early on, which now seems like a stroke of luck. or was there something unusual about that code? many new runs of similar recent hybrid code do not seem to find anything near it. its notable from the graph the bit width ‘nl’ looks like it was surprisingly a very low max ~75. it now seems highly rare and should have recorded it! its also notable that the algorithm does not search much higher than ~75 and its a strong ceiling as the algorithm moves sideways over many iterations. in retrospect/ 2020 hindsight all the high ‘cmnl’ values need to be checked for the lsb 1-run structure, although right now cant imagine any not having it. overall it seems like maybe more sophisticated and/ or specialized methods—if they even exist—are needed to find these rare outliers. a worthy challenge of the moment… ❓

(12/23) ❗ 💡 😮 😎 some really incredible/ surprising/ unexpected results! have been working on some variants of hybrid code optimizing mainly by ‘cmnl’, quick exploratory experiments. maybe not especially surprisingly, trajectories with longer/ higher ‘cmnl’ also generally tend to have the same lsb 1-run structure as the ‘cm’ and ‘cg’ (hybrid) optimized searches. on a hunch/ whim, decided to start analyzing the sub-trajectories that span from the start of the postdetermined region to the peak, ie the sequences of length ‘cmnl’ starting at the same index as the bit width and ending at the trajectory max. these seem to have somewhat extraordinary properties so it makes sense to name them. for now, call them the bridge part of the trajectory. the bridge is represented by just the sequence of up/ down movements ie a parity sequence.

my 1st idea was that maybe they are not so unique, despite many have lengths in the dozens. a single hybrid run to find trajectories without the restriction on changing bit widths did indeed turn up many duplicates, and this was not hinted at all in earlier experiments (again evoking einsteins dictum “the theory determines what can be measured”). it seems around 10% of all bridges generated are equal to all of them ie 90% are duplicates of the 10%! of course for short bridges there is a higher probability of matches but many matching bridges are over a dozen iterates. the starting bit width was 200 and the bridges are generated over a range centered at 200. then had another hunch. took the unique bridges from the 200 starting width run and then merged it with unique ones from the 100 starting width run, and found many more matches across the two different starting bit widths! (fyi the hybrid algorithm tended to spread out over about 50 bins centered around the starting bit width sizes. also, bin sizes were set to 200.)

in other words, the bridges seem to have some extraordinary scale invariance properties such that starting bit widths of substantial difference generate the same bridges! this is a striking, possible gamechanger finding! (and makes me think how this problem seems sometimes to be endlessly full of surprises!)

(caveat/ footnote/ fineprint: just realized on writing this that some of the duplicates at least in one run may be due to the algorithm generating duplicate starting seeds, because put the bridge tracking logic in the count subroutine, which is “prior” to detecting duplicates.)

then looked at variants of construct95 which extract the bridge. the 1-lsb triangle construction in it is a quick way of generating bridges without a search. then revealed the same redundancy (aka low entropy) can be found here in the sense of string repetitions. also on closer look, the trailing strings of the bridges sometimes match up to many iterates, even dozens! this reminds me of old experiments in grammar compression on trajectory sequences. post Terras discoveries, those make somewhat less sense (it is now known from it that absolutely arbitrary parity sequences of any length can be constructed), but the very useful compression idea reappears! “blast from the past”! note the grammar compression ideas are also a relatively natural/ powerful entropy measurement although not phrased in those terms at the time.

it was also natural to extend the construct95 diagram much farther. looked at iterates up to 5k and the graph seems about the same, again attesting to the apparent strong scale invariance property(s) of bridges.

was on a run with new findings; need to write up some related code & post it, and am looking at a way of consolidating a lot of these ideas into one algorithm.

(12/24) was looking into some grammar compression ideas for the bridges and then got a little carried away, but it is due to the inherent complexity, with simple ideas running into complications. this code illustrates the idea using a greedy scheme based on longest matching substrings working over the 2000 bridges generated from the lsb 1-run scheme. it looks for large common substrings and then removes them from the bridges replacing them with an ‘x’ character and marking the replacement span in parenthesis characters. it tries to build first the larger (longer) bridges out of substrings from the smaller ones. when its all finished one can compare the initial vs final size. theres a minimum replacement length ’20’ specified here otherwise the algorithm gets “carried away” with many smaller incremental replacements and there is some other logic to avoid replacements that tend to fragment the string into much tinier pieces.

the code reports 8461 → 6414 compression character rate (total binary string counts). some longer bridges are not decomposed. the overall idea here is to show significant redundancy in bridges, but dont really see a clearly exploitable structure here, it seems to compress into something like a fractal tree (actually a Directed Acyclic Graph).

construct99.rb

❓ a natural question. higher density parity sequences have a rough not-random entropy. is the low entropy of the bridges due to the higher density? my feeling is that they are much more ordered than that but its a simple control sample. quick thinking/ thought, this actually fairly easy. just generate random strings of the same length of the same density for each bit string. the compression is 8461 → 8317. so the lower entropy due not merely to random non-½ (ie higher than ½) density patterns is quite measurable.

construct99b.rb

💡 idea: maybe there is some kind of (exploitable?) redundancy also in the near post-peak region (ie immediate post bridge).

(12/26) ❗ 😳 on closer look/ some further replication/ rework the mad dash for results runs into some pitfalls. 2020 hindsight clearly the bitwise code should output the final optimized array for analysis instead of the analysis routines using the graph file. am thinking there is little guarantee the iterates in the graph file have values close to the edge values especially for a noisy metric like ‘cg’, but here it seems to accidentally mostly correlate. also review139 and review140 rely on the cm, cg edge values from the graph data and are therefore not accurate in exact locations for each glide and think found a case where its subtly biasing results (away from null)! also in review140 the 3 power sequence is not recalculated starting at the glide descent, and definitely wonder if results would change if it were, its an important comparison.

as outlined this fixes the bitwise code and reevaluates doing a scatterplot. it finds several subtle effects. ‘dp0’ vs ‘dp1’ are correlated over trajectories for the left vs right regions. not sure what else this relates to, seems like it should relate to something else. the spread changes also from left to right. left side is red, right side is green.

bitwise56b.rb

review139d.rb

review139d

further look: the 0/1 density parity values are correlated within the left/ right density parity and also with each other (0 vs 1 within a side), but left/ right density parity are not correlated with each other. there also seems to be rather high variance in the density parity given the number of iterates on each side.

this experiment is a revision on review139c using latest data that uses lsb spans (over odd iterates only) instead of 1-bit cross sections. it finds significant density and entropy signal in left vs right. ‘drl’, ‘erl’ differences are right side scale. the signals generally decay for longer lsb spans left to right except for ‘drl’ which spikes in amplitude around 4 bits.

review145c.rb

review145c

(12/27) 💡 had an idea to construct artificial glides and then reverse engineer them so to speak with the Terras mapping. didnt find any striking statistical aspects but the constructions are interesting/ nontrivial so am writing this up. suppose one wants to construct an artificial random walk with fixed endpoints, eg to build trajectories with certain nearly fixed properties such as ‘cm’, ‘cg’. in this case the “up” increments are slightly different than the “down” increments, call them u, d respectively. u, d are the logarithmic values of collatz sequence multipliers in this case 3/2, 1/2. suppose one wants a random walk of length ‘w’ reaching an offset of ‘z’ from the starting point on the “random” axis. combine how many up and down increments randomly mixed? (note that any mix of length ‘w’ in the same ratio leads to the same offset.) label the up, down increments x, y. this leads to 2 simultaneous equations: xu – yd = z, x + y = w. solving, x = (z + wd) / (u + d) and y = w – x.

this led to these two code variants. the 1st one modifies ‘cm’ while keeping ‘cg’ constant for w=500 width trajectories and offset z=50, the 2nd varies both ‘cm’ and ‘cg’ by tacking on a random downward walk of ½ density for remaining trajectory. because of small random discrepancies around the endpoints the ‘cm’, ‘cg’ are not exact but close. irrelevant aside: the 1st looks a lot like a ps4 controller outline… lol!

rwalk.rb

rwalk2.rb

rwalk

rwalk2

(later) some more statistical analysis revealing subtle biases makes me think of another possible trap. there are so many subtle “edges” that can be isolated, but they seem to be deceptive as far as leading anywhere closer to a proof. in a sense, real mirages so to speak! it reminds me of the expression “death by paper cuts”… alas, sometimes even the darkest and most hopeless thoughts of trolls arise, there is also a vague suspicion bordering on paranoia of “going around in circles”… when you stare into the abyss, sometimes the abyss stares back! 😮 o_O

as already proposed, the collatz function seems to act a lot like a highly secure encryption system, in the sense that future dynamics of the trajectory can almost be completely statistically concealed in a current iterate encoding…

❓ another idea: am wondering if the dispersion of the random walk is somehow not typical maybe in the sense of hurst measurements… is it something like a random walk in reverse starting from 1, and does this reverse direction shed light on its dispersion properties vs the forward direction? am starting to think the dispersion of the random walk is proportional to the iterate bit width… guess this case is studied somewhere? there seems to be evidence for  in the ufo (lsb 1-runs) ideas (which seem to show a max expansion of size ‘w’ is possible where ‘w’ is current bit width) and elsewhere (eg am recalling a sibling analysis diagram, also slope correlation measurements, maybe also monotone estimator etc)… maybe did some measurements on something like this a long time ago… how can this be expressed in some kind of property? seem to need something more sophisticated than looking at nonmonotone spans which admittedly is probably one of the simplest ways of quantifying a random walk…

was looking at some old ufo grid diagrams and then came up with this idea. there is a decline right after the lsb 1-run, how does it compare to later decline (post glide)? a simple way to study this is with the “single” lsb 1-run triangles as in construct95 and construct99, and simply looking at parity density over the two ranges. my idea was that the intra-glide parity density might be lower ie a faster decline in a sense “recovering” (in the mean returning sense) from the 1-run “spike” ie maybe some local effect. didnt find that, but theres a remarkable correlation between parity density and parity entropy among adjacent single triangle trajectories as seen here. theres also a sawtooth pattern in the intraglide parity density (red). this all points to some kind of similarity between both glides and descents for adjacent single triangle trajectories.

construct100.rb

construct100

(12/28) ❗ 😮 further looking, a basic discovery, the single triangles are merging in final trajectories. the trajectories 2^n – 1 for n in [100..200] merges to 9 final trajectories. surprising, did not expect this! this explains a lot of recent prior findings on the single 1-triangles eg the repetition in the parity sequences, density/ entropy similarities etc., and maybe if the merges happen “early” in the glides it could explain other aspects, and getting the feeling, while they have long been noticed/ known, should have studied them closer long ago… its also quite striking that the merges happen at identical sequence indexes starting the sequence at the 1st postdetermined index (bit width index)! it implies that higher trajectories have a slightly steeper initial slope (pre merge) than the lower ones. somehow this all seems to point to something bigger but still trying to wrap my head/brain around it… it seems to point to some kind of “clustering” by higher trajectories but could it be unique to these 1-triangles or more a more general phenomenon?

construct100b.rb

construct100b

(12/30) 💡 am trying to pivot (yet again) somehow, musing, trying to think out of the box, where the box is “pretty” (ok, forget that, extremely) complicated. getting some ideas. sometimes it helps to come up with new analogies.

  • there is a notion of similarity of glides going back to many experiments, maybe not brought out exactly or highlighted but now again thinking about it. induction seems to be based fundamentally on similarity. the prior metrics devised can all be used as similarity measures, ie trajectories with similar features (measures) can be regarded as similar. but there is some trickiness here because many metrics arent scale invariant, ie many of the metrics are unbounded. it is possible to convert most unbounded metrics into bounded ones eg like glide indexes divided by initial bit widths. could/ does this play a pivotal role in an induction or proof structure?
  • came up with idea for a basic new similarity measure not studied much so far but on immediate consideration, seemingly very related to a lot of prior ideas. trajectory parity sequences clearly have a lot of similarity but how to quantify this? was thinking at 1st of vectors then realized a natural measure is edit distance! think looking at edit distance differences of parity sequences is a really intriguing new direction. its not cheap to compute but its quadratic complexity and is basically the same complexity as common sequence code used so far. and theres another connection, the common sequence code is used in grammar compression ideas, and edit distance has a lot of natural connections to grammar compression.
  • immediately thought of the entropy traversal ideas. how could they relate to or utilize edit distance? an idea would be to traverse trajectory space based on “nearest” edit distances of parity sequences. it would also be interesting to explore modifying the traversal start seeds also based on edit distances (eg 1 unit) to get new frontier points. some of the traversal algorithms are similar to minimum spanning tree logic, and think its natural to apply it.
  • this led me to think more of the entropy traversal ideas and where they left off/ apparent limitations/ problems. it seems both the frontier and resolved points sets tend to increase in size/ complexity. and the whole idea of induction is to find patterns. how does this apply to entropy traversal? there is some contrast with similarity here, because entropy traversal doesnt clearly/ necessarily align with it. it seems one wants to find an ordered aka nonchaotic, algorithmic, “decidable” aka “enumerable” pattern in both frontier and resolved sets, and their combination.
  • this led me to the idea of set representations. one could represent a set (eg the resolved & frontier sets) eg in terms of sets of ranges (low, high pairs, and single values). then that representation has an inherent complexity in terms of # of ranges in the set. and then it seems like one wants an algorithm that regularly decreases the ranges down to unit complexity, ie a single range or constant # of ranges, the opposite of expanding. this is somewhat related to earlier traversal ideas that look at the lowest point in the set & try to make sure it increases. but then this idea seems too limiting, because maybe there are induction patterns that dont require the set complexity to reset periodically? (trying to think of an example of that…)
  • this leads to the idea of “(ir)regularity” of a set. how “algorithmic” is a set of data? the natural measure is kolmogorov complexity, which ofc is impossible to measure. what are other rough measures of kolmogorov complexity? ofc entropy comes to mind. maybe measuring the minimum total edit distance between a set of points is a nice natural measure of the points entropy… but again trying to minimize total entropy of a set of points while related does not seem to be the same as finding decidable/enumerable algorithms…
  • the idea of building an order out of an infinite expanse of objects reminds me of tiling, dont recall mentioning this analogy before, but maybe did a long time ago. in this case it seems to be a multidimensional tiling where feature space is the dimensions. here induction seems to be like trying to create a regular tiling out of irregular tiles (trajectories, or trajectory features). also is there a concept of a regular tiling with tiles of unbounded size? maybe this idea requires working with the scaled features.
  • an idea that is poking at me for many years, and just dont know exactly how to express or pursue it, but it goes something like this. the collatz problem is almost like an “infinite structure”. then there are incremental algorithmic modifications one can make to this “infinite structure”. some of them make the overall structure less chaotic, decrease its entropy, increase its order, and nudge/ make it “closer” to decidable/enumerable algorithmic structures. as mentioned multiple times, proving that the collatz infinite structure is decidable/enumerable is equivalent to a proof. somehow an idea of steering the traversal toward algorithmic order comes to mind… it seems all sometimes like tip of iceberg, like some kind of new/ vast algorithmic algebra + calculus theory + framework, but even after years only have the barest  pieces/ sketches of how the bigger picture could fit together… maybe mentioned this before, the steps or transformations of the algebra seem to be decidable algorithmic transformations… in fact the theory of algebra itself can apparently be seen as a kind of subset of this new theory!
  • “finding an algorithm to fit fractal/ chaotic data” or “utilizing algorithms to find algorithms” seems rather abstract/ exotic/ difficult except there is a whole vibrant/ advancing/ emerging/ cutting edge field dedicated to it, ie machine learning!
  • was musing about what a “successfully orderable” traversal pattern might look like. searching for some kind of example/ case study/ model, suppose there was such a thing as a “semi-disorderly, semichaotic” traversal that covered all integers, more noisy locally but globally ordered. what would it look like? did write code like this many years ago on the collatz page, just got a new idea, think right now its worth cooking up…

 

4 thoughts on “intractable collatz

  1. Pingback: collatz alive and still kicking | Turing Machine

  2. Pingback: collatz virus attack | Turing Machine

  3. Pingback: collatz trials+tribulations | Turing Machine

  4. Pingback: collatz standing on shoulders of giants (not!) | Turing Machine

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s