hi all, have gotten a sizeable spike in hits over the last week on collatz related blog posts! it seems to be traceable to an old reddit page on collatz from jun 2015 by `level1807`

talking about using a new feature in Mathematica to graph the collatz conjecture. profiled that finding/ graph myself here in this blog around that time. not sure how people are finding that page again, but googling around, it looks like this graph is now immortalized in a mathematical coloring book which was announced in a recent March 28th numberphile video getting just a few thousand short of ~200K hits at the time of writing this (maybe a few ten thousand in a few days!), and profiled the same day by a popular mechanics blogger weiner under the title of the Sea Monster. so, essentially viral, but putting the bar a little lower for mathematics! and as for the “elephant in the room” much to my amusement/ chagrin the video never once uses the word **fractal** (bipolar moods again attesting to a longterm love-hate relationship, not to mention the other (mal?)lingering facet of mania-depression!).

and this coincides very nicely with my announcement of the following. have been making some big hints lately and think finally have a Big Picture/ Big Idea from the most recent experiments. (yeah, no hesitation in the open Big Reveal on a *mere blog* after years of a similar routine…)

what is looking very plausible at this point is a formula in the form of a **matrix difference equation/ matrix recurrence relation**. the devil is of course in the details, but heres a rough sketch. prior experiments have some “indicator metrics” that are based mainly on binary density of iterates, and other “surface-like” aspects such as 0/1 run lengths etc… and its now shown that these are strong enough to predict future iterate sizes (10 for now) with some significant degree of accuracy.

but what are the variables? the code already has a ~7 variable vector as “indicator variables” and the latest experiments simply add the iterate width ratio to the mix. then the next step is to optimize/ decrease the errors of the approximation and analyze the generalization/ convergence of the matrix difference equation. am on the lookout for a great comprehensive ref on the topic now! admittedly prior experiments do not *directly* show that the indicator variables (not including the bit size ratio) form their own recurrence relation but that seems like almost a mere formality at this point!

note that just because one has a matrix difference equation, am not sure if one can then exhaustively analyze its convergence dynamics, but it looks quite analytical to begin with, its certainly likely a *very strong hammer,* so does it work/ apply enough force against the nail? at this point it seems it would have to be a very strange nail not to be “hammerable” ie convergence dynamics not obtainable from the recurrence relation.

it will take quite a few months, surely, to fill in all the blanks/ technical details, work out all wrinkles, etc, but barring unforeseen circumstances, all indication/ forecast at this point is quite optimistic for some kind of… *proof*.

so, nothing further extremely tangible to report at this moment (eg code/ results/ data) but alas, did want to write something immediately and dont know how much time can dedicate to it over the next few weeks/ months. and further alas, there are no other volunteers whove materialized after several years of pseudo-advertising/ promotion on this blog. hey, the silver lining of that (but not one sought) is that there wont be any ambiguity over the credit/ attribution then right? too bad too because in my estimate with the basic map now laid out, this is a nearly optimal moment for parallel attack/ effort/ collaboration! (if the chance of a lifetime gets posted on a blog and nobody reads it, does it make any noise?) anyway though do plan to keep hammering & update status!

the numberphile video featuring Alex Bellos has him declaring nobody has any idea how to prove the problem. *riiight, wink, wink* 😀

bellos/ harriss books are Snowflake Seashell Star: Colouring Adventures in Numberland, Patterns of the Universe: A Coloring Adventure in Math and Beauty, and Visions of the Universe: A Coloring Journey Through Math’s Great Mysteries. not sure which one(s) has the collatz problem yet. and its notable that collatz “gets in on some of the action” of the adult coloring book craze, although it seems to be winding down at this point, to the point of even affecting barnes & noble stock![1][2][3] (hope my last favorite “carefully arranged/ curated atom packets” store doesnt get crushed in the continuing onslaught of the digital age!)

**(4/7)** this is the code to look at the “MDE/MRR” (matrix difference eqn/ recurrence relation as described above) of the 9 variables in addition to bit size ratio ‘x’. each variable is attempted to be predicted 10 iterations forward based on current variable measurements. this is a graph of correlation coefficients over 20 runs (new distributions generated each time), with 3/20 leading to linear dependent fits. ‘x’ had the best performance around ~0.7-0.8. the other variable correlations were less/ not as strong in range ~0.2-0.6 but all seem generally “discernably not noise”. there is also some correlation their heights/ scale visible and this seems to indicate some kind of uneveness from one distribution to the next and would like to isolate that although itd be probably/ presumably a timeconsuming exercise. did guess the correlations would be better and these are weaker than expected, far from a slam dunk fantasized in manic moments, but these results seem not null/ maybe usable. one must always resist asking for too much esp wrt this problem! 😮 😳

this is a slight modification that turns the # of trajectory iterations from 10 into a parameter and iterates over that parameter starting/ incrementing from 2 instead, to show declining correlation for higher iteration counts. however the common decline to ~0.2 makes one question the baseline for noise/ random correlation. (coincidentally again 3/20 linear dependent fits.)

**(4/11)** 💡 ⭐ 😮 😀 this code looks at regressions over a single “iteration” but data is sampled in a much different way, using a simple distribution varying smoothly over starting seed densities. the iteration may be “compressed” or “uncompressed” (“f2/ f1”). the results are a bit phenomenal! for the uncompressed version (2nd set), there are very high correlation coefficients, but they are high for the compressed also. in contrast, computing the seed bit size ratios “x1/ x2” works better with the compressed variables. it tries to predict size ratio with both pre-/ post-iteration variables and the former consistently performs better. output is the correlation coefficients.

{"x1"=>0.9210449533987287} {"x1"=>0.28153255117100767} {"x2"=>0.9018232245828868} {"x2"=>0.3182126151521053} {"d_2"=>0.9283209014432241} {"dh_2"=>0.8880647578464561} {"dl_2"=>0.9201096138753048} {"a0_2"=>0.99558548448025} {"sd0_2"=>0.9912187501515848} {"mx0_2"=>0.9974273525998754} {"a1_2"=>0.7239828500208254} {"sd1_2"=>0.8983785421966467} {"mx1_2"=>0.9401646849166891} {"x1"=>0.6691446282196233} {"x1"=>0.5584292801704467} {"x2"=>0.6691446282196291} {"x2"=>0.558429280170434} {"d_2"=>0.9761843656693724} {"dh_2"=>0.9564714521765179} {"dl_2"=>0.9632145769176049} {"a0_2"=>0.9971425206877131} {"sd0_2"=>0.998238239549548} {"mx0_2"=>0.9963291869147001} {"a1_2"=>0.9927486054676593} {"sd1_2"=>0.9983842986687205} {"mx1_2"=>0.9964725313949367}

⭐ ⭐ ⭐

**(4/13)** 💡 ❗ ⭐ 😮 🙄 😀 had some wild ideas late last nite, got jazzed/ wired up right before bedtime as occasionally happens, all after cogitating/ stewing/ mulling/ rolling over the big question, *what would it take to get this to work?* (one has a lot of time to come up with synonyms for *thinking* working on this problem…) am busy with other stuff at moment & it will take awhile to get to implementing them, but heres a possible path toward something “rigorous” and wanted to write out these seemingly very promising musings/ direction. the big issue with the matrix difference eqn (MDE) is that it can only be an approximation, and it is subject to the “sensitive dependence on initial conditions” aka “butterfly effect”. it will match the actual metrics of a trajectory for several iterations but then diverge from it, due to lack of perfect 100% correlation/ fit at each iteration, and longer trajectories will diverge farther.

how to rescue that? heres an amazing idea. suppose that one has a MDE (derived from curve fits with imperfect accuracy) that converges under certain properties. a key question is how much its convergence behavior is unchanged by perturbing it somewhat. but it seems like maybe some MDEs still converge even after being perturbed. then, the big question, how much perturbation is possible while still ensuring convergence? in particular what if the function is *perturbed at each iteration* and can still be shown to converge? then for “some limited degree of perturbation” (the gap exactly what is to be nailed down precisely) one can match the perturbations with *the actual trajectory* and prove that the perturbed function converges also!

in other words, the MDE may have a convergence property that is retained even in the face of “noisy” periodic perturbations at each step. maybe the simplest/ obvious term for that is *stability*… which is a big deal in general information technology field also with a remotely similar meaning in much different context (some hint of personal background showing there). its a big conjecture, but dont see a big reason *a priori* why it should be invalid.

**(4/18)** 💡 ⭐ ❗ another thought experiment. consider the big/ original paradigm of strange attractors, the lorenz equation. now consider drawing a 2d “hoop” around part of its eventual trajectory, “capturing” it. now consider perturbing its current 3d location repeatedly at different increments as it advances. if “small” these perturbations will not alter it from repeatedly circling/ cycling through the “hoop”. in other words, the trajectory is “stable” through the “hoop”.

**(4/21)** this is some fairly complex code to get right, but not a lot of lines, that computes the linear model over all the iterations of a random trajectory, and calculates the difference in the metrics with the actual metrics of the trajectory. results seem encouraging. the errors do increase consistently at the end of trajectories. its back to the highly tuned trajectory distribution generation designed to range over all iteration bit width ratios. also noticed a distinct phenomenon, that ‘x’ the iterate width ratio is apparently *more* predictable for wider spans eg (multiple) “compressed” iterations where there are loops over odd/ even iterates (it would be a worthwhile idea to find the optimum) whereas other metrics are more predictable for the minimal iterations, 1. also this is tuned so that only the iterates with the next iterate odd are included in the regression fit which gives best performance, other selections/ boundaries can decrease correlation fit substantially. the increased error means maybe that more stability is required near the end of the trajectory for the previously sketched MDE convergence/ tracking idea to work.

AnonymousThe video doesn’t use the word fractal because (in contrast to you) the author knows what he is talking about. The Collatz fractal is a well-known mathematical object, see the wikipedia article to learn what it is. The Collatz iteration graph is not even a tree if the conjecture is false. But if it’s true, it’s not necessary for the tree to have statistical similarity of all its subtrees.