💡 ❗ ⭐ this code has a lot of moving parts and took quite awhile but is also a culmination of many previous ideas. it has the same distribution generation mechanism as last run (slightly modified/ adjusted), but its a 1st cut on evaluating the significance/ accuracy of the MDE/RR model as a predictor of trajectory length. the better the accuracy of its modelling, the more “plausible” a (“analytic”!) model it is and viable for exploiting for further derivations.
hi all, have gotten a sizeable spike in hits over the last week on collatz related blog posts! it seems to be traceable to an old reddit page on collatz from jun 2015 by
level1807 talking about using a new feature in Mathematica to graph the collatz conjecture. profiled that finding/ graph myself here in this blog around that time. not sure how people are finding that page again, but googling around, it looks like this graph is now immortalized in a mathematical coloring book which was announced in a recent March 28th numberphile video getting just a few thousand short of ~200K hits at the time of writing this (maybe a few ten thousand in a few days!), and profiled the same day by a popular mechanics blogger weiner under the title of the Sea Monster. so, essentially viral, but putting the bar a little lower for mathematics! and as for the “elephant in the room” much to my amusement/ chagrin the video never once uses the word fractal (bipolar moods again attesting to a longterm love-hate relationship, not to mention the other (mal?)lingering facet of mania-depression!).
and this coincides very nicely with my announcement of the following. have been making some big hints lately and think finally have a Big Picture/ Big Idea from the most recent experiments. (yeah, no hesitation in the open Big Reveal on a mere blog after years of a similar routine…)
what is looking very plausible at this point is a formula in the form of a matrix difference equation/ matrix recurrence relation. the devil is of course in the details, but heres a rough sketch. prior experiments have some “indicator metrics” that are based mainly on binary density of iterates, and other “surface-like” aspects such as 0/1 run lengths etc… and its now shown that these are strong enough to predict future iterate sizes (10 for now) with some significant degree of accuracy.
this tightens the screws some )( on the prior findings and show they generalize nicely. prior analysis seems very solid but there is always the shadow of question of bias in the generation algorithms which start from small seed sizes and build larger ones out of smaller ones. an entirely different strategy for generating sample seeds is used here based on a genetic algorithm. the idea is to start with a fixed bit width and alter the bits in it just like “genes”. fitness is based on glide length (or equivalently ‘h2’ for a constant seed width). it starts with 20 random parents of given length. there is 1 mutation operator and 2 crossover operators. 1 crossover operator selects adjacent/ contiguous bits from parents at a random cutoff/ crossover point (left to right) and the other just selects bits randomly from parents not wrt adjacent/ contiguous position.
fit24 is again used for the linear regression fit. these runs are for 50, 80, 100 bit sizes and ~200 points generated for each. 50K iterations.
because of declining # of points for higher widths, this is circumstantial evidence that as widths increase long glides (relative to width, ie ‘h2’) are (1) increasingly hard to find and/ or (2) rare. these two aspects interrelate but are not nec exactly equivalent. hardness of finding seeds with certain qualities ie computational expense does not nec mean theyre rare. an example might be RSA algorithm. finding twin primes is somewhat computationally expensive (although still in P) but theyre not so rare. technically/ theoretically rareness and computational expense are related through probabilistic search aka “randomized algorithms”.
hi all. some extended wallowing in self appraisal/ reflection to begin with in this installment. last months installment of collatz had a big highlight, at least wrt this blogs history. as has been stated in various places, & trying not to state the obvious here, part of the idea of this blog was to try to build up an audience… aka communication which (to nearly state the almost-canard) is well known to be a “two way street!” there are all kinds of audiences on the spectrum of passive to active, and in cyberspace those in the former camp are also long known semi-(un?)-affectionately as lurkers.
must admit do have some “blog envy” of some other bloggers and how active their audiences are wrt commenting. one that comes to mind is scott aaronson. wow! thought something like a fraction of that level would be achievable for this blog but now in its 5th year, and candidly/ honestly, it just aint really happening. have lots of very good rationalizations/ justifications/ excuses for that too. ofc it would help to have some breakthrough to post on the blog and drive traffic here through a viral media frenzy… as the beautiful women sometimes say, dream on… ah, so that just aint really happening either. 😐
however, there was a highlight from last month, for this blog something like a breakthrough, but also, as you might realize the subtext on reading further, with some major leeway on where the bar is set (cyber lambada anyone?). got an anonymous, openminded, even almost/ verging )( on encouraging comment from someone who wrote perceptively and clearly had a pretty good rough idea of what was going on in that significantly complicated collatz analysis blog post as if reading substantial part of it and comprehending it, and getting to some of the crux/ gist of ideas/ approach here. nice! 😎
(alas, full “open kimono”/ self-esteem challenging disclosure… admittedly that is a very rare event on this blog, and despite immediate encouragement and my marginal/ long gradually increasing desperation now verging on
anonymous has so far not returned. this overall predicament is something of a nagging
failure gap/ regret/ ongoing challenge wrt the original idealism/ enthusiasm/ conception of this blog. which reminds me, also, long ago there was an incisive/ discouraging/ naysaying/ cutting/ near-hostile/ unforgettable comment, and may get to “highlighting” that one too eventually as part of the overall yin/ yang balance etc after changing circumstances and/ or building up enough courage wrt my cyber-ego, long keeping in mind that other quirky aphorism, success is the best revenge…) 😈
anyway here is the comment again, suitably highlighted/ framed/ treasured forever at the top of this blog:
What is a “glide” and how is it related to the trajectory length? Have you defined it somewhere earlier? What are your input variables for the model? What’s the reason to believe that even if you have a good predictor for your “glide” it helps to prove the conjecture?
this is a hazy idea thats been pulling at me for several years, finally came up with a way to solidify it and then decided to try it out. results are unfortunately lackluster so far but it was a relief to finally see how it works (otherwise would just keep wondering if its the “one that got away”!). and anyway think the overall conceptual idea is at least somewhat sound and still has promise maybe with some other revision/ angle.
the prior runs showed that theres a roughly predictable linear relationship between decidable metrics for each iterate and the glide length (“horizontal ratio”). so, one of my rules of thumb discovered the hard way over the years, once its proven at least a simple machine learning algorithm works, then one can look at more sophisticated ones that should at least outperform linear regression. (and conversely, if there is not even a weak linear relationship present, it seems unlikely that machine learning will “pull a rabbit out of a hat”. this is surely not an absolute phenomenon, but it seems to hold in practice, and think it would be very interesting to isolate exceptions, suspect they are rare.)
the idea explored here on specific collatz data is a general one for machine learning. suppose that one does not have very many coordinates in ones data. each coordinate is something like a “feature”. and one would like to increase the number of features using only the supplied ones. this is similar to what a neural network does but typically each neuron has many connections. one wonders, can it be done with few connections? the smallest case is 2 connections. is it possible that only 2-way analysis of input data, built on recursively, could have predictive capability? the answer here for this data is mostly no but maybe theres more to the story with further tweaking. also it might work for some other type of data.