💡 ❗ this exercise is partly inspired by looking at a mathematica stackexchange post on collatz and the 1st answer by mgamer which looks at seeds in the form pn where p is prime and n ranges over a few hundred.
that graph did not normalize by seed starting (binary) “width”/ size. started looking at something similar and was interested in replicating it and normalizing by width. then decided to look at mn where m is either composite or prime. since there are fewer primes than composites per random range of integers, this code looks at odd primes and adjacent (also odd) composites. the graph is 1st 50 primes/ composites seed width on x axis, glide length on y axis (div2 compressed). the graphs for prime vs composite m (1st, 2nd below) are distinctly different! amplitudes are larger on the 1st and it has a shorter lower “base” of the higher widths. an immediate hypothesis (from prior experiments) is that maybe this difference is explained by starting seed binary density? but thats also hard to imagine at this point.
have long wondered about prime/ composite effects on the collatz dynamics, have poked at this before, but never found anything notable (previously!). need to find better way(s) to visualize this. this form is somewhat rough, “as discovered”, & surely can be sharpened up somehow. for the low/ leftward starting values (the “hotter/ higher” colors, superimposed/ “painted” last), there is major width variation that makes the graph harder to interpret. a better approach is probably to have all the same width. these graph(s) are also hiding a striking short-interval periodicity in the spikes that could be better hilighted somehow also.
😳 oops, too much wishful thinking? this next experiment seems maybe to melt the effect away. there does seem to be an effect, but its more about differences in adjacent integers (sequential prime-composite pairs) than primes vs composites. apparently the “exponentiated” seeds lead to glides that tend to come in alternating bunches of high or low amplitudes, but maybe not tied to primes or composites. if the latter were the case this graph below would not alternate (red is a prime, followed by the nearest green composite). it alternates in the runs, but inconsistently. maybe more evidence for the effect being tied to initial seed density? this following code swaps the inner and outer loops so that sequences of the exponentiated seeds go sequentially, and filters out two patterns in the prior runs, one related to the starting seed and one related to the parity of the power. another way to visualize this is the difference in the runs which also shows/ highlights the sequential/ alternating bunching (2nd graph). (ARGV=50)
(8/14) back to the drawing board! this is a simpler exercise that starts at odd numbers at 210 + 3. and doesnt have any prime/ composite distinction. wow! never noticed this before! the notches at the bottom are also more apparent and its clearly associated with what might be called the “2nd” parity of the numbers ie the parity of (n – 1) / 2, and the characteristic bunching is evident again. (postscript) the notches correspond exactly to the # of 1 (consecutive) bits aka “runs” in the lsbs of the seeds, which exponentiation preserves.
(8/17) here is some more analysis that shows some of the simple analytical power of sorting. this shuffles the datapoints first to randomize/ avoid ordering “artifacts” due to stable sorting algorithm, and then sorts them by different criteria and then one can apply a graph. the first column sort is by 4, the starting seed 1-lsb run length. similarly to some prior results, there is a weak correlation between it (green line) and the glide distance (red spikes), and also as just found, a strong correlation with the minimum glide distance over/ within a run “bin”.
the next graph is a sort by column 0 the iterate binary “width” (ie log2(x), red line) and looks at iterate binary density (green) and finds a rough correlation or pattern. the repeated exponentiation can be seen as a sort of “mixing” that (as guessed earlier) affects density “somehow”, ie is now seen specifically to decrease the variance of the density toward the “center/ core” ~½ (much like collatz fn itself!).
(8/18) “looks can be deceiving” and the effect is noisy/ spiky & isnt super-dominating so an an obvious idea is to analyze that 2nd last graph for average glide lengths within each “bin”, looking at glide averages within/ over the bins. results are somewhat close but as expected: steadily increasing. this output is one line per bin # with average, standard deviation, points in the sample (and notice the last is in mostly geometrically decreasing/ exact multiples of 50). but note the average seems to be of a “long tail” distribution with large spikes. am immediately thinking of how to do this similarly in some way with equal bin counts instead.
there is also something apparently quite striking currently somewhat hidden in these results: even though the seed “size” ranges quite nearly-uniformly randomly over 0-1000 bits (looking at a plot of column 1 in prior output), the 1-msb-prefix-width (“1-prefix”) seems to have a relatively huge effect on determining variation in glide length. in other words the estimated glide length seems to be roughly some function of seed width and 1-prefix width (with latter weighted “substantially”).
[3.803, 4.151, 1250]
[5.7, 6.14, 650]
[8.647, 8.123, 300]
[10.36, 6.92, 150]
[13.03, 9.166, 100]
[15.42, 9.55, 50]
this next idea normalizes the seed (binary) width to some large constant here 1000+1. then there are two basic strategies for generating random seeds with linearly increasing lsb 1-runs over a loop. the 1st fills the 1-run length to same as the loop counter (which ranges over the entire 1000 width) and then randomly chooses each remaining bit as 0 or 1 with equal ½ probability. this means its density will gradually increase from ½ and end at 1. the 2nd approach pins the density to ½ by filling the 1-run to ½ the loop counter, and chooses the number of remaining bits assigned to 1 randomly to maintain overall ½ density.
the following is a plot of the glide lengths for both strategies (red, green respectively, left y axis scale) and superimposed, the seed density for the 2nd approach (blue, right y axis scale). this is also notable in that earlier experiments found a weak correlation between trajectory length and outside-of-density-core seeds whereas here its somewhat the opposite: the ~1 density seeds have about ~½ the glide distance.
this next idea revisits hard seed generation based on max glide spread and investigates how much the lsb 1-runs “fill” these trajectories. it accepts two parameters, the max glide spread and number of samples. this run is 50 spread and 1000 samples. the output is the index of the trajectory maximum versus the integrated sum of lsb 1-run lengths within that initial window. it answers the question of whether there are long glides without significant lsb 1-run totals in the negative, at least for these hard seeds. on average there are at least ~500/90 = ~5 1-run lengths at every iteration. now some part of this can be attributed to this highly unique/ specialized distribution of seeds, and some may be due to a general property which will require further investigation to differentiate between the two possibilities.
maybe dashed that one off too fast! immediately thought about tracking a few more mostly obvious metrics (two already calculated but not output). namely, index of max 1-run, iterate size at max 1-run, max 1-run (length) (all limited to prior to trajectory max), iterate size at trajectory max, 1-run length at trajectory max, glide length, trajectory length.
this next graph shows a rough linear correlation between index of max 1-run and its iterate size. this could be explained in that all the glides increase prior to the trajectory max roughly in the same way, even though max 1-run seems to be nearly uniformly randomly distributed in it (ie, rising glide part only). (called with same 50 spread 1K samples args.)
❗ this looks promising & confirms some earlier trend seen in the single reported sample. the later (index) the max 1-run occurs, the lower the value. this seems to suggest the 1-runs are roughly analogous to a concept earlier laid out: potential energy!
❗ next, a correlation between width of iterate at max 1-run and width at trajectory peak!
next an anticorrelation between max 1-run length and trajectory peak width.
(8/20) 💡 😳 must admit am a bit embarrassed to think of this very basic idea at this late date but as they say “better late than never” and in my defense, have never seen anyone do this before. it makes a lot of sense to treat the collatz fn as a sequence of alternating steps between repeatedly iterating over odd results while/ after f(n) = (3n + 1) / 2, and repeatedly iterating over even results while/ after f(n) = n / 2. so heres some basic code to do that and count “odd/ even” runs each time. the results are graphed below in the dual plot. the green line are “odd-runs” and the red line is “even-runs”. at start and during nearly the entire rising glide for the hard seed (using the same glide spread 50 approach), strikingly the even-runs are all uniformly 1 and the odd-runs spike, and after the trajectory peak, the two tend to balance out more. the blue line is trajectory log2 iterate value.
(8/22) this code looks at integrating/ averaging the lsb 1-runs and 0-runs over many samples to try to understand the characteristic shape/ profile and whether there is any predictive capability vs. trajectory peak location, glide length, or total trajectory length. this was done in the prior post but did the integration only without the average normalizing by total samples per point. the max glide spread generation algorithm is replaced with a iterations-to-trajectory-max generation algorithm here and runs are averaged over 1000 samples.
in the 1st graph the iterations-to-max count is limited to [50..55] and other samples accidentally outside that range are discarded. the red line is the 1-runs and the green line is the 0-runs (note this coloring is swapped from last graph). 2nd graph iterations-to-max count is [75..80] (line 116 in the code), also 1000 samples. the blue line is the average trajectory. the sample counts drop off sigmoidally near the end (not graphed) leading to the spike in noise of the averages. the graphs were made using the same horizontal axes for comparison. there is a very gradual upward trend in the 0-runs but the 1-runs are mostly flat before the trajectory peak. the 1-runs spike upward right before the peak and the 0-runs less so downward.
(8/29) met a new cohort eviljs in the stackexchange computer science chat room, from poland. interesting guy! he has some natural curiosity on the collatz conjecture and we chatted briefly about it. he asked for a summary, and said something like these following brief musings that have been meaning to post.
why is this problem so hard? a simple understandable answer is that every trajectory is self-similar yet unique like a fractal or snowflake. there are strong parallels to the mandelbrot set calculation (although few seem to have remarked on this—and somewhat discouragingly, as some know and has been proven, “computation of the mandelbrot set is undecidable,” although am not sure of the ref!) any conceivable proof must be inductive in some way, but it appears that induction cannot gain any “traction” on the endless differences in the trajectory dynamics, nothing can effectively “capture” general trends that lead to exploitable inductive properties.
although there is some qualification to that observation. there are techniques that resolve infinite sets of trajectories at a time, eg FSM related ones etc, discovered that one literally decades ago. however these techniques also do not seem to be inductive. the sets that are resolved seem to have no exploitable/ inductive relation between each other. finding one is part of the core challenge.
another idea to try that has been flitting in my mind for awhile (and am not actually sure if its been posted here! have to review!): every trajectory needs to “fall below” its starting seed, ie it has a glide. but note that some iterates are part of multiple different glides. how could this be visualized? my idea is to look at the minimum “fall below” value for each iterate of its multiple “fall below” values. it is a bit hard to explain simply. the code to find these minimal fall-below values is surely not very complicated. exercise to reader, can you come up with it?
another idea have been meaning to try: looking at the compression of a parity sequence vs the compression of the starting seed, and attempting to balance them some way in a traversal method.
➡ another misc “meta” note: there seems to be a new wordpress control that allows changing blog style. and it lets one pick fonts! 😮 ❤ 😎 ⭐
holy cow have been wanting to change this blog font to at least a nice serif font since day 1! wasnt it impossible, thought capability to change font only came with the paid subscription? did the management relent and allow different fonts for the free version? (but of course their restriction there is totally understandable considering how much different fonts really cost in cyberspace!) 😛 but now its an “embarrassment of riches” and having a lot of trouble settling on a cool new serif font! need to look at em all side by side! so hard to choose, its like the kid in the candy story problem! “libre baskerville” seemed way cool!
(8/31) 💡 oh yeah! 2nd thought, and answer to the exercise: a simple algorithm that traverses the trajectories and marks iterates (as a hashmap) with the seed for each iterate in the trajectory satisfies the question/ challenge, and it doesnt have to look at minimal integers of a set, because going in low to high order of the seeds guarantees that no iterate already visited will have a seed higher than the current seed being traversed. simple, right? and then one can plot the hashmap as a 2d plot where x is the iterate visited and y is its starting seed. after 2000 seeds, it looks like this:
(9/2) more ideas on reverse tree traversal metrics. this first simple one just chooses the least frontier point. have tried stuff quite similar but strangely, never tried this basic riff before! (whipped this out a little too fast and the duplicate output at line 22 was unintended.)
this code does a reverse tree traversal with a new metric. the metric is to look at average distance between seen points, and attempt to match this as closely as possible with the new point chosen from the frontier. the graph is average distance of seen points versus new point. the self-similar nearly-inductive structure is quite apparent.
(9/3) wrote this code 7/30 but forgot to post it (was intending to tinker with it more to get better results). its a refactor of the compress approach to add trajectory “path codes” (ie parity sequences) and allow a sort of generic parameterized toolbox/ experimental compression metric. the current metric is just the compression of the new trajectory to add. other metrics (commented, lines 56-58) can consider it wrt to the current “glob” (processed trajectories) but they seemed to lead to mostly noisy traversals. the graph is the reverse iterate and shows (some) desirable self-similarity. (again maybe whipped this out a little too fast, the sort at line 60 is a bit of a holdover/ typo from sorting the non-single-item list as in the comments and has no effect.)
(9/4) this code looks at grammar compression pattern(s) in the different groups of reverse tree iterate sets by depth and shows some self-similarity/ inductive possibility. in addition to the string labelling, the longest matching strings are numbered sequentially (line 46) and plotted, with some space separation between each set depth (line 55). one standout trend, it seems to find long matching strings 1st later in the ordered sets (the upward sloping points), trending into a more random clouds. (the code omits a few remaining 0/1 bits not integrated into the grammar compression, at the conditional at line 53.)
as noted earlier, there is probably some simple proof that this grammar compression approach is not creating DAGs but only creating a dictionary of binary strings with no further “words” in the binary strings. another grammar compression in earlier research did create DAGs and this one could be modified to do so fairly simply—the decompression simply has to be run on the dictionary words in addition to the original string to compress. have been meaning to look at grammar compression DAGs with graphviz for a long time & just have never gotten around to it.
(9/11) ❓ ❗ not really sure what to make of this right now but it seems unusual/ unexpected & worthy of reporting. was interested in looking at how much the reverse tree branches and traversing wrt it, and maybe constraining or maximizing the size of the frontier set versus the size of the seen set. the reverse tree can be traversed such that there is at most a single dual-juncture branch where all other nodes have a single branch, and also in order of the depth (line 39). this code is also set to break if the newly added point collides with any prior frontier points, but it never seems to happen (line 32). the green line is the max branch junctures possible. from looking at internal data, when it is 2 there is only a sole node with that value. also since this tries to maximize the size of the frontier set, it seems like maybe this code shows the (“adjacent”) frontier set can never get bigger than about 63.1% of the seen set (column 4 divided by column 3).