summer collatz highlights, reddit/ stackexchange

hi all. around here, the philosophy is something like, when in doubt, write a new collatz post. recently ran across a few notable collatz items on reddit and stackexchange and this post is to share those along with other misc new research idea(s).

  • cohort/ once collaborator randomra came up with his own codegolf collatz generalization challenge and racked up an impressive 13v.[a1]
  • a really significant/ beautiful visualization showed up on math reddit by level1807 using a new command Anglepath in mathematica. it basically creates paths offset by a sequence of angles. the visualization also reveals the recurring fractal aspects of collatz, making it look like a feather, plant, or blood vessels! the post got 214v and 39 comments! level1807 put his code on github. the graph is so striking, cannot avoid incl it here! (this version is by halirutan in next item.)[b1]
  • this led me to mention/ cite this reddit post in the stackexchange mathematica site chat room, a site/ room have not visited much. then noticed several other collatz related posts on the site and this crowd may be a notch above codegolfers in their analysis sophistication skills. met/ had a nice chat with high rep user halirutan (57K) over collatz and other misc stuff… he is a big fan of the problem and advocates that all CS/ math students should be familiar with its basics… so as the exprs go, “its a small world/ music to my ears/ great minds think alike”! 🙄 and coincidentally/ cybersynchronicity, someone asked about collatz recently within visiting there and halirutan was so gracious to link my blog page on collatz, and his topvoted answer got 47v![d1] 😎 😀
  • now is as good a time as any to point out that mathoverflow also has quite a few collatz related posts and even a collatz tag [c5] although/ “caveat” (as is typical on se) many of the posts do not have that associated tag.[c]

overall, spking from long experience, stackexchange is one of the leading “open science” and expert review/ feedback sites on the entire internet although (and maybe despite) not being specifically designed for that purpose. on the other hand, its also not nec a “ideal/ great fit” for ambitious ongoing (collaborative) research projects. the sites/ high rep users can be very picky about acceptable questions, their guidelines/ preferences are not always obvious, and if a question isnt a good fit it can quickly get negative-voted and “closed” (aka “on hold”) or even deleted. also, questions are typically/ preferably “bite sized” and have known answers which doesnt nec mesh with open problems/ active research areas.

⭐ ⭐ ⭐

next, recently noticed something notable in a collatz analysis. have long been looking for properties that correlate somehow with glides & then ran into this. was looking again at “hard seeds” & code that attempts to find seeds that maximize the spread of the max glide. then, just looked at the trajectory sequences in binary. noticed that within the glide, the suffixes often are (descending length) strings of 1s and then analyzed it further. yes, it seems to be some notable tendency/ property.

this code has the seed max glide spread target as the 1st parameter. these runs are with 50. the 1st graph is the log2 value of trajectory iterates. the max coincides with midpoint of the glide at about 2x the glide spread and total glide length comes out as almost exactly 4x the max spread. the 2nd graph is the ending 1-run lengths and shows spikes roughly lining up with the overall glide. hmmm!




next, an integration over 500 separate runs to smooth out the noise. (note the x axis has a different width.)




(7/16) this is some analysis very similar to ideas in the last collatz post and finds some notable trends. those prior runs did not look at metrics for the obvious “div2 compressed” function. results here are remarkable. the 1st plot is a run that generates seeds maximizing “glide spread” and then computes glide distance of the seed (red) vs its partner (green), and graph also shows the (absolute) difference (blue). the difference seems to be very level/ nearly bounded but a very gradual increasing trend appears to be present. the 2nd version (called with single param) generates seeds via maximizing glide distance and again finds a levelling effect on the glide distance differences. these both seem to hint strongly at a possible “inductive structure”.





(7/17) a natural next step is to just generate hard seeds using this new criteria of glide stopping distance difference. one has to be careful though, if it is bounded by a constant then a search for larger constants will not terminate/ hang. this is 10 separate runs of generating increasingly hard seeds using that criteria, and the code terminates if over 5K list items are reached in the search. the prior plots show that larger glide stopping distance differences exist (up to ~500 at least).

an interesting pattern/ trend in this project is to generate “hard” seeds using increasingly difficult hardness criteria. this last criteria may be so “hard” as to be “impossible”. and then a proof is actually equivalent to showing that the seeds with the given hard criteria not only “cant be found” (via searches) but “dont actually even exist”…. (so then how does one prove a search always fails? ofc, just as hard as the original problem?) basically the difficulty of finding harder seeds goes up nonlinearly in a huge spike at the end, it seems even more dramatic than exponential growth, almost more like a boundary between feasible and impossible. ie 3rd graph which is the generation array size (all final points in each line can be visualized as >=5K).

ofc/ on other hand the persistent/ unavoidable caveat with this type of approach is that generation process may be “biased” in such a way that its deceptive; ie large/r glide run differences may exist but the algorithm dynamics may be “painting itself into a corner”. zen question of moment, centered around a real solution: how to tell the difference? another way of thinking of this is that unlike other hard seed generation methods which seem to rarely run into scarce regions, large glide run differences might exist but not “build” on “prior” large glide run differences (where “building” is the iterative bit-adding in the generation process).

(partly for future ref, incl the 2 gnuplot commands for color specification in the plot file as directed, and also ofc because all the colors are just so pretty…)

plot 'out.txt' using 1:2:3 with line linecolor variable
plot 'out.txt' using 1:2:3 with line palette





(7/18) 15 +1s & 2 reshares for this post announced on the relatively new yet huge 172K subscriber Math group on google+, thx all! the group is great/ excellent for sharing cool math pictures/ diagrams. ❗ 😎 😀 ❤

which reminds me, while on the topic, “best hits”, never did cite this old google+ post on one of my best visualizations from over a yr ago which also very nicely brings out its fractal nature and its seemingly endless striking/ mysterious dichotomy between order and randomness. 17 +1s, 2 reshares. doncha just luv social media? with all these “pictures of amazing curves,” maybe need to finally bite the bullet & get a instagram acct eh?! this stuff is surely way more compelling and likely to get far more hits than all that other flotsam-jetsam on it right? “FOMO”?! “YOLO”?! huh?!? us mathematicians really know whats important in life & how to have fun and appreciate real beauty, right!?! ⭐ 🙂 o_O

the base 2 “wedge”

(7/22) ❗ ❓ this is a striking case study of a single trajectory represented in binary, inspired by some earlier ideas looking at the prefix versus the suffix of the iterates in base 2 and the observation of long 1-runs in the suffix. it was found randomly via the max glide spread method of hard seed generation. the binary string is cut into two with one half the base 2 length of the original seed, and the remaining suffix. the trajectory continues only until the max value. see any pattern? 😮 ARGV[0]=10


["1011111111111111111111111111111111", "1"]
["1000111111111111111111111111111111", "11"]
["1101011111111111111111111111111111", "11"]
["1010000111111111111111111111111111", "111"]
["1111001011111111111111111111111111", "111"]
["1011011000111111111111111111111111", "1111"]
["1000100010101111111111111111111111", "11111"]
["1100110100000111111111111111111111", "11111"]
["1001100111000101111111111111111111", "111111"]
["1110011010101000111111111111111111", "111111"]
["1010110011111110101111111111111111", "1111111"]
["1000000110111111000011111111111111", "11111111"]
["1100001010011110100101111111111111", "11111111"]
["1001000111110110111100011111111111", "111111111"]
["1101101011110010011010101111111111", "111111111"]
["1010010000110101110100000011111111", "1111111111"]
["1111011001010000101110000101111111", "1111111111"]
["1011100010111100100010100100011111", "11111111111"]
["1000101010001101011001111011010111", "111111111111"]
["1100111111010100000110111001000011", "111111111111"]
["1001101111011111000101001010110010", "1111111111111"]
["1110100111001110100111110000001100", "0111111111111"]
["1010111101011010111101110100001001", "01011111111111"]
["1000001110000100001110010111000111", "000001111111111"]
["1100010101000110010101100010101010", "100010111111111"]
["1001001111110100110000001001111111", "1110100011111111"]
["1101110111101111001000001110111111", "1101110101111111"]
["1010011001110011010110001011001111", "11100110000111111"]
["1111100110101101000001010000110111", "11011001001011111"]
["1011101101000001110000111100101001", "111000101110001111"]
["1000110001110001010100101101011111", "0110101000101010111"]
["1101001010101001111111000100001111", "0001111101000000011"]

(7/23)💡 that “wedge pattern” is one of the most striking patterns found to date! so the straightfwd thing to do at this point is start out analyzing the basics of this! feel some deja vu on this. might have noticed something like this a long time ago, but maybe put it aside, and forgot to come back to it. there were too many other shiny objects lying by the side of the road since then, so to speak. but with collatz, one has to take what one can get. my immediate idea is that this trend is with starting seeds in form 2n-1 and sure enough, thats the case.

afaik, these trajectories have not been studied before or resolved, so to resolve them in general could be a real breakthrough (and might also lead to techniques to work on other “similar” trajectories or better). heres some simple code to look at the very smooth/linear “length” of the wedge (red) versus the somewhat more wobbly trajectory peak (green), and the glide length (blue), and total trajectory length (magenta). the wedge length and trajectory peak very closely coincide. the glide peak length is very nearly ~½ of the glide length. the glide length in turn is about ~½ the trajectory length.



1st next graph, the trajectory peak minus wedge length is quite spiky, & seems to follow a power law distribution (recurring large spikes), but looks conceivably bounded by a constant. the next, 2nd graph shows glide length divided by total trajectory length hovering around ~½. the 3rd graph of glide length and trajectory length divided by wedge length shows a sawtooth-like pattern.




(7/24) 😳 oops! after a query on mathoverflow & response by Simon Henry realized that this pattern is quite basic, and he says “nothing more can be said about the trajectory after the wedge pattern”. (would cite the question but its likely to “link rot”/ aka be deleted, doncha luv stackexchange!) here is some code that looks at trajectory dynamic occurring after the wedge pattern for very high starting seeds up to 320K. (ruby bignums are really amazing! & big data rocks!) it tracks the glide distance after the wedge. it is “low” under ~70 max for “very large” iterates but the case for a constant bound seems more shaky after looking at this, considering the nearly distinctly “larger” spikes that seem to happen in the 2nd half.



a few brief musings on a new physics analogy idea that occurred to me. each iterate seems to store “potential energy” somewhat like a spring. at the peak of the “trajectory” (a word also used for projectile paths), potential energy is totally emptied. as the trajectory is increasing toward the peak, the potential energy is decreasing to zero as it converts into kinetic energy. it seems one wants a recursive way to “measure” this potential energy other than the collatz function itself. the idea of potential energy also points to a “spring” which stores potential energy and releases it in oscillating waves of kinetic energy. a trajectory also seems like the arc of a projectile under the influence of gravity, launched with some initial velocity.

next, had the urge to revisit an old idea of reverse graph traversal using a optimization heuristic. this code attempts to minimize the distance between the new point in the frontier and a current average distance between prior added points and the “seen”/ currently enumerated tree. talk about emergent properties! it has a striking regime change at about ~2500-4000. the current average distance is in the 2nd graph and the long downslope correlates with the regime change. the average distance is bumpy and then smooths out exactly at the regime change. there appears to be a remarkable tradeoff between noise in the average and noise in the graph traversal. the temporary regime also seems to bear some resemblance to sequential stopping distance graphs.




this is a slight change that uses hamming distance on the points instead and also a blend of the point distance and “trajectory position”, and gets notable results. it seems to traverse the collatz graph in a patterned/ fractal/ self-similar way, what is generally sought. from experience, zooming in on the patterns or finding the exact differences in the patterns would probably lead to uncomputable fractal variations. musing on this leads me to think about this overall project as attempting to “smooth out wrinkles” on an “infinitely/ endlessly wrinkled” entity. can this general concept of “unwrinkling” be formalized somehow?



(7/27) this code looks at a depth-first traversal of the reverse tree, and from prior experiments these points tend to cluster in bunches. never did do a histogram of that, and so heres some code to do that and look at how the histogram peaks change with depth. theres a definite/ distinctive “spectral pattern”. following are histograms at depth 30, 40 specified as ARGV[0].




then a slight alteration to the code centers the histograms on the peak bin & looks more coherent. depth/ ARGV[0]=40



(7/29) examining those histogram peaks of values in binary, found long runs of the same strings, which led to this next idea. this is some sophisticated-yet-not-large code that revisits traversal via “monotone increasing compression ordering”. this traverses the reverse tree based on minimizing the compression ordering. compression order is total size of compressed string plus the “compression run dictionary”. the compression routine was adjusted based on finding a glitch of not compressing an obvious run of values into a “double word”. it also compresses multiple instances of the found paired string eg into the 3rd, 4th and so on. it looks at the order of the current “glob” combined with a single new node at the “current point” of evaluation (it is even more expensive to calculate adding any new node to the always-current/latest glob).

unfortunately the results do not seem that ordered. the 1st graph is the reverse tree visit order. the 2nd and 3rd (same graph, red & green respectively) are the compression order value and the depth. a natural (next) analysis would be to look at the grammar compression trees at milestones like every time the depth reaches a new max etc.





this seems surprising. the compression routine was run on the nth level of reverse tree values concatenated in base2. the greedy routine finds no dictionary items containing other dictionary items. the output is the sequence in dictionary “words”, followed by the dictionary. the dictionary words contain no other dictionary “words”.


["aq", "ah", "aa", "ag", "0", "ag", "aa", "1", "z", "aj", "o", "ap", "1", "g", "af", "n", "as", "ae", "t", "as", "e", "a
o", "ai", "ar", "v", "ao", "y", "ar", "ad", "y", "m", "x", "s", "s", "ao", "r", "ae", "0", "0", "0", "q", "q", "ai", "f"
, "af", "v", "0", "ad", "l", "aq", "ac", "ar", "o", "0", "am", "ah", "g", "an", "1", "ac", "1", "n", "0", "p", "t", "an"
, "z", "d", "u", "aq", "al", "k", "k", "aj", "f", "j", "j", "am", "m", "as", "x", "i", "i", "w", "e", "ak", "l", "0", "h
", "h", "p", "0", "w", "1", "ap", "d", "1", "a", "ab", "u", "0", "ab", "c", "c", "al", "b", "b", "r", "ah", "a", "0", "a
["a", ["1", "0", "1", "0", "1", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0"]]
["b", ["0", "0", "0", "0", "1", "0", "1", "0", "1", "0", "1", "0", "1", "0", "1", "0", "1", "0"]]
["c", ["0", "0", "0", "0", "0", "0", "0", "0", "1", "0", "1", "0", "1", "0", "1", "0", "1", "0"]]
["d", ["0", "0", "1", "0", "1", "1", "1", "1", "0", "0", "0", "1", "1", "1", "0", "0", "0"]]
["e", ["1", "0", "1", "0", "1", "0", "1", "0", "1", "1", "1", "1", "0", "0", "0", "1", "1"]]
["f", ["0", "0", "0", "1", "1", "0", "1", "0", "1", "0", "1", "0", "1", "0", "0", "0"]]
["g", ["0", "1", "0", "0", "1", "0", "1", "1", "0", "1", "0", "1", "0", "1", "1", "0"]]
["h", ["0", "0", "1", "1", "1", "0", "0", "0", "1", "1", "0", "1", "0", "1", "0"]]
["i", ["0", "0", "0", "1", "1", "1", "0", "0", "0", "1", "0", "1", "0", "1", "0"]]
["j", ["0", "0", "1", "1", "0", "1", "0", "1", "0", "1", "0", "1", "0", "1", "0"]]
["k", ["0", "0", "0", "0", "0", "0", "0", "0", "1", "1", "0", "1", "0", "1", "0"]]
["l", ["0", "0", "0", "1", "1", "1", "0", "0", "0", "1", "1", "0", "1", "0", "0"]]
["m", ["1", "0", "0", "0", "1", "0", "0", "0", "0", "0", "0", "0", "0", "1"]]
["n", ["0", "0", "1", "0", "1", "1", "1", "0", "0", "0", "1", "0", "1", "0"]]
["o", ["0", "0", "1", "0", "1", "1", "0", "1", "0", "0", "0", "0", "1", "0"]]
["p", ["1", "0", "1", "1", "1", "0", "0", "0", "1", "1", "1", "0", "0"]]
["q", ["0", "0", "0", "0", "1", "0", "0", "0", "1", "1", "0", "1", "0"]]
["r", ["1", "0", "0", "0", "1", "0", "1", "0", "1", "0", "1", "0", "1"]]
["s", ["0", "0", "0", "1", "0", "0", "0", "1", "0", "1", "0", "1", "0"]]
["t", ["1", "0", "1", "1", "1", "0", "1", "0", "0", "0", "1", "0"]]
["u", ["0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "1"]]
["v", ["1", "0", "0", "1", "0", "0", "0", "1", "1", "1"]]
["w", ["1", "0", "0", "1", "1", "1", "0", "0", "0"]]
["x", ["0", "0", "0", "1", "0", "1", "0", "0", "0"]]
["y", ["1", "0", "0", "1", "0", "0", "1", "1", "0"]]
["z", ["1", "1", "1", "0", "1", "0", "1", "0", "1"]]
["aa", ["0", "1", "1", "1", "1", "1", "0", "0", "0"]]
["ab", ["1", "0", "1", "0", "1", "0", "0", "0"]]
["ac", ["1", "1", "0", "0", "0", "0", "0"]]
["ad", ["0", "0", "1", "0", "1", "1"]]
["ae", ["1", "0", "0", "0", "1", "1"]]
["af", ["1", "1", "1", "0", "0", "0"]]
["ag", ["0", "1", "1", "1", "0", "1"]]
["ah", ["0", "1", "0", "1", "0"]]
["ai", ["1", "0", "0", "0", "1"]]
["aj", ["1", "0", "0", "0", "0"]]
["ak", ["0", "0", "0", "0"]]
["al", ["1", "0", "0", "0"]]
["am", ["1", "0", "1", "1"]]
["an", ["0", "1", "0"]]
["ao", ["1", "0", "0"]]
["ap", ["1", "1", "0"]]
["aq", ["1", "0"]]
["ar", ["0", "1"]]
["as", ["1", "1"]]


here is some small variation on the histogram code that sorts the reverse tree values by binary widths instead and draws lines (x axis width, y axis count). as usual it leads to a mix of regular and irregular aspects. the clustered area of smaller sizes/ widths is a more noisy “frontier” and the other area is more regular (ARGV[0]=60 depth).




a. collatz
b. reddit
c. mathoverflow
d. mathematica

2 thoughts on “summer collatz highlights, reddit/ stackexchange

    1. vznvzn Post author

      🙂 way cool/ thx for sharing! your name looks familiar, maybe you comment on blogs somewhere? do you have a stackexchange acct? maybe drop by this chat room if you have time sometime, number theory


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s