hi all this is the new month, 30th collatz post on this blog. (its been a long trip, and the destination is still not in sight…)
⭐ 💡 ❗ this was surprising to me and shows how little some of these ideas have been explored; its some very simple bunch code that just advances sequentially but integrates over logarithmic trajectory values instead of the actual ones and has a very strong smoothing property even over “not that many” trajectories! (had noticed earlier in other contexts that the logarithmic transform has strong smoothing properties and decided to apply it here on a whim.) its only 100 trajectories but appears to be perfectly monotonic. and its a totally local algorithm. why didnt this show up sooner? 2020 hindsight! 😳 😡 😦
exercise for reader: how will this strategy easily fail? hint, consider hard seed generation strategies.
this is the answer, the code is hooked up to the hard seed generator testing/ graphing nonmono run length and fixed the graphing to be in color. nice/ pretty! there is a mostly attenuated upslope except somewhat stronger for the 3rd ‘c’ scheme. the answer is that the longrunning trajectory will always contribute to/ increase the nonmono run length as it is the last trajectory being advanced. this suggests a new strategy: look at bunch algorithms that advance the longest trajectory earlier than any/ all others. so which advance heuristics have this property? (possibly some have been tried already that have something like this property.)
(12/8) recall trying something like this a long time ago but never writing it up, not sure if decided the conclusion was incorrect, or was going to revisit it or…? anyway, there is a basic principle behind “bunch” algorithms that can be analyzed, and am slightly embarrassed havent worked this out prior to this moment. 😳
the basic idea is that for each trajectory, there is an “area” above the starting seed, and a sort of anti-area “below” the starting seed, and a bunch algorithm cannot succeed if the above area exceeds the below area. this is some simple code that does that calculation starting at hard seeds and over the 100 adjacent trajectories. the graph is the sum of positive area minus negative area (there is a cosmetic improvement of putting visual gaps between the runs), and if it increases, bunch algorithms are ruled out in general! the results are striking, and the rising 1st, 5th, 7th (black1, magenta, yellow/ ‘w’, ‘cg’, ‘r’) generation schemes clearly rule it out. wow! back to the drawing board! 😮 😥
⭐ 💡 ❗ 😮 (12/10) when in doubt, try the logarithmic mapping! not exactly sure why this works so well, it could be further analyzed why this happens, but the difference of logs of the positive and negative areas work out excellent (more precisely, this is nearly equivalent to looking at products of seed vs iterate differences instead of sums). a reversal, snatching victory from the jaws of defeat! there is some smoothing due to the log2 function roundoff, not sure if it is significant yet. (the gnuplot log function takes absolute values, the
abs(x) expr is not necessary but just for clarity.)
this is another idea have been wanting to try for awhile, it looks at the 3n binary prefix comparisons considered previously and whether they could lead to a smooth monotonically decreasing series, based on the max nonmono run length metric of the series. it finds an attenuating effect but the trends are clearly increasing.
(12/11) there is a simpler way to generate seeds that does not try to determine hardness ahead of time! the simple approach is to just use the memory usage as a rough metric of seed hardness. so decided to simplify that code and it works great. it also has the nice feature of never failing to return seeds! again, that feeling of “should have figured that out to begin with…” the nonmono run length routine is slightly simplified also. also tried a very simple bunch algorithm based on advancing max trajectory iterate 1st. results were lackluster (increasing) so skipping that graph.
(12/14) am a bit surprised/ disappointed at these results. an earlier experiment for
scale.rb found that a natural scaling of all the trajectories versus the max length trajectory led to excellent smoothing properties. (here “smoothing” is a rough shorthand for generally good monotonically decreasing properties.) did not try this across all the hard seed algorithms, and that is now implemented/ tested here. it turns out to have weak scalability over the hard seeds. this is something of a setback because its a global algorithm, not even a local one. so it appears even finding a global algorithm with good smoothing is hard.
heres a tiny variation but with big computational impact by increasing the bunch count to 5000 from 100. there seems to be some rough improvement and overall max decrease but the overall trends are still noticeably upward.
💡 ❗ ❓ (12/15) this was somewhat surprising, again a simple exercise revealing something striking, along the lines of the expr “only scratching the surface”. its just a 100 trajectory output (log scaled) for ‘w’ method at the 0.5 hardness. other than the outlier trajectory that is generated, the trajectories merge/ group into “grooves”. are they hitting the same “seen” elements just with some offset? needs further investigation given that any pattern, no matter how small, might be exploitable. anyway this also explains earlier results where “adjacent” trajectories are far from independent.
a simple addition of a seen cache reveals that is indeed the case (in this following graph the later trajectories that hit seen/ cached iterates just terminate in the middle), but this is written such that its not clear if shorter paths are consistently eventually intersecting the later iterates of the larger path, or the other paths. overall this leads to all kinds of immediate interesting questions/ ideas. (generated path in green, others in red.)
this is another way of looking at that same data by working backwards from the trajectory “tree”. the trajectories merge with each other in a tree structure and this code builds the tree and then graphs it starting from the 1 destination point. again the generated trajectory is in green, others red.
this plot studies something close to a concept of “treewidth” of the prior trees over increasing hard seeds for the ‘w’ method. treewidth has a long and illustrious background in advanced complexity theory with many deep connections to computational hardness. the basic idea here is that a bunch of say, 100 trajectories starts out as a tree with 100 branches, and those join at a certain rate, and fewer and fewer branches need to be considered as nearing the trunk of the tree. however, this works from the ending trunk starting at 1. the darker colors are treewidth profiles for less hard seeds, the lighter colors are for the harder seeds. the pattern is that the treewidths rapidly decline to a small number of branches, into a narrow “staircase” region at the bottom.
(12/17) that last graph is not the proudest moment as far as visual coherence. in retrospect it makes a lot of natural sense to overlay the spikes. a very simple change is to output the tree in the reverse order from branches toward the trunk. that leads to this nice result. so much more coherent! this also shows a natural similarity in the underlying graphs (trees). they have a similar “profile”. the profile “stretches out” for harder seeds. the trends tend to flatten out quickly so a logarithmic plot follows for another somewhat “zoomed in” pov. (note the color scale for hardness is flipped versus the prior graph.) there is some hint of this graph in prior bunch algorithms that decrease the number of trajectories considered based on a seen cache.
the next obvious question is, is any of this exploitable? its interesting to note these are perfectly monotonically declining graphs, however the max nonmono run length of the curves must necessarily be increasing (hint, think of the longest trajectory). but if the curves could be proven monotonically declining some other way, thats essentially a proof! so (thinking out loud) what could something like that look like?
(12/18) an obvious idea is to integrate those trajectories! and also look at nonmono run length. these graphs are for 100 and 500 trajectories (line 130). the # of trajectories seems to have almost no effect. the data shows that the max nonmono run length is (not surprisingly) almost always coming from the final few trajectories if not the last one. have the suspicion that nonmono run length may be nearly flat for the earlier part of the curve. this is a familiar “trap” where the overall performance is not smoothed and instead dominated by the extreme or outlier point(s).
(12/30) this is replication of a measurement that was discovered/ written up last summer, calculated over the new hard seed generation. to review, each trajectory has a partner or “smaller sibling” that behaves similarly in various statistics such as total trajectory length and trajectory height etc, by adjusting only the msb. this computes the max trajectory values between the trajectory and its smaller sibling, does a log2 difference and finds it seems to be bounded by a constant. seems promising; so again, how can this be exploited?
(12/31) was looking at plots of the trajectory and its smaller sibling, and then came up with an idea as follows. extract the two trajectories starting from the max to the end glide point. then look at the max (log2) “vertical” differences between these two trajectories scaled horizontally. this is a global type algorithm. it leads to very flat results for some of the hard generation methods, and increasing trends for the others.
(1/3/2016) 💡 ❗ 😮 on a whim decided to scale the vertical dimension by the max height for the last code, and get surprising results! did not expect this! the difference metric seems to converge ie roughly decrease!
this is a bit more rough, this code was refactored to run on ruby1.8 and also the gnuplot version did not have variable linecolor mode. the ruby formatting got lost in all the back-and-forth. ruby1.8 does not have max_by, min_by functions or odd?/ even? either. the code uses sort instead of max to find a max element which is inefficient because it replaces a O(n) algorithm with an O(n log n) one.
(1/6) didnt have any big new ideas after that (some strong probably-foreign virus is not contributing to my mood/ creativity right now). however heres a simple/ minor variation/ extension. this code looks at the max difference in the ascent and descent sections (red, green) respectively and the max of both (blue). the ascent difference seems to be larger/ more noisy but still noticeably decreasing. not sure exactly what to make of this right now but this all seems to point to the idea that even though fractal, the ascent and descent begin to converge toward a particular “shape” as the seeds get larger.