โ ๐ก โ now the big fun begins, tuning the algorithm & looking for a real solution. did some further experimenting and came up with some new metrics. the prior metrics succeed in improving and but on further runs & thoughts (as is commonly the case with GAs), maybe these metrics that dont take into account slope of the sampled function are not sufficient to find the desired convergence behavior.

in fact to have below-constant max nonmonotonic run length , the “sequential decrease/ nonreversal density” needs to increase gradually as the size of the starting seed increases (because average trajectory lengths increase as graphed previously) so merely maximizing is not enough. another basic metric is “average distance between local minima” and that needs to be constant, but again minimizing it over a small local region does not necessarily ensure the solutions will have a constant trend as is the case in some experiments. yet another simple (new) metric/ constraint worth looking at is “total count of local minima” and that also needs to increase at least as fast as the trajectory lengths to get the desired constant-bounded distance between local minima.

so then a natural idea is to look at trends of lines and a corresponding fitness function. how can this be measured/ quantified? the standard/ straightforward method is least squares linear regression, and that method was next employed.

experiments also showed that even with average distance between local minima nearly constant over a small optimizing region, sometimes there are massive irregular distances/ spikes in that metric. so next it seems like the variance in average distance between local minima is a key metric to minimize. of course small average distance between local minima is preferred but tuning GAs to optimize more than one thing at a time can sometimes be a can of worms.

the final evaluation function evaluates the linear regression of the variance in average distance between local minima fitted to the weighted walk starting at random starting seeds in the range at offsets of 2 (10 points). in other words the algorithm is searching for a constant average distance but, while it is preferrably smaller, there is no fitness directly based on/ connected to its magnitude.

some other bug fixes: after some troubleshooting, discovered the prior code sometimes gets “NaN” “not-a-number” values for averages and standard deviations in rare cases, in one case where there are all ‘1’s in the binary expansion ie numbers in the form .

so, after some not-difficult adjustments, ran the algorithm and found some striking results possibly verging on yet another breakthrough โ outlined in the following graphs. the algorithm found 9 improved solutions over ~7 hours. it is easy to throw out some for which is visually constant-bounded and not increasing (or declining) (even though may often be quite noisy in those cases and difficult to interpret unambiguously individually). but whats left?

a very strange-yet-dramatic (or vice versa?) solution pattern emerged over at least two different GA runs as documented in the following graphs, seems to be a strong vindication of the techniques, and apparently fits the desired criteria over a large test region although in a somewhat unpredictable way. the weights are as below. note that this code has not been tested for strict resilience/ repeatability in floating point ops and controlled random number generator starting seeds, that will require some further analysis and code adjustments, although the current inherent seed randomness in the trend analysis seems to indicate some significant resilience/ repeatability of the solution.

the 1^{st} graph is the whole “bottom line” of the max nonmonotonic distance (). for many GA solutions these are very gradually increasing graphs as in the prior graphs. however, this solution is anomalous in many ways but also shows that the overall idea (linear weighting of nonlinear iterate analysis statistics) can apparently lead to quite structured behavior.

there are three very distinct regions. the leftmost region has the training region and manages to stay below a constant under about 100, very desirable. but then it “blows up” around and climbs rapidly in the 2^{nd} region. then in the 3^{rd} region starting around its quite noisy but still *apparently* “sideways”, less than a ceiling of about ~1800 although with several large spikes.

the 2^{nd} graph plots average (green) and standard deviation (red) in the nonmonotonic run lengths and has the same 3 corresponding regions. the 1^{st} region is low. the 2^{nd} region has rapidly climbing average/ deviation. the 3^{rd} region shows both *falling* rapidly and then gradually! this is completely unexpected behavior in the test regions 2/3 outside of the training region 1 yet in the “final” region 3, *desired*, even *“better than expected”* or almost even *“too good to be true”* because only constant bounding was sought after. ๐ leading to a wild wonder/ conjecture, is an asymptotically declining average really theoretically possible? โ โ

the 3^{rd} graph plots sequential decrease/ nonreversal density . in region 1 it is noisy and slowly climbing. in region 2 it falls rapidly. in region 3 it climbs gradually, the desired behavior (although does seem to plateau near the end right, which is maybe problematic).

the 4^{th} graph plots total trajectory length in red (divided by 5 to scale it in the graph but which does not affect slope) and counts of local minima in green. the former climbs quite predictably but the latter shows the same 3-region behavior differences. in region 1 it climbs, in region 2 it falls, both in a single curved, concave-downward manner. then in region 3 it increases linearly. note in the 3^{rd} region the local minima count climbs faster than the trajectory lengths, in line with the other statistics. *wow!*

the GA found “a” desired solution in the long run analysis although not in the form expected. it is nearly ideal except for some large nonmonotone distances and region 2; regions 1/3 are acceptable in their dynamics, with region 3 as the presumed asymptotic behavior. the big question is if there are any more region changes possible after the large analysis region. the limitations of empirical analysis are emphasized with these results. no hint of the unexpected test region 2/3 behavior is hinted in the training region 1 statistics. but the test-divergence-from-training is not at all typical of machine learning apps either. its not just a simplistic increase in error. its almost as if its the output of a program/ algorithm that has 3 different conditional cases…! ๐ฎ

actually the wondrous nature of this could be emphasized even further. a bunch of random walks are weighted and added together to get a distinctly less random walk. its almost unbelievable or magical. a dramatic, maybe vista-opening reaffirmation of the ancient/ deep principle, *“the whole is greater than the sum of the parts.”* or like a rabbit being pulled out of a hat. ๐ โ

4 w0 -1.2965207298328 w1 0.00881255269287658 w2 0.834874460020838 w3 0.488768570929333 w4 -0.841996911397055 w5 0.461962697182684 w6 0.624682934548628 w7 0.182038750890474 w8 1.11402628548688 w9 0.853460506803983 w10 0.0276695282228376 w11 -0.428453944156957 w12 0.179899938636543 w13 -0.0481205801620707

#!/usr/bin/ruby1.8 def stat(l) return 0, 0 if (l.empty?) t = t2 = 0 l.each \ { |x| t += x t2 += x ** 2 } c = l.size a = t.to_f / c z = t2.to_f / c - a ** 2 sd = Math.sqrt(z < 0 ? 0 : z) raise if (sd.nan?) return a, sd end def avg(l) t = 0 l.each { |x| t += x } return t.to_f / l.size end def d(s) c = s.split('').select { |x| x == '1' }.size return c.to_f / s.length, s.length, c, s.size - c, c % 2, c % 3 end def rank(n, l) s = n.to_s(2) m1 = Math.log(n) m2 = m1 ** 2 m3 = m1 ** 0.5 d1, w, c1, c0, c2, c3 = d(s) d2 = (0.5 - d1).abs a1, sd1 = stat(s.split('0').map { |x| x.length }) a0, sd0 = stat(s.split('1').map { |x| x.length }) l2 = [m1, m2, m3, d1, w, c1, c0, c2, c3, d2, a1, sd1, a0, sd0] t = 0.0 l.each_with_index { |w, i| t += w * l2[i] } raise(l2.join(' ')) if (t.nan?) return t end def f3(n) return sprintf("%.3f", n).to_f end def f3m(l) l.map { |x| f3(x.to_f) } end def coef(x, y) x1 = x2 = y1 = y2 = xy = 0.0 n = x.size n.times \ { |i| x1 += x[i] y1 += y[i] xy += x[i] * y[i] x2 += x[i] ** 2 y2 += y[i] ** 2 } d = n * x2 - x1 ** 2 a0 = (y1 * x2 - x1 * xy) / d a1 = (n * xy - x1 * y1) / d return a0, a1 end def f(n, w) l = [] t = 0 while (n != 1) l << rank(n, w).abs puts(l[-1]) if ($mode == 'x') n = (n % 2 == 0) ? n / 2 : (n * 3 + 1) / 2 t += 1 end c = 0 m = l[0] x = j = 0 l2 = [] (1...l.size).each \ { |i| c += 1 if (l[i] < l[i - 1]) if (l[i] < m) then l2 << i - j m, j = [l[i], i] end x = [x, i - j].max } r = c.to_f / (l.size - 1) return x, stat(l2), r, l2.size, t end def eval1(p, z) n = 2 ** p + rand(2 ** (p - 1)) return f(n, z) end def eval(z) l = [] x = [] p = 40 10.times { |i| x << p; l << eval1(x[-1], z).flatten; p += 2 } l1 = l.transpose l2 = stat(l1[0]) + l1[1..-1].map { |l0| avg(l0) } y = l1[2] return -(coef(x, y)[1].abs), l2 end def trend() l = File.open($fn).readlines n = ARGV[0].nil? ? l.size - 1 : ARGV[0].to_i d = l[n].split("\t") s = f3m(d.slice!(0, 8)) w = f3m(d.slice!(0, $v)) $stderr.puts(n, s.join("\t"), w.join("\t"), d.join("\t")) f = File.open('out2.txt', 'w') (2..750).each \ { |p| n = 2 ** p + rand(2 ** (p - 1)) l = f(n, w) f.puts(l.join("\t")) f.flush } end def trend2() exit end def rsign() return rand(2) ? -1 : 1 end $mode = ARGV[0] $v = 14 $fn = 'var.txt' if (!['a'].member?($mode)) trend2() if ($mode == 'x') trend() exit end l = [] 50.times \ { |i| w = (1..$v).map { rand() * 1.5 * ((i % 2 == 0) ? 1 : rsign()) } l <<= [eval(w), w, "[#{i}]"] } f = File.open('out.txt', 'w') f2 = File.open($fn, 'w') mhi = nil mlo = nil j = 0 loop \ { a = l[rand(l.size)] b = l[rand(l.size)] x = a[1] y = b[1] r = rand($v) c = rand(4) h = nil case c when 0 z = x.dup z[r] = 0.0 when 1 z = x.dup z[r] = rand() * 1.5 * rsign() when 2 z = x.dup z[r] *= rand() * 1.5 * rsign() when 3 z = [] x.size.times { |i| z[i] = (rand(2) % 2 == 0) ? x[i] : y[i] } h = "(#{a[2]}+#{b[2]})" end op = 'a' c.times { op.succ! } h = "#{a[2]}#{op}#{r}" if (h.nil?) t = eval(z) l.pop if (l.size > 200) l << [t, z, h] l = l.sort_by { |x| x[0][0] }.reverse hi = l[0] lo = l[-1] fhi = flo = false mhi, fhi = [hi, true] if (mhi.nil? || hi[0][0] >= mhi[0][0] && hi[2] != mhi[2]) mlo, flo = [lo, true] if (mlo.nil? || lo[0][0] > mlo[0][0]) ts = Time.now.to_s f.puts([f3(t[0]).abs, f3(mhi[0][0]).abs, f3(mlo[0][0]).abs, h.length, j, ts].join("\t")) if (flo) f.flush f2.puts((mhi + [j, ts]).join("\t")) if (fhi) f2.flush j += 1 }

**(1/10)** this is some new code with some enhancements. oh, and *oops* the prior analysis code was truncating weights to 3 decimal digits! this was unintentional but on the flip side, the good news is that there is very good stability in the results, ie not much observed butterfly effect and even truncating a lot of decimal digits in the weights seems to have no noticeable effect on analysis statistics.

the code has some changes in the logging, a bunch of new genetic randomizing gene alterations too numerous to mention for now but easily understood by inspection of the code. it has new analysis code and other similar fitness functions, 3 types as labelled by letter modes. in each case it attempts to maximize upward slope after fitting by linear regression around the same training region:

- the sequential decreasing/nonreversal density
- the local minima count
- the local minima density (prior count divided by trajectory length)

got some interesting/ similar yet somewhat different/ improved results in these 3 runs labelled a, b, c. the ‘b’ metric performed best and found a new solution with max nonmonotonic run length of less than ~650. here are the weights and max nonmonotonic run length plots for the best solutions found after over 24hr of GA iterations, ‘a’ red, ‘c’ green, ‘b’ blue. in some ways these seem to be a newly scaled version of the prior solution, ie variations on a theme. the 3-region phenomenon continues at roughly the same numeric boundaries.

however the improvement in the max nonmonotonic run length ~650 of best solution ‘b’ over prior ~1400 is about 53% which is quite encouraging and even dramatic. am hunting for a low value here but havent figured out how to do this (yet) combined with a constant trend in a fitness function, so am relying on presumed indirect effects here through the other 3 fitness functions. it is not known yet if the improvement was due to the change in fitness functions or if its due to starting in better random starting conditions or even just randomness in the search etc; its currently too computationally expensive to do multiple runs. (wheres that open science collaboration team and supercomputing cluster when they’re needed?)

12 0 -0.0621151129675874 1 0.027749950267922 2 0.0285131621680541 3 -0.84576774323482 4 -0.901049776447273 5 -1.61758279610741 6 -0.178417524589377 7 0.0 8 -0.0 9 -1.66351603027066 10 0.0 11 -0.832684135084786 12 0.0 13 0.19429829301678 21 0 -0.937405391284104 1 0.0670177950097334 2 0.0 3 -1.73005189948666 4 -1.38257016553931 5 -0.558885208274334 6 -0.175932470473479 7 0.00144369304594031 8 0.0 9 0.463002776570181 10 0.746205672068276 11 0.790480164773667 12 0.0415956136680239 13 -0.918165827279676 11 0 4.15867051538285 1 -0.13286683498869 2 -0.49242392058947 3 0.0986854597916064 4 2.8536377301632 5 0.239334156410213 6 -0.341876102500191 7 0.14631302770963 8 -0.361365204553417 9 0.472727150190803 10 0.411456277848107 11 0.0 12 0.306902447155646 13 1.43546799909965

#!/usr/bin/ruby1.8 def stat(l) return 0, 0 if (l.empty?) t = t2 = 0 l.each \ { |x| t += x t2 += x ** 2 } c = l.size a = t.to_f / c z = t2.to_f / c - a ** 2 sd = Math.sqrt(z < 0 ? 0 : z) raise if (sd.nan?) return a, sd end def avg(l) t = 0 l.each { |x| t += x } return t.to_f / l.size end def d(s) c = s.split('').select { |x| x == '1' }.size return c.to_f / s.length, s.length, c, s.size - c, c % 2, c % 3 end def rank(n, l) s = n.to_s(2) m1 = Math.log(n) m2 = m1 ** 2 m3 = m1 ** 0.5 d1, w, c1, c0, c2, c3 = d(s) d2 = (0.5 - d1).abs a1, sd1 = stat(s.split('0').map { |x| x.length }) a0, sd0 = stat(s.split('1').map { |x| x.length }) l2 = [m1, m2, m3, d1, w, c1, c0, c2, c3, d2, a1, sd1, a0, sd0] t = 0.0 l.each_with_index { |w, i| t += w * l2[i] } raise(l2.join(' ')) if (t.nan?) return t end def f3(n) return sprintf("%.3f", n).to_f end def f3m(l) l.map { |x| f3(x.to_f) } end def coef(x, y) x1 = x2 = y1 = y2 = xy = 0.0 n = x.size n.times \ { |i| x1 += x[i] y1 += y[i] xy += x[i] * y[i] x2 += x[i] ** 2 y2 += y[i] ** 2 } d = n * x2 - x1 ** 2 a0 = (y1 * x2 - x1 * xy) / d a1 = (n * xy - x1 * y1) / d return a0, a1 end def f(n, w) l = [] l1 = [] t = 0 while (n != 1) x = rank(n, w).abs l << x l1 << [Math.log(n), x] n = (n % 2 == 0) ? n / 2 : (n * 3 + 1) / 2 t += 1 end return l1 if ($mode == 'x') c = 0 m = l[0] x = j = 0 l2 = [] (1...l.size).each \ { |i| c += 1 if (l[i] < l[i - 1]) if (l[i] < m) then l2 << i - j m, j = [l[i], i] end x = [x, i - j].max } r = c.to_f / (l.size - 1) return x, stat(l2), r, l2.size, l2.size.to_f / l.size, t end def eval1(p, z) n = 2 ** p + rand(2 ** (p - 1)) return f(n, z) end def eval(z) l = [] x = [] p = 40 10.times { |i| x << p; l << eval1(x[-1], z).flatten; p += 2 } l1 = l.transpose l2 = stat(l1[0]) + l1[1..-1].map { |l0| avg(l0) } y = l1[3] if ($mode == 'a') y = l1[4] if ($mode == 'b') y = l1[5] if ($mode == 'c') return coef(x, y)[0], l2 end def out(n, w) puts(n) w.each_with_index { |w, i| puts("#{i}\t#{w}") } end def info() File.open($fn).readlines.each_with_index \ { |ln, i| d = ln.split("\t") puts([i, f3(d[0]), d[-2], d[-3].length, d[-1]].join("\t")) if ($mode == 'y') puts([i, d[-3]].join("\t")) if ($mode == 'z') if ($mode == 'w') then s = d.slice!(0, $s) w = d.slice!(0, $v).map { |x| x.to_f } out(i, w) end } exit end def trend() l = File.open($fn).readlines n = ARGV[0].nil? ? l.size - 1 : ARGV[0].to_i d = l[n].split("\t") s = d.slice!(0, $s) w = d.slice!(0, $v).map { |x| x.to_f } out(n, w) f = File.open("out_#{n}.txt", 'w') (2..750).each \ { |p| n = 2 ** p + rand(2 ** (p - 1)) l = f(n, w) f.puts(l.join("\t")) f.flush } end def trend2() l = File.open($fn).readlines d = l[-1].split("\t") s = d.slice!(0, $s) w = d.slice!(0, $v).map { |x| x.to_f } f = File.open('out2.txt', 'w') p = 10 19.times \ { l = eval1(p, w) l.each_with_index { |x, i| f.puts([l.size - i, x].join("\t")) } p += 10 f.puts } exit end def rsign() return (rand(2) % 2 == 0) ? -1 : 1 end def rw() return rand() * 1.5 * rsign() end def re() return rand(2) % 2 == 0 end $mode = ARGV[0] $s = 9 $v = 14 $fn = 'var.txt' if (!['a', 'b', 'c'].member?($mode)) trend2() if ($mode == 'x') info() if (['w', 'y', 'z'].member?($mode)) trend() exit end l = [] 50.times \ { |i| w = (1..$v).map { rand() * 1.5 * rsign() } l <<= [eval(w), w, "[#{i}]"] } f = File.open('out.txt', 'w') f2 = File.open($fn, 'w') mhi = nil mlo = nil j = 0 loop \ { a = l[rand(l.size)] b = l[rand(l.size)] x = a[1] y = b[1] r = rand($v) c = rand(12) op = 'a' c.times { op.succ! } h = "#{a[2]}#{op}#{r}" case op when 'a' z = [] x.size.times { |i| z[i] = re() ? x[i] : y[i] } h = "(#{a[2]}+#{b[2]})" when 'b' z = x.dup z[r] = 0.0 when 'c' z = x.dup z[r] += rw() when 'd' z = x.dup z[r] *= rw() when 'e' z = x.dup z[r] = rw() when 'f' z = x.map { |w| re() ? w : w + rw() } when 'g' z = x.map { |w| re() ? w : w * rw() } when 'h' z = x.map { |w| re() ? w : w * rsign() } when 'i' z = x.map { |w| re() ? w : rw() } when 'j' z = x.map { |w| w + rw() } when 'k' z = x.map { |w| w * rw() } when 'l' z = x.map { |w| w * rsign() } end t = eval(z) l.pop if (l.size > 200) l << [t, z, h] l = l.sort_by { |x| x[0][0] }.reverse hi = l[0] lo = l[-1] fhi = flo = false mhi, fhi = [hi, true] if (mhi.nil? || hi[0][0] >= mhi[0][0] && hi[2] != mhi[2]) mlo, flo = [lo, true] if (mlo.nil? || lo[0][0] > mlo[0][0]) ts = Time.now.to_s f.puts([$mode, f3(t[0]).abs, f3(mhi[0][0]).abs, f3(mlo[0][0]).abs, h.length, j, ts].join("\t")) if (flo) f.flush f2.puts((mhi + [j, ts]).join("\t")) if (fhi) f2.flush j += 1 }

there are so many interesting ways to improve the algorithm and analyze this data right now its a bit overwhelming. the blog writeup is also very time consuming at this pt. the GAs throw off a lot of auxiliary data related to convergence etc, but also there are so many different/ similar metrics and fitness functions now. it could take awhile to sort through all this.

also need to do weight analysis to figure out whether to cut any metrics, and also which metrics are most effective by weight and thereby suggest new metrics. right now the weights seem to be generally nonzero for optimal solutions. but some different weights are zero and not in consistent positions. this is somewhat surprising that totally different weight configurations seem to be leading to similar structured solutions. a big possibly key, even critical mystery right now. ๐ฎ โ

โญ ๐ oh some other way cool news. am formulating some rough ideas about computing density over FSMs as hinted in the last post. not sure exactly how to do it (yet) but it seems computable and maybe even in P time. was able to formulate it into a cstheory.se question that got a few votes but more importantly, hit the jackpot with a quick answer by Shallit. its always a very good day when someone with a wikipedia biography replies in cyberspace. maybe even the highlight of the week. thanks much JS! ๐ (maybe if this all works out mine will be forthcoming soon) ๐ … JS has answered one of my other questions before also… and is also notable in that it takes something different for him to respond; he doesnt have too much activity on the site with currently only 42 answers and **zero** questions. a real life math guru at the top of the mountain!

**(1/11)** โ just realized on editing my Collatz page that Shallit introduced the concept of use of regular languages in understanding the Collatz problem in 1991, ref [1] on that page. also added two newly googled interesting refs [27][28] apparently written by advanced undergraduates as senior theses, LaTourette and Chamberland. alas it seems a pity so few undergraduates seem to be exposed to the problem…

**(1/14)** โ ๐ฎ some deeper *eyepopping/ jawdropping* analysis of what turn out to be remarkable solutions. a basic analysis is to run the multiple evaluation statistics (in general measuring random walk monoticity) on the “unconditioned/ raw” collatz trajectories and compare those statistics directly with the “conditioned” walks. those reveal the strong/ definitive differences, but that is to be expected. it turns out, even more dramatic is to plot the original 1d walk vs conditioned 1d random walk as a curve (ie correlation between the two) to figure out how the algorithm is optimizing and “smoothing” the latter.

these following plots are for “weight set 1”. samples are taken at starting trajectory bit lengths at increasing lengths of 10 between 10 and 300. starting positions are plotted in red. then up to 20, 50, or 100 of the initial trajectory values are plotted on top in green. the 3 corresponding regions are now apparent. region 1, 2 are the rising and falling sections of the concave-downward curve, roughly parabolic. then there is a climbing region 3. the curve is clearly smoothed but is surprising and unexpected and seems to represent some attractor in the optimization strategy. note the trajectories are plotted right-to-left.

def stat(l) return 0, 0 if (l.empty?) t = t2 = 0 l.each \ { |x| t += x t2 += x ** 2 } c = l.size a = t.to_f / c z = t2.to_f / c - a ** 2 sd = Math.sqrt(z < 0 ? 0 : z) raise if (sd.nan?) return a, sd end def avg(l) t = 0 l.each { |x| t += x } return t.to_f / l.size end def d(s) c = s.split('').select { |x| x == '1' }.size return c.to_f / s.length, s.length, c, s.size - c, c % 2, c % 3 end def rank(n, l) s = n.to_s(2) m1 = Math.log(n) m2 = m1 ** 2 m3 = m1 ** 0.5 d1, w, c1, c0, c2, c3 = d(s) d2 = (0.5 - d1).abs a1, sd1 = stat(s.split('0').map { |x| x.length }) a0, sd0 = stat(s.split('1').map { |x| x.length }) l2 = [m1, m2, m3, d1, w, c1, c0, c2, c3, d2, a1, sd1, a0, sd0] t = 0.0 l.each_with_index { |w, i| t += w * l2[i] } raise(l2.join(' ')) if (t.nan?) return t end def maxrun(l) m = l[0][1] l2 = [] l1 = [] l.each \ { |x| if (x[1] < m) then l2 = [] m = x[1] else l2 << x end l1 = l2 if (l2.size > l1.size) } return l1 end def f(n, w) l = [] while (n != 1) x = rank(n, w).abs l << [Math.log(n) / Math.log(2), x] n = (n % 2 == 0) ? n / 2 : (n * 3 + 1) / 2 end return l end def g1(w) [1,20,50,100].each_with_index \ { |t, i| f = File.open("out#{t}.txt", 'w') p = 10 d = 10 while (p <= 300) n = 2 ** p + rand(2 ** (p - 1)) l = f(n, w) l[0...t].each { |xy| f.puts(xy.join("\t")) } f.puts if (t != 1) f.flush p += d end f.close } end def g2(w) c = 5 f = File.open("out.txt", 'w') f2 = File.open("outx.txt", 'w') p = 10 d = 10 h = 300 v = 500 (c * c).times \ { |i| x = (i % c) * h y = (i / c) * v n = 2 ** p + rand(2 ** (p - 1)) l = f(n, w) l.each { |xy| f.puts([-xy[0] + x, xy[1] - y].join("\t")) } f.puts f2.puts([x, -y].join("\t")) f2.puts([x, -y + v].join("\t")) f2.puts f2.puts([x, -y].join("\t")) f2.puts([x - h, -y].join("\t")) f2.puts f.flush f2.flush p += d } f.close f2.close end def g3(w) f = File.open("outy.txt", 'w') f2 = File.open("outz.txt", 'w') p = 150 n = 2 ** p + rand(2 ** (p - 1)) l = f(n, w) l.each_with_index { |x| f.puts(x.join("\t")) } maxrun(l).each { |x| f2.puts(x.join("\t")) } end def trend() w = File.open("w#{ARGV[0]}.txt").readlines.map { |x| x.split[1].to_f } g1(w) g2(w) g3(w) end trend()

the next 3 plots contain 5×5 subplots of full trajectories starting at increasing bit lengths by 10, so from 10 to the 250 range, and arrange from top left to bottom right. the x axis is mirrored so that trajectories move left-to-right instead. they are for weight sets 1, 2, 3. here there is a bigger story, a lot to take in and it would definitely make sense as an animation! the weights 2, 3 are strikingly much smoother, the regions are slightly different, getting compressed, and also tend to climb much quicker, and the regions seem to correspond to the changing “bump” size/ shape at the end.

โญ โญ โญ the *sheer smoothness* of these final 2 plots is *stunning!* more *strong, exciting* evidence of a *real, unmitigated breakthrough.* and it is not at *all* hinted in the random walk monotonicity statistics which generally still look quite noisy. showing how differences in quantitative metrics might not reveal fundamental shifts in qualitative worth and how critical different types of analysis/ pictures are for understanding the big(ger) picture.

๐ก *conjecture:* interpreting the diagrams it is natural to wonder where the longest nonmonotone run is occuring. maybe the longest nonmonotone run for different trajectories correlates/ corresponds to the left upward side of the bump in the final left-to-right plots (right before the steep falling section). this should be easy to determine but havent gotten to it yet.

**(1/15)** the conjecture was too easy! it is simple to demonstrate empirically (but to say nothing of *proof*). it explains the prior graphs in a very elegant way. the code has been modified to add the 3^{rd} analysis routine that tracks the longest nonmonotone run and allows plotting it superimposed on the full run, and it always coincides as the “ending bump”.

some more interpretation. imagine computing running averages of the original collatz random walks. that would definitely also smooth out the function but from computation, the longest nonmonotone run lengths seem to be unbounded. therefore no finite averaging would lead to finite max nonmonotone run lengths. the weighting/ conditioning function seems to be “folding up” the random walk in some currently highly mysterious/ inexplicable way to achieve bounded nonmonotone run lengths. concepts of topology theory seem to be relevant here. it seems to be some kind of topological folding.

the plots also seem to immediately suggest a new natural fitness metric that might behave much more smoothly than the other metrics and lead to new solutions: the euclidean distance from the origin of (x, y) where x is the collatz trajectory value and y is the weighted/ conditioned trajectory value! cant wait to try it โ

**(1/22)** โ ๐ฎ ๐ฆ ๐ณ ๐ฅ bad news! heres some analysis code that does a bit-wise greedy search for long nonmonotonic runs, successfully finding large ones. then it looks at the unconditioned vs conditioned random walks and the longest nonmonotonic run for each (red, green respectively). both tend to trend upward. if the conditioned results levelled off while unconditioned walks climbed as in earlier plots it would indicate (potential) success, but they are roughly correlated so the formula is essentially failing on “hard” seeds. *(wince)* another indication of failure is that the longest nonmonotone runs are as high as ~900, which were not seen in the earlier analyses, which didnt find runs longer than ~650.

this is for weight set #3 (1st arg). there does seem to be a minor improvement effect by the algorithm after x=~250 bits. overall, it looks like the GA is training well on random inputs but not generalizing to “hard” seeds which seem to have some different distribution. aka the needles in the haystack throw a curve ball at the conditioning formula & it strikes out.

oh and the code gets a numeric overflow in the `Math.log` function when the # of bits hits 1024. ๐ณ never pushed ruby that far before, *bummer.*

there is maybe some idea to salvage here. if the hard seeds can be differentiated from easier seeds somehow in the initial bit structure then those would be better to train the GA with.

def stat(l) return 0, 0 if (l.empty?) t = t2 = 0 l.each \ { |x| t += x t2 += x ** 2 } c = l.size a = t.to_f / c z = t2.to_f / c - a ** 2 sd = Math.sqrt(z < 0 ? 0 : z) raise if (sd.nan?) return a, sd end def avg(l) t = 0 l.each { |x| t += x } return t.to_f / l.size end def d(s) c = s.split('').select { |x| x == '1' }.size return c.to_f / s.length, s.length, c, s.size - c, c % 2, c % 3 end def rank(n, l) s = n.to_s(2) m1 = Math.log(n) m2 = m1 ** 2 m3 = m1 ** 0.5 d1, w, c1, c0, c2, c3 = d(s) d2 = (0.5 - d1).abs a1, sd1 = stat(s.split('0').map { |x| x.length }) a0, sd0 = stat(s.split('1').map { |x| x.length }) l2 = [m1, m2, m3, d1, w, c1, c0, c2, c3, d2, a1, sd1, a0, sd0] t = 0.0 l.each_with_index { |w, i| t += w * l2[i] } raise(l2.join(' ')) if (t.nan?) return t end def f(n, w) l = [] while (n != 1) l << [Math.log(n), rank(n, w).abs] n = (n % 2 == 0) ? n / 2 : (n * 3 + 1) / 2 end return l end def maxrun(l) return 0 if (l.empty?) m = l[0] l2 = [] l1 = [] l.each \ { |x| if (x < m) then l2 = [] m = x else l2 << x end l1 = l2 if (l2.size > l1.size) } return l1.size end def fmt(h, c) h = h.dup h['n#'] = h.delete('n').to_s(2).length h['p#'] = h.delete('p').to_s(2).length h['l#'] = c return h.to_s.gsub('=>', ' => ') end def sample(t, w) l = [{'x' => 0, 'n' => 1, 'p' => 1, 'x2' => 0}] loop \ { t = (0...l.size).max_by { |j| l[j]['x'] } z = l.delete_at(t) x, n, p, x2 = [z['x'], z['n'], z['p'], z['x2']] puts(fmt(z, l.size)) $stdout.flush p <<= 1 l.push({'x' => x, 'n' => n, 'p' => p, 'x2' => x2}) n2 = n + p l2 = f(n2, w) x1 = maxrun(l2.map { |z| z[0] }) x2 = maxrun(l2.map { |z| z[1] }) l.push({'x' => x1, 'n' => n2, 'p' => p, 'x2' => x2}) } end w = File.open("w#{ARGV[0]}.txt").readlines.map { |x| x.split[1].to_f } m, n, l = sample(nil, w)

**(later)** on immediate inspection of the binary seed values, the “pattern” is almost trivial! the greedy code is generating a long bitstring of mostly 0’s and some scattered 1’s as seeds. the 0 runs follow some irregular pattern. this simplified code (removing the formula weighting logic) outputs a list of the lengths of the 0 runs separated by 1s for bitwise greedy search of long nonmonotone runs. this suggests quite simply that *(deja vu)* a better random seed generator will range over all the different 0/1 densities! the current scheme is centered around 50% density with low deviation.

def f(n) l = [] while (n != 1) l << Math.log(n) n = (n % 2 == 0) ? n / 2 : (n * 3 + 1) / 2 end return l end def maxrun(l) return 0 if (l.empty?) m = l[0] l2 = [] l1 = [] l.each \ { |x| if (x < m) then l2 = [] m = x else l2 << x end l1 = l2 if (l2.size > l1.size) } return l1.size end def fmt(h, c) h = h.dup h['n#'] = h.delete('n').to_s(2).length h['p#'] = h.delete('p').to_s(2).length h['l#'] = c return h.to_s.gsub('=>', ' => ') end def sample() l = [{'x' => 0, 'n' => 1, 'p' => 1}] h = {} loop \ { t = (0...l.size).max_by { |j| l[j]['x'] } z = l.delete_at(t) x, n, p = [z['x'], z['n'], z['p']] if (!h.member?(n)) then puts("\t" + fmt(z, l.size)) h[n] = nil ns = n.to_s(2) p(ns.split('1').map { |s| s.length }) $stdout.flush break if (ns.length > 1000) end p <<= 1 l.push({'x' => x, 'n' => n, 'p' => p}) n2 = n + p x1 = maxrun(f(n2)) l.push({'x' => x1, 'n' => n2, 'p' => p}) } end sample()

{"x" => 0, "n#" => 1, "p#" => 1, "l#" => 0} [] {"x" => 4, "n#" => 2, "p#" => 2, "l#" => 1} [] {"x" => 7, "n#" => 3, "p#" => 3, "l#" => 2} [] {"x" => 7, "n#" => 4, "p#" => 4, "l#" => 4} [] {"x" => 56, "n#" => 5, "p#" => 5, "l#" => 5} [] {"x" => 80, "n#" => 11, "p#" => 11, "l#" => 11} [0, 5] {"x" => 80, "n#" => 25, "p#" => 25, "l#" => 26} [0, 13, 5] {"x" => 126, "n#" => 27, "p#" => 27, "l#" => 28} [0, 15, 5] {"x" => 130, "n#" => 80, "p#" => 80, "l#" => 81} [0, 52, 15, 5] {"x" => 181, "n#" => 82, "p#" => 82, "l#" => 83} [0, 1, 52, 15, 5] {"x" => 208, "n#" => 122, "p#" => 122, "l#" => 123} [0, 39, 1, 52, 15, 5] {"x" => 214, "n#" => 131, "p#" => 131, "l#" => 132} [0, 8, 39, 1, 52, 15, 5] {"x" => 286, "n#" => 162, "p#" => 162, "l#" => 163} [0, 30, 8, 39, 1, 52, 15, 5] {"x" => 298, "n#" => 213, "p#" => 213, "l#" => 214} [0, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 308, "n#" => 238, "p#" => 238, "l#" => 239} [0, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 329, "n#" => 243, "p#" => 243, "l#" => 244} [0, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 384, "n#" => 251, "p#" => 251, "l#" => 252} [0, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 401, "n#" => 260, "p#" => 260, "l#" => 261} [0, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 406, "n#" => 277, "p#" => 277, "l#" => 278} [0, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 433, "n#" => 329, "p#" => 329, "l#" => 330} [0, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 444, "n#" => 333, "p#" => 333, "l#" => 334} [0, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 481, "n#" => 374, "p#" => 374, "l#" => 375} [0, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 497, "n#" => 376, "p#" => 376, "l#" => 377} [0, 1, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 512, "n#" => 393, "p#" => 393, "l#" => 394} [0, 16, 1, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 524, "n#" => 409, "p#" => 409, "l#" => 410} [0, 15, 16, 1, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 533, "n#" => 449, "p#" => 449, "l#" => 450} [0, 39, 15, 16, 1, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 657, "n#" => 465, "p#" => 465, "l#" => 466} [0, 15, 39, 15, 16, 1, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 706, "n#" => 495, "p#" => 495, "l#" => 496} [0, 29, 15, 39, 15, 16, 1, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 722, "n#" => 578, "p#" => 578, "l#" => 579} [0, 82, 29, 15, 39, 15, 16, 1, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 742, "n#" => 579, "p#" => 579, "l#" => 580} [0, 0, 82, 29, 15, 39, 15, 16, 1, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 749, "n#" => 597, "p#" => 597, "l#" => 598} [0, 17, 0, 82, 29, 15, 39, 15, 16, 1, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 752, "n#" => 618, "p#" => 618, "l#" => 619} [0, 20, 17, 0, 82, 29, 15, 39, 15, 16, 1, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 764, "n#" => 671, "p#" => 671, "l#" => 672} [0, 52, 20, 17, 0, 82, 29, 15, 39, 15, 16, 1, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 793, "n#" => 680, "p#" => 680, "l#" => 681} [0, 8, 52, 20, 17, 0, 82, 29, 15, 39, 15, 16, 1, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 809, "n#" => 695, "p#" => 695, "l#" => 696} [0, 14, 8, 52, 20, 17, 0, 82, 29, 15, 39, 15, 16, 1, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 972, "n#" => 706, "p#" => 706, "l#" => 707} [0, 10, 14, 8, 52, 20, 17, 0, 82, 29, 15, 39, 15, 16, 1, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 996, "n#" => 785, "p#" => 785, "l#" => 786} [0, 78, 10, 14, 8, 52, 20, 17, 0, 82, 29, 15, 39, 15, 16, 1, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 1024, "n#" => 797, "p#" => 797, "l#" => 798} [0, 11, 78, 10, 14, 8, 52, 20, 17, 0, 82, 29, 15, 39, 15, 16, 1, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 1050, "n#" => 967, "p#" => 967, "l#" => 968} [0, 169, 11, 78, 10, 14, 8, 52, 20, 17, 0, 82, 29, 15, 39, 15, 16, 1, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 1051, "n#" => 981, "p#" => 981, "l#" => 982} [0, 13, 169, 11, 78, 10, 14, 8, 52, 20, 17, 0, 82, 29, 15, 39, 15, 16, 1, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5] {"x" => 1062, "n#" => 1014, "p#" => 1014, "l#" => 1015} [0, 32, 13, 169, 11, 78, 10, 14, 8, 52, 20, 17, 0, 82, 29, 15, 39, 15, 16, 1, 40, 3, 51, 16, 8, 7, 4, 24, 50, 30, 8, 39, 1, 52, 15, 5]