some new ideas. as mused on earlier and a continuing theme, it would seem that “derandomizing” has something to do with resolving collatz. derandomizing is part of some of the most important techniques and proofs in complexity theory eg AKS Primes in P, st connectivity in L, etc.

if an algorithm had a small amount of space (a buffer) to “derandomize” the output as best possible, how would that look? here are two interesting ideas along these lines.

where is this all going? there are certain ideas that continue to haunt me like phantoms. suspect there is some very amazing, vast theory waiting to be uncovered in these areas. imagine “solving” unknown-undecidable problems somewhat akin to the way that algebraic problems are solved, by various transformations. in this case with data transformations, in which all recursive operations are allowed. somehow they would preserve the problem structure but also alter it in such a way that its hidden structure is revealed such that a proof can be constructed out of regularities laid bare.

does that sound a little crazy? admittedly, have been chasing these ephemeral phantoms for close to 3 decades now. oh yeah [borrowing a scary topic from fortnows blog], godel, cantor, turing & post all eventually went more-or-less crazy. 😈

have been updating my quote pg with some new quotes, check it out! & there is an old quote by arthur c clarke that has been running through my head lately.

Any sufficiently advanced technology is indistinguishable from magic.

a solution of the big problems eg collatz problem or P=?NP would indeed seem like magic. but the proofs would look quite foreign and be initially met with great skepticism. and there seems to be a law that the bigger & more significant the problem, the stronger the crank magnetism! so in another sense,

Any sufficiently advanced claim/breakthrough is [at 1

^{st}/superficially] indistinguishable from crackpottery.

oh yeah, and a quote apparently actually originating in old narcotics anonymous literature

Repeating the same thing over and over and expecting different results is the definition of insanity.

so then what about trying *different* stuff? is that the essence of sanity? 💡

this next graph is quite graceful and occurred to me as in some sense seemingly an optimal way of outputting a random function with minimal reshuffling. in fact wonder if it can be proven it in some sense minimizes the output entropy wrt the buffer size available. (right now its a bit of an island, didnt have further ideas on it yet.)

def f(n) c = 0 n2 = n while (n2 >= n) n2 = n2 * 3 + 1 n2 >>= 1 while (n2.even?) c += 1 end return c end def take(l) i = nil 2.times \ { l2 = (0...l.size).select { |x| $up ? (l[x] >= $z2) : (l[x] <= $z2) } if (l2.empty?) then $up = !$up next end i = $up ? l2.min_by { |x| l[x] } : l2.max_by { |x| l[x] } break } $z2 = l.delete_at(i) p($z2) end l = [] j = 0 n = 3 t = 5000 $up = true $z2 = 0 t.times \ { |j| z = f(n) l << z take(l) if (l.size > 100) n += 4 }

*** * ***

this next sequence seems quite promising. its a very simple idea of removing the min value from the current (streaming) buffer and really seems to improve the output order moreso than just about anything tried so far.

def f(n) c = 0 n2 = n while (n2 >= n) n2 = n2 * 3 + 1 n2 >>= 1 while (n2.even?) c += 1 end return c end l = [] j = 0 n = 3 t = 3000 up = true t.times \ { |j| z = f(n) l << z if (l.size > 200) then i = (0...l.size).min_by { |x| l[x] } p(l.delete_at(i)) end n += 4 }

the above graph definitely gives me some great ideas. imagine that every stopping distance is possible over all integers but that short stopping distances are still abundant/frequent among higher integers. how could these be laid out? in this way it seems to have deep connections to the way that primes are irregular yet have short gaps between them no matter how large the numbers [as in the recent twin primes theory] but that larger gaps keep occurring also. of course if one could show a more direct connection, it would be a real breakthrough.

it seems each line has a frequency of occurrences. the bottom lines stay at about the same frequencies. the higher lines start out at a high frequency of occurrences and then taper off, but never fully decrease to zero frequency, only approach it asymptotically, in a sort of “decay process”.

the above graph also reminds me a lot of counting in binary, which few have remarked is a remarkable, somewhat canonical-looking fractal. the lsb digit has the highest frequency of change, the next-lsb has the next highest frequency, and so on.

*** * ***

hmmm, suppose that the horizontal row points were evenly staggered after their 1st occurrence, the plot would look like the following. it was generated by post processing the results and computing average row frequencies.

def f(n) c = 0 n2 = n while (n2 >= n) n2 = n2 * 3 + 1 n2 >>= 1 while (n2.even?) c += 1 end return c end l = [] mn = [] mx = [] c = [] n = 3 t = 3000 t.times \ { |j| z = f(n) n += 4 l << z next if (l.size <= 200) i = (0...l.size).min_by { |x| l[x] } y = l.delete_at(i) mn[y] = j if (mn[y].nil?) mx[y] = j c[y] = 0 if (c[y].nil?) c[y] += 1 } c.each_with_index \ { |y, j| next if (y.nil?) # p([j, mn[j], mx[j], c[j]]) d = (t - mn[j]).to_f / c[j] x = mn[j] while (x <= t) puts([x, j].join("\t")) x += d end }

that led to another rather wild idea that gets to the heart of these exercises. it would seem the goal is to create an identical function to the collatz stopping distances using known recursive algorithms and then prove the two are equivalent. of course this is an undecidable problem in general (for two arbitrary functions) but in a sense all major proofs in mathematics are about overcoming that limitation. [coincidentally/cyber-synchronistically, rjlipton was just blogging about an important new concept/lens for analyzing the prime number theorem, on pretentious functions which “pretend” to be other functions.]

the above graph with averaged gaps seems that it couldnt be all that difficult to reproduce nonrandomly due to its two main characteristics of (1) rows starting at roughly a linearly staggered start (or at least adhering to a nearly-sublinear formula) and (2) the frequency seemingly dropping off roughly linearly also.

so now given roughly similar graphs “pack2” and graph “pack2c” how would one compute a “difference” between them? there is a fairly natural 1-1 mapping between them based on “x” coordinate. every “x” coordinate is roughly unique in each graph (given that “pack2c” is plotted in floats without rounding to nearest integers). so now plot the difference in the index of corresponding points ordered by “x” coordinate (based on ordering points 1-1 by row, row point index in the two graphs)….?

its not totally ordered but its not totally random either…. the blurring is largely due to the shifting average row density that seems to move downward asymptotically toward constants after an initial spike…. (the 1^{st} graph is the general strong linear trend, and the 2^{nd} is the residual after subtracting the linear trend.)

def f(n) c = 0 n2 = n while (n2 >= n) n2 = n2 * 3 + 1 n2 >>= 1 while (n2.even?) c += 1 end return c end l = [] mn = [] mx = [] c = [] n = 3 t = 3000 l2 = [] a = {} t.times \ { |j| z = f(n) n += 4 l << z next if (l.size <= 200) i = (0...l.size).min_by { |x| l[x] } y = l.delete_at(i) l2 << y mn[y] = j if (mn[y].nil?) mx[y] = j c[y] = 0 if (c[y].nil?) a[[y, c[y]]] = j c[y] += 1 } l3 = [] c.each_with_index \ { |y, j| next if (y.nil?) # p([j, mn[j], mx[j], c[j]]) d = (t - mn[j]).to_f / c[j] x = mn[j] i = 0 while (i < y) l3 << [x, [j, i]] x += d i += 1 end } l3.sort! l2.each_with_index \ { |x, i| puts([a[l3[i][1]] - i, a[l3[i][1]]].join("\t")) }

visually its already fairly clear, but to quantify it further anyway, heres a plot of the average gap between points on the “pack2” graph by rows [inversely proportional to point frequency] showing the nearly-linear trend.

def f(n) c = 0 n2 = n while (n2 >= n) n2 = n2 * 3 + 1 n2 >>= 1 while (n2.even?) c += 1 end return c end l = [] mn = [] mx = [] c = [] n = 3 j = 0 loop \ { z = f(n) n += 4 l << z next if (l.size <= 200) i = (0...l.size).min_by { |x| l[x] } y = l.delete_at(i) break if (y == 100) mn[y] = j if (mn[y].nil?) mx[y] = j c[y] = 0 if (c[y].nil?) c[y] += 1 j += 1 $stderr.puts([j, c.size].inspect) if (j % 1000 == 0) } t = j c.each_with_index \ { |y, j| next if (y.nil?) # p([j, mn[j], mx[j], c[j]]) d = (t - mn[j]).to_f / c[j] puts([j, d.round(3), c[j]].join("\t")) }

Pingback: a new pair of collatz ideas & striking/promising new visualizations | Turing Machine

Pingback: collatz reverse tree traversal/ visualization revisited | Turing Machine