the last installment found a way to construct large “ufos” in the postdetermined region and in crisp clinical terms refuted some longrunning hypotheses, in more informal pov “throwing a wrench in the works” of big overall proof strategies painstakingly built up over years and showing both limitations and strengths of the overall empirical/ datamining approach. however, maybe there is a silver lining; there was a larger hypothesis that maybe escapes largely unscathed, ie in more dramatic terms can yet be rescued amidst some of the substantial wreckage/ smoking ruins (aka “easy come, easy go”™). a recent experiment found that measuring “slope correlation” in the postdetermined region gives high adherence to linearity (ie close to 1) even as iterates get larger. this was discovered with
bitwise optimization and never did convert that finding to the stronger/ more rigorous
hybrid optimization and it was in the back of my mind to do that.
looked over the hybrid code and had the urge to refactor it. the bit vector initialization was servicable but really funky. also had an idea to extend vectors at a few bit positions at a time, not just at the msbs. also came up with the idea of a corresponding/ symmetric shortening operator.
the basic experiment here was trying to minimize slope correlation ‘sc’ for larger iterates. however the naive code simply found a significantly low ‘sc’ for some small iterates and then didnt get past those smaller iterates, the search got stuck so to speak. while this finding is consistent with the hypothesis my real question was what the trend for ‘sc’ was for higher iterates. so then bit the bullet and did multiparameter optimization for this hybrid code, something that is novel wrt prior code/ experiments also. the multiparameter optimization is based on gaussian normalization recalculated every 100 samples. the optimization is to push up ‘nw’ the iterate bit width and push down ‘sc’ the slope correlation. the code reaches very high iterate sizes ~4.5K and yet cant push down ‘sc’ more than ~0.9684. the multioptimization means that there are smaller iterates with smaller slope correlations, but the algorithm moves past those as pushed to search higher iterates. so this general observation/ trend still seems “robust”.
the ufos found in the postdetermined region earlier have been on my mind and they seem to be aptly named because they are mysterious. its a strange emergent pattern that doesnt seem to fit into a lot of other analysis. most of the findings/ “momentum” of analysis is that the 0/1-runs tend to crunch in most situations. the ufos found were not numerous so far and the few found took a lot of processing to discover, and also as mentioned there may be some flickering circumstantial evidence that they could be limited in some way eg maybe to lower iterates, although a lot of other experience with the problem would tend to push against that, ie nearly any pattern seen at small scales seem to be seen at larger ones—actually even more than that, that idea/ general theme is a big part of the underlying motivating ideology of computational investigation techniques.
an idea with the last exercise was hopefully to be able the synthesize ufo type patterns. but some of the findings of that experiment were that after picking a local ufo pattern at random it seems to be hard to create a long prior sequence to it although it also suggested thats not limited to ufos. that code could typically only find pre-iterations in small counts. it was searching for special pre-iterations that are smaller than the final/ target/ pattern iteration bit width, which is maybe not the same as ufos in the postdetermined region, but maybe the same difficulty still holds, it would be nice to better understand those interrelationships. “further TBD”
this is an idea that occurred to me, building on last prior theme, to look into bit patterns connected to the subsequences of the parity sequence. for a glide there is a predetermined and postdetermined parity sequence. this looks at ‘w’ width bit windows over the iterate sequences, here w=10, for a single long trajectory. for the climb it is clear that the lsbs (in the iterate/ window) tend toward 1 odd. what about adjacent bits? this somewhat ingeniously shows they are indeed also tending toward 1s in 1st graph, ie the familiar 1-lsb runs/ triangles. the visualization is a tree diagram where the branches are 0/1 with 1 the upper and 0 the lower branch, and are easier to draw than one might guess; the code uses a “cumulate coordinate halving” idea. the left side is the climb/ predetermined range and the right side is the decline/ postdetermined range. color coding is by iterate # in the ranges. the 2nd graph is the windows over the msbs instead of the lsbs. close study shows some difference in the “ending” comb density from one side to the other with left side a little denser than right implying some kind of different distribution. the color distribution seems fairly even/ uniform.
ok, this is a sort of shocking discovery at the new year. it is almost obvious to look at properties of iterates wrt remainder by division. there was a years-ago look into factorizing. also there are years-old looks at the bit (base2) diagrams, and this was adapted to do base3 diagrams, but didnt notice anything unusual and left them unwritten-up. did not notice this until just now, but it might have been in those visual diagrams. was looking at some older glides associated with
mix30d. was curious about the mod3 behavior intra glide pre/ post peak, actually wondering about existence of repeated 3 factors. was quite surprised to find this major discrepancy/ differentiator. randomly sampling/ spot checking, these glide iterates were never divisible by 3! also, for the other two mod3 values, the distributions are different over the iterates pre and post peak. for climb the mod3=2 case is about 3x the mod3=1 case. for descent the ratio is closer to about 1.5. this is demonstrated in this simple code.
the last idea from 10/2018 was bouncing around in my thinking some, it left open a question about “degree of monotonicity”. it shows that the max 0/1 runs seems to be almost monotonically decreasing even in a climb/ glide. havent really looked at the monotonicity of the max 0/1 runs all that directly with the nonmono run length measurement. ofc all the recent analysis is related to monotonicity, but not exactly tied to nonmono run lengths. there is some relation. at this point am interested in finding long nonmono run lengths in the max 0/1 runs sequence. last months genetic algorithm code looked into this but for a fixed (“1d”) bit width. what would that look like for variable bit width (2d)?
the last “2d” genetic algorithm search code was
mix32 from 9/2018. the
mix series is aptly named because it mixes quite a few ideas across many algorithms. it has single dimensional genetic algorithms, bitwise optimizations, and 2d genetic algorithms (ie both within given bit widths and over multiple bit widths). am going to rename the 2d genetic algorithm here
got all kinds of cool ideas for refactoring/ streamlining working on it. it has some abstraction for all the initialization and combination/ crossover operators. it has a very sophisticated 2d algorithm that is similar but different from prior algorithms. it has no restrictions on expanding “bit bins” except that the expansion operator does it only by a max of 3 bits. it dynamically analyzes bit bins for fitness similarly to analyzing fitness within the bins based on top performing candidate in each bin, and combination operators work on the top performing bins. it currently throws out the most underperforming bit bin and currently keeps the total bit bins to 50 and the size of each to 50 for max 2500 candidates. the code enforces a minimum of the max 0/1 runs in each iterate, here 6. starting bit size 10. the recent
initw bit string initializer code turned out to be crucial after the minimum was added because the other initializers dont seem to create many iterates with sufficiently large max runs.