went a little while with no idea what to try next, then came up with some simple idea. earlier it was found that nonmonotone run lengths on left/ right side of the glide are unbounded. but then wondered a bit, in a sort of long-shot guess, that maybe this is a bit of trick going on (this is also hopeful in the sense that the transducer framework is very powerful working against “constant-count-or-less” computations, ie TM ID sequences or other basic arithmetic operations such as addition/ multiplication by constant). a later finding was that the div2 operations count compared to initial seed bit width may be quite significant. so maybe it makes sense to look at “left” and “right” of this key point, namely the trajectory index where div2 operations equals initial bit width.
since it seems to be something significant call it the “div2 trajectory index” or just “div2 index”. my 1st question was to look at nonmono run length in the right side of the div2 index wondering if it was bounded. (my general suspicion aka “hypothesis” is that maybe the behavior of the semirandom walk is significantly smoothed out after the div2 index and a lot of the randomness is “left” of it and attributable/ correlated to the initial seed. but also another meshing pov is that this randomness can allow greater flexibility for arbitrary structures eg unbounded nonmono run lengths on either left or right of the trajectory max.)
❗ 😮 for at least this specific measurement (but also supporting the general idea), this indeed turned out to be the case (2nd graph, dual plot, ‘mx’ right side scale), but then noticed something possibly more significant, that just the simple length of the right div2 index side/ “trail” seemed not to be unbounded. so then was just looking at trying to increase right div2 index “trail” length alone. sure enough, it seems to be bounded! this code counts/ optimizes 4 parameters, 3 of them sequence counts. ‘ls’ is glide length, ‘ls2’ is right-of-max length, ‘j’ is max index, and ‘ls3’ is right-of-div2 index trail size. (omitting ‘j’ seemed to lead to a graph with ‘ls’ ≈ ‘ls2’.) 2nd code optimizes ‘ls’, ‘ls3’, ‘mx’. (note for each trajectory mx < ls3.)
(11/17) have had this idea for many months and finally decided to get around to it (worthy of a new filename at least), maybe procrastinating somewhat since the existing search algorithms seem to be very highly performant. the idea is to use a genetic algorithm on the search space in a simple way, where starting seeds are the bit vectors operated on, and theres a simple mutation and crossover operators. mutation is simple, a single bit flip. the crossover operator takes the random-width “left” bits from one parent and “right” bits from another parent. the child may have fewer or more bits depending on the random lengths selected.
this is 100K iterations that runs in about the same time (in contrast to the 10K operations in previous algorithms). it runs faster per iteration because it maximizes the GA candidate array at 500 throwing out lowest rated candidates. run time for this and prior algorithms was ~12m. the GA graph is only showing a subset of the iterations limited to 2000 points spread evenly over the full 100K iteration run (ie one point every 50 iterations). the trend lines are much more noisy but this approach may “in general” get out of local minima effect better (aka “painting into a corner”). the line width on ‘ls3’ is thickened for visibility. the algorithm only finds a max of ~12, not competitive with the prior algorithm that found ~30 (2nd-to-last) and ~50 (last).
my immediate next idea was to smooth out the successive bit width of new candidates (eg limiting it to no more than 1 larger than longest parent) and also have an idea of creating a buffer of new candidate solutions that also might smooth out the traversal/ trendlines more. both ideas hopefully would also improve the final optimization results which quite frankly seem kind of lackluster so far, but nevertheless do add evidence for the inherent hardness of pushing up ‘ls3’ much. 😐
😳 changed the rank logic slightly to add sum-of-abs values instead of sums only, thinking it was correct, but now think it was a mistake wrt current objective that is actually causing algorithm to seek out minimums in the fitness metric as much as maximums. could be useful for some other purpose but was not as intended in this case. its presumably also contributing to the noisiness of the plot, will have to delve further into how much.
(11/18) this is the same code with only the sum-of-abs logic removed again and it does seem to improve the plot trendline coherence/ decrease noisiness. ‘ls3’ reaches ~20 in the middle but then the algorithm fails to increase it as it flatlines/ declines to nothing. in this plot ‘ns’ is log scale on the right because it seems to climb nearly exponentially in the second half of the plot. apparently in the middle there is a sort of phase transition where nonzero ‘ls3’ values become very rare and the algorithm focuses on increasing the other parameters.
as planned this code limits the new candidates to bit length no longer than 1 more than the longest parent. the code was surprisingly tricky to get right. it avoids bias in the sense of picking either left or right sides of the smaller and larger parents. ‘ls3’ max ~30 found again in the middle and looks like increasing ‘ls3’ are scarce(r) after that.
this is the code with a 100 size lookahead buffer. instead of having two different arrays, the lookahead buffer is contained in the regular array and an indicator var ‘a’ indicates whether its the buffer or main array (true, false respectively). the code ran for ~200m and was outputting status info every few minutes, but then seemed to lock up! halted it after 30m of no status info. not exactly sure what happened, think the GA might have painted itself into a corner through loss of genetic diversity? very strange/ hard to explain! and not sure if it can be replicated!
the algorithm succeeded in finding the highest ‘ls3’ value so far of ~80 right before the end. there is a very gradual upslope in ‘ns’ the seed bit width (black line). unfortunately this run tends to suggest that its difficult to make any conclusions about the overall trend in ‘ls3’ because its so spiky even with so much computation behind the data, and some spikes/ outliers are far above other apparent trendlines. note plot statistic orders are rearranged here a little to display better wrt overlap. this graph was generated by filtering out all points with ls3<10.
(11/19) ❓ ❗ ❓ this one ran ~260m overnight and stopped finding new candidates also. ?!? whats going on here? unf its not so simple to figure out, some special/ extra code needs to be added to isolate. its so rare that algorithms lock up that rarely have to code for it. which loop is locking up? there is an inner loop that avoids previously seen candidates and 0 vectors. maybe outputting the candidate solutions at intermittent points will isolate it but one has to be careful where that logic is put (most inner loop is safest) otherwise it might be outside the loop. also its worth trying larger starting seeds. ~50 max ‘ls3’ found in middle. the long run also shows more of an upslope trend in ‘ns’. (am also wondering how much effect the lookahead buffer has, wonder if it is discernable when switched on vs off.)
(11/21) the surprising-at-the-time results starting out this blog revealed that the nonmono run length story is not so simple as some earlier results that may have been misleadingly interpreted. it looks like theres more to the story and nonmono run length may be “better behaved” than unbounded. how else to look at this, whats another angle?
this idea is to look at the constant-length trail end of longer trajectories and see if the nonmono run length can be maxed out within that region. it was just an idea, but it turns out to have an apparently interesting answer. the idea is also related to some earlier evidence that maybe the earlier parts of the trajectory are more “random” than the later parts. there are several elements to that evidence, including the older concept of the “core”. for this code ‘ls3’ is the count of the trail end of the trajectory limited to 50 max. the code does indeed have trouble finding ‘mx’ (yellow) that are maximized in that region, and the max ‘mx’ fall off even as other metrics increase. this code attempts to maximize ‘ls’, ‘j’, ‘ls2’, ‘mx’. this is back to using the single-bit advance optimization logic which is generally such a great workhorse and does not seem to be unreliable/ deceiving so far.
(11/22) this is a minor modification that does 20K iterations and limits the lookahead array to 1000 elements by deleting the least-ranked one after reaching 1000, and uses a 100 max trail size instead of 50. however, results are very similar, ie ‘mx’ increases to the 100 limit and then falls/ drops off, suggesting that maybe this is somehow due to the optimization algorithm painting into a corner. the delete logic is slightly tricky because the order of the least and most-ranked items in the array can be “either way” and the larger is deleted first so that offsetting of the 2nd-to-delete index is prevented in the opposite case.
on further analysis the prior results seem ok/ correct but maybe a more natural way to explore this constraint/ bound is to look at nonmono run length in the declining part of the glide only, ie after its max index, and not over the whole glide. by definition, essentially the nonmono run length of the glide is nearly its entire span. so prior results somewhat mix this up because its not so clear when the declining part of the glide is larger than the max trail size. that simple change gives more understandable/ intuitive/ clear results. here then it looks like nonmono run length of the final 100 iterates of trajectories are bounded by ~50, again with max trail size 100. ‘mx’ is plotted in black instead of yellow for better visibility.
(11/23) ❗ a 2nd run gave similar results (didnt save the graph, now wishing differently!) but a 3rd run gives much different results! the max index ‘j’ flatlined and ‘mx’ increased in an apparently unbounded way only limited by max trail size 100.
a natural idea is to look at a series for different max trail sizes labelled ‘s’ in the following code, in this case [20,40,60,80,100], dashing the hopes of a bound and showing that earlier findings of one were not an actual property, and that ‘mx’ basically approaches ‘s’ in all cases, although there is some nuance such as falling back (s=100,s=40) and an intermediate plateau (s=80). am wondering if this has something to do with ‘j’ flatlining at low value which on closer inspection happens in all the runs and not sure what those trajectories could even look like! ❓ 😦
(11/25) figured might as well try to hook up the same code to the genetic optimizer and see how it does/ goes. here are two very long runs. there is a new variable ‘m’ that determines how aggressively the bit vector crossover tries to add bits larger than one of the parents which will influence (increase) the slope of seed bit sizes for higher values (red), here m=5. it was added because earlier runs with m=1 did not create much of an upslope. but the upslope is also inherently limited by finding new candidates that exceed prior ones in the ranking evaluation function (which here consists of all parameters). the 1st run is ~9.5hr, s=50, max mx=~45. the 2nd was ~11hr s=100, max mx=~70. in both graphs, a few million points were generated and the graphed points were filtered with mx>~10 (iirc) and mx>30 respectively to get plotted points down to a few thousand.
the results are not so easy to interpret. it appears that ‘mx’ possibly increases without bound, slower than ‘ls2’ (green), but large values seem very spiky/ rare. at least some trends are apparent in ‘ns’, ‘ls2’, but values like ‘ls3’ which (maybe) increase without bound but are rare tend to push the empirical approaches to their limits (so to speak), in the sense that trajectories with “weird” or “exotic” properties exist but maybe are very difficult to find with optimization/ search/ gradient ascent algorithms (which are all essentially “incremental” in character)… 😐
(12/2) here is a riff on the earlier work and some improved postprocessing. it makes more sense to create a simple metric ‘ns2’ that is the count of div2 operations in the glide minus the seed bit size. this is the same as the earlier metric for positive values but also generalizes to negative values.
the other improvement is in filtering logic. had been using a primitive bash+awk script to filter out values. but then realized this needed some algorithmic treatment. was hand-choosing/adjusting the cutoff threshhold to limit the # of plot output lines to ~2000. but for this recent graph, that did not stagger the line samples, because most of the skipped lines (points) were to the left and more of the matching lines were to the right, leaving a kind of unaesthetic gap in the plot.
so this postprocessing/ filtering code is written in ruby and is a nice algorithmic approach as follows. it chooses every nth point to get about 1000 points from a staggered sample. then another ~1000 points are chosen such that a threshhold on the particular column of interest is automatically adjusted, adding to total ~2000 lines/ points. this leads to a nice coherent plot and a generalized visualization strategy for somewhat difficult data that has outliers (aka “is spiky”).
the results are even more compelling. ‘ns2’ the difference between div2 count ‘c’ (red) and seed bit size ‘ns’ (green) maxes out at ~100 in the middle of the plot and then clearly declines. its peak values are spiky but the decline is still quite evident. the optimizer is including all the parameters. the plot code is rather involved but oh well. (also it looks like wordpress finally fixed their editor to not mess up code snippets with html reformatting. hallelujah! 😀 )
plot 'out2.txt' using (column('ln')):(column('c')) title 'c','' using (c olumn('ln')):(column('ns')) title 'ns', '' using (column('ln')):(column('ls')) ti tle 'ls', '' using (column('ln')):(column('ns2')) with impulse title 'ns2',0
was wondering about the distribution of nonmono run lengths inside of glides. maybe need to develop this further for more insight (eg color coding, histograms etc) but here is some early attempt at visualization. this generates 2000 glides using the same logic and plots the nonmono run lengths as a percentage of the glide position and length. it seems to show essentially that long (nearly maximal) nonmono run lengths occur at any location in the glide but maybe “density” of them is smaller to the left. so far it is not easy to figure out how to deal visually/ mathematically with the different length trajectories wrt this analysis. the overplotting aspect is also possibly misleading etc (long bars masking short bars). there seems to be some early incline and some late dropoff at ~0.9 that could use a further look.