[draft] an inductive proof that SI isn't useful

Date: 2020-09-07

written 2020-09-04 I think.

I don’t know if I’ll ever finish this or think more about it. it’s an unfinished draft of an attempt to use mathematical induction to show you can’t choose useful alphabets with SI.

There’s some context about this in tutoring max 37.

the goal of this is to inductively show that there are some alphabets (the set of instructions for turing machines) which SI can’t use in a productive way.

maybe we’ll show all meaningful alphabets are these types of alphabets.

let’s take a trivial case for a Solomonoff Induction machine. it’s taken that we have a free choice of turing machine, so we’ll choose one where the bit string 1100 corresponds to a program that outputs 1 and halts. we need to generate input data, and to avoid complexity we use a deterministic evidence generation pattern. the meat of this algorithm is: we output 1 if the data matched our theory, and 0 if it didn’t. it’s trivial to see that for the case where we have a correct idea the program 1100 is optimal, and SI will rightly find it.

if the progam can read the data and generate it on the fly (perhaps generating probabilities of the next bit), that’s okay, it still works, we’ll just choose a program that e.g. throws away any extra input data.

this is the base case.

the issue with the base case is that the alphabet (and thus both the program and the evidence data) contains a sort of digest of some preimage. The alphabet isn’t useful because all the useful stuff was done already and baked in to the alphabet. The ideas that are baked into a alphabet are how we know what the program means; there’s a way to convert the program into other forms like words.

these sorts of alphabets (with baked in knowledge) aren’t useful for SI to use; it can’t use them to create knowledge. we can only use them to learn stuff we already knew (or could have ~easily found out by doing like statistical analysis or something).

we should be able to get a useful program by changing the alphabet. Particularly, we can replace the symbol for “the experimental data matched our theory” with some combination of other, more foundational symbols. This can be done in slow steps, but the principal works over bigger looking jumps too. Maybe we choose symbols so we can represent data generated by the idea “we measured photons coming from XYZ coordinates and got [data…]”. if we choose an alphabet that bakes in ideas about photon generation activities (like what might happen across the room or in other stars), then the work of figuring out what caused the photons is already done! besides, we know we can’t choose alphabets that bake in that sort of knowledge. Right now though, there’s lots of “baked in” knowledge, like what photons are, how they interact with stuff, how they get created, what coordinates are, what a room is, etc. It’s not that SI wouldn’t find a program, it’s that we wouldn’t ever learn anything from it we didn’t already know before starting.

in general, any time we try to modify the alphabet we’re using by replacing some symbols with others (usually more of them), if the new symbols contain information in terms of constituent ideas at a theoretical level, then we won’t get a useful program out. we will have already done the creative stuff via the design and choice of the alphabet.

However, without such an alphabet we don’t have a straight forward way of figuring out what the program does. There’s no way for us to know anything from it. E.g. without a specific alphabet to predict how light acted with mirrors; without the ideas of ‘angle’, ‘incidence’, ‘reflect’ baked in, SI & variants wouldn’t ever produce the law of reflection in a straight forward program.

(note: SI+ can produce some not-straight forward programs via jumps in the usefulness of certain abstractions which I’ll mention in a bit)

You can leave a comment anonymously. No sign up or login is required. Use a junk email if not your own; email is only for notifications—though, FYI, I will be able to see it.