A toy model of tool learning

Over the Summer, I did a bunch of ad hoc jobs involving various new technologies and tools that I hadn’t really touched before, including Javascript, SQL, various Java tools, Web application deployment, and specific code generators.

Since I had relatively little background knowledge of most of the above, accomplishing various tasks with them involved, by necessity, a condensed cycle of learning and applying that knowledge to the problems at hand. I repeated this process a number of times, refining it a bit on the way. Sometime after the end of all those jobs, I discovered a pattern in the intertwined process of learning and using tools to accomplish goals, tying everything together with basic AI concepts.

Basically, my toy model for tool learning involves two phases:

Learning about what kind of things can be done using a given tool and how, and
Considering different combinations of learned actions to further one’s goals (this can be modelled as an informed traversal of a search space).

To illustrate the tool learning model, I choose a problem that involves learning a tool and using it to further accomplish some objective. The first example that comes to my mind involves text editing, so I’m going to use that.

Suppose we then want to write some sort of structured text using some programming language. There is necessarily a degree of duplication in the contents of the expected final result but it would take far more effort to write a code generator for the same task.

So, taking a text editor (say, Emacs) as our tool, we have a number of general operations we might want to use it for, including:

normal character-at-a-time text input,
copy/paste,
macros (i.e. recording the application of any other operators here and playing them back a number of times),
search and substitution,
regexp,
navigation (around the file/block/paragraph/line/word), and
uppercasing/lowercasing/capitalising words

Since this is a toy model of tool learning, I will assume we know little about text editing beyond the very basics (think Notepad), but that we have a sense of how efficient different ways of doing things are (which could be estimated by the (expected) amount of duplicity at various levels in the code/document).

Now we have a rough idea that using something like Emacs would be a good idea for our structured text-editing task. We don’t know anything about Emacs yet, but we have some idea about the possible improvements we can make over ‘baseline’ text editing (character at a time and copy/paste using a mouse too).

We can look at this problem solving process involving the tool as an ‘anytime’ search within a search space defined by text-editing operations in the editor and combinations thereof. We can choose between the various operations using a heuristics such as how much duplicated effort a combination of operations would probably involve. The search space model can be applied at different levels of abstraction, from the document-level to the line and word-level.

Below, I have created three ‘snapshots’ that would hopefully model what goes on the mind of some ‘average’ problem solver. Here, the numbers in the brackets should roughly indicate the expected value of applying each operation, whether it’s object-level or meta-level (learning about the domain/tool itself). I have omitted any concrete examples of such editing tasks I am unable to simplify them sufficiently, and thus, none of the graphs are about any situation in particular, but would hopefully reflect some generic editing tasks as the document/code progresses from nothing to its final version through a series of refactorings.

In the beginning, our abstract search space for some low/mid-level editing task might look like this – we need to start by writing something, but we have a hunch that we can do better after that by learning to more about the editor

After we’ve created the first bits of the code without any fancy features, we shift our focus to the editor’s documentation and learn about things like regular expressions in order to better adapt existing structures to new purposes (through duplication).

Another stage that would occur in many code editing tasks is when you’re more or less finished functionality-wise, but soon discover that much of the code needs to be refactored, edited and tweaked since the your model of the system deviated from how things actually worked. You have a hunch that more editing power would avoid much waste here, and soon discover keyboard macros, leading to the final snapshot of the problem solving process:

And there it is, my toy model of tool learning. Hopefully the examples convey something, and hopefully what I had in mind gets communicated with all this. Though I probably did not emphasise it enough in the body, what I was really interested in with this model is the fact that we in fact appear to execute some sort of informed search to actually use tools like here, going however deep and regressing to “brute force” (uninformed search) attempts to try everything only when all else fails.

Elias@Kunnas.com

A toy model of tool learning