Recursive Grid Architecture

Working Picture

A grid contains n percepts arranged in a configuration C_i. The percepts are generated or governed by a process P_i, and the grid exposes an outside state f(C_i).

An observer looks at the grid and acts through something like F(S, C_i), where S is a strategy produced by a learner L. The important move is that the observer is not special to one level. A grid can be placed inside a larger grid whose percepts are themselves smaller grids.

At the higher level, the smaller grid is only visible through its outside state f(C_i). Then the same idea repeats.

Recursive Placement

The poetic version is that f(C_i) maps a lower-level configuration to a coordinate in the larger grid. The larger space might be a lattice, a continuous space, or another structured substrate. The lower grid becomes one percept among many at the next scale.

That gives a recursive picture:

Local percepts form a configuration.
The configuration exposes an outside state.
That outside state becomes a percept for a larger configuration.
The same learner-facing problem can appear at many scales.

Why The Symmetry Matters

Certain observations should present an identical learning problem to L even when they occur at different levels. The substrate is the same: a grid, a configuration, an outside state, a strategy, and a learner. If a strategy S is strong at one level, it may be reusable at another.

That is the main promise: massive transfer of learning because the learner keeps encountering the same kind of problem in differently scaled clothing.

A Simpler Cousin

A simpler version is to make f(C_i) the thing to be learned directly. If the percepts are image patches, the operation begins to look like convolution: local percepts combine into a new representation, which is positioned into a larger grid, and so on.

But the more interesting version is not just a convolutional network. It is a highly symmetric recursive computation where multiple percepts can observe the same process, and transfer can happen bottom-up, top-down, or diagonally between arbitrary grids.

Beyond Backprop

If the base operations are differentiable, one could train the system with backpropagation. But the symmetry suggests that ordinary backprop is probably not the only interesting algorithm.

Assume access to H(G_i), the history of configurations for a grid. If two grid histories are similar, future trajectories could be coupled. One grid might be copied into another, or one computation might stand in for another, saving work. This opens a route toward learning algorithms that exploit repeated structure rather than treating every subproblem as independent.

Adjacent ideas include fractal networks, task decomposition, and work on adding idealized structure to learning systems such as this task-based decomposition paper. The direction I care about is the possibility of transfer between any two sufficiently similar grids, not only between neighboring layers.