Pay Attention

The interplay of Reasoning’s Attention and the emphasis Recognition surfaces in Context is fertile ground for research.

[This post is a bit more speculative than most here. Comments and Discussion are welcome! We’ll return to our themes in the next post.]

It’s reasonably well known that System 2[1] – Reasoning in our architecture – tends to act like a single-threaded microprocessor, working on one thing at a time.  Said another way, Reasoning pays Attention to one thing at a time.  That Attention is a precious resource… limited and costly (in biological effort or processor cycles).  Therefore, managing Attention is an important task.

The task at hand for Reasoning may not be constrained by the reality of the moment. When free of immediate physical needs, Reasoning may work on planning or conceptual tasks for instance (or Bitcoin mining for GPU systems).  When Reasoning is constrained by reality, that reality is conveyed to it as Context from Recognition.

As discussed in A Bit of Context, biological systems tend to convey Context hierarchically.  This is a deeply evolved survival skill: it’s better to take action immediately relative to something big moving toward you rapidly than to wait to recognize if it’s a car, truck, elephant or tiger!  In fact, we see that Recognition not only conveys things hierarchically, but also has interrupts – in the microprocussar sense – to change the focus of Attention.


That got your attention, right?  In fact, it’s entirely likely that you did not even finish reading the last sentence of the last paragraph before your eyes were drawn to that text and your Attention shifted.  (Did you notice the misspelling in the last sentence. Go back and read it carefully.)

So, in biology, we see an almost adversarial relationship between what System 2 wants to think about (i.e., literally pay Attention to, which, because System 2 is lazy, could be “nothing”) and what System 1 perceives as important.  We see the same in modern operating systems, with a scheduler for high-level processes (including an idle process) and interrupts for device drivers that need servicing… that need attention.  What can we learn from this for AI systems?

That’s a good question, and this author is neither a researcher nor skilled enough in the technology to answer it. However, the question points to a number of interesting lines of inquiry:

  • Generative Adversarial Networks (GANs) often work on the problem itself… for instance, how to win at Go. What if the “problem” instead is to optimize Attention. Remember, Attention is expensive, so optimizing it under the constraints of limited capacity and proper prioritization of Context (“tiger!”) is interesting… especially if the optimization needs to be continual rather than static. How can we learn what to pay Attention to?
  • Relating to the above, what might be learned about the hierarchy of Context? Specifically, how do different hierarchical representations lead to better optimizations of Attention?[2] Wouldn’t it be interesting if the answer turns out to closely match Maslow in some way?
  • What might be learned about how to prioritize information from Context? For a given signal, you might want an interrupt, or you might want Reasoning to poll the status, or (for at least some things) you want to omit it completely unless it changes.  This is clearly important, as the basics have been built into animal systems strongly by Evolution.
  • Relating to the above, can anything be learned about states of Attention to Context?
    • Distracted (Reasoning is working on other things, conceptual tasks, or nothing)
    • Polling (“Keeping an eye on” some aspect of Context)
    • Active (“Prick up your ears” to focus Attention, perhaps with uncertain Context)
      • Active Attention can be requested by Reasoning in conditions where it can’t resolve inconsistencies in the Context… when “something does not feel right.” See Star Trek for example.

We look forward to reading about research in these topics. The interrelationship of Context and Attention is clearly complex and dynamic… with a lot of questions to answer!




[1] D. Kahneman, Thinking Fast and Slow, 2011.

[2] We know what happens when nonhierarchical representations lead to indecisiveness in Reasoning.