Many problems in the world today can be solved using the spatial problem solving approach.

Diagram showing numbered steps for the spatial problem solving approach

1. Ask and explore

  • How many are in an area?
  • Which sites meet my criteria?
  • What are the characteristics of an area?
  • How is it distributed?
  • What is near what?
  • What is on top of what?
  • How is it related?
  • Explore and visualize your data to refine and scope the question that you want to address. Exploring your data will shed light on aspects of the question that you may not have considered, prompting you to further refine your question.

2. Model and compute

  • Choose an analysis tool to transform your data into new results or build a model of multiple tools to feed the results of one tool into the next.
  • Process the data analytically to derive essential information that helps you answer your question.

3. Examine and interpret

  • Manipulate and display the results of your analysis as information products, such as maps, reports, charts, graphs, and information pop-ups.
  • Seek explanations for the patterns you see and speculate about what they might mean from a spatial or temporal perspective.
  • Assess whether the results of the analysis provide an adequate answer to the question you asked. If not, you may need to adjust your approach. Is your question too broad or too narrow? Do you require more or different data? Should you use more or different analysis tools?
  • Determine whether assumptions about the data, analysis methods, and mapping methods would alter the results. Also consider what artifacts of the data, analysis, and mapping processes deserve special attention.

4. Make decisions

  • Document your interpretation of the analysis results and decide how to respond.
  • In some cases, you can take action based on your interpretation of the results. Implement a solution, correct a situation, create an opportunity, or mitigate circumstances.
  • In other cases, no action is required because your goal was to build knowledge and gain a deeper understanding.
  • Often new questions arise that need to be addressed. These new questions will often lead to further analysis.

5. Share your results

  • Identify the audience that will benefit from your findings and determine who you want to influence. Then use maps, pop-ups, graphs, and charts that communicate your results efficiently and effectively.
  • Share those results with others through web maps and apps that are geoenriched to provide deeper explanation and support further inquiry.

This description of the spatial problem solving approach is a simplification, in large part because problem solving isn't linear. The actual process will be much more involved. You will iterate, diagnose, and review as you gain new insights and understanding along the way, prompting you to rethink your approach.

A short 20-second video with various animated maps showing scenes and data from all over the world

Problem-solving with a geographic approach

As we confront the greatest issues of our time, one factor is crucial—geography.

What is the geographic approach?

Our most serious challenges—such as climate change, sustainability, social inequity, and global public health—are inherently spatial. To solve such complex problems, we must first understand their geography.

The geographic approach is a way of thinking and problem-solving that integrates and organizes all relevant information in the crucial context of location. Leaders use this approach to reveal patterns and trends; model scenarios and solutions; and ultimately, make sound, strategic decisions.

An animated digital map of Europe showing pollution hotspots in orange and yellow, overlaid on a photo of diplomats meeting to discuss policies

Monitoring the earth’s health

The European Environment Agency tracks air quality and pollution levels to better inform policy decisions across the continent.

A geographic approach provides clarity

Geography is a way of pulling all key information about an issue together, expanding the questions we can ask about a place or a problem and the creative solutions we can bring to bear. 

Science based and data driven

A geographic approach relies on science and data to understand problems and reveal solutions.

Holistic and inclusive

A geographic approach considers how all factors are interconnected, uniting data types by what they have in common—location.

Collaborative

Maps are a powerful foundation for communication and action—a way to create shared understanding, explore alternatives, and find solutions.

A digital animation of the Port of Rotterdam with traffic illuminated as it moves around the port, overlaid on a photo of a man next to a shipping container entering information onto a tablet

Making the most out of complex data

Mapping all kinds of data about a system such as the Port of Rotterdam offers a full perspective, revealing opportunities to operate more efficiently.

Mapping transforms data into understanding

With so much data having a location component, a geographic approach provides a logical foundation for organizing, analyzing, and applying it. When we visualize and analyze data on a map, hidden relationships and insights emerge.

Geography delivers a dynamic narrative

Maps tell stories about places—what's happening there now, what has happened, and what will happen next.

Maps are an accessible analytic platform

Maps help us grasp concepts and tap into a visual storytelling language we intuitively understand.

High-resolution imagery comes to life

When viewed on a map, imagery transforms from static snapshots to compelling stories that enhance understanding.

A short video shows a true to life 3D version of San Francisco, then zooms in and changes to white building renderings with areas highlighted in pink along transit lines, overlaid on an image of someone riding a bicycle

Visualizing how to improve mobility

This 3D map of San Francisco demonstrates how walkability and transportation access (shown in pink) improve with planned transit service expansions.

Cutting-edge technology magnifies the power of geography

Geography is being revitalized by a world of sensors and connectivity and made more powerful by modern geographic information system (GIS) software. With today's sophisticated digital maps, we can apply our best data science and analysis to convert raw data into location intelligence—insights that empower real-time understanding and transform decision-making.

An animated dashboard shows a simplified view of New York City with the live locations of buses and traffic incidents

Managing real-time operations

This live dashboard view of buses and traffic incidents in New York City combines historical and real-time data to avoid delays and keep people safe.

Global challenges require a geographic approach

Sustainability, infrastructure, climate impacts.

Leaders use a geographic approach to guide the most successful sustainability projects and actualize resilience.

A map of Southern California with areas marked in red to show the results of a green infrastructure analysis

A geographic approach to planning, prioritization, and operations helps leaders understand how infrastructure projects relate to surrounding environments.

A detailed 3D vector model of Cincinnati, Ohio shows buildings and individual contoured trees to help inform planning for 5G networks.

Leaders who need to understand climate change impacts rely on a geographic approach to build actionable climate change solutions.

A map of the Pacific Northwest shows air quality in colors ranging from green to dark red, poor air quality being a result of wildfires across the region

Location matters more than ever

Geographic knowledge creates essential context. We can't manage our world without it—whether it's global supply chain issues, equitable internet access in the US, or energy consumption for a multinational company. As we work together to address today’s challenges, a geographic approach, powered by GIS, will help map the common ground we need to inspire effective action.

Applying a geographic approach across all sectors

Businesses use a geographic approach to streamline operations, develop strategy, and achieve sustainable prosperity.

Governments use a geographic approach to build resilient, equitable infrastructure and improve disaster preparedness.

Nonprofit organizations use a geographic approach to maximize their effectiveness and make the most of limited resources.

Find out how a geographic approach can elevate your organization's work.

Spatial Problem Solving in Spatial Structures

  • August 2017
  • Conference: The 11th Multi-disciplinary International Workshop on Artificial Intelligence (MIWAI 2017)

Christian Freksa at Universität Bremen

  • Universität Bremen

Ana-Maria Olteteanu at Tomorrow University

  • Tomorrow University
  • This person is not on ResearchGate, or hasn't claimed this research yet.

Jasper van de Ven at Universität Bremen

Abstract and Figures

Structure of a full cognitive system 1 .

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations

Beate Krickel

  • Spatial Cognit Comput

Chayanika D. Nath

  • Thomas Barkowsky

Christian Freksa

  • Christian Freksa
  • Holger Schultheis
  • Lect Notes Comput Sci

Erich Rome

  • Joachim Hertzberg

Georg Dorffner

  • George Pólya

George Lakoff

  • Hubert Dreyfus
  • John Haugeland
  • J. Haugeland
  • Jae Hee Lee

Till Mossakowski

  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up

A conceptual model for solving spatial problems

Available with Spatial Analyst license.

Once you have identified what type of model you need to create to solve your problem, you should then identify the set of conceptual steps that can be used to help you build that model.

  • Step 1: State the problem
  • To solve your spatial problem, start by clearly stating the problem you are trying to solve and the goal you are trying to achieve.
  • Step 2: Break down the problem
  • Once the goal is understood, you must break down the problem into a series of objectives, identify the elements and their interactions that are needed to meet your objectives, and create the necessary input datasets to develop the representation models.
  • For example, if your goal is to find the best sites for spotting moose, your objectives might be to find out where moose were recently spotted, what vegetation types they feed on most, and so on.
  • In the moose spotting example, known sightings and vegetation types will be only a few of the elements necessary for identifying where moose are most likely to be. The location of humans and the existing road network will also influence the moose. The interactions between the elements are that moose prefer certain vegetation types, and they avoid humans, who can gain access to the landscape through roads. A series of process models might be needed to ultimately find the locations with the greatest chance of spotting a moose.
  • Input datasets might contain sightings of moose in the past week, vegetation type, and the location of human dwellings and roads.
  • The overall model (composed of a series of objectives, process models, and input datasets) provides you with a model of reality, which will help you in your decision-making process.
  • Step 3: Explore input datasets
  • It is useful to understand the spatial and attribute properties of the individual objects in the landscape and the relationships between them (the representation model). To understand these relationships, you need to explore your data. A variety of tools and mechanisms are available in ArcGIS Pro with which to explore your data, including symbolization and creating charts .
  • Step 4: Perform analysis
  • In the moose spotting example, you may need to identify the tools necessary to select and weight certain vegetation types, buffer houses and roads, and weight them appropriately.
  • Step 5: Verify the model result
  • Check the result from the model in the field. Should certain parameters be changed to give you a better result?
  • If you created several models, determine which model you should use. You need to identify the best model. Does one particular model clearly meet your initial goal better than the others?
  • Step 6: Implement the result
  • When you visit the locations with the greatest chance of spotting moose, do you in fact see any?
  • Many times, there are conflicting objectives and evaluation criteria that must be resolved before a result can be agreed on. See GIS and Multicriteria Decision Analysis by Jacek Malczewski for more information.
  • Apply the conceptual model

There are many possible applications for this approach to problem solving. The following topic provides an example where the conceptual model was used to solve a siting problem:

  • Use the conceptual model to create a suitability map

Malczewski, J, GIS and Multicriteria Decision Analysis , 1999, Wiley & Sons

Related topics

  • What is the ArcGIS Spatial Analyst extension?

Feedback on this topic?

In this topic

Spatial Problem Solving by Chris Mickle

Spatial Problem Solving by Chris Mickle

Welcome to the Spatial Problem Solving by Chris Mickle web site!

This website was created to provide visitors with an overview of several techniques for solving spatial problems using esri ArcGIS.

The themes discussed in this website are based on those themes taught in the Spatial Problem Solving course that is part of the curriculum at the College of Natural Resources in the Master of Geospatial Information Science and Technology (MGIST) program at North Carolina State University (NCSU - Go Pack!!)

Course Overview: GIS520 Solving Spatial Problems focuses on challenging students with advanced spatial problems.  These problems are addressed with a specific approach to understanding the problem, identifying data and information resources available, and giving the students the tools and techniques using esri ArcGIS to solve the problem.  The course provides a framework and technique for working through problems, organizing ideas, and selecting the most appropriate spatial resource tools available for solving problems from a geographic information perspective.

Students learn to solve spatial problems through advanced analysis using geospatial technologies, learn to integrate and analyze spatial data in various formats, and explore methods for displaying geographic data analysis results to guide decision making.

spatial problem solving approach

Course Reflection: The GIS520 Spatial Problem Solving course has provided a fresh perspective and approach for spatial problem solving.  The geopsatial courses part of the Center for Geospatial Analytics at NCSU provide instruction on tools and techniques that provide the skills to use GIS.  This course has also provided a framework for properly anlayzing the problems, available data, and appropriate tools to use for solving a geospatial problem.  The course has built on previous courses that taught the fundamentals of using GIS, documenting the GIS work process, developing GIS models, and tools and techniques associated with environmental remote sensing.

In addition to re-inforcing the process for analyzing and documenting the spatial problem solving process several of the spatial analysis tools covered have direct applicability in my day to day work as an environmental consultant.  This course has required us to focus on the application of the tools and techniques taught to our own profession and think of additional problems that geospatial techniques can be applied to.

spatial problem solving approach

Tools and techniques for solving geospatial problems.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 27 January 2023

Foundations of human spatial problem solving

  • Noah Zarr 1 &
  • Joshua W. Brown 1  

Scientific Reports volume  13 , Article number:  1485 ( 2023 ) Cite this article

1897 Accesses

2 Citations

8 Altmetric

Metrics details

  • Cognitive neuroscience
  • Computational neuroscience

Despite great strides in both machine learning and neuroscience, we do not know how the human brain solves problems in the general sense. We approach this question by drawing on the framework of engineering control theory. We demonstrate a computational neural model with only localist learning laws that is able to find solutions to arbitrary problems. The model and humans perform a multi-step task with arbitrary and changing starting and desired ending states. Using a combination of computational neural modeling, human fMRI, and representational similarity analysis, we show here that the roles of a number of brain regions can be reinterpreted as interacting mechanisms of a control theoretic system. The results suggest a new set of functional perspectives on the orbitofrontal cortex, hippocampus, basal ganglia, anterior temporal lobe, lateral prefrontal cortex, and visual cortex, as well as a new path toward artificial general intelligence.

Similar content being viewed by others

spatial problem solving approach

Linear reinforcement learning in planning, grid fields, and cognitive control

spatial problem solving approach

Goal-oriented representations in the human hippocampus during planning and navigation

spatial problem solving approach

Neuronal vector coding in spatial cognition

Introduction.

Great strides have been made recently toward solving hard problems with deep learning, including reinforcement learning 1 , 2 . While these are groundbreaking and show superior performance over humans in some domains, humans nevertheless exceed computers in the ability to find creative and efficient solutions to novel problems, especially with changing internal motivation values 3 . Artificial general intelligence (AGI), especially the ability to learn autonomously to solve arbitrary problems, remains elusive 4 .

Value-based decision-making and goal-directed behavior involve a number of interacting brain regions, but how these regions might work together computationally to generate goal directed actions remains unclear. This may be due in part to a lack of mechanistic theoretical frameworks 5 , 6 . The orbitofrontal cortex (OFC) may represent both a cognitive map 7 and a flexible goal value representation 8 , driving actions based on expected outcomes 9 , though how these guide action selection is still unclear. The hippocampus is important for model-based planning 10 and prospection 11 , and the striatum is important for action selection 12 . Working memory for visual cues and task sets seems to depend on the visual cortex and lateral prefrontal regions, respectively 13 , 14 .

Neuroscience continues to reveal aspects of how the brain might learn to solve problems. Studies of cognitive control highlight how the brain, especially the prefrontal cortex, can apply and update rules to guide behavior 15 , 16 , inhibit behavior 17 , and monitor performance 18 to detect and correct errors 19 . Still, there is a crucial difference between rules and goals. Rules define a mapping from a stimulus to a response 20 , but goals define a desired state of the individual and the world 21 . When cognitive control is re-conceptualized as driving the individual to achieve a desired state, or set point, then cognitive control becomes a problem amenable to control theory.

Control theory has been applied to successfully account for the neural control of movement 22 and has informed various aspects of neuroscience research, including work in C. Elegans 23 , and work on controlling states of the brain 24 and electrical stimulation placement methods 25 (as distinct from behavioral control over states of the world in the present work), and more loosely in terms of neural representations underlying how animals control an effector via a brain computer interface 26 . In Psychology, Perceptual Control Theory has long maintained that behavior is best understood as a means of controlling perceptual input in the sense of control theory 27 , 28 .

In the control theory framework, a preferred decision prospect will define a set point, to be achieved by control-theoretic negative feedback controllers 29 , 30 . Problem solving then requires 1) defining the goal state; 2) planning a sequence of state transitions to move the current state toward the goal; and 3) generating actions aimed at implementing the desired sequence of state transitions.

Algorithms already exist that can implement such strategies, including the Dijkstra and A* algorithms 31 , 32 and are commonly used in GPS navigation devices found in cars and cell phones. Many variants of reinforcement learning solve a specific case of this problem, in which the rewarded states are relatively fixed, such as winning a game of Go 33 . While deep Q networks 1 and generative adversarial networks with monte carlo tree search 33 are very powerful, what happens when the goals change, or the environmental rules change? In that case, the models may require extensive retraining. The more general problem requires the ability to dynamically recalculate the values associated with each state as circumstances, goals, and set points change, even in novel situations.

Here we explore a computational model that solves this more general problem of how the brain solves problems with changing goals 34 , and we show how a number of brain regions may implement information processing in ways that correspond to specific model components. While this may seem an audacious goal, our previous work has shown how the GOLSA model can solve problems in the general sense of causing the world to assume a desired state via a sequence of actions, as described above 34 . The model begins with a core premise: the brain constitutes a control-theoretic system, generating actions to minimize the discrepancy between actual and desired states. We developed the Goal-Oriented Learning and Selection of Action (GOLSA) computational neural model from this core premise to simulate how the brain might autonomously learn to solve problems, while maintaining fidelity to known biological mechanisms and constraints such as localist learning laws and real-time neural dynamics. The constraints of biological plausibility both narrow the scope of viable models and afford a direct comparison with neural activity.

The model treats the brain as a high-dimensional control system. It drives behavior to maintain multiple and varying control theoretic set points of the agent’s state, including low level homeostatic (e.g. hunger, thirst) and high level cognitive set points (e.g. a Tower of Hanoi configuration). The model autonomously learns the structure of state transitions, then plans actions to arbitrary goals via a novel hill-climbing algorithm inspired by Dijkstra’s algorithm 32 . The model provides a domain-general solution to the problem of solving problems and performs well in arbitrary planning tasks (such as the Tower of Hanoi) and decision-making problems involving multiple constraints 34 (“ Methods ”).

The GOLSA model works by representing each possible state of the agent and environment in a network layer, with multiple layers each representing the same sets of states (Fig.  1 A,B). The Goal Gradient layer is activated by an arbitrarily specified desired (Goal) state and spreads activation backward along possible state transitions represented as edges in the network 35 , 36 . This value spreading activation generates current state values akin to learned state values (Q values) in reinforcement learning, except that the state values can be reassigned and recalculated dynamically as goals change. This additional flexibility allows goals to be specified dynamically and arbitrarily, with all state values being updated immediately to reflect new goals, thus overcoming a limitation of current RL approaches. Essentially, the Goal Gradient is the hill to climb to minimize the discrepancy between actual and desired states in the control theoretic sense. In parallel, regarding the present state of the model system, the Adjacent States layer receives input from a node representing the current state of the agent and environment, which in turn activates representations of all states that can be achieved with one state transition. The valid adjacent states then mask the Goal Gradient layer to yield the Desired Next State representation. In this layer, the most active unit represents a state which, if achieved, will move the agent one step closer to the goal state. This desired next state is then mapped onto an action (i.e. a controller signal) that is likely to effect the desired state transition. In sum, the model is given an arbitrarily specified goal state and the actual current state of the actor. It then finds an efficient sequence of states to transit in order to reach the goal state, and it generates actions aimed at causing the current state of the world to be updated so that it approaches and reaches the goal state.

figure 1

( A ) The GOLSA model determines the next desired state by hill climbing. Each layer represents the same set of states, one per neuron. The x- and y-axes of the grids represent abstracted coordinates in a space of states. Neurons are connected to each other for states that are reachable from another by one action, in this case neighbors in the x,y plane. The Goal state is activated and spreads activation through a Goal Gradient (Proximity) layer, thus dynamically specifying the value of each state given the goal, so that value is greater for states nearer the goal state. The Current State representation activates all Adjacent States, i.e. that can be achieved with one state transition. These adjacent states mask the Goal Gradient input to the Desired Next State, so that the most active unit in the Desired Next State represents a state attainable with one state transition and which will bring the state most directly toward the goal state. The black arrows indicate that the Desired Next State unit activities are the element-wise products of the corresponding Adjacent States and Goal Gradient unit activities. The font colors match the model layer to corresponding brain regions in Figs. 3 and 4 . ( B ) The desired state transition is determined by the conjunction of current state and desired next state. The GOLSA model learns a mapping from desired state transitions to the actions that cause those transitions. After training, the model can generate novel action sequences to achieve arbitrary goal states. Adapted from 34 .

Here we test whether and how the GOLSA model might provide an account of how various brain regions work together to drive goal-directed behavior. To do this, we ask human subjects to perform a multi-step task to achieve arbitrary goals. We then train the GOLSA model to perform the same task, and we use representational similarity analysis (RSA) to ask whether specific GOLSA model layers show similar representations to specific brain regions ( Supplementary Material ). The results will provide a tentative account of the function of specific brain regions in terms of the GOLSA model, and this account can then be tested and compared against alternative models in future work.

Study design

The details of the model implementation and the model code are available in the “ Methods ”. Behaviorally, we found that the GOLSA model is able to learn to solve arbitrary problems, such as reaching novel states in the Tower of Hanoi task (Fig.  2 A). It does this without hard-wired knowledge, simply by making initially random actions and learning from the outcomes, then synthesizing the learned information to achieve whatever state is specified as the goal state.

figure 2

( A ) The GOLSA model learns to solve problems, achieving arbitrary goal states. It does this by making arbitrary actions and observing which actions cause which state transitions. Figure adapted from earlier work 34 , 37 . ( B ) Treasure Hunt task. Both the GOLSA model and the human fMRI subjects performed a simple treasure hunt task, in which subjects were placed in one of four possible starting locations, then asked to generate actions to reach any of the other possible locations. To test multi-step transitions, subjects had to first move to the location of a key needed to unlock a treasure chest, then move to the treasure chest location. Participants first saw an information screen specifying the contents of each of the four states (‘you’, ‘key’, ‘chest’, or ‘nothing’). After a jittered delay, participants selected a desired movement direction and after another delay saw an image of the outcome location. The mapping of finger buttons to game movements was random on each trial and revealed after subjects were given the task and had to plan their movements, thus avoiding motor confounds during planning. Bottom: The two state-space maps used in the experiment. One map was used in the first half of trials while the other was used in the second half, in counterbalanced order.

Having found that the model can learn autonomously to solve arbitrary problems, we then aimed to identify which brain regions might show representations and activity that matched particular GOLSA model layers. To do this, we tested the GOLSA model with a Treasure Hunt task (Fig.  2 B and “ Methods ”), which was performed by both the GOLSA model and human subjects with fMRI. All human subjects research here was approved by the IRB of Indiana University, and subjects gave full informed consent. The human subjects research was performed in accordance with relevant guidelines/regulations and in accordance with the Declaration of Helsinki. Subjects were placed in one of four starting states and had to traverse one or two states to achieve a goal, by retrieving a key and subsequently using it to unlock a treasure chest for a reward (Fig.  2 B). The Treasure Hunt task presents a challenge to standard RL approaches, because the rewarded (i.e. goal) state changes regularly. In an RL framework, the Bellman equation would regularly relearn the value of each possible state in terms of how close it is to the currently rewarded state, forgetting previous state values in the process.

Representational similarity analysis

To analyze the fMRI and model data, we used model-based fMRI with representational similarity analysis (RSA) 38 (“ Methods ”). RSA considers a set of task conditions and asks whether a model, or brain region, can discriminate between the patterns of activity associated with the two conditions, as measured by a correlation coefficient. By considering every possible pairing of conditions, the RSA method constructs a symmetric representational dissimilarity matrix (RDM), where each entry is 1-r, and r is the correlation coefficient. This RDM provides a representational fingerprint of what information is present, so that the fingerprints can be compared between a model layer and a given brain region. For our application of RSA, each representational dissimilarity matrix (RDM) represented the pairwise correlations across 96 total patterns–4 starting states by 8 trial types by 3 time points within a trial (problem description, response, and feedback). For each model layer, the pairwise correlations are calculated with the activity pattern across layer cells in one condition vs. the activity pattern in the same layer in the other condition. For each voxel in the brain, the pairwise correlations are calculated with the activity pattern in a local neighborhood of radius 10 mm (93 voxels total) around the voxel in question, for one condition vs. the other condition. The 10 mm radius was chosen to provide a tradeoff between a sufficiently high number of voxels for pattern analysis and a sufficiently small area to identify specific regions. The fMRI RSA maps are computed for each subject over all functional scans and then tested across subjects for statistical significance. The comparison between GOLSA model and fMRI RDMs consists of looking for positive correlations between elements of the upper symmetric part of a given GOLSA model layer RDM vs. the RDM around a given voxel in the fMRI RDMs. The resulting fMRI RSA maps, one per GOLSA model layer, show which brain regions have representational similarities between particular model components and particular brain regions. The fMRI RSA maps showing the similarities between a given GOLSA model layer and a given brain region are computed for each subject and then tested across subjects for statistical significance in a given brain region, with whole-brain tests for significance in all cases. Full results are in Table 2 , and method details are in the “ Methods ” section. As a control, we also generated a null model layer that consisted of normally distributed noise (μ = 1, σ = 1). In the null model, no voxels exceeded the cluster defining threshold, and so no significant clusters were found, which suggests that the results below are not likely to reflect artifacts of the analysis methods.

Orbitofrontal cortex, goals, and maps

We found that the patterns of activity in a number of distinct brain regions match those expected of a control theoretic system, as instantiated in the GOLSA model (Figs. 3 A,B and 4 A–C); Table 1 ). Orbitofrontal cortex (OFC) activity patterns match model components that represent both a cognitive map 7 and a flexible goal value representation 8 , specifically matching the Goal and Goal Gradient layer activities. These layers represent the current values of the goal state and the current values of states near the goal state, respectively. The Goal Gradient layer incorporates cognitive map information in terms of which states can be reached from which other states. This suggests mechanisms by which OFC regions may calculate the values of states dynamically as part of a value-based decision process, by spreading activation of value from a currently active goal state representation backward. The GOLSA model representations of the desired next state also match overlapping regions in the orbitofrontal cortex (OFC) and ventromedial prefrontal cortex (vmPFC), consistent with a role in finding the more valuable decision option (Fig.  3 ). Reversal learning and satiety effects as supported by the OFC reduce to selecting a new goal state or deactivating a goal state respectively, which immediately updates the values of all states. Collectively this provides a mechanistic account of how value-based decision-making functions in OFC and vmPFC.

figure 3

Representational Similarity Analysis (RSA) of model layers vs. human subjects performing the same Treasure Hunt task. All results shown are significant clusters across the population with a cluster defining threshold of p  < 0.001 cluster corrected to p  < 0.05 overall, and with additional smoothing of 8 mm FWHM applied prior to the population level t-test for visualization purposes. ( A ) population Z maps showing significant regions of similarity to model layers in orbitofrontal cortex. Cf. Figure  1 and Fig.  5 B. The peak regions of similarity for goal-gradient and goal show considerable overlap in right OFC. The region of peak similarity for simulated-state is more posterior. To most clearly show peaks of model-image correspondence, the maps of gradient and goal are here visualized at p  < 0.00001 while all others are visualized at p  < 0.001. ( B ) Z maps showing significant regions of similarity to model layers in right temporal cortex. The peak regions of similarity for goal-gradient and goal overlap and extend into the OFC. The peak regions of similarity for adjacent-states, next-desired-state, and -simulated-state occur in similar but not completely overlapping regions, while the cluster for queue-store is more lateral. ( C ) Fig.  1 A, copied here as a legend, where the font color of each layer name corresponds to the region colors in panels ( A) and ( B) .

figure 4

Representational Similarity Analysis of model layers vs. human subjects performing the same Treasure Hunt task, with the same conditions and RSA analysis as in Fig.  3 . ( A ) Population Z maps showing significant regions of similarity to model layers in visual cortex. The peak regions of similarity for goal-gradient and goal overlap substantially, primarily in bilateral cuneus, inferior occipital gyrus, and lingual gyrus. The simulated-state layer displayed significantly similar activity to that in a smaller medial and posterior region. Statistical thresholding and significance are the same as Fig.  3 . ( B ) Z map showing significant regions of similarity to the desired-transition layer. Similarity peaks were observed for desired-transition in bilateral hippocampal gyrus as well as bilateral caudate and putamen. The desired-transition map displayed here was visualized at p  < 0.00001 for clarity. ( C ) Z maps showing significant regions of similarity to the model layers in frontal cortex. Similarity peaks were observed for queue-store in superior frontal gyrus (BA10). Action-output activity most closely resembled activity in inferior frontal gyrus (BA9), while simulated-state and goal-gradient patterns of activity were more anterior (primarily BA45). Similarity between activity in the latter two layers and activity in OFC, visual cortex, and temporal pole is also visible.

Lateral PFC and planning

The GOLSA model also incorporates a mechanism that allows multi-step planning, by representing a Simulated State as if the desired next state were already achieved, so that the model can plan multiple subsequent state transitions iteratively prior to committing to a particular course of action (Fig.  5 B). Those subsequent state transitions are represented in a Queue Store layer pending execution via competitive queueing, in which the most active action representation is the first to be executed, followed by the next most active representation, and so on 39 , 40 . This constitutes a mechanism of prospection 41 and planning 42 . The Simulated State layer in the GOLSA model shows strong representational similarity with regions of the OFC and anterior temporal lobe, and the Queue Store layer shows strong similarity with the anterior temporal lobe and lateral prefrontal cortex. This constitutes a mechanistic account of how the vmPFC and OFC in particular might contribute to multi-step goal-directed planning, and how plans may be stored in lateral prefrontal cortex.

figure 5

( A ) Full diagram of core model. Each rectangle represents a layer and each arrow a projection. The body is a node, and two additional nodes are not shown which provide inhibition at each state-change and oscillatory control. The colored squares indicate which layers receive inhibition from these nodes. Some recurrent connections not shown. ( B ) Full diagram of extended model, with added top row representing ability to plan multiple state transition steps ahead (Simulated State, Queue Input, Queue Store, and Queue Output layers). Adapted with permission from earlier work 34 .

Visual cortex and future visual states

The visual cortex also shows representational patterns consistent with representing the goal, goal gradient, and simulated future states (Figs.  3 B and 4 ). This suggests a role for the visual cortex in planning, in the sense of representing anticipated future states beyond simply representing current visual input. Future states in the present task are represented largely by images of locations, such as an image of a scarecrow or a house. In that sense, an anticipated future state could be decoded as matching the representation of the image of that future state. One possibility is that this reflects an attentional effect that facilitates processing of visual cues representing anticipated future states. Another possibility is that visual cortex activity signals a kind of working memory for anticipated future visual states, similar to how working memory for past visual states has been decoded from visual cortex activity 14 . This would be distinct from predictive coding, in that the activity predicts future states, not current states 43 . In either case, the results are consistent with the notion that the visual cortex may not be only a sensory region but may play some role in planning by representing the details of anticipated future states.

Anterior temporal lobe and planning

The anterior temporal lobe likewise shows representations of the goal, goal gradient, the adjacent states, the next desired state, and simulated future and queue store states (Figs.  3 B, 4 C). In one sense this is not surprising, as the states of the task are represented by images of objects, and visual objects (especially faces) are represented in the anterior temporal lobe 44 . Still, the fact that the anterior temporal lobe shows representations consistent with planning mechanisms suggests a more active role in planning beyond feedforward sensory processing as commonly understood 45 .

Hippocampal region and prospection

Once the desired next state is specified, it must be translated to an action. The hippocampus and striatum match the representations of the Desired Transition layer in the GOLSA model. This model layer represents a conjunction of the current state and desired next state transitions, which in the GOLSA model is a necessary step toward selecting an appropriate action to achieve the desired transition. This is consistent with the role of the hippocampus in prospection 41 , and it suggests computational and neural mechanisms by which the hippocampus may play a key role in turning goals into predictions about the future, for the purpose of planning actions 10 , 11 . Finally, as would be expected, the motor output representations in the GOLSA model match motor output patterns in the motor cortex (Fig.  4 C).

The results above show how a computational neural model, the GOLSA model, provides a novel computational account of a number of brain regions. The guiding theory is that a substantial set of brain regions function together as a control-theoretic mechanism 47 , generating behaviors to minimize the discrepancy between the current state and the desired (goal) state. The OFC is understood as including neurons that represent the value of various states in the world, such as the value of acquiring certain objects. Greater activity of an OFC neuron corresponds with more value of its represented state given the current goals. Because of spreading activation, neurons will be more active if they represent states closer to the goal. This results in value representations similar to those provided by the Bellman equation of reinforcement learning 48 , with the difference being that spreading activation can instantly reconfigure the values of states as goals change, without requiring extensive iterations of the Bellman equation.

Given the current state and the goal state, the next desired state can be determined as a nearby state that can be reached and that also moves the current state of the world closer to the goal state. Table 1 shows these effects in the medial frontal gyrus, putamen, superior temporal gyrus, pons, and precuneus. The GOLSA model suggests this is computed as the activation of available state representations, multiplied by the OFC value for that state. Precedent for this kind of multiplicative effect has been shown in the attention literature 49 . The action to be generated is represented by neural activity in the motor cortex region. This in turn is determined on the basis of neurons that are active specifically for a conjunction of the particular current state and next desired state. Neurally, we find this conjunction represented across a large region including the striatum and hippocampus. This is consistent with the notion of the hippocampus as a generative recurrent neural network, that starts at a current state and runs forward, specifically toward the desired state 50 . The striatum is understood as part of an action gate that permits certain actions in specific contexts, although the GOLSA model does not include an explicit action gate 51 . Where multiple action steps must be planned prior to executing any of them, the lateral PFC seems to represent a queue of action plans in sequence, as sustained activity representing working memory 39 , 52 . By contrast, working memory representations in the visual cortex apparently represent the instructed future states as per the instructions for each task trial, and these are properly understood as visual sensory rather than motor working memories 14 .

Our findings overall bear a resemblance to the Free Energy principle. According to this, organisms learn to generate predictions of the most likely (i.e. rewarding) future states under a policy, then via active inference emit actions to cause the most probable outcomes to become reality, thus minimizing surprise 53 , 54 . Like active inference, the GOLSA model emits actions to minimize the discrepancy between the actual and predicted state. Of note, the GOLSA model specifies the future state as a desired state rather than a most likely state. This crucial distinction allows a state that has a high current value to be pursued, even if the probability of being in that state is very low (for example buying lottery tickets and winning). Furthermore, the model includes the mechanisms of Fig.  1 , which allow for flexible planning given arbitrary goals. The GOLSA model is a process model and simulates rate-coded neural activity as a dynamical system (“ Methods ”), which affords the more direct comparison with neural activity representations over time as in Figs.  3 and 4 .

The GOLSA model, and especially our analysis of it, builds on recent work that developed methods to test computational neural models against empirical data. Substantial previous work has demonstrated how computational neural modeling can provide insight into the functional properties underlying empirical neural data, such as recurrent neural networks elucidating the representational structure in anterior cingulate 19 , 55 , 56 and PFC 57 ; deep neural networks accounting for object recognition in IT with representational similarity analysis 58 , and encoding/decoding of visual cortex representations 59 ; dimensionality reduction for comparing neural recordings and computational neural models 60 , and representations of multiple learned tasks in computational neural models 61 .

The GOLSA model shares some similarity with model-based reinforcement learning (MBRL), in that both include learned models of next-state probabilities as a function of current state and action pairs. Still, a significant limitation of both model-based and model free RL is that typically there is only a single ultimate goal, e.g. gaining a reward or winning a game. Q-values 62 are thus learned in order to maximize a single reward value. This implies several limitations: (1) that Q values are strongly paired with corresponding states; (2) that there is only one Q value per state at a given time, as in a Markov decision process (MDP), and (3) Q values are generally updated via substantial relearning. In contrast, real organisms will find differing reward values associated with different goals at different times and circumstances. This implies that goals will change over time, and re-learning Q-values with each goal change would be inefficient. Instead, a more flexible mechanism will dynamically assign values to various goals and then plan accordingly. The GOLSA model exemplifies this approach, essentially replacing the learned Q values of MBRL and MDPs with an activation-based representation of state value, which can be dynamically reconfigured as goals change. This overcomes the three limitations above.

Our work has several limitations. First, regarding the GOLSA model itself, the main limitation is its present implementation of one-hot state representations. This makes a scale-up to larger and continuous state spaces challenging. Future work may overcome this limitation by replacing the one-hot representations with vector-valued state representations and the learned connections with deep network function approximators. This would require corresponding changes in the search mechanisms of Fig.  1 A, from parallel, spreading activation to a serial, monte carlo tree search mechanism. This would be consistent with evidence of serial search during planning 63 , 64 and would afford a new approach to artificial general intelligence that is both powerful and similar to human brain function. Another limitation is that the Treasure Hunt task is essentially a spatial problem solving task. We anticipate that the GOLSA model could be applied to solve more general, non-spatial problems, but this remains to be demonstrated.

The fMRI analysis here has several limitations as well. First, a correspondence of representations does not imply a correspondence of computations, nor does it prove the model correct in an absolute sense 65 . There are other computational models that use diffusion gradients to solve goal-directed planning 66 , and more recent work with deep networks to navigate from arbitrary starting to arbitrary ending states 50 . The combined model and fMRI results here constitute a proposed functional account of the various brain regions, but our results do not prove that the regions compute exactly what the corresponding model regions do, nor can we definitively rule out competing models. Nevertheless the ability of the model to account for fMRI data selectively in specific brain regions suggests that it merits further investigation and direct tests against competing models, as a direction for future research. Future work might compare other models besides GOLSA against the fMRI data using RSA, to ascertain whether other model components might provide a better fit to, and account of, specific brain regions. While variations of model-based and model-free reinforcement learning models would seem likely candidates, we know of only one model by Banino et al. 50 endowed with the ability to flexibily switch goals and thus perform the treasure hunt task as does the GOLSA model 34 . It would be instructive to compare the overall abilities of GOLSA and the model of Banino et al. to account the RDMs of specific brain regions in the Treasure Hunt task, although it is unclear how to do a direct comparison given that the two models consist of very different mechanisms.

The GOLSA model may in principle be extended hierarchically. The frontal cortex has a hierarchical representational structure, in which higher levels of a task may be represented as more anterior 67 . Such hierarchical structure has been construed to represent higher, more abstract task rules 13 , 15 , 68 . The GOLSA model suggests another perspective, that higher level representations consist of higher level goals instead of higher level rules. In the coffee-making task for example 69 , the higher level task of making coffee may require a lower level task of boiling water. If the GOLSA model framework were extended hierarchically, the high level goal of having coffee prepared would activate a lower level goal of having the water heated to a specified temperature. The goal specification framework here is intrinsically more robust than a rule or schema based framework–rules may fail to produce a desired outcome, but if an error occurs in the GOLSA task performance, replanning simply calculates the optimal sequence of events from whatever the current state is, and the error will be automatically addressed.

This incidentally points to a key difference between rules and goals, in that task rules define a mapping from stimuli to responses 15 , in a way that is not necessarily teleological. Goals, in contrast, are by definition teleological. This distinction roughly parallels that between model-free and model-based reinforcement learning 70 The rule concept, as a stimulus–response mapping, implies that an error is a failure to generate the action specified by the stimulus, regardless of the final state of a system. In contrast, the goal concept implies that an error is precisely a failure to generate the desired final state of a system. Well-learned actions may acquire a degree of automaticity over time 71 , but arguably the degree of automaticity is independent of whether an action is rule oriented vs. goal-directed. If a goal-directed action becomes automatized, this does not negate the teleological nature, namely that errors in the desired final state of the world can be detected and lead to corrective action to achieve the desired final state. Rule-based action, whether deliberate or automatized, does not necessarily entail corrective action to achieve a desired state. Where actions are generated, and possibly corrected, to achieve a desired state of the world, this may properly be referred to as goal-directed behavior.

We have investigated the GOLSA model here to examine whether and how it might account for the function of specific brain regions. With RSA analysis, we found that specific layers of the GOLSA model show strong representational similarities with corresponding brain regions. Goals and goal value gradients matched especially the orbitofrontal cortex, and also some aspects of the visual and anterior temporal cortices. The desired transition layer matched representations in the hippocampus and striatum, and simulated future states matched representations in the middle frontal gyrus and superior temporal pole. Not surprisingly, the model motor layer representations matched the motor cortex. Collectively, these results constitute a proposal that the GOLSA model can provide an organizing account of how multiple brain regions interact to form essentially a negative feedback controller, with time varying behavioral set points derived from motivational states. Future work may investigate this proposal in more depth and compare against alternative models.

Model components

The GOLSA model is constructed from a small set of basic components, and the model code is freely available as supplementary material . The main component class is a layer of units, where each unit represents a neuron (or, more abstractly, a small subpopulation of neurons) corresponding to either a state, a state transition, or an action. The activity of units in a layer represents the neural firing rate and is instantiated as a vector updated according to a first order differential equation (c.f. Grossberg 72 ). The activation function varies between layers, but all units in a particular layer are governed by the same equation. The most typical activation function for a single unit is,

where a represents activation, i.e. the firing rate, of a model neuron. The four terms of this equation represent, in order: passive decay \(-\lambda a(t)\) , shunting excitation \(\left(1-a\left(t\right)\right)E\) , linear inhibition \(-I\) , and random noise \(\varepsilon N(t)\sqrt{dt}\) . “Shunting” refers to the fact that excitation (E) scales inversely as current activity increases, with a natural upper bound of 1. The passive decay works in a similar fashion, providing a natural lower bound activity of 0. The inhibition term linearly suppresses unit activity, while the final term adds normally distributed noise N (μ = 0, σ = 1), with strength \(\varepsilon\) . Because the differential equations are approximated using the Euler method, the noise term is multiplied by \(\sqrt{dt}\) to standardize the magnitude across different choices of dt 73 , 74 . The speed of activity change is determined by a time constant τ. The parameters τ, λ, ε vary by layer in order to implement different processes. E and I are the total excitation and inhibition, respectively, impinging on a particular unit for every presynaptic unit j in every projection p onto the target unit,

where \({a}_{{p}_{j}}\) is the activation of a presynaptic model neuron that provides exciation, and \({w}_{{p}_{j}}\) is the synaptic weight that determines how much excitation per unit of presynaptic activity will be provided to the postsynaptic model neuron.

A second activation function used in several places throughout the model is,

This function is identical to Eq. ( 1 ) except that the inhibition is also shunting, such that it exhibits a strong effect on highly active units and a smaller effect as unit activity approaches 0. While more typical in other models, shunting inhibition has a number of drawbacks in the current model. Two common uses for inhibition in the GOLSA model are winner-take-all dynamics and regulatory inhibition which resets layer activity. Shunting inhibition impedes both of these processes because inhibition fails to fully suppress the appropriate units, since it becomes less effective as unit activity decreases.

Projections

Layers connect to each other via projections, representing the synapses connecting one neural population to another. The primary component of projections is a weight matrix specifying the strength of connections between each pair of units. Learning is instantiated by updating the weights according to a learning function. These functions vary between the projections responsible for the model learning and are fully described in the section below dealing with each learning type. Some projections also maintain a matrix of traces updated by a projection-specific function of presynaptic or postsynaptic activity. The traces serve as a kind of short-term memory for which pre or postsynaptic units were recently activated, which serve a very similar role to eligibility traces as in Barto et al. 75 , though with a different mathematical form.

Nodes are model components that are not represented neurally via an activation function. They represent important control and timing signals to the model and are either set externally or update autonomously according to a function of time. For instance, sinusoidal oscillations are used to gate activity between various layers. While in principle rate-coded model neurons could implement a sinusoidal wave, the function is simply hard coded into the update function of the node for simplicity. In some cases, it is necessary for an entire layer to be strongly inhibited when particular conditions hold true, such as when an oscillatory node is in a particular phase. Layers therefore also have a list of inhibitor nodes that prevent unit activity within the layer when the node value meets certain conditions. In a similar fashion, some projections are gated by nodes such that they allow activity to pass through and/or allow the weights to be updated only when the relevant node activity satisfies a particular condition. Another important node provides strong inhibition to many layers when the agent changes states.

Environment

The agent operates in an environment consisting of discrete states, with a set of allowable state transitions. Allowable state transitions are not necessarily bidirectional, but for the present simulations, they are deterministic (unlike the typical MDP formulation used in RL). In some simulations, the environment also contains different types of reward located in various states, which can be used to drive goal selection. In other simulations, the goal is specified externally via a node value.

Complete network

Each component and subnetwork of the model is described in detail below or in the main text, but for reference and completeness a full diagram of the core network is shown in Fig.  5 A, and the network augmented for multi-step planning is shown in Fig.  5 B. Some of the basic layer properties are summarized in Table 2 . Layers and nodes are referred to using italics, such that the layer representing the current state is referred to simply as current-state.

Representational structure

In Fig.  5 B, the layers Goal, Goal Gradient, Next State, Adjacent States, Previous States, Simulated State, and Current State all have the same number of nodes and the same representational structure, i.e. one state per node.

The layers Desired Transition, Observed Transition, Transition Output, Queue Input, Queue Output, and Queue Store likewise have the same representational structure, which is the number of possible states squared. This allows a node in these layers to represent a transition from one specific state to another specific state.

The layers Action Input, Action Output, and Previous Action all have the same representational structure, which is one possible action per node.

Task description

The Treasure Hunt task (Fig.  2 ) was created and presented in OpenSesame, a Python-based toolbox for psychological task design 76 . In the task, participants control an agent which can move within a small environment comprised of four distinct states. The nominal setting is a farm, and the states are a field with a scarecrow, the lawn in front of the farm house, a stump with an axe, and a pasture with cows. Each is associated with a picture of the scene obtained from the internet. These states were chosen to exemplify categories previously shown to elicit a univariate response in different brain regions, namely faces, houses, tools, and animals 77 , 78 , 79 .

Over the course of the experiment, participants were told the locations of treasure chests and the keys needed to open them. By arriving at a chest with the key, participants earned points which were converted to a monetary bonus at the end of the experiment. The states were arranged in a square, where each state was accessible from the two adjacent states but not the state in the opposite corner (diagonal movement was not allowed).

Each trial began with the presentation of a text screen displaying the relevant information for the next trial, namely the locations of the participant, the key, and the chest (Fig.  2 ). Because the neural patterns elicited during the presentation were the primary target of the decoding analysis, it was important that visual information be as similar as possible across different goal configurations, to avoid potential confounds. To hold luminance as constant as possible across conditions, each line always had the same number of characters. Since, for instance, “Farm House: key” has fewer characters than “Farm House: Nothing”, filler characters were added to the shorter lines, namely Xs and Os. On some trials Xs were the filler characters on the top row and Os were the filler characters on the bottom rows. This manipulation allowed us to attempt to decode the relative position of the Xs and Os to test whether decoding could be achieved due only to character-level differences in the display. We found no evidence that our results reflect low level visual confounds such as the properties of the filler characters.

Participants were under no time constraint on the information screen and pressed a button when they were ready to continue. A delay screen then appeared consisting of four empty boxes. After a jittered interval (1-6 s, distributed exponentially), arrows appeared in the boxes. The arrows represented movement directions and the boxes corresponded to four buttons under the participants left middle finger, left index finger, right index finger, and right middle finger, from left to right. Participants pressed the button corresponding to the box with the arrow pointing in the desired direction to initiate a movement. A fixation cross then appeared for another jittered delay of 0–4 s, followed by a 2 s display of the newly reached location if their choice was correct or an error screen if it was incorrect.

If the participant did not yet have the key required to open the chest, the correct movement was always to the key. Sometimes the key and chest were in the same location in which case the participant would earn points immediately. If they were in different locations, then on the next trial the participant had to move to the chest. This structure facilitated a mix of goal distances (one and two states away) while controlling the route required to navigate to the goal.

If the chosen direction was incorrect, participants saw an error screen displaying text and a map of the environment. Participants advanced from this screen with a button press and then restarted the failed trial. If the failed trial was the second step in a two-step sequence (i.e., if they had already gotten the key and then moved to the wrong state to get to the chest), they had to repeat the previous two trials.

Repeating the failed trial ensured that there were balanced numbers of each class of event for decoding, since an incorrect response indicated that some information was not properly maintained or utilized. For example, if a participant failed the second step of a two-trial sequence, then they may not have properly encoded the final goal when first presented with the information screen on the previous trial, which specified the location of the key and the chest.

Halfway through the experiment, the map was reconfigured such that states were swapped across the diagonal axes of the map. This was necessary because otherwise, each state could be reached by exactly two movement directions and exactly two movement directions could be made from it. For instance, if the farm house was the state in the lower left, the farmhouse could only be reached by moving left or down from adjacent states, and participants starting at the farm house could only move up or to the right. If this were true across the entire experiment, above-chance classification of target state, for instance, could appear in regions that in fact only contain information about the intended movement direction.

Each state was the starting state for one quarter of the trials and the target destination for a different quarter of the trials. All trials were one of three types. One category consisted of single-trial (single) sequences in which the chest and key were in the same location. The sequences in which the chest and key were in separate locations required two trials to complete, one to move from the initial starting location to the key and another to move from the key location to the chest location. These two steps formed the other two classes of trials, the first-of-two (first) and second-of-two (second) trials. Recall that on second trials, no information other than the participant’s current location is presented on the starting screen to ensure that the participant maintained the location of the chest in memory across the entire two-trial sequence (if it was presented on the second trial, there would be no need to maintain that information through the first trial). The trials were evenly divided into single, first, and second classes with 64 trials in each class. Therefore, every trial had a starting state and an immediate goal, while one third of trials also had a more distant final goal.

Immediately prior to participating in the fMRI version of the task, participants completed a short 16-trial practice outside the scanner to refresh their memory. Before beginning the first run inside the scanner, participants saw a map of the farm states and indicated when they had memorized it before moving on. Within each run, participants completed as many trials as they could within eight minutes. As described above, exactly halfway through the trials, the state space was rearranged with each state moving to the opposite corner. Therefore, when participants completed the first half of the experiment, the current run was terminated and participants were given time to learn the new state space before scanning resumed. At the end of the experiment, participants filled out a short survey about their strategy.

Participants

In total, 49 participants (28 female) completed the behavioral-only portion of the experiment, including during task piloting (early versions of the behavioral task were slightly different than described below). Participants provided written informed consent in accordance with the Institutional Review Board at Indiana University, and were compensated $10/hour for their time plus a performance bonus based on accuracy up to an additional $10. The behavioral task first served as a pilot during task design and then as a pre-screen for the fMRI portion, in that only participants with at least 90% accuracy were invited to participate. Additional criteria for scanning were that the subjects be right handed, free of metal implants, free of claustrophobia, weigh less than 440 pounds, and not be currently taking psychoactive medication. In total, 25 participants participated in the fMRI task but one subject withdrew shortly after beginning, leaving 24 subjects who completed the imaging task (14 female). Across the 24 subjects, the average error rate of responses during the fMRI task was 2.4%, and error trials were modeled separately in the fMRI analysis. These were not analyzed further as there were too few error trials for a meaningful analysis.

fMRI acquisition and data preprocessing

Imaging data were collected on a Siemens Magnetom Trio 3.0-Tesla MRI scanner and a 32 channel head coil. Foam padding was inserted around the sides of the head to increase participant comfort and reduce head motion. Functional T2* weighted images were acquired using a multiband EPI sequence 80 with 42 contiguous slices and 3.44 × 3.44 × 3.4 mm 3 voxels (echo time = 28 ms; flip angle = 60; field of view = 220, multiband acceleration factor = 3). For the first subject, the TR was 813 ms, but during data collection for the second subject the TR changed to 816 ms for unknown reasons. The scanner was upgraded after collecting data from an additional five subjects, at which point the TR remained constant at 832 ms. All other parameters remained unchanged. High-resolution T 1 –weighted MPRAGE images were collected for spatial normalization (256 × 256 × 160 matrix of 1 × 1 × 1mm 3 voxels, TR = 1800, echo time = 2.56 ms; flip angle = 9).

Functional data were spike-corrected using AFNI’s 3dDespike ( http://afni.nimh.nih.gov/afni ). Functional images were corrected for difference in slice timing using sinc-interpolation and head movement using a least-squares approach with a 6-parameter rigid body spatial transformation. For subjects who moved more than 3 mm total or 0.5 mm between TRs, 24 motion regressors were added to subsequent GLM analyses 81 .

Because MVPA and representation similarity analysis (RSA) rely on precise voxelwise patterns, these analyses were performed before spatial normalization. For the univariate analyses, structural data were coregistered to the functional data and segmented into gray and white matter probability maps 82 . These segmented images were used to calculate spatial normalization parameters to the MNI template, which were subsequently applied to the functional data. As part of spatial normalization, the data were resampled to 2 × 2 × 2mm 3 , and this upsampling allowed maximum preservation of information. All analyses included a temporal high-pass filter (128 s) and correction for temporal autocorrelation using an autoregressive AR(1) model.

Univariate GLM

For initial univariate analyses, we measured the neural response associated with each outcome state at the outcome screen (when an image of the state was displayed), as well as the signal at the start of the trial associated with each immediate goal location. Five timepoints were modeled in the GLM used in this analysis, namely the start of the trial, the button press to advance, the appearance of the arrows and subsequent response, the start of the feedback, and the end of the feedback. The regressors marking the start of the trial and the start of the feedback screen were further individuated by the immediate goal on the trial. A separate error regressor was used when the response was incorrect, meaning they did not properly pursue the immediate goal and received error feedback. All correct trials in which participants moved to, for instance, the cow field, used the same trial start and feedback start regressors.

The GLM was fit to the normalized functional images. The resulting beta maps were combined at the second level with a voxel-wise threshold of p  < 0.001 and cluster corrected ( p  < 0.05) to control for multiple comparisons. We assessed the univariate response associated with each outcome location, by contrasting each particular outcome location with all other outcome locations. The response to the error feedback screen was assessed in a separate contrast against all correct outcomes. To test for any univariate responses related to the immediate goal, we performed an analogous analysis using the trial start regressors which were individuated based on the immediate goal. For example, the regressor ‘trialStartHouseNext’ was associated with the beginning of every trial where the farmhouse was the immediate goal location. To assess the univariate signal associated with the farmhouse immediate goal, we performed a contrast between this regressor and all other trial start regressors.

Representational similarity analysis (RSA)

As before, a GLM was fit to the realigned functional images. The following events were modeled with impulse regressors: trial onset (information screen), key press to advance to the decision screen, the prompt and immediately subsequent action (modeled as a single regressor), the onset of the outcome screen, and the termination of the outcome screen. The RSA analysis used beta maps derived from the regressors marking trial onset, prompt/response, and outcome screen onset.

Each of these regressors (except those used in error trials) were further individuated by the (start state, next state, final goal) triple constituting the goal path. There were 8 distinct trial types starting in each state. Each state could serve as the starting point of two single-step sequences (in which the key and treasure chest are in the same location) and four two-step sequences (in which the key and treasure chest are in different locations). Each state could also be the midpoint of a two-step sequence with the treasure chest located in one of two adjacent states. With three regressors used for each trial, there were 4 starting states * 8 trial types * 3 time points = 96 total patterns used to create the Representational Dissimilarity Matrix (RDM) in each searchlight region, where cell x ij in the RDM is defined as one minus the Pearson correlation between the ith and jth patterns. Values close to 2 therefore represent negative correlation (high representational distance) while values close to 0 indicate a positive correlation (low representational distance).

To derive the model-based RDMs, the GOLSA model was run on an analogue of the goal pursuit task, using a four state state-space with four actions corresponding to movement in each cardinal direction. The model layer timecourses of activity are shown in Figs.  6 and 7 for one- and two-step trials, respectively. The base GOLSA model is not capable of maintaining a plan across an arbitrary delay, but instead acts immediately to make the necessary state transitions. The competitive queue 83 module allows state transition sequences to be maintained and executed after a delay, and was therefore necessary to model the task in the most accurate manner possible. However, the goal-learning module was not necessary since goals were externally imposed. Because participants had to demonstrate high performance on the task before entering the scanner, little if any learning took place during the experiment. As a result, the model was trained extensively on the state space before performing any trials used in data collection. To further simulate likely patterns of activity in the absence of significant learning, the input from state to goal-gradient (used in the learning phase of an oscillatory cycle) was removed and the goal-gradient received steady input from the goal layer, interrupted only by the state-change inhibition signal. In other words, the goal-gradient layer continuously represented the actual goal gradient rather than shifting into learning mode half of the time.

figure 6

Model activity during a simulated one-step sequence of the Treasure Hunt task. The competitive queueing module first loads a plan and then executes it sequentially. State activity shows that the agent remains in state 1 for the first half of the simulation, while simulated-state (StateSim) shows the state transition the agent simulates as it forms its plan. Adjacent-states (Adjacent) receives input from stateSim which, along with goal-gradient (Gradient) activity determines the desired next state and therefore the appropriate transition to make. The plan is kept in queue-store (Store) which receives a burst of input from queue-input (QueueIn) and finally executes the plan by sending output to queue-output (QueueOut) which drives the motor system. The vertical dashed lines indicating the different phases of the simulation used in the creation of the model RDMs. For each layer, activity within each period was averaged across time to form a single vector representing the average pattern for that time period in the trial type being simulated. The bounds of each phase were determined qualitatively. The planning period is longer than the acting and outcome periods because the model takes longer to form a plan than execute it or observe the outcome.

figure 7

Model activity during a simulated two-step sequence of the Treasure Hunt task. The competitive queueing module first loads a plan and then executes it sequentially. State activity shows that the agent remains in state 1 for the first half of the simulation, while simulated-state shows the state transitions the agent simulates as it forms its plan. Adjacent-states receives input from simulated-state which, along with goal-gradient activity determines the desired next state and therefore the appropriate transitions to make. The plan is kept in queue-store which receives bursts of input from queue-input and finally executes the plan by sequentially sending output to queue-output which drives the motor system. To force the agent to go to the appropriate intermediate state, goal activity first reflects the key location and then the chest location. The vertical dashed lines indicate time periods used when creating the RDMs for the two-step sequence simulations. The first three time periods correspond to the first trial in the sequence while the latter three correspond to the second trial in the sequence. Again, the first planning period is much longer due to the nature of the model dynamics. During the second “planning” period (P2), the plan was already formed as must have been the case in the actual experiment since on the second trials in a two-step sequence, no information was presented at the start of the trial and had to be remembered from the previous trial.

In the task, participants first saw an information screen from which they could determine the immediate goal state and the appropriate next action. This plan was maintained over a delay before being implemented. At the beginning of each trial simulation, the queuing module was set to “load” while the model interactions determined the best method of getting from the current state to the goal state. This period is analogous to the period in which subjects look at the starting information screen and plan their next move. Then, the queuing module was set to “execute,” modeling the period in which participants are prompted to make their selection. Finally, the chosen action implements a state transition and the environment provides new state information to the state layer, modeling the outcome phase of the experiment.

Some pairs of trials in the task comprised a two-step sequence in which the final goal was initially two states away from the starting state. On the second trial of such sequences, participants were not provided any information on the information screen at the start of the trial, ensuring that they had encoded and maintained all goal-related information from the information screen presented at the start at the first trial in the sequence. These pairs of trials were modeled within a single GOLSA simulation. The model seeks the quickest path to the goal, identifying immediately available subgoals as needed. However, in the task, the location of the key necessitated a specific path to reach the final goal of the treasure chest. To provide these instructions to the model at the start of a two-step simulation, the goal representation from the subgoal (the key) was provided to the model first until the appropriate action was loaded and then the goal representation shifted to the final goal (the chest). Once the full two-step state transition sequence was loaded in the queue, the actions were read out sequentially, as shown in Fig.  7 .

A separate RDM was generated for each model layer. Patterns were extracted from three time intervals per action (six total for the two-step sequence simulations). Due of the time required to load the queue, the first planning period was longer than all other intervals. For each simulation and time point, the patterns of activity across the units were averaged over time, yielding one vector. Each trial type was repeated 10 times and the patterns generated in the previous step were averaged across simulation repetitions. The activity of each layer was thus summarized with at most 96 patterns of activity which were converted into an RDM by taking one minus the Pearson correlation between each pattern. Patterns in which all units were 0 were ignored since the correlation is undefined for constant vectors.

We looked for neural regions corresponding to the layers that played a critical role in the model during the acting phase in the typical learning oscillation since in these simulations the learning phase of the oscillation was disabled. We created RDMs from the following layers: current-state, adjacent-states, goal, goal-gradient, next-desired-state, desired-transition, action-out, simulated-state, and queue-store. As a control, we also added a layer component which generated normally distributed noise (μ = 1, σ = 1).

RSA searchlight

The searchlight analysis was conducted using Representational Similarity Analysis Toolbox, developed at the University of Cambridge ( http://www.mrc-cbu.cam.ac.uk/methods-and-resources/toolboxes/license/ ). For each of these layer RDM, a searchlight of radius of 10 mm was moved through the entire brain. At each voxel, an RDM was created by from the patterns in the spherical region centered on that voxel.

An r value was obtained for each voxel by computing the Spearman correlation between the searchlight RDM and the model layer RDM, ignoring trial time periods in which all model units showed no activity. A full pass of the searchlight over the brain produced a whole-brain r map for each subject for each layer. Voxels in regions that perform a similar function to the model component will produce similar RDMs to the model component RDM and thus will be assigned relatively high values. The r maps were then Fisher-transformed into z maps ( \(z=\frac{1}{2}\mathrm{ln}\left(\frac{1+r}{1-r}\right)\) ). The z maps were normalized into the MNI template but were not smoothed, as the searchlight method already introduces substantial smoothing. Second level effects were assessed with a t test on the normalized z maps, with a cluster defining threshold of p  < 0.001, cluster corrected to p  < 0.05 overall. The cluster significance was determined by SPM5 and verified for clusters >  = 24 voxels in size with a version of 3DClustSim (compile date Jan. 11, 2017) that corrects for the alpha inflation found in pervious 3DClustSim versions 84 . The complete results are shown in Table 1 .

Data availability

The GOLSA model code for the simulations is available at https://github.com/CogControl/GolsaOrigTreasureHunt . Imaging data are available from the corresponding author on reasonable request.

Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518 , 529–533 (2015).

Article   ADS   CAS   Google Scholar  

Silver, D. et al. Mastering the game of Go without human knowledge. Nature 550 , 354–359 (2017).

Palm, G. & Schwenker, F. Artificial development by reinforcement learning can benefit from multiple motivations. Front. Robot. AI 6 , 6 (2019).

Article   Google Scholar  

Adams, S. et al. Mapping the landscape of human-level artificial general intelligence. AI Mag. 33 , 25 (2012).

Google Scholar  

Jonas, E. & Kording, K. Could a neuroscientist understand a microprocessor? http://biorxiv.org/lookup/doi/ https://doi.org/10.1101/055624 (2016) doi: https://doi.org/10.1101/055624 .

Brown, J. W. The tale of the neuroscientists and the computer: Why mechanistic theory matters. Front. Neurosci. 8 , (2014).

Wilson, R. C., Takahashi, Y. K., Schoenbaum, G. & Niv, Y. Orbitofrontal cortex as a cognitive map of task space. Neuron 81 , 267–279 (2014).

Article   CAS   Google Scholar  

Schoenbaum, G., Takahashi, Y., Liu, T.-L. & McDannald, M. A. Does the orbitofrontal cortex signal value?. Ann. N. Y. Acad. Sci. 1239 , 87–99 (2011).

Article   ADS   Google Scholar  

Whyte, A. J. et al. Reward-related expectations trigger dendritic spine plasticity in the mouse ventrolateral orbitofrontal cortex. J. Neurosci. https://doi.org/10.1523/JNEUROSCI.2031-18.2019 (2019).

Vikbladh, O. M. et al. Hippocampal contributions to model-based planning and spatial memory. Neuron 102 , 683-693.e4 (2019).

Buckner, R. L. The role of the hippocampus in prediction and imagination. Annu. Rev. Psychol. 61 , 27–48 (2010).

Cools, A. R. Role of the neostriatal dopaminergic activity in sequencing and selecting behavioural strategies: Facilitation of processes involved in selecting the best strategy in a stressful situation. Behav. Brain Res. 1 , 361–378 (1980).

Nee, D. E. & Brown, J. W. Rostral-caudal gradients of abstraction revealed by multi-variate pattern analysis of working memory. Neuroimage 63 , 1285–1294 (2012).

Riggall, A. C. & Postle, B. R. The relationship between working memory storage and elevated activity as measured with functional magnetic resonance imaging. J. Neurosci. 32 , 12990–12998 (2012).

Miller, E. K. & Cohen, J. D. An integrative theory of prefrontal cortex function. Annu. Rev. Neurosci. 21 , 167–202 (2001).

Donoso, M., Collins, A. G. E. & Koechlin, E. Human cognition. Foundations of human reasoning in the prefrontal cortex. Science 344 , 1481–1486 (2014).

Aron, A. R. The neural basis of inhibition in cognitive control. Neurosci. 13 , 214–228 (2007).

Alexander, W. H. & Brown, J. W. Medial prefrontal cortex as an action-outcome predictor. Nat. Neurosci. 14 , 1338–1344 (2012).

Brown, J. W. & Alexander, W. H. Foraging value, risk avoidance, and multiple control signals: How the anterior cingulate cortex controls value-based decision-making. J. Cogn. Neurosci. 29 , 1656–1673 (2017).

Cooper, R. & Shallice, T. Contention scheduling and the control of routine activities. Cogn. Neuropsychol. 17 , 297–338 (2000).

Leung, J., Shen, Z., Zeng, Z. & Miao, C. Goal Modelling for Deep Reinforcement Learning Agents. in 271–286 (2021). doi: https://doi.org/10.1007/978-3-030-86486-6_17 .

Todorov, E. & Jordan, M. I. Optimal feedback control as a theory of motor coordination. Nat. Neurosci. 5 , 1226–1235 (2002).

Yan, G. et al. Network control principles predict neuron function in the Caenorhabditis elegans connectome. Nature 550 , 519–523 (2017).

Gu, S. et al. Optimal trajectories of brain state transitions. Neuroimage 148 , 305–317 (2017).

Stiso, J. et al. White matter network architecture guides direct electrical stimulation through optimal state transitions. Cell Rep. 28 , 2554-2566.e7 (2019).

Golub, M. D. et al. Learning by neural reassociation. Nat. Neurosci. 21 , 607–616 (2018).

Powers, W. T. Quantitative analysis of purposive systems: Some spadework at the foundations of scientific psychology. Psychol. Rev. 85 , 417–435 (1978).

Marken, R. S. & Mansell, W. Perceptual control as a unifying concept in psychology. Rev. Gen. Psychol. 17 , 190–195 (2013).

Juechems, K. & Summerfield, C. Where does value come from?. PsyArxiv https://doi.org/10.31234/osf.io/rxf7e (2019).

Carroll, T. J., McNamee, D., Ingram, J. N. & Wolpert, D. M. Rapid visuomotor responses reflect value-based decisions. J. Neurosci. https://doi.org/10.1523/JNEUROSCI.1934-18.2019 (2019).

Hart, P., Nilsson, N. & Raphael, B. A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Syst. Sci. Cybern. 4 , 100–107 (1968).

Dijkstra, E. W. A note on two problems in connexion with graphs. Numer. Math. 1 , 269–271 (1959).

Article   MATH   Google Scholar  

Silver, D. et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362 , 1140–1144 (2018).

Article   ADS   CAS   MATH   Google Scholar  

Fine, J. M., Zarr, N. & Brown, J. W. Computational neural mechanisms of goal-directed planning and problem solving. Comput. Brain Behav. 3 , 472–493 (2020).

Martinet, L.-E., Sheynikhovich, D., Benchenane, K. & Arleo, A. Spatial learning and action planning in a prefrontal cortical network model. PLoS Comput. Biol. 7 , e1002045 (2011).

Ivey, R., Bullock, D. & Grossberg, S. A neuromorphic model of spatial lookahead planning. Neural Netw. 24 , 257–266 (2011).

Knoblock, C. A. Abstracting the tower of Hanoi. Work. Notes AAAI-90 Work. Autom. Gener. Approx. Abstr. 1–11 (1990).

Kriegeskorte, N. Representational similarity analysis–connecting the branches of systems neuroscience. Front. Syst. Neurosci. https://doi.org/10.3389/neuro.06.004.2008 (2008).

Averbeck, B. B., Chafee, M. V., Crowe, D. A. & Georgopoulos, A. P. Parallel processing of serial movements in prefrontal cortex. Proc. Natl. Acad. Sci. USA 99 , 13172–13177 (2002).

Rhodes, B. J., Bullock, D., Verwey, W. B., Averbeck, B. B. & Page, M. P. Learning and production of movement sequences: Behavioral, neurophysiological, and modeling perspectives. Hum. Mov. Sci. 23 , 699–746 (2004).

Gilbert, D. T. & Wilson, T. D. Prospection: experiencing the future. Science 317 , 1351–1354 (2007).

Buckner, R. L. & Carroll, D. C. Self-projection and the brain. Trends Cogn. Sci. 11 , 49–57 (2007).

Rao, R. P. N. & Ballard, D. H. Predictive coding in the visual cortex: A functional interpretation of some extra-classical receptive-field effects. Nat. Neurosci. 2 , 79–87 (1999).

Kriegeskorte, N., Formisano, E., Sorger, B. & Goebel, R. Individual faces elicit distinct response patterns in human anterior temporal cortex. Proc. Natl. Acad Sci. USA 104 , 20600–20605 (2007).

Guest, O. & Love, B. C. What the success of brain imaging implies about the neural code. Elife 6 , (2017).

Tzourio-Mazoyer, N. et al. Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. Neuroimage 15 , 273–289 (2002).

Baldassarre, G. et al. Intrinsically motivated action-outcome learning and goal-based action recall: A system-level bio-constrained computational model. Neural Netw. 41 , 168–187 (2013).

Barto, A., Sutton, R. & Anderson, C. Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man. Cybern. 5 , 834–846 (1983).

Reynolds, J. H. & Heeger, D. J. The normalization model of attention. Neuron 61 , 168–185 (2009).

Banino, A. et al. Vector-based navigation using grid-like representations in artificial agents. Nature 557 , 429–433 (2018).

Brown, J. W., Bullock, D. & Grossberg, S. How laminar frontal cortex and basal ganglia circuits interact to control planned and reactive saccades. Neural Netw. 17 , 471–510 (2004).

Niki, H. & Watanabe, M. Prefrontal and cingulate unit activity during timing behavior in the monkey. Brain Res. 171 , 213–224 (1979).

Friston, K. The free-energy principle: A unified brain theory?. Nat. Rev. Neurosci. 11 , 127–138 (2010).

Friston, K., Mattout, J. & Kilner, J. Action understanding and active inference. Biol. Cybern. 104 , 137–160 (2011).

Alexander, W. H. & Brown, J. W. Medial prefrontal cortex as an action-outcome predictor. Nat. Neurosci. 14 , 1338–1344 (2011).

Alexander, W. H. & Brown, J. W. A general role for medial prefrontal cortex in event prediction. Front. Comput. Neurosci. 8 , (2014).

Mante, V., Sussillo, D., Shenoy, K. V. & Newsome, W. T. Context-dependent computation by recurrent dynamics in prefrontal cortex. Nature 503 , 78–84 (2013).

Khaligh-Razavi, S.-M. & Kriegeskorte, N. Deep supervised, but not unsupervised, models may explain it cortical representation. PLoS Comput. Biol. 10 , e1003915 (2014).

Wen, H. et al. Neural encoding and decoding with deep learning for dynamic natural vision. Cereb. Cortex 28 , 4136–4160 (2018).

Williamson, R. C., Doiron, B., Smith, M. A. & Yu, B. M. Bridging large-scale neuronal recordings and large-scale network models using dimensionality reduction. Curr. Opin. Neurobiol. 55 , 40–47 (2019).

Yang, G. R., Joglekar, M. R., Song, H. F., Newsome, W. T. & Wang, X.-J. Task representations in neural networks trained to perform many cognitive tasks. Nat. Neurosci. 22 , 297–306 (2019).

Watkins, C. J. C. H. & Dayan, P. Q-learning. Mach. Learn. 8 , 279–292 (1992).

Pfeiffer, B. E. & Foster, D. J. Hippocampal place-cell sequences depict future paths to remembered goals. Nature 497 , 74–79 (2013).

Van der Meer, M. A. & Redish, A. D. Expectancies in decision making, reinforcement learning, and ventral striatum. Front. Neurosci. 4 , 29–37 (2010).

Platt, J. R. Strong inference: Certain systematic methods of scientific thinking may produce much more rapid progress than others. Science 146 , 347–353 (1964).

Glasius, R., Komoda, A. & Gielen, S. C. A. M. Neural network dynamics for path planning and obstacle avoidance. Neural Netw. 8 , 125–133 (1995).

Alexander, W. H. & Brown, J. W. Frontal cortex function as derived from hierarchical predictive coding. Sci. Rep. 8 , 3843 (2018).

Badre, D. & D’Esposito, M. Is the rostro-caudal axis of the frontal lobe hierarchical?. Nat. Rev. Neurosci. 10 , 659–669 (2009).

Cooper, R. P. & Shallice, T. Hierarchical schemas and goals in the control of sequential behavior. Psychol. Rev. 113 , 887–916 (2006).

Dolan, R. J. & Dayan, P. Goals and Habits in the Brain. Neuron 80 , 312–325 (2013).

Moors, A. & De Houwer, J. Automaticity: A theoretical and conceptual analysis. Psychol. Bull. 132 , 297–326 (2006).

Grossberg, S. Contour enhancement, short term memory, and constancies in reverberating neural networks. Stud. Appl. Math. 52 , 213–257 (1973).

Busemeyer, J. R. & Townsend, J. T. Decision field theory: A dynamic-cognitive approach to decision making in an uncertain environment. Psychol. Rev. https://doi.org/10.1037/0033-295X.100.3.432 (1993).

Usher, M. & McClelland, J. L. The time course of perceptual choice: The leaky, competing accumulator model. Psychol. Rev. https://doi.org/10.1037/0033-295X.108.3.550 (2001).

Barto, A. G., Sutton, R. S. & Brouwer, P. S. Associative search network: A reinforcement learning associative memory. Biol. Cybern. 40 , 201–211 (1979).

Mathôt, S., Schreij, D. & Theeuwes, J. OpenSesame: An open-source, graphical experiment builder for the social sciences. Behav. Res. Methods 44 , 314–324 (2012).

Kanwisher, N., McDermott, J. & Chun, M. M. The fusiform face area: a module in human extrastriate cortex specialized for face perception. J. Neurosci. 17 , 4302–4311 (1997).

O’Craven, K. M., Downing, P. E. & Kanwisher, N. fMRI evidence for objects as the units of attentional selection. Nature 401 , 584–587 (1999).

Anzellotti, S., Mahon, B. Z., Schwarzbach, J. & Caramazza, A. Differential activity for animals and manipulable objects in the anterior temporal lobes. J. Cogn. Neurosci. 23 , 2059–2067 (2011).

Moeller, S. et al. Multiband multislice GE-EPI at 7 tesla, with 16-fold acceleration using partial parallel imaging with application to high spatial and temporal whole-brain FMRI. Magn. Reson. Med. 63 , 1144–1153 (2010).

Nee, D. E. & D’Esposito, M. The hierarchical organization of the lateral prefrontal cortex. Elife 5 , (2016).

Ashburner, J. & Friston, K. Multimodal image coregistration and partitioning-a unified framework. Neuroimage 6 , 209–217 (1997).

Bullock, D. Adaptive neural models of queuing and timing in fluent action. Trends Cogn. Sci. 8 , 426–433 (2004).

Eklund, A., Nichols, T. E. & Knutsson, H. Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates. Proc. Natl. Acad. Sci. https://doi.org/10.1073/pnas.1602413113 (2016).

Download references

Acknowledgements

We thank A. Ramamoorthy for helpful discussions and J. Fine and W. Alexander for helpful comments on the manuscript. Supported by the Indiana University Imaging Research Facility. JWB was supported by NIH R21 DA040773.

Author information

Authors and affiliations.

Department of Psychological and Brain Sciences, Indiana University, Bloomington, USA

Noah Zarr & Joshua W. Brown

You can also search for this author in PubMed   Google Scholar

Contributions

J.W.B. and N.Z. designed the model and experiment. N.Z. implemented and simulated the model, implemented and ran the fMRI experiment, and analyzed the data. J.W.B. and N.Z. wrote the paper.

Corresponding author

Correspondence to Joshua W. Brown .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary information., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Zarr, N., Brown, J.W. Foundations of human spatial problem solving. Sci Rep 13 , 1485 (2023). https://doi.org/10.1038/s41598-023-28834-3

Download citation

Received : 09 September 2022

Accepted : 25 January 2023

Published : 27 January 2023

DOI : https://doi.org/10.1038/s41598-023-28834-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

spatial problem solving approach

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

Spatial Problem Solving in Spatial Structures

Profile image of Ana-Maria  Olteteanu

2017, Multi-disciplinary trends in artificial intelligence

The ability to solve spatial tasks is crucial for everyday life and therefore of great importance for cognitive agents. In artificial intelligence (AI) we model this ability by representing spatial configurations and spatial tasks in the form of knowledge about space and time. Augmented by appropriate algorithms, such representations enable the generation of knowledge-based solutions to spatial problems. In comparison, natural embodied and situated cognitive agents often solve spatial tasks without detailed knowledge about underlying geometric and mechanical laws and relationships. They directly relate actions and their effects through physical affordances inherent in their bodies and their environments. Examples are found in everyday reasoning and also in descriptive geometry. In an ongoing research effort we investigate how spatial and temporal structures in the body and the environment can support or even replace reasoning effort in computational processes. We call the direct use of spatial structure Strong Spatial Cognition. Our contribution describes cognitive principles of an extended paradigm of cognitive processing. The work aims (i) to understand the effectiveness and efficiency of natural problem solving approaches; (ii) to overcome the need for detailed representations required in the knowledge-based approach; and (iii) to build computational cognitive systems that make use of these principles.

Related Papers

Karl Wender

spatial problem solving approach

KI - Künstliche Intelligenz

Kerstin Schill , T. Tenbrink

Leandra Bucher

Representations in mind and world

Ana-Maria Olteteanu , Ana-Maria Olteteanu

A spatial problem is (1) a question about a given spatial configuration (of arbitrary physical entities) that needs to be answered (e.g. is there wine in the glass?) or (2) the challenge to construct a spatial configuration with certain properties from a given spatial configuration (e.g. add two matchsticks to the given configuration to obtain four squares) (Bertel, 2010). By spatial configurations we mean arrangements of entities in 1-, 2-, or 3-dimensional physical space, where physical space is commonsensically observable Euclidean space and motion, rather than relativistic space-time. Physical space is contrasted here to abstract space of arbitrary dimensionality. Physical space affords certain actions, like (i) rotation (circular motion of objects around a given location); (ii) motion from one location to another; (iii) deformation of objects; (iv) separation of objects into parts; (v) aggregation of objects; and (vi) combinations, i.e. rotation around a changing location. A special feature of commonsense physical space (CPS) is that operations such as motion are severely constrained and comply with rigid rules we cannot change whereas in abstract spaces we are free to make up arbitrary rules about which operations are possible and which are not. For example, in abstract representations of space (AbsRS) we could allow a 'jump' operation that moves an entity directly from one location to a remote location (as in some board games). In CPS this is not possible: objects always first move to neighboring locations and then to a neighbor of that location, etc., before they can reach a remote location 1. This has implications on the trajectories (including the time course) of motion. As a second example, in abstract space we could come up with an operation that allows an entity to be in two places at the same time. In CPS this is not possible because of the nature of physical space and matter. This has implications on unique identity, presence in a space, containment within it and access to it. In abstract space, the types of operations possible are defined by the agent conceiving the abstract space, while in CPS they depend on the nature of physical space itself. The types of actions that can be performed in CPS define the characteristic structure of physical space (Freksa, 1997) that is exploited by Euclidean geometry and vice versa (Euclid, 300 BC/1956). In this chapter, we discuss (1) how cognitive agents such as humans, other animals, or robots can use concrete CPS and AbsRS for solving spatial problems and (2) what are the relative merits of both approaches. We describe how the approaches can be combined. We look at the roles of spatial configurations and of cognitive agents in the process of spatial problem solving from a cognitive architecture perspective. In particular, we discuss (a) the role of the structures of space and time; (b) the role of conceptualizations 1 This holds for arbitrary granularities of neighborhoods.

Marco Ragni

Encyclopedia of Cognitive Science

Louis Goldberg

Spatial Information Theory

Corinne Jola

F. Steffenhagen

Lecture Notes in Computer Science

Sven Bertel

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

RELATED PAPERS

Marco Cruciani

Gérard Ligozat

Marco Aiello

John Domingue

Janet Aisbett

Künstliche Intelligenz

Journal of Verbal Learning and Verbal Behavior

Stephen Handel

Mobile Robots Motion Planning. New Challenges

Lech Polkowski

Cognitive Processing

Frank Dylla , Rui Li , Alexander Klippel

Psychological Review

John Kelleher

Benjamin Kuipers

The VLDB Journal

Daniel Hernandez

Thinking & Reasoning

Sergio Morra

Neuroscience & Biobehavioral Reviews

Giorgio Vallortigara

Ana-Maria Olteteanu , Ahmed Loai Ali , Frank Dylla , Zoe Falomir , Ana-Maria Olteteanu

Merideth Gattis

Barbara Tversky

Computational Intelligence

Randy Goebel

Hülya YALDIR

RELATED TOPICS

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024
  • Introduction
  • Open access
  • Published: 23 March 2021

Why spatial is special in education, learning, and everyday activities

  • Toru Ishikawa 1 &
  • Nora S. Newcombe 2  

Cognitive Research: Principles and Implications volume  6 , Article number:  20 ( 2021 ) Cite this article

16k Accesses

18 Citations

11 Altmetric

Metrics details

The structure of human intellect can be conceptualized as consisting of three broad but correlated domains: verbal ability, numerical ability, and spatial ability (Wai et al. 2009 ). Verbal and numerical abilities are traditionally emphasized in the classroom context, as the phrase "the three Rs" (reading, writing, and arithmetic) suggests. However, research has increasingly demonstrated that spatial ability also plays an important role in academic achievement, especially in learning STEM (science, technology, engineering, and mathematics) (National Research Council 2006 ; Newcombe 2010 ). For example, envisioning the shape or movement of an imagined object contributes to the understanding of intersections of solids in calculus, structures of molecules in chemistry, and the formation of landscapes in geology.

Spatial thinking is a broader topic than spatial ability, however (Hegarty 2010 ). We use symbolic spatial tools, such as graphs, maps, and diagrams, in both educational and everyday contexts. These tools significantly enhance human reasoning, for example, graphs are a powerful tool to show the relationship among a set of variables in two (or higher) dimensions. STEM disciplines use these tools frequently, and, in addition, often have specific representations that students need to master, such as block diagrams in geology. Although teachers may assume that these representations are easy to read, maps, diagrams and graphs often pose difficulty for students, especially those with low spatial ability (e.g., a graph that shows changes in an object's velocity according to time) (Kozhevnikov et al. 2007 ).

As well as understanding spatial representations that are provided by teachers or in textbooks, good spatial thinkers can choose or even create representations that are suitable for the task at hand. Novices tend to prefer representations that are realistic and detailed, often more realistic and detailed than necessary because they include irrelevant information (Hegarty 2010 ; Tversky and Morrison 2002 ). Being good at spatial thinking entails the ability to select and create appropriate spatial representations, based on sound knowledge of content in a specific domain.

Navigation is a special kind of spatial thinking, which requires us to understand our location (where we are) and orientation (which direction we are facing) in relation to the surroundings. Sometimes, we may construct reasonably accurate mental representations of the environment ("maps in the head" or "cognitive maps"). However, people often have difficulty with cognitive mapping (Ishikawa and Montello 2006 ; Weisberg and Newcombe 2016 ), especially in environmental space (beyond figural or vista space), when we cannot view a layout in its entirety from a single viewpoint (Ittelson 1973 ; Jacobs and Menzel 2014 ; Montello 1993 ). People thus need to move around and integrate separate pieces of information available at each viewpoint in a common frame of reference, which poses extra cognitive processing demands (Han and Becker 2014 ; Holmes et al. 2018 ; Meilinger et al. 2014 ). Spatial orientation and navigation may be problematic for some people even with maps or satellite navigation (Ishikawa 2019 ; Liben et al. 2002 ).

Characteristics of spatial thinking

Spatial thinking has unique characteristics that offer interesting research challenges. First, spatial thinking concerns space at different scales. Thinking about the structures of molecules, envisioning the folding and unfolding of a piece of paper, making a mechanical drawing, packing a suitcase, finding your way to a destination in a new environment, and reasoning about the formative process of a geologic structure all concern thinking and reasoning about space, but they span a wide range of spatial and temporal scales. Expertise in spatial thinking in STEM domains typically focuses on a specific scale, with organic chemistry, surgery, mechanical engineering, architecture, structural geology, and planetary science spanning but not exhausting the range. Spatial skills may vary across scale. For example, Hegarty et al. ( 2006 ) showed that learning from direct navigation in the environment differed from learning from a video or a desktop virtual environment, yielding two separate factors in factor analysis, and that the former was correlated with self-report sense of direction, whereas the latter with psychometrically assessed spatial ability. Learmonth et al. ( 2001 ) showed that young children's use of landmark information to reorient depends on the size of space.

Second, spatial thinking occurs in various media, including 2D static images, 3D animations, schematic diagrams, indoor and outdoor environments, immersive virtual environments, and spatial language. Each medium has its own way of representing spatial information (Liben 1999 ; Tversky 2001 ) and knowledge acquired from different media differs in structure and flexibility in important ways (Rieser 1989 ; Taylor and Tversky 1992 ; Thorndyke and Hayes-Roth 1982 ). In discussing spatial thinking and learning media, one should distinguish between internal representations (knowledge in the mind) and external representations (spatial products or expressions presented to a person). External spatial representations are shown visually in a certain level of detail or resolution (Goodchild and Proctor 1997 ), and verbally in a specific frame of reference (Levinson 1996 ).

Third, spatial thinking skills vary both at a group level and at the individual level. There are cases where group differences are of concern to the instructor, for example, in consideration of male–female differences in entry and retention rates in STEM disciplines (Belser et al. 2018 ; Chen 2013 ; Sithole et al. 2017 ). Instructors are also concerned with individual differences in aptitudes; for example, students vary in their spatial and verbal abilities and some students are good at spatial tasks and some are good at verbal tasks. Is there a good way to adjust instructional methods to students' aptitudes? Furthermore, given the existence of group and individual differences in spatial thinking, another question of concern is how instruction can have an impact, for example, whether male–female differences in spatial thinking, when they occur, can be eliminated, or how best people with difficulty in spatial thinking can improve, by training.

Papers in this special issue

The papers in this special issue center around three major topics: (a) spatial thinking and the skill of mental rotation; (b) spatial thinking in the classroom context or in STEM curricula; and (c) spatial thinking in wayfinding or large-scale spatial cognition. Here is a link to the papers ( https://cognitiveresearchjournal.springeropen.com/spatial-collection ) (Table 1 ).

Mental rotation

Mental rotation is one of the major spatial abilities assessed by psychometric spatial tests, and has been much studied. Importantly, it has been shown to correlate with success in a variety of other spatial thinking tasks. Intriguingly, it also shows large male–female differences in adults, although sex differences in other spatial skills tend to be smaller or even non-existent. Whether there are sex differences in mental rotation in children is a more controversial topic; sex differences may emerge over the course of development (Lauer et al. 2019 ; Newcombe 2020 ), but for an alternative, see Johnson & Moore’s paper in this special issue. There are also papers in the special issue investigating the malleability of mental rotation with practice (Moen et al.), and its relations with spatial anxiety (Alvarez-Vargas, Abad, & Pruden) and everyday experience (Cheng, Hegarty, & Chrastil). In an unexpected twist, it turns out that mental rotation may even be involved with tracking tasks and executing intended actions at specified times (Kubik, Del Messier, & Mantyla).

Spatial thinking in STEM

Spatial thinking, as discussed above, includes advanced disciplinary thinking of a spatial nature, based on expert knowledge and reasoning in each domain. Examples of such academic disciplines include structural geology, surgery, chemistry (Atit, Uttal, & Stieff), and mathematics (Aldugom, Fenn, & Cook). Despite the contribution of spatial thinking to a physical prediction task, however, spatial skills did not account for all of the individual differences observed in intuitive physics (Mitko & Fischer). Variation in spatial learning is already evident in early adolescence, as shown in a study of learning about plate tectonics using a computer visualization (Epler-Ruths, McDonald, Pallant, & Lee). The development of effective spatial instruction should consider how to bring scientific research into the educational practice of spatial thinking (Gagnier & Fisher) and how to support elementary school teachers who are liable to spatial anxiety (Burte, Gardony, Hutton, & Taylor).

Spatial thinking and navigation

Space at environmental scale, or navigational spatial thinking, is vital in everyday life for wayfinding in the environment. Issues of concern to researchers include spatial reasoning in different spatial frames of reference (Weisberg & Chatterjee), learning performance at different spatial scales (Zhao et al.), relationship with sense of direction (Zhao et al.; Stites, Matzen, & Gastelum), the possibility of improving cognitive mapping skills (Ishikawa & Zhou), and navigation in complex environments or emergent situations (Stites, Matzen, & Gastelum). Uncertainty in a novel environment prompts people to seek information, and a review of the literature suggests the importance of examining task behavior, not just the state of knowledge at the end of a navigation experience (Keller, Taylor, & Brunye). In the context of a discussion of the possibility of instructing spatial thinking, participation in spatial activities during childhood or adolescence and its relationship with spatial thinking has attracted the attention of researchers and practitioners (Peterson et al.). Sex differences in navigation may arise from girls and boys having different childhood wayfinding experiences (Vieites, Pruden, & Reeb-Sutherland).

Questions for further thinking about spatial thinking

Looking over the articles in the special issue as well as other recent studies suggests questions for further research into spatial thinking.

Spatial ability and spatial thinking

How does mental rotation relate to spatial thinking in various academic disciplines? The existing literature points to the malleability of the skill of mental rotation: given that mental rotation is an important component of spatial thinking, how can training in mental rotation improve (or transfer to) spatial thinking? Does the effect differ in different disciplines or for different types of spatial thinking in a specific discipline? What about examining other spatial abilities, such as perspective taking, spatial orientation, or flexibility of closure, in regard to their relations with spatial thinking of various kinds? Arguably, we have focused too much on mental rotation, and ignored other kinds of crucial mental operations.

Spatial thinking as a domain-specific learning skill

Researchers have studied spatial thinking in various STEM disciplines including geoscience, surgery, chemistry, and mathematics, and also in the K-12 setting and at the college level. Continued research into the types of spatial thinking that are required in disciplinary learning and characterize expert thinking in each domain would contribute to better theoretical understanding and educational practice. Specific questions include: How is STEM learning related to (explained or predicted by) facility with spatial thinking? Is spatial thinking different from spatial ability assessed by spatial tests? In a specific STEM discipline, what is the relationship among spatial thinking, spatial ability, and domain-specific knowledge? What is the contribution of spatial thinking, spatial ability, and domain-specific knowledge, respectively, to the mastery of each disciplinary learning? And, importantly, how can one develop curricula that effectively take scientific knowledge of spatial thinking into account to encourage students to pursue STEM careers?

Spatial thinking as it relates to our everyday activities

Space is s fundamental component to our cognition and behavior, as it surrounds us and affords us opportunities to function adaptively. Thinking in, about, and with space characterizes (or conditions) our everyday activities. Finding one's way in the environment (cognitive mapping), communicating information in graphs and diagrams (visualization), and using space to think about nonspatial phenomena (spatial metaphors or spatialization) are major examples of our everyday spatial thinking, to name but a few. How are these everyday spatial thinking skills acquired, and if possible, instructed? Can navigation and wayfinding skills be trained, or can people's "sense of direction" be improved by training? Does the participation in spatial activities affect spatial thinking? Does self-assessment of one's spatial thinking skills affect (promote or hinder) participation in spatial activities?

Investigation of these questions, in collaboration between researchers and practitioners, will deepen our understanding of what spatial thinking is and how it relates to our cognition and behavior. We hope that the special issue fosters more research along these lines and enhances scientific and pedagogical interest in this vital domain of human cognition.

Belser, C., Shillingford, A., & Daire, A. P. (2018). Factors influencing undergraduate student retention in STEM majors: Career development, math ability, and demographics. The Professional Counselor, 8, 262–276.

Article   Google Scholar  

Chen, X. (2013). STEM attrition: College students’ paths into and out of STEM fields (NCES 2014–001) . Washington, DC: National Center for Education Statistics, Institute of Education Sciences, U.S. Department of Education.

Google Scholar  

Goodchild, M. F., & Proctor, J. (1997). Scale in a digital geographic world. Geographical and Environmental Modelling, 1, 5–23.

Han, X., & Becker, S. (2014). One spatial map or many? Spatial coding of connected environments. Journal of Experimental Psychology: Learning, Memory, and Cognition, 40, 511–531.

PubMed   Google Scholar  

Hegarty, M. (2010). Components of spatial intelligence. Psychology of Learning and Motivation, 52, 265–297.

Hegarty, M., Montello, D. R., Richardson, A. E., Ishikawa, T., & Lovelace, K. (2006). Spatial abilities at different scales: Individual differences in aptitude-test performance and spatial-layout learning. Intelligence , 34 , 151–176.

Holmes, C. A., Newcombe, N. S., & Shipley, T. F. (2018). Move to learn: Integrating spatial information from multiple viewpoints. Cognition, 178, 7–25.

Ishikawa, T. (2019). Satellite navigation and geospatial awareness: Long-term effects of using navigation tools on wayfinding and spatial orientation. The Professional Geographer, 71, 197–209.

Ishikawa, T., & Montello, D. R. (2006). Spatial knowledge acquisition from direct experience in the environment: Individual differences in the development of metric knowledge and the integration of separately learned places. Cognitive Psychology, 52, 93–129.

Ittelson, W. H. (1973). Environment perception and contemporary perceptual theory. In W. H. Ittelson (Ed.), Environment and cognition (pp. 1–19). New York, NY: Seminar Press.

Jacobs, L. F., & Menzel, R. (2014). Navigation outside of the box: What the lab can learn from the field and what the field can learn from the lab. Movement Ecology, 2, 3.

Kozhevnikov, M., Motes, M. A., & Hegarty, M. (2007). Spatial visualization in physics problem solving. Cognitive Science, 31, 549–579.

Lauer, J. E., Yhang, E., & Lourenco, S. F. (2019). The development of gender differences in spatial reasoning: A meta-analytic review. Psychological Bulletin, 145 (6), 537–565.

Learmonth, A. E., Newcombe, N. S., & Huttenlocher, J. (2001). Toddlers’ use of metric information and landmarks to reorient. Journal of Experimental Child Psychology, 80, 225–244.

Levinson, S. C. (1996). Frames of reference and Molyneux’s question: Cross-linguistic evidence. In P. Bloom, M. A. Peterson, L. Nadel, & M. F. Garrett (Eds.), Language and space (pp. 109–169). Cambridge, MA: MIT Press.

Liben, L. S. (1999). Developing an understanding of external spatial representations. In I. E. Sigel (Ed.), Development of mental representation: Theories and applications (pp. 297–321). Mahwah, NJ: Erlbaum.

Liben, L. S., Kastens, K. A., & Stevenson, L. M. (2002). Real-world knowledge through real-world maps: A developmental guide for navigating the educational terrain. Developmental Review, 22, 267–322.

Meilinger, T., Riecke, B. E., & Bülthoff, H. H. (2014). Local and global reference frames for environmental spaces. Quarterly Journal of Experimental Psychology, 67, 542–569.

Montello, D. R. (1993). Scale and multiple psychologies of space. In A. U. Frank & I. Campari (Eds.), Spatial information theory (pp. 312–321). Berlin: Springer.

National Research Council. (2006). Learning to think spatially . Washington, DC: National Academies Press.

Newcombe, N. S. (2010). Picture this: Increasing math and science learning by improving spatial thinking. American Educator, 34 (2), 29–43.

Newcombe, N. S. (2020). The puzzle of spatial sex differences: Current status and prerequisites to solutions. Child Development Perspectives, 14 (4), 251–257.

Rieser, J. J. (1989). Access to knowledge of spatial structure at novel points of observation. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 1157–1165.

Sithole, A., Chiyaka, E. T., McCarthy, P., Mupinga, D. M., Bucklein, B. K., & Kibirige, J. (2017). Student attraction, persistence and retention in STEM programs: Successes and continuing challenges. Higher Education Studies, 7 (1), 46–59.

Taylor, H. A., & Tversky, B. (1992). Spatial mental models derived from survey and route descriptions. Journal of Memory and language, 31, 261–292.

Thorndyke, P. W., & Hayes-Roth, B. (1982). Differences in spatial knowledge acquired from maps and navigation. Cognitive Psychology, 14, 560–589.

Tversky, B. (2001). Spatial schemas in depictions. In M. Gattis (Ed.), Spatial schemas and abstract thought (pp. 79–112). Cambridge, MA: MIT Press.

Tversky, B., & Morrison, J. B. (2002). Animation: Can it facilitate? International Journal of Human-Computer Studies, 57, 247–262.

Wai, J., Lubinski, D., & Benbow, C. P. (2009). Spatial ability for STEM domains: Aligning over 50 years of cumulative psychological knowledge solidifies its importance. Journal of Educational Psychology, 101, 817–835.

Weisberg, S. M., & Newcombe, N. S. (2016). Why do (some) people make a cognitive map? Routes, places, and working memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 42, 768–785.

Download references

Author information

Authors and affiliations.

INIAD Toyo University, Tokyo, Japan

Toru Ishikawa

Temple University, Philadelphia, USA

Nora S. Newcombe

You can also search for this author in PubMed   Google Scholar

Corresponding authors

Correspondence to Toru Ishikawa or Nora S. Newcombe .

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Ishikawa, T., Newcombe, N.S. Why spatial is special in education, learning, and everyday activities. Cogn. Research 6 , 20 (2021). https://doi.org/10.1186/s41235-021-00274-5

Download citation

Published : 23 March 2021

DOI : https://doi.org/10.1186/s41235-021-00274-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

spatial problem solving approach

More From Forbes

Spatial thinking is everywhere, and it’s changing everything.

  • Share to Facebook
  • Share to Twitter
  • Share to Linkedin

FEDEX IS ONE OF THOSE COMPANIES that epitomizes the modern idea of efficiency: Deliver a document or a package anywhere in the U.S. overnight — guaranteed. The company stands as a symbol of operational wizardry, with a fleet of gleaming white airplanes and battalions of purple-shirted staff. The package handed across a FedEx counter in Orlando at 4 pm Tuesday appears in an office in San Diego before noon Wednesday.

To keep that happening, FedEx’s more than 700 jets need routine maintenance. That maintenance requires exactly the right parts from inventories of thousands of spares, and it requires the right mechanic in the right airport in a global network of 650. But FedEx can’t have planes sitting idly on the ground, waiting for a part or a mechanic.

Hence the puzzle: How to consistently get all three elements — plane, parts, and person — in the right place at the right time.

“It’s a ballet,” says Sheila Davis, manager of FedEx’s IT enterprise platform for geography. Airline operations turned to Davis’s IT group, an innovative FedEx team that uses location technology and spatial analytics — often, quite literally, maps loaded with data — to solve problems.

For this problem, like many others, the key is geography. You need to know where everything is, all the time.

As Davis explains, “there’s a lot of coordination that has to take place — getting the part of a plane on another plane, [and] once that plane with the part reaches the destination, coordinating having a mechanic available.

“We gave Air Ops some insight, mainly using visualization, looking — spatially — at all the planes in our network. Which planes were in flight, or about to take off? When would they arrive?”

At a glance, Fed Ex Air Ops could see how to get planes needing maintenance, and the parts they required, to the same place at the same time. Then they could schedule mechanics smartly. Seeing the status of the whole FedEx flight network in real time unraveled a puzzle that screens of data made worse.

Best High-Yield Savings Accounts Of September 2023

Best 5% interest savings accounts of september 2023.

The result: Maintenance happens more efficiently. Just as important, planning the maintenance process itself takes many fewer work hours. Costs are lower; maintenance performance is higher.

That’s the power — really, almost the magic — of bringing spatial analytics or the geographic approach to bear on a whole array of challenges, from communication to long-term strategy to risk reduction. Geography and visualization turn out to be a missing tool that unlocks remarkable insights, efficiencies, even hidden opportunities for growth.

The biggest companies in the world have discovered the impact of spatial thinking and are using it to tackle a surprising range of problem-solving.

Marriott International — with hotels in every major city in the world — is using spatial analytics, aka the geographic approach, to predict the physical risks that climate change poses to each of those 8000+ hotels, to better plan, prepare and manage that risk, region by region, hotel by hotel. “That is our quintessential thing right now,” says Rob Bahl, Marriott’s global VP of engineering and facilities. “We’re a bit early in our journey.” But Marriott is quickly finding all kinds of applications for spatial thinking. The company also uses it to track its fleet of US facility maintenance vehicles.

Spatial thinking is a decision-making tool. A planning tool. A tool of data analysis, data modeling, ... [+] and data presentation. It’s a collaboration tool, a communication tool, and a transparency tool.

HOW IS IT POSSIBLE that a single approach — the geographic approach — could untangle aircraft maintenance for FedEx, and help Marriott to analyze and prepare for climate risks to their properties? And become a ubiquitous problem-solving strategy?

Many companies themselves have been caught off guard at the significance and usefulness of infusing geography into almost every analysis, every day-to-day operation, every long-term plan.

Bahl from Marriott says one of the hidden powers of spatial thinking is communication — maps are appealing and approachable. Putting critical data into geographic form has a way of making complicated ideas understandable.

“We’ve been on a journey for a long time to reduce our carbon footprint, and we're getting more and more aggressive about that,” says Bahl. Using the geographic approach and geospatial technology, “We created a spinning globe imagery using ArcGIS that has the carbon footprint of every hotel around the world.”

Marriott’s own staff and property owners find the globe riveting.

“It’s fun to watch people pay attention to the globe,” Bahl says. “Who doesn’t like to play with maps? I was amazed and excited about the interest they had, much more than if I put a big spreadsheet up on a PowerPoint slide. There were a lot of questions asked.”

“A light bulb went off for me,” says Bahl. “This is a much better way to communicate things like ‘carbon footprint.’”

AT FEDEX, DAVIS’S GROUP HAS CREATED self-service tools, and spatial thinking is so broadly useful that groups inside FedEx are building their own maps, applications, and dashboards.

“It’s really interesting to see those teams able to answer their own questions,” Davis says. “We’re getting team members to think about solving problems in a different way — in a spatial way. That’s definitely rewarding.”

Spatial thinking or the geographic approach is a decision-making tool. A planning tool. A tool of data analysis, data modeling, and data presentation. It’s a collaboration tool, a communication tool, and a transparency tool. In other words, in the words of those in the best position to know in the first place, it’s an especially useful tool for an especially challenging time.

To learn more about how businesses can use spatial thinking to plan and manage complex supply chains on a global scale, visit esri.com/en-us/industries/technology/focus-areas/supply-chain-operations-and-analysis .

Cindy Elliott

  • Editorial Standards
  • Reprints & Permissions

Join The Conversation

One Community. Many Voices. Create a free account to share your thoughts. 

Forbes Community Guidelines

Our community is about connecting people through open and thoughtful conversations. We want our readers to share their views and exchange ideas and facts in a safe space.

In order to do so, please follow the posting rules in our site's  Terms of Service.   We've summarized some of those key rules below. Simply put, keep it civil.

Your post will be rejected if we notice that it seems to contain:

  • False or intentionally out-of-context or misleading information
  • Insults, profanity, incoherent, obscene or inflammatory language or threats of any kind
  • Attacks on the identity of other commenters or the article's author
  • Content that otherwise violates our site's  terms.

User accounts will be blocked if we notice or believe that users are engaged in:

  • Continuous attempts to re-post comments that have been previously moderated/rejected
  • Racist, sexist, homophobic or other discriminatory comments
  • Attempts or tactics that put the site security at risk
  • Actions that otherwise violate our site's  terms.

So, how can you be a power user?

  • Stay on topic and share your insights
  • Feel free to be clear and thoughtful to get your point across
  • ‘Like’ or ‘Dislike’ to show your point of view.
  • Protect your community.
  • Use the report tool to alert us when someone breaks the rules.

Thanks for reading our community guidelines. Please read the full list of posting rules found in our site's  Terms of Service.

European Proceedings Logo

  • Publishing Policies
  • For Organizers/Editors
  • For Authors
  • For Peer Reviewers

Search icon

Spatial Ability: Understanding the Past, Looking into the Future

email address

Spatial ability (SA) refers to the ability to generate, retain and manipulate abstract visual images. From 1880 to 1940 SA was gradually understood and defined as an independent, unique ability and justly included as one of Gardner’s types of intelligences. Later, Maier improved Gardner's model by distinguishing between five types of spatial abilities and intelligence: Spatial perception, Visualization, Mental rotation, Spatial relations and Spatial orientation. During the 70’s and 80’s the developmental attributes of the term were addressed. Those studies wished to understand how and when SA develops and naturally were directed mostly at children as subjects. Those studies revealed that SA is an essential ability to the development of mathematical skills. Later developmental studies addressed adult SA development and accordingly found that SA was a predictor of success in STEM fields of academic studies. Furthermore, during those years psychometric studies started to develop standardized tests to measure SA. Starting from the 80’s and up to now, a great deal of research is being directed at technology and the way it influences SA development. These studies direct special attention at studying how new technologies such as computer games and VR affect SA measuring and training. The current research wishes to continue the rich tradition of this field of studies and draws attention to first year engineering and architecture students. It seeks to investigate how to best train their SA and the way this training will influence future achievements on both fields.

Keywords: Mental rotation , spatial ability , spatial perception , spatial visualization , spatial relation

What is Spatial Ability?

A great body of research throughout the history of science addressed questions about human abilities, how they are developed and how these abilities contribute to performance in different fields. Gardner’s theory of multiple intelligences defines seven different types: linguistic, logical-mathematical, spatial, musical, physical-kinaesthetic, interpersonal and intrapersonal. The spatial intelligence is defined by Gardner ( 1983 ) as the ability of forming a mental model of the spatial world and maneuvering and working with this model. Spatial intelligence helps the individual to perceive, decode and activate in the imagination visual representations that create the image of space. Researchers in this field consider SA an ability consisting of a set of cognitive skills and list its components. According to Gardner ( 2011 ), spatial intelligence includes several capabilities:

The ability to accurately capture two-dimensional shapes or objects in space.

The ability to imagine and graphically represent visual or spatial ideas.

The ability to evaluate what a shape or object will look like after imagination’s manipulation.

Armstrong ( 2018 ) states that spatial intelligence consists of three components:

The ability to absorb and understand visual information: The ability to distinguish similarities and differences between objects or between object’s parts; The ability to understand shape globally; The ability to identify relationships between visual factors and the ability to distinguish between character and background; The ability to absorb and understand visual information makes it possible to translate the visual information into the properties of the object to which the observed information relates.

The ability to create a mental image: The ability to imagine objects from their two-dimensional representations or from verbal information.

The ability to perform manipulations by imagination on a mental image. For example, the ability to imagine movement of objects or imagine changes and processes in them. In this activity also the source and the product of mental processing are also visual, and the emphasis is on the mental process that the individual goes through while simulating a change that has happen in object.

According to Armstrong ( 2015 ), people with spatial intelligence think and learn with the help of pictures; remember faces and places; Notice small details that most people do not notice; Connecting between the world of reality and the world of imagination that they envision. They think in three dimensions, imagining three-dimensional things in their head and "looking" at them from different angles.

Most studies use the term Spatial ability (SA) and it is generally attributed to spatial perception or visualization ( Grande, 1990 ). In this study we will use the following definition: SA is the ability to generate, retain and manipulate abstract visual images ( Lohman et al., 1979 ).

The research on SA started at the end of the 19st century. Those studies focused on acknowledging and defining SA as a unique and separate ability, this as part of a general movement away from understanding intelligence as a holistic construct and towards perceiving it as combined of diverse set of skills such as associative memory, numerical ability, verbal fluency and understanding etc. ( Thorndike, 1921 ).

Psychometric Perspective

From the fourth and the fifth decade of the 20st century psychometric studies began. Those studies were directed at defining and measuring the different elements of SA. For instance, Thurstone ( 1950 ) found three elements of SA which eventually became known as: mental rotation, spatial imagery & spatial perception ( Linn & Petersen, 1985 ). Naturally at that time, there was no unity regarding the test and methods of measurement and sometimes, those differences even led to contradictory definitions of the SA elements ( Tzabary & Tesler, 2022 ). The psychometric branch of studies continues to these days and keeps refining the distinction between the different subtypes of SA and the best way to measure them. After Gardner’s work, SA was more accurately laid out by Maier ( 1996 ) who improved Gardner's multiple intelligences model and theory when distinguished between five types of spatial abilities and intelligence:

: The horizontal and the vertical fixation of the direction regardless of disturbing information.

: This is the ability to describe situations when the components pleasant to each other;

: rotation of three-dimensional solids mentally.

: The ability to identify the relations between the parts of solid.

: The ability to enter a given spatial state.

Exercises of the five elements of SA, Maier (1998)

Recent studies suggested other perspectives on how to view SA such as the distinction between static SA and dynamic SA ( D'Oliveira, 2004 ) or the distinction between intrinsic SA and extrinsic SA ( Newcombe & Shipley, 2015 ).

Developmental Perspective

From 1960-1980 the development of SA was addressed in a more serious manner. These studies wished to find out when SA develops and how. Thus, Piaget and Inhendler showed SA to be an essential ability from birth and specified three stages of development: 1. Typological stage- usually children from the age of 3-5 acquire two-dimensional abilities that are tied to qualitative relations between shapes such as closeness, separation, order etc. and also the distinction between open and closed shapes. Children who successfully reach this stage are capable of completing a puzzle but they draw a square, a triangle and a circle the same way (round closed line). Children at these ages are still egocentric and are unable to understand that there may be more than one perspective. 2. The Projective stage- at this stage children learn to distinct between a straight and curved line and between different polygons. They learn to imagine different three-dimensional objects and understand how they look from a different perspective or after rotation. This stage is usually acquired at the age of 6 and keeps developing throughout all the years of childhood and adolescence. 3. The Euclidean stage- at this stage a transition is reached from two-dimensional to three-dimensional reasoning, thus the ability to form and use terms like direction, angle, relation volume etc. is possible ( Beard et al., 1971 ). Current studies indicate that Piaget’s stages are correct but that the age at which they are reached is younger ( Huttenlocher et al., 1999 ). Verdine et al. ( 2017 ) developed a test that measures SA ability in children as young as three years old. In their Longitudinal study they found that SA ability at the age of three predicts SA ability at the age of five.

Can SA develop at adulthood?

Hoffer ( 1977 ) describes a research conducted on people who were blind from birth (because of cataract). These people could identify objects by other senses such as touch, taste, smell and hearing. Because of developments in the medical world they have gone through surgery and as a result could see for the first time. After surgery the subjects were presented with objects that they could only see and could not identify them. For those people shapes like pyramids or cubes had no visual meaning. But with time, they were able to develop this understanding and learned to know and interpret the visual information and to combine it with knowledge from other senses. Hoffer saw these results a proof that visual perception can develop at adulthood . furthermore Special courses given to students whose mastery of spatial skills was low contributed to a marked improvement in their achievement. These findings also indicate the ability to develop SA.

SA and Other Abilities

After the developmental research was grounded, researchers began focusing on the relationship between SA and children’s mathematical achievements and capabilities. These studies have shown that SA plays a major role in mathematical thinking ( Mix et al., 2016 ; Wheatley, 1990 ). One of the common theories about the way this effect occurs, explains that mathematical thinking is supported by spatial-mental representations ( Cheng & Mix, 2013 ), Thus, for example, some people create schematic representations of mathematical problems that include the spatial relationships described in the problems. Studies indicate that the solutions offered by these people are more correct on average ( Rittle-Johnson et al., 2019 ).

Later studies have shown similar correlations between SA and mathematical thinking in college students ( Wai et al., 2009 ). According to these studies SA is among the cognitive factors that were identified as predictors of success in STEM fields (Science, Technology, Engineering and Math) ( Richardson et al., 2018 ; Wainman et al., 2021 ). Large scale studies show that SA can predict long term achievements in STEM, better than verbal and quantitative abilities ( Wai et al., 2009 ).

Recent studies have turned the focus on another relevant field in which SA plays a vital role, the field of architecture ( Berkowitz et al., 2021 ). Architects need good mathematical competences. For example, they need to be able to find the strength of a certain structure or identify the optimal way of stabilizing a structure, and so on ( Sergeeva et al., 2019 ). Furthermore, when architects design a building, they perform a multi-step process of manipulating spatial configurations, switching between perspective and so on ( Cross, 2011 ). Thus, the ability to visualize space is an integral skill in architecture ( Berkowitz et al., 2021 ). Having said that, it seems that the way SA influences on success in architecture studies was not studied yet.

SA Training

Experts agree that it is important to develop a child’s spatial abilities from a very early age ( Tzabary & Tesler, 2022 ). Thus, it was found that playing puzzle games at the age of 2-4 can predict spatial abilities, even when variables such as parent’s educational degree, income etc. are being controlled ( Levine et al., 2012 ). It has also been found that goal-directed construction play with parental support contributes to the development of SA rather than open play in which the child builds as he sees fit. In such a game the child is required to solve a certain problem in order to reach the goal, and for this purpose he is expected to use more spatial intelligence ( Ferrara et al., 2011 ). Newcombe et al. ( 2013 ) also states that the use of symbolic representations such as maps, models, graphs and spatial hand gestures has an enormous effect on the development of SA in child’s early years of life. The use of spatial representations allows to pass spatial knowledge and encourages spatial communication between parents and children that includes the use of words to describe spatial relations such as: above, behind, under etc. this communication influences how well children from the age of 1-4 understand and use spatial language and in turn predicts their spatial abilities ( Pruden et al., 2011 ). Also, studies have shown that the presence of an adult that talks, guides and encourages a child’s learning while pointing his attention towards the difference or resemblance between shapes, helps him to sort them into categories and motivates him to explore the world around him, will enhance development ( Levine et al., 2016 ).

Further, it was found that gestures have a big role as means to aid SA skills. A gesture can represent and object, point to a specific direction, it can also describe spatial relations etc. usually gestures accompany speech and add extra information ( Newcombe et al., 2013 ). Thus, it was found that gestures improve college student’s achievements in mental rotation tasks ( Chu & Kita, 2011 ). Students who were encouraged to use gestures without speech did better than those who weren’t given any instructions or those who were banned from using gestures. In accord with these findings teachers should use gestures in their teaching to better develop student’s SA ( Newcombe & Frick, 2010 ).

Real, tangible teaching vs computer-based learning:

Maier, who introduced the five different types of SA (presented above), wrote that based on psychological research findings, all five elements of SA have to be specifically trained. He further introduced a modular construction system based on the traditional system where polygons were joined with rubber bands. He used real models, because in his view those were the most successful in improving students SA ( Maier, 1996 ). Current studies support Maier’s approach, they state the use of real models is more effective in developing SA that computer based learning ( Hill et al., 2010 ). Sorby ( 1999 ) reports on a study conducted among engineering students and examined what develops their spatial imagery ability. It was found that in courses where students were required to draw models by hand (rather than using a computer) and work with tangible models (rather than models on a computer monitor), there was a development in their spatial performance. Although effective, this intervention is costly since it requires an expert teacher, it is also long and needs a lot of models if the students work individually ( Aszalos & Bako, 2004 ).

Later there has been a computer-based training program, developed by Aszalos and Bako. This intervention seeks to improve student’s spatial geometry ability and according to the results did so with success. Nevertheless, this intervention was preliminary and limited in the kind of ability tested to improve geometry and in the number of subjects. Yang and chan, 2010 found that digital pentomino games (cutting and assembling riddles of forms that are comprised of 5 squeres) can improve spatial skills.

Baldwin ( 2003 ) and Orion et al. ( 1997 ) report a development in spatial performance following an intervention program as part of a course in earth sciences. The studies were conducted among different populations: middle school students, geology students and non-science students, and all study groups have developed certain improvement.

From 1980 and to these days many studies focus on the influence of technology on SA and it’s measurement ( Mohler, 2001 ). The great technological revolution that became evident in the fields of computers, medical equipment, VR etc. led to new possibilities in the research of SA. Accordingly, researchers started looking into training programs that can improve children’s and adult’s SA.

Thus, computer based training programs are gaining more and popularity these days ( Kurtulus & Uygan, 2010 ). Rafi et al. ( 2008 ) examined the effect of Web-based activities and animation aided computer applications on the spatial visualization

abilities of two test groups of primary school 2nd Grade students. The control group in this study was taught through traditional teaching methods. Results indicated that the two test groups had higher levels of spatial ability than that of the control group ( Rafi et al., 2008 ).

Another technology that was implemented for the purpose of developing spatial skills is 3D dynamic sketching software that enables a more learning experience.

Kurtulus and Uygan ( 2010 ) developed a training program aimed at improving spatial visualization. In order to improve this ability, they used a 3D dynamic sketching software that is usually used for designing 3D building models. This study found significant differences in students’ visualization abilities in a pre-post intervention design between test (sketchup training) and control group (traditional geometry activities) ( Kurtulus & Uygan, 2010 ). Recently Jaelani ( 2021 ) compared students who were receiving SketchUp-aided generative learning and direct learning in high school students. The results showed that spatial ability increased after receiving SketchUp-Aided generative learning compared to direct traditional learning ( Jaelani, 2021 ).

Most recently, Di and zheng conducted a meta-analysis of virtual technologies training programs for improving SA. This study encompassed 36 empirical studies published between 2010 and 2020. They found that virtual technologies were effective in improving student’s SA, this effect was especially high for preschool learners, in the fields of natural science and engineering technologies, for all types of spatial ability, and when learning during 3 to 6 months. Furthermore, augmented reality was most conducive to improving learners’ spatial ability compared with other virtual technologies ( Di & Zheng, 2022 ).

Summary and Discussion:

As laid out in this article, spatial ability is a term long defined and studied. And indeed, it came a long way until reaching its current definition, psychometry and the theories behind its development. Though research keeps being done in all these arias, and some disputes are still evident and fuel future studies, it seems to me that the one unsettled area of spatial ability research remains the area of training and improving SA. In this field it seems that despite many studies conducted, there is no movement towards a more accepted, unified, standardized approach to train SA. On the one hand, there is Maier’s approach, which states that all five subtypes of SA should be trained, using traditional real models, since in his view those were the most successful in improving students SA. Current research supports Maier’s approach by pointing to the advantages of training programs and college courses that encourage tangible learning through working with models, using gestueres, drawing etc. On the other hand, there are multiple studies that presented the training of one specific type of SA such as mental rotation or visualization, using computerized methods only ( Aszalos & Bako, 2004 ). Some studies trained SA through requiring students to draw models by hand ( Sorby, 1999 ), others stated that VR and 3D training is better ( Rafi et al., 2008 ). Also, almost no research was found to address the question of combining tangible and computer based learning. It seems to me that there is no consensus in this regard and there is a lack of a more unified approach, combining the different methods and creating a general guideline to training SA. Such an approach is extremely important considering the great evidence found for the importance of SA in cognitive development and later achievements. Finally, the vast majority of studies regarding SA were conducted on one specific population, this is the population of STEM fields (both in children and in adults) and little attention was given to other fields such as architecture ( Berkowitz et al., 2021 ).

In my study I seek to deal with the problem laid out above and to try and develop a multidisciplinary training program, directed at improving SA in both engineering and architecture students. A program that will combine both frontal learning, using real life models and by hand drawing, and computer-based learning, using sketch-up and VR technologies. I wish to create a balance between different forms of learning, addressing the different needs and deficits of the educational systems (shortage in expert teachers, need for cost-effective, short term interventions, on-line learning etc.). In this way I wish to optimize the process of learning, to better engage and motivate students and to create a cost-effective training program that will serve as the basis of a unified method of training SA.

Armstrong, T. (2015). You’re Smarter Than You Think: a Kid’s Guide to Multiple Intelligences. Readhowyouwant.com Ltd.

Armstrong, T. (2018). Multiple intelligences in the classroom. ASCD.

Aszalos, L., & Bako, M. (2004). How can we improve the spatial intelligence. 6th International Conference on Applied Informatics Eger. Hungary.

Baldwin, T. K. (2003). Spatial ability development in the geosciences. [Master’s Thesis].University of Arizona, USA

Beard, R., Piaget, J., & Inhelder, B. (1971). Mental Imagery in the Child: A Study of the Development of Imaginal Representation. British Journal of Educational Studies, 19(3), 343.

Berkowitz, M., Gerber, A., Thurn, C. M., Emo, B., Hoelscher, C., & Stern, E. (2021). Spatial Abilities for Architecture: Cross Sectional and Longitudinal Assessment with Novel and Existing Spatial Ability Tests. Frontiers in Psychology, 11.

Cheng, Y.-L., & Mix, K. S. (2013). Spatial Training Improves Children’s Mathematics Ability. Journal of Cognition and Development, 15(1), 2–11.

Chu, M., & Kita, S. (2011). The nature of gestures’ beneficial role in spatial problem solving. Journal of Experimental Psychology: General, 140(1), 102–116.

Cross, N. (2011). Design thinking: Understanding how designers think and work. Berg.

Di, X., & Zheng, X. (2022). A meta-analysis of the impact of virtual technologies on students’ spatial ability. Educational Technology Research and Development, 70(1), 73–98.

D'Oliveira, T. C. (2004). Dynamic Spatial Ability: An Exploratory Analysis and a Confirmatory Study. The International Journal of Aviation Psychology, 14(1), 19-38.

Ferrara, K., Hirsh-Pasek, K., Newcombe, N. S., Golinkoff, R. M., & Lam, W. S. (2011). Block Talk: Spatial Language During Block Play. Mind, Brain, and Education, 5(3), 143-151.

Gardner, H. (1983). Frames of mind: The Theory of Multiple Intelligences. Basic Books.

Gardner, H. (2011). Frames of mind: The Theory of Multiple Intelligences (3rd ed.). Basic Books.

Grande, J. D. (1990). Spatial Sense. The Arithmetic Teacher, 37(6), 14–20.

Hill, C., Corbett, C., & St Rose, A. (2010). Why so few? Women in science, technology, engineering, and mathematics. American Association of University Women. 1111 Sixteenth Street NW, Washington, DC 20036.

Hoffer, A. R. (1977). Mathematics Resource Project: Geometry and visualization. Creative Publications.

Huttenlocher, J., Newcombe, N., & Vasilyeva, M. (1999). Spatial Scaling in Young Children. Psychological Science, 10(5), 393–398.

Jaelani, A. (2021). SketchUp-aided generative learning in solid geometry: Does it affected students’ spatial abilities? Journal of Physics: Conference Series, 1778(1), 012039.

Kurtulus, A., & Uygan, C. (2010). The effects of Google Sketchup based geometry activities and projects on spatial visualization ability of student mathematics teachers. Procedia - Social and Behavioral Sciences, 9, 384–389.

Levine, S. C., Foley, A., Lourenco, S., Ehrlich, S., & Ratliff, K. (2016). Sex differences in spatial cognition: Advancing the conversation. Wiley Interdisciplinary Reviews: Cognitive Science, 7(2), 127–155.

Levine, S. C., Ratliff, K. R., Huttenlocher, J., & Cannon, J. (2012). Early puzzle play: A predictor of preschoolers' spatial transformation skill. Developmental Psychology, 48(2), 530-542.

Linn, M. C., & Petersen, A. C. (1985). Emergence and Characterization of Sex Differences in Spatial Ability: A Meta-Analysis. Child Development, 56(6), 1479.

Lohman, D. F., States, U., & University, S. (1979). Spatial ability: a review and reanalysis of the correlational literature. School Of Education, Stanford University.

Maier, P. H. (1996). Spatial geometry and spatial ability–How to make solid geometry solid. Selected Papers from the Annual Conference of Didactics of Mathematics, 63–75.

Mix, K. S., Levine, S. C., Cheng, Y.-L., Young, C., Hambrick, D. Z., Ping, R., & Konstantopoulos, S. (2016). Separate but correlated: The latent structure of space and mathematics across development. Journal of Experimental Psychology: General, 145(9), 1206–1227.

Mohler, J. L. (2001). Using interactive multimedia technologies to improve student understanding of spatiallydependent engineering concepts. Proceedings of the GraphiCon, 292–300.

Newcombe, N. S., & Frick, A. (2010). Early education for spatial intelligence: Why, what, and how. Mind, Brain, and Education, 4(3), 102–111.

Newcombe, N. S., & Shipley, T. F. (2015). Thinking About Spatial Thinking: New Typology, New Assessments. In J. S. Gero (Ed.), Studying Visual and Spatial Reasoning for Design Creativity (pp. 179–192). Springer Netherlands.

Newcombe, N. S., Uttal, D. H., & Sauter, M. (2013). Spatial Development. The Oxford Handbook of Developmental Psychology (Vol. 1, pp. 563-590).

Orion, N., Ben-Chaim, D., & Kali, Y. (1997). Relationship between Earth-Science Education and Spatial Visualization. Journal of Geoscience Education, 45(2), 129-132.

Pruden, S. M., Levine, S. C., & Huttenlocher, J. (2011). Children's spatial thinking: does talk about the spatial world matter?: Children's spatial thinking. Developmental Science, 14(6), 1417-1430.

Rafi, A., Samsudin, K. A., & Said, C. S. (2008). Training in spatial visualization: The effects of training method and gender. Journal of Educational Technology & Society, 11(3), 127-140.

Richardson, R., Sammons, D., & Delparte, D. (2018). Augmented affordances support learning: Comparing the instructional effects of the augmented reality sandbox and conventional maps to teach topographic map skills. Journal of Interactive Learning Research, 29(2), 231–248.

Rittle-Johnson, B., Zippert, E. L., & Boice, K. L. (2019). The roles of patterning and spatial skills in early mathematics development. Early Childhood Research Quarterly, 46, 166–178.

Sergeeva, E. V., Moskvina, E. A., & Torshina, O. A. (2019). The interaction between mathematics and architecture. IOP Conference Series: Materials Science and Engineering, 675(1), 012018.

Sorby, S. A. (1999). Developing 3D spatial visualization skills. The Engineering Design Graphics Journal, 63(2).

Thorndike, E. L. (1921). On the Organization of Intellect. Psychological Review, 28(2), 141–151.

Thurstone, L. L. (1950). Some primary abilities in visual thinking. Proceedings of the American Philosophical Society, 94(6), 517–521.

Tzabary, A., & Tesler, B. (2022). The code of spatial intelligence (1st ed.). Mofet.

Verdine, B. N., Golinkoff, R. M., HirshPasek, K., & Newcombe, N. (2017). Links between spatial and mathematical skills across the preschool years. Wiley Hoboken.

Wai, J., Lubinski, D., & Benbow, C. P. (2009). Spatial ability for STEM domains: Aligning over 50 years of cumulative psychological knowledge solidifies its importance. Journal of Educational Psychology, 101(4), 817–835.

Wainman, B., Aggarwal, A., Birk, S. K., Gill, J. S., Hass, K. S., & Fenesi, B. (2021). Virtual Dissection: An Interactive Anatomy Learning Tool. Anatomical Sciences Education, 14(6), 788-798.

Wheatley, G. H. (1990). One Point Of View: Spatial Sense and Mathematics Learning. The Arithmetic Teacher, 37(6), 10–11.

Copyright information

Creative Commons License

About this article

Publication date.

31 May 2023

Article Doi

https://doi.org/10.15405/epes.23056.9

978-1-80296-962-7

European Publisher

Print ISBN (optional)

Edition number.

1st Edition

Education, reflection, development

Cite this article as:

Porat, R., & Ceobanu, C. (2023). Spatial Ability: Understanding the Past, Looking into the Future. In I. Albulescu, & C. Stan (Eds.), Education, Reflection, Development - ERD 2022, vol 6. European Proceedings of Educational Sciences (pp. 99-108). European Publisher. https://doi.org/10.15405/epes.23056.9

We care about your privacy

We use cookies or similar technologies to access personal data, including page visits and your IP address. We use this information about you, your devices and your online interactions with us to provide, analyse and improve our services. This may include personalising content or advertising for you. You can find out more in our privacy policy and cookie policy and manage the choices available to you at any time by going to ‘Privacy settings’ at the bottom of any page.

Manage My Preferences

You have control over your personal data. For more detailed information about your personal data, please see our Privacy Policy and Cookie Policy .

These cookies are essential in order to enable you to move around the site and use its features, such as accessing secure areas of the site. Without these cookies, services you have asked for cannot be provided.

Third-party advertising and social media cookies are used to (1) deliver advertisements more relevant to you and your interests; (2) limit the number of times you see an advertisement; (3) help measure the effectiveness of the advertising campaign; and (4) understand people’s behavior after they view an advertisement. They remember that you have visited a site and quite often they will be linked to site functionality provided by the other organization. This may impact the content and messages you see on other websites you visit.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Scientific Reports

Logo of scirep

Foundations of human spatial problem solving

Department of Psychological and Brain Sciences, Indiana University, Bloomington, USA

Joshua W. Brown

Associated data.

The GOLSA model code for the simulations is available at https://github.com/CogControl/GolsaOrigTreasureHunt . Imaging data are available from the corresponding author on reasonable request.

Despite great strides in both machine learning and neuroscience, we do not know how the human brain solves problems in the general sense. We approach this question by drawing on the framework of engineering control theory. We demonstrate a computational neural model with only localist learning laws that is able to find solutions to arbitrary problems. The model and humans perform a multi-step task with arbitrary and changing starting and desired ending states. Using a combination of computational neural modeling, human fMRI, and representational similarity analysis, we show here that the roles of a number of brain regions can be reinterpreted as interacting mechanisms of a control theoretic system. The results suggest a new set of functional perspectives on the orbitofrontal cortex, hippocampus, basal ganglia, anterior temporal lobe, lateral prefrontal cortex, and visual cortex, as well as a new path toward artificial general intelligence.

Introduction

Great strides have been made recently toward solving hard problems with deep learning, including reinforcement learning 1 , 2 . While these are groundbreaking and show superior performance over humans in some domains, humans nevertheless exceed computers in the ability to find creative and efficient solutions to novel problems, especially with changing internal motivation values 3 . Artificial general intelligence (AGI), especially the ability to learn autonomously to solve arbitrary problems, remains elusive 4 .

Value-based decision-making and goal-directed behavior involve a number of interacting brain regions, but how these regions might work together computationally to generate goal directed actions remains unclear. This may be due in part to a lack of mechanistic theoretical frameworks 5 , 6 . The orbitofrontal cortex (OFC) may represent both a cognitive map 7 and a flexible goal value representation 8 , driving actions based on expected outcomes 9 , though how these guide action selection is still unclear. The hippocampus is important for model-based planning 10 and prospection 11 , and the striatum is important for action selection 12 . Working memory for visual cues and task sets seems to depend on the visual cortex and lateral prefrontal regions, respectively 13 , 14 .

Neuroscience continues to reveal aspects of how the brain might learn to solve problems. Studies of cognitive control highlight how the brain, especially the prefrontal cortex, can apply and update rules to guide behavior 15 , 16 , inhibit behavior 17 , and monitor performance 18 to detect and correct errors 19 . Still, there is a crucial difference between rules and goals. Rules define a mapping from a stimulus to a response 20 , but goals define a desired state of the individual and the world 21 . When cognitive control is re-conceptualized as driving the individual to achieve a desired state, or set point, then cognitive control becomes a problem amenable to control theory.

Control theory has been applied to successfully account for the neural control of movement 22 and has informed various aspects of neuroscience research, including work in C. Elegans 23 , and work on controlling states of the brain 24 and electrical stimulation placement methods 25 (as distinct from behavioral control over states of the world in the present work), and more loosely in terms of neural representations underlying how animals control an effector via a brain computer interface 26 . In Psychology, Perceptual Control Theory has long maintained that behavior is best understood as a means of controlling perceptual input in the sense of control theory 27 , 28 .

In the control theory framework, a preferred decision prospect will define a set point, to be achieved by control-theoretic negative feedback controllers 29 , 30 . Problem solving then requires 1) defining the goal state; 2) planning a sequence of state transitions to move the current state toward the goal; and 3) generating actions aimed at implementing the desired sequence of state transitions.

Algorithms already exist that can implement such strategies, including the Dijkstra and A* algorithms 31 , 32 and are commonly used in GPS navigation devices found in cars and cell phones. Many variants of reinforcement learning solve a specific case of this problem, in which the rewarded states are relatively fixed, such as winning a game of Go 33 . While deep Q networks 1 and generative adversarial networks with monte carlo tree search 33 are very powerful, what happens when the goals change, or the environmental rules change? In that case, the models may require extensive retraining. The more general problem requires the ability to dynamically recalculate the values associated with each state as circumstances, goals, and set points change, even in novel situations.

Here we explore a computational model that solves this more general problem of how the brain solves problems with changing goals 34 , and we show how a number of brain regions may implement information processing in ways that correspond to specific model components. While this may seem an audacious goal, our previous work has shown how the GOLSA model can solve problems in the general sense of causing the world to assume a desired state via a sequence of actions, as described above 34 . The model begins with a core premise: the brain constitutes a control-theoretic system, generating actions to minimize the discrepancy between actual and desired states. We developed the Goal-Oriented Learning and Selection of Action (GOLSA) computational neural model from this core premise to simulate how the brain might autonomously learn to solve problems, while maintaining fidelity to known biological mechanisms and constraints such as localist learning laws and real-time neural dynamics. The constraints of biological plausibility both narrow the scope of viable models and afford a direct comparison with neural activity.

The model treats the brain as a high-dimensional control system. It drives behavior to maintain multiple and varying control theoretic set points of the agent’s state, including low level homeostatic (e.g. hunger, thirst) and high level cognitive set points (e.g. a Tower of Hanoi configuration). The model autonomously learns the structure of state transitions, then plans actions to arbitrary goals via a novel hill-climbing algorithm inspired by Dijkstra’s algorithm 32 . The model provides a domain-general solution to the problem of solving problems and performs well in arbitrary planning tasks (such as the Tower of Hanoi) and decision-making problems involving multiple constraints 34 (“ Methods ”).

The GOLSA model works by representing each possible state of the agent and environment in a network layer, with multiple layers each representing the same sets of states (Fig.  1 A,B). The Goal Gradient layer is activated by an arbitrarily specified desired (Goal) state and spreads activation backward along possible state transitions represented as edges in the network 35 , 36 . This value spreading activation generates current state values akin to learned state values (Q values) in reinforcement learning, except that the state values can be reassigned and recalculated dynamically as goals change. This additional flexibility allows goals to be specified dynamically and arbitrarily, with all state values being updated immediately to reflect new goals, thus overcoming a limitation of current RL approaches. Essentially, the Goal Gradient is the hill to climb to minimize the discrepancy between actual and desired states in the control theoretic sense. In parallel, regarding the present state of the model system, the Adjacent States layer receives input from a node representing the current state of the agent and environment, which in turn activates representations of all states that can be achieved with one state transition. The valid adjacent states then mask the Goal Gradient layer to yield the Desired Next State representation. In this layer, the most active unit represents a state which, if achieved, will move the agent one step closer to the goal state. This desired next state is then mapped onto an action (i.e. a controller signal) that is likely to effect the desired state transition. In sum, the model is given an arbitrarily specified goal state and the actual current state of the actor. It then finds an efficient sequence of states to transit in order to reach the goal state, and it generates actions aimed at causing the current state of the world to be updated so that it approaches and reaches the goal state.

An external file that holds a picture, illustration, etc.
Object name is 41598_2023_28834_Fig1_HTML.jpg

( A ) The GOLSA model determines the next desired state by hill climbing. Each layer represents the same set of states, one per neuron. The x- and y-axes of the grids represent abstracted coordinates in a space of states. Neurons are connected to each other for states that are reachable from another by one action, in this case neighbors in the x,y plane. The Goal state is activated and spreads activation through a Goal Gradient (Proximity) layer, thus dynamically specifying the value of each state given the goal, so that value is greater for states nearer the goal state. The Current State representation activates all Adjacent States, i.e. that can be achieved with one state transition. These adjacent states mask the Goal Gradient input to the Desired Next State, so that the most active unit in the Desired Next State represents a state attainable with one state transition and which will bring the state most directly toward the goal state. The black arrows indicate that the Desired Next State unit activities are the element-wise products of the corresponding Adjacent States and Goal Gradient unit activities. The font colors match the model layer to corresponding brain regions in Figs. ​ Figs.3 3 and ​ and4. 4 . ( B ) The desired state transition is determined by the conjunction of current state and desired next state. The GOLSA model learns a mapping from desired state transitions to the actions that cause those transitions. After training, the model can generate novel action sequences to achieve arbitrary goal states. Adapted from 34 .

Here we test whether and how the GOLSA model might provide an account of how various brain regions work together to drive goal-directed behavior. To do this, we ask human subjects to perform a multi-step task to achieve arbitrary goals. We then train the GOLSA model to perform the same task, and we use representational similarity analysis (RSA) to ask whether specific GOLSA model layers show similar representations to specific brain regions ( Supplementary Material ). The results will provide a tentative account of the function of specific brain regions in terms of the GOLSA model, and this account can then be tested and compared against alternative models in future work.

Study design

The details of the model implementation and the model code are available in the “ Methods ”. Behaviorally, we found that the GOLSA model is able to learn to solve arbitrary problems, such as reaching novel states in the Tower of Hanoi task (Fig.  2 A). It does this without hard-wired knowledge, simply by making initially random actions and learning from the outcomes, then synthesizing the learned information to achieve whatever state is specified as the goal state.

An external file that holds a picture, illustration, etc.
Object name is 41598_2023_28834_Fig2_HTML.jpg

( A ) The GOLSA model learns to solve problems, achieving arbitrary goal states. It does this by making arbitrary actions and observing which actions cause which state transitions. Figure adapted from earlier work 34 , 37 . ( B ) Treasure Hunt task. Both the GOLSA model and the human fMRI subjects performed a simple treasure hunt task, in which subjects were placed in one of four possible starting locations, then asked to generate actions to reach any of the other possible locations. To test multi-step transitions, subjects had to first move to the location of a key needed to unlock a treasure chest, then move to the treasure chest location. Participants first saw an information screen specifying the contents of each of the four states (‘you’, ‘key’, ‘chest’, or ‘nothing’). After a jittered delay, participants selected a desired movement direction and after another delay saw an image of the outcome location. The mapping of finger buttons to game movements was random on each trial and revealed after subjects were given the task and had to plan their movements, thus avoiding motor confounds during planning. Bottom: The two state-space maps used in the experiment. One map was used in the first half of trials while the other was used in the second half, in counterbalanced order.

Having found that the model can learn autonomously to solve arbitrary problems, we then aimed to identify which brain regions might show representations and activity that matched particular GOLSA model layers. To do this, we tested the GOLSA model with a Treasure Hunt task (Fig.  2 B and “ Methods ”), which was performed by both the GOLSA model and human subjects with fMRI. All human subjects research here was approved by the IRB of Indiana University, and subjects gave full informed consent. The human subjects research was performed in accordance with relevant guidelines/regulations and in accordance with the Declaration of Helsinki. Subjects were placed in one of four starting states and had to traverse one or two states to achieve a goal, by retrieving a key and subsequently using it to unlock a treasure chest for a reward (Fig.  2 B). The Treasure Hunt task presents a challenge to standard RL approaches, because the rewarded (i.e. goal) state changes regularly. In an RL framework, the Bellman equation would regularly relearn the value of each possible state in terms of how close it is to the currently rewarded state, forgetting previous state values in the process.

Representational similarity analysis

To analyze the fMRI and model data, we used model-based fMRI with representational similarity analysis (RSA) 38 (“ Methods ”). RSA considers a set of task conditions and asks whether a model, or brain region, can discriminate between the patterns of activity associated with the two conditions, as measured by a correlation coefficient. By considering every possible pairing of conditions, the RSA method constructs a symmetric representational dissimilarity matrix (RDM), where each entry is 1-r, and r is the correlation coefficient. This RDM provides a representational fingerprint of what information is present, so that the fingerprints can be compared between a model layer and a given brain region. For our application of RSA, each representational dissimilarity matrix (RDM) represented the pairwise correlations across 96 total patterns–4 starting states by 8 trial types by 3 time points within a trial (problem description, response, and feedback). For each model layer, the pairwise correlations are calculated with the activity pattern across layer cells in one condition vs. the activity pattern in the same layer in the other condition. For each voxel in the brain, the pairwise correlations are calculated with the activity pattern in a local neighborhood of radius 10 mm (93 voxels total) around the voxel in question, for one condition vs. the other condition. The 10 mm radius was chosen to provide a tradeoff between a sufficiently high number of voxels for pattern analysis and a sufficiently small area to identify specific regions. The fMRI RSA maps are computed for each subject over all functional scans and then tested across subjects for statistical significance. The comparison between GOLSA model and fMRI RDMs consists of looking for positive correlations between elements of the upper symmetric part of a given GOLSA model layer RDM vs. the RDM around a given voxel in the fMRI RDMs. The resulting fMRI RSA maps, one per GOLSA model layer, show which brain regions have representational similarities between particular model components and particular brain regions. The fMRI RSA maps showing the similarities between a given GOLSA model layer and a given brain region are computed for each subject and then tested across subjects for statistical significance in a given brain region, with whole-brain tests for significance in all cases. Full results are in Table ​ Table2, 2 , and method details are in the “ Methods ” section. As a control, we also generated a null model layer that consisted of normally distributed noise (μ = 1, σ = 1). In the null model, no voxels exceeded the cluster defining threshold, and so no significant clusters were found, which suggests that the results below are not likely to reflect artifacts of the analysis methods.

Full list of core model layers and associated parameters.

Layer nameInhibition typeTime constantDecay rateNoise gain
Current-stateShunting0.510
GoalLinear110
Goal-gradientLinear110
Adjacent-statesLinear0.210.01
Next-desired-stateLinear10.10
Previous-state-1Shunting40.50
Previous-state-2Linear10.0010
Previous-actionLinear10.0010
Observed-transitionLinear1.510
Desired-transitionLinear1.510
Transition-outputLinear1.510
Action-inputLinear110
Action-outputLinear0.50.20

Orbitofrontal cortex, goals, and maps

We found that the patterns of activity in a number of distinct brain regions match those expected of a control theoretic system, as instantiated in the GOLSA model (Figs. ​ (Figs.3A,B 3 A,B and ​ and4A–C); 4 A–C); Table ​ Table1). 1 ). Orbitofrontal cortex (OFC) activity patterns match model components that represent both a cognitive map 7 and a flexible goal value representation 8 , specifically matching the Goal and Goal Gradient layer activities. These layers represent the current values of the goal state and the current values of states near the goal state, respectively. The Goal Gradient layer incorporates cognitive map information in terms of which states can be reached from which other states. This suggests mechanisms by which OFC regions may calculate the values of states dynamically as part of a value-based decision process, by spreading activation of value from a currently active goal state representation backward. The GOLSA model representations of the desired next state also match overlapping regions in the orbitofrontal cortex (OFC) and ventromedial prefrontal cortex (vmPFC), consistent with a role in finding the more valuable decision option (Fig.  3 ). Reversal learning and satiety effects as supported by the OFC reduce to selecting a new goal state or deactivating a goal state respectively, which immediately updates the values of all states. Collectively this provides a mechanistic account of how value-based decision-making functions in OFC and vmPFC.

An external file that holds a picture, illustration, etc.
Object name is 41598_2023_28834_Fig3_HTML.jpg

Representational Similarity Analysis (RSA) of model layers vs. human subjects performing the same Treasure Hunt task. All results shown are significant clusters across the population with a cluster defining threshold of p  < 0.001 cluster corrected to p  < 0.05 overall, and with additional smoothing of 8 mm FWHM applied prior to the population level t-test for visualization purposes. ( A ) population Z maps showing significant regions of similarity to model layers in orbitofrontal cortex. Cf. Figure  1 and Fig.  5 B. The peak regions of similarity for goal-gradient and goal show considerable overlap in right OFC. The region of peak similarity for simulated-state is more posterior. To most clearly show peaks of model-image correspondence, the maps of gradient and goal are here visualized at p  < 0.00001 while all others are visualized at p  < 0.001. ( B ) Z maps showing significant regions of similarity to model layers in right temporal cortex. The peak regions of similarity for goal-gradient and goal overlap and extend into the OFC. The peak regions of similarity for adjacent-states, next-desired-state, and -simulated-state occur in similar but not completely overlapping regions, while the cluster for queue-store is more lateral. ( C ) Fig.  1 A, copied here as a legend, where the font color of each layer name corresponds to the region colors in panels ( A) and ( B) .

An external file that holds a picture, illustration, etc.
Object name is 41598_2023_28834_Fig4_HTML.jpg

Representational Similarity Analysis of model layers vs. human subjects performing the same Treasure Hunt task, with the same conditions and RSA analysis as in Fig.  3 . ( A ) Population Z maps showing significant regions of similarity to model layers in visual cortex. The peak regions of similarity for goal-gradient and goal overlap substantially, primarily in bilateral cuneus, inferior occipital gyrus, and lingual gyrus. The simulated-state layer displayed significantly similar activity to that in a smaller medial and posterior region. Statistical thresholding and significance are the same as Fig.  3 . ( B ) Z map showing significant regions of similarity to the desired-transition layer. Similarity peaks were observed for desired-transition in bilateral hippocampal gyrus as well as bilateral caudate and putamen. The desired-transition map displayed here was visualized at p  < 0.00001 for clarity. ( C ) Z maps showing significant regions of similarity to the model layers in frontal cortex. Similarity peaks were observed for queue-store in superior frontal gyrus (BA10). Action-output activity most closely resembled activity in inferior frontal gyrus (BA9), while simulated-state and goal-gradient patterns of activity were more anterior (primarily BA45). Similarity between activity in the latter two layers and activity in OFC, visual cortex, and temporal pole is also visible.

Significant similarity clusters for RSA analysis. The p and Size columns refer to cluster-corrected values. Anatomical labels are derived from the Automated Anatomical Labeling Atlas in SPM5 46 .

LayerMNI coordinates
Peak region (TD label)XYZZ-ScorePeak R Size
goalCuneus/frontal − 14 − 86146.750.0284 < 0.00112,209
goal-gradientCuneus/frontal17 − 79 − 77.700.0414 < 0.00111,528
goal-gradientPostcentral gyrus − 21 − 38684.340.01270.001136
goal-gradientSuperior frontal gyrus − 2845343.770.01150.01286
goal-gradientMiddle temporal gyrus72 − 31 − 34.310.01050.00797
adjacent-stateMiddle temporal pole4524 − 373.540.02050.03079
next-desired-stateMedial frontal gyrus1431 − 145.090.0143 < 0.0011072
next-desired-statePutamen28 − 10144.240.01250.03567
next-desired-stateSuperior temporal gyrus − 3410 − 344.690.0077 < 0.001226
next-desired-stateSuperior temporal gyrus3114 − 414.490.0128 < .001616
next-desired-statePons3 − 28 − 374.150.0110 < .001227
next-desired-statePrecuneus − 17 − 48373.870.00450.01979
desired-transitionStriatum − 24 − 7376.820.0846 < 0.0015615
desired-transitionPosterior cingulate − 24 − 72104.630.05750.012132
desired-transitionCerebellum − 7 − 58 − 344.360.0235 < .001256
desired-transitionOccipital lobe28 − 58 − 34.300.0298 < .001267
desired-transitionPrecuneus − 17 − 58414.250.05430.03696
action-outputPrecentral gyrus − 410345.290.02400.001354
action-outputMiddle frontal gyrus28 − 7445.060.0216 < 0.001908
action-outputSupramarginal gyrus − 38 − 45374.480.01740.002147
queue-storeSuperior frontal gyrus2158274.150.02100.002173
queue-storeMiddle temporal gyrus523 − 344.430.0178 < .001340
queue-storeTemporal lobe34 − 45 − 104.360.00680.04881
queue-storePostcentral gyrus − 58 − 17173.950.02740.009130
simulated-stateLingual gyrus3 − 83 − 75.150.0885 < 0.0011381
simulated-stateSuperior temporal pole4524 − 314.600.0525 < 0.001795
simulated-stateMiddle frontal gyrus − 5245 − 33.810.03410.001388
simulated-stateMiddle frontal gyrus69 − 41 − 34.110.04100.01492
stateExtra-nuclear − 31 − 14247.780.0191 < 0.00143,492

Lateral PFC and planning

The GOLSA model also incorporates a mechanism that allows multi-step planning, by representing a Simulated State as if the desired next state were already achieved, so that the model can plan multiple subsequent state transitions iteratively prior to committing to a particular course of action (Fig.  5 B). Those subsequent state transitions are represented in a Queue Store layer pending execution via competitive queueing, in which the most active action representation is the first to be executed, followed by the next most active representation, and so on 39 , 40 . This constitutes a mechanism of prospection 41 and planning 42 . The Simulated State layer in the GOLSA model shows strong representational similarity with regions of the OFC and anterior temporal lobe, and the Queue Store layer shows strong similarity with the anterior temporal lobe and lateral prefrontal cortex. This constitutes a mechanistic account of how the vmPFC and OFC in particular might contribute to multi-step goal-directed planning, and how plans may be stored in lateral prefrontal cortex.

An external file that holds a picture, illustration, etc.
Object name is 41598_2023_28834_Fig5_HTML.jpg

( A ) Full diagram of core model. Each rectangle represents a layer and each arrow a projection. The body is a node, and two additional nodes are not shown which provide inhibition at each state-change and oscillatory control. The colored squares indicate which layers receive inhibition from these nodes. Some recurrent connections not shown. ( B ) Full diagram of extended model, with added top row representing ability to plan multiple state transition steps ahead (Simulated State, Queue Input, Queue Store, and Queue Output layers). Adapted with permission from earlier work 34 .

Visual cortex and future visual states

The visual cortex also shows representational patterns consistent with representing the goal, goal gradient, and simulated future states (Figs.  3 B and ​ and4). 4 ). This suggests a role for the visual cortex in planning, in the sense of representing anticipated future states beyond simply representing current visual input. Future states in the present task are represented largely by images of locations, such as an image of a scarecrow or a house. In that sense, an anticipated future state could be decoded as matching the representation of the image of that future state. One possibility is that this reflects an attentional effect that facilitates processing of visual cues representing anticipated future states. Another possibility is that visual cortex activity signals a kind of working memory for anticipated future visual states, similar to how working memory for past visual states has been decoded from visual cortex activity 14 . This would be distinct from predictive coding, in that the activity predicts future states, not current states 43 . In either case, the results are consistent with the notion that the visual cortex may not be only a sensory region but may play some role in planning by representing the details of anticipated future states.

Anterior temporal lobe and planning

The anterior temporal lobe likewise shows representations of the goal, goal gradient, the adjacent states, the next desired state, and simulated future and queue store states (Figs.  3 B, ​ B,4C). 4 C). In one sense this is not surprising, as the states of the task are represented by images of objects, and visual objects (especially faces) are represented in the anterior temporal lobe 44 . Still, the fact that the anterior temporal lobe shows representations consistent with planning mechanisms suggests a more active role in planning beyond feedforward sensory processing as commonly understood 45 .

Hippocampal region and prospection

Once the desired next state is specified, it must be translated to an action. The hippocampus and striatum match the representations of the Desired Transition layer in the GOLSA model. This model layer represents a conjunction of the current state and desired next state transitions, which in the GOLSA model is a necessary step toward selecting an appropriate action to achieve the desired transition. This is consistent with the role of the hippocampus in prospection 41 , and it suggests computational and neural mechanisms by which the hippocampus may play a key role in turning goals into predictions about the future, for the purpose of planning actions 10 , 11 . Finally, as would be expected, the motor output representations in the GOLSA model match motor output patterns in the motor cortex (Fig.  4 C).

The results above show how a computational neural model, the GOLSA model, provides a novel computational account of a number of brain regions. The guiding theory is that a substantial set of brain regions function together as a control-theoretic mechanism 47 , generating behaviors to minimize the discrepancy between the current state and the desired (goal) state. The OFC is understood as including neurons that represent the value of various states in the world, such as the value of acquiring certain objects. Greater activity of an OFC neuron corresponds with more value of its represented state given the current goals. Because of spreading activation, neurons will be more active if they represent states closer to the goal. This results in value representations similar to those provided by the Bellman equation of reinforcement learning 48 , with the difference being that spreading activation can instantly reconfigure the values of states as goals change, without requiring extensive iterations of the Bellman equation.

Given the current state and the goal state, the next desired state can be determined as a nearby state that can be reached and that also moves the current state of the world closer to the goal state. Table ​ Table1 1 shows these effects in the medial frontal gyrus, putamen, superior temporal gyrus, pons, and precuneus. The GOLSA model suggests this is computed as the activation of available state representations, multiplied by the OFC value for that state. Precedent for this kind of multiplicative effect has been shown in the attention literature 49 . The action to be generated is represented by neural activity in the motor cortex region. This in turn is determined on the basis of neurons that are active specifically for a conjunction of the particular current state and next desired state. Neurally, we find this conjunction represented across a large region including the striatum and hippocampus. This is consistent with the notion of the hippocampus as a generative recurrent neural network, that starts at a current state and runs forward, specifically toward the desired state 50 . The striatum is understood as part of an action gate that permits certain actions in specific contexts, although the GOLSA model does not include an explicit action gate 51 . Where multiple action steps must be planned prior to executing any of them, the lateral PFC seems to represent a queue of action plans in sequence, as sustained activity representing working memory 39 , 52 . By contrast, working memory representations in the visual cortex apparently represent the instructed future states as per the instructions for each task trial, and these are properly understood as visual sensory rather than motor working memories 14 .

Our findings overall bear a resemblance to the Free Energy principle. According to this, organisms learn to generate predictions of the most likely (i.e. rewarding) future states under a policy, then via active inference emit actions to cause the most probable outcomes to become reality, thus minimizing surprise 53 , 54 . Like active inference, the GOLSA model emits actions to minimize the discrepancy between the actual and predicted state. Of note, the GOLSA model specifies the future state as a desired state rather than a most likely state. This crucial distinction allows a state that has a high current value to be pursued, even if the probability of being in that state is very low (for example buying lottery tickets and winning). Furthermore, the model includes the mechanisms of Fig.  1 , which allow for flexible planning given arbitrary goals. The GOLSA model is a process model and simulates rate-coded neural activity as a dynamical system (“ Methods ”), which affords the more direct comparison with neural activity representations over time as in Figs.  3 and ​ and4 4 .

The GOLSA model, and especially our analysis of it, builds on recent work that developed methods to test computational neural models against empirical data. Substantial previous work has demonstrated how computational neural modeling can provide insight into the functional properties underlying empirical neural data, such as recurrent neural networks elucidating the representational structure in anterior cingulate 19 , 55 , 56 and PFC 57 ; deep neural networks accounting for object recognition in IT with representational similarity analysis 58 , and encoding/decoding of visual cortex representations 59 ; dimensionality reduction for comparing neural recordings and computational neural models 60 , and representations of multiple learned tasks in computational neural models 61 .

The GOLSA model shares some similarity with model-based reinforcement learning (MBRL), in that both include learned models of next-state probabilities as a function of current state and action pairs. Still, a significant limitation of both model-based and model free RL is that typically there is only a single ultimate goal, e.g. gaining a reward or winning a game. Q-values 62 are thus learned in order to maximize a single reward value. This implies several limitations: (1) that Q values are strongly paired with corresponding states; (2) that there is only one Q value per state at a given time, as in a Markov decision process (MDP), and (3) Q values are generally updated via substantial relearning. In contrast, real organisms will find differing reward values associated with different goals at different times and circumstances. This implies that goals will change over time, and re-learning Q-values with each goal change would be inefficient. Instead, a more flexible mechanism will dynamically assign values to various goals and then plan accordingly. The GOLSA model exemplifies this approach, essentially replacing the learned Q values of MBRL and MDPs with an activation-based representation of state value, which can be dynamically reconfigured as goals change. This overcomes the three limitations above.

Our work has several limitations. First, regarding the GOLSA model itself, the main limitation is its present implementation of one-hot state representations. This makes a scale-up to larger and continuous state spaces challenging. Future work may overcome this limitation by replacing the one-hot representations with vector-valued state representations and the learned connections with deep network function approximators. This would require corresponding changes in the search mechanisms of Fig.  1 A, from parallel, spreading activation to a serial, monte carlo tree search mechanism. This would be consistent with evidence of serial search during planning 63 , 64 and would afford a new approach to artificial general intelligence that is both powerful and similar to human brain function. Another limitation is that the Treasure Hunt task is essentially a spatial problem solving task. We anticipate that the GOLSA model could be applied to solve more general, non-spatial problems, but this remains to be demonstrated.

The fMRI analysis here has several limitations as well. First, a correspondence of representations does not imply a correspondence of computations, nor does it prove the model correct in an absolute sense 65 . There are other computational models that use diffusion gradients to solve goal-directed planning 66 , and more recent work with deep networks to navigate from arbitrary starting to arbitrary ending states 50 . The combined model and fMRI results here constitute a proposed functional account of the various brain regions, but our results do not prove that the regions compute exactly what the corresponding model regions do, nor can we definitively rule out competing models. Nevertheless the ability of the model to account for fMRI data selectively in specific brain regions suggests that it merits further investigation and direct tests against competing models, as a direction for future research. Future work might compare other models besides GOLSA against the fMRI data using RSA, to ascertain whether other model components might provide a better fit to, and account of, specific brain regions. While variations of model-based and model-free reinforcement learning models would seem likely candidates, we know of only one model by Banino et al. 50 endowed with the ability to flexibily switch goals and thus perform the treasure hunt task as does the GOLSA model 34 . It would be instructive to compare the overall abilities of GOLSA and the model of Banino et al. to account the RDMs of specific brain regions in the Treasure Hunt task, although it is unclear how to do a direct comparison given that the two models consist of very different mechanisms.

The GOLSA model may in principle be extended hierarchically. The frontal cortex has a hierarchical representational structure, in which higher levels of a task may be represented as more anterior 67 . Such hierarchical structure has been construed to represent higher, more abstract task rules 13 , 15 , 68 . The GOLSA model suggests another perspective, that higher level representations consist of higher level goals instead of higher level rules. In the coffee-making task for example 69 , the higher level task of making coffee may require a lower level task of boiling water. If the GOLSA model framework were extended hierarchically, the high level goal of having coffee prepared would activate a lower level goal of having the water heated to a specified temperature. The goal specification framework here is intrinsically more robust than a rule or schema based framework–rules may fail to produce a desired outcome, but if an error occurs in the GOLSA task performance, replanning simply calculates the optimal sequence of events from whatever the current state is, and the error will be automatically addressed.

This incidentally points to a key difference between rules and goals, in that task rules define a mapping from stimuli to responses 15 , in a way that is not necessarily teleological. Goals, in contrast, are by definition teleological. This distinction roughly parallels that between model-free and model-based reinforcement learning 70 The rule concept, as a stimulus–response mapping, implies that an error is a failure to generate the action specified by the stimulus, regardless of the final state of a system. In contrast, the goal concept implies that an error is precisely a failure to generate the desired final state of a system. Well-learned actions may acquire a degree of automaticity over time 71 , but arguably the degree of automaticity is independent of whether an action is rule oriented vs. goal-directed. If a goal-directed action becomes automatized, this does not negate the teleological nature, namely that errors in the desired final state of the world can be detected and lead to corrective action to achieve the desired final state. Rule-based action, whether deliberate or automatized, does not necessarily entail corrective action to achieve a desired state. Where actions are generated, and possibly corrected, to achieve a desired state of the world, this may properly be referred to as goal-directed behavior.

We have investigated the GOLSA model here to examine whether and how it might account for the function of specific brain regions. With RSA analysis, we found that specific layers of the GOLSA model show strong representational similarities with corresponding brain regions. Goals and goal value gradients matched especially the orbitofrontal cortex, and also some aspects of the visual and anterior temporal cortices. The desired transition layer matched representations in the hippocampus and striatum, and simulated future states matched representations in the middle frontal gyrus and superior temporal pole. Not surprisingly, the model motor layer representations matched the motor cortex. Collectively, these results constitute a proposal that the GOLSA model can provide an organizing account of how multiple brain regions interact to form essentially a negative feedback controller, with time varying behavioral set points derived from motivational states. Future work may investigate this proposal in more depth and compare against alternative models.

Model components

The GOLSA model is constructed from a small set of basic components, and the model code is freely available as supplementary material . The main component class is a layer of units, where each unit represents a neuron (or, more abstractly, a small subpopulation of neurons) corresponding to either a state, a state transition, or an action. The activity of units in a layer represents the neural firing rate and is instantiated as a vector updated according to a first order differential equation (c.f. Grossberg 72 ). The activation function varies between layers, but all units in a particular layer are governed by the same equation. The most typical activation function for a single unit is,

where a represents activation, i.e. the firing rate, of a model neuron. The four terms of this equation represent, in order: passive decay - λ a ( t ) , shunting excitation 1 - a t E , linear inhibition - I , and random noise ε N ( t ) dt . “Shunting” refers to the fact that excitation (E) scales inversely as current activity increases, with a natural upper bound of 1. The passive decay works in a similar fashion, providing a natural lower bound activity of 0. The inhibition term linearly suppresses unit activity, while the final term adds normally distributed noise N (μ = 0, σ = 1), with strength ε . Because the differential equations are approximated using the Euler method, the noise term is multiplied by dt to standardize the magnitude across different choices of dt 73 , 74 . The speed of activity change is determined by a time constant τ. The parameters τ, λ, ε vary by layer in order to implement different processes. E and I are the total excitation and inhibition, respectively, impinging on a particular unit for every presynaptic unit j in every projection p onto the target unit,

where a p j is the activation of a presynaptic model neuron that provides exciation, and w p j is the synaptic weight that determines how much excitation per unit of presynaptic activity will be provided to the postsynaptic model neuron.

A second activation function used in several places throughout the model is,

This function is identical to Eq. ( 1 ) except that the inhibition is also shunting, such that it exhibits a strong effect on highly active units and a smaller effect as unit activity approaches 0. While more typical in other models, shunting inhibition has a number of drawbacks in the current model. Two common uses for inhibition in the GOLSA model are winner-take-all dynamics and regulatory inhibition which resets layer activity. Shunting inhibition impedes both of these processes because inhibition fails to fully suppress the appropriate units, since it becomes less effective as unit activity decreases.

Projections

Layers connect to each other via projections, representing the synapses connecting one neural population to another. The primary component of projections is a weight matrix specifying the strength of connections between each pair of units. Learning is instantiated by updating the weights according to a learning function. These functions vary between the projections responsible for the model learning and are fully described in the section below dealing with each learning type. Some projections also maintain a matrix of traces updated by a projection-specific function of presynaptic or postsynaptic activity. The traces serve as a kind of short-term memory for which pre or postsynaptic units were recently activated, which serve a very similar role to eligibility traces as in Barto et al. 75 , though with a different mathematical form.

Nodes are model components that are not represented neurally via an activation function. They represent important control and timing signals to the model and are either set externally or update autonomously according to a function of time. For instance, sinusoidal oscillations are used to gate activity between various layers. While in principle rate-coded model neurons could implement a sinusoidal wave, the function is simply hard coded into the update function of the node for simplicity. In some cases, it is necessary for an entire layer to be strongly inhibited when particular conditions hold true, such as when an oscillatory node is in a particular phase. Layers therefore also have a list of inhibitor nodes that prevent unit activity within the layer when the node value meets certain conditions. In a similar fashion, some projections are gated by nodes such that they allow activity to pass through and/or allow the weights to be updated only when the relevant node activity satisfies a particular condition. Another important node provides strong inhibition to many layers when the agent changes states.

Environment

The agent operates in an environment consisting of discrete states, with a set of allowable state transitions. Allowable state transitions are not necessarily bidirectional, but for the present simulations, they are deterministic (unlike the typical MDP formulation used in RL). In some simulations, the environment also contains different types of reward located in various states, which can be used to drive goal selection. In other simulations, the goal is specified externally via a node value.

Complete network

Each component and subnetwork of the model is described in detail below or in the main text, but for reference and completeness a full diagram of the core network is shown in Fig.  5 A, and the network augmented for multi-step planning is shown in Fig.  5 B. Some of the basic layer properties are summarized in Table ​ Table2. 2 . Layers and nodes are referred to using italics, such that the layer representing the current state is referred to simply as current-state.

Representational structure

In Fig.  5 B, the layers Goal, Goal Gradient, Next State, Adjacent States, Previous States, Simulated State, and Current State all have the same number of nodes and the same representational structure, i.e. one state per node.

The layers Desired Transition, Observed Transition, Transition Output, Queue Input, Queue Output, and Queue Store likewise have the same representational structure, which is the number of possible states squared. This allows a node in these layers to represent a transition from one specific state to another specific state.

The layers Action Input, Action Output, and Previous Action all have the same representational structure, which is one possible action per node.

Task description

The Treasure Hunt task (Fig.  2 ) was created and presented in OpenSesame, a Python-based toolbox for psychological task design 76 . In the task, participants control an agent which can move within a small environment comprised of four distinct states. The nominal setting is a farm, and the states are a field with a scarecrow, the lawn in front of the farm house, a stump with an axe, and a pasture with cows. Each is associated with a picture of the scene obtained from the internet. These states were chosen to exemplify categories previously shown to elicit a univariate response in different brain regions, namely faces, houses, tools, and animals 77 – 79 .

Over the course of the experiment, participants were told the locations of treasure chests and the keys needed to open them. By arriving at a chest with the key, participants earned points which were converted to a monetary bonus at the end of the experiment. The states were arranged in a square, where each state was accessible from the two adjacent states but not the state in the opposite corner (diagonal movement was not allowed).

Each trial began with the presentation of a text screen displaying the relevant information for the next trial, namely the locations of the participant, the key, and the chest (Fig.  2 ). Because the neural patterns elicited during the presentation were the primary target of the decoding analysis, it was important that visual information be as similar as possible across different goal configurations, to avoid potential confounds. To hold luminance as constant as possible across conditions, each line always had the same number of characters. Since, for instance, “Farm House: key” has fewer characters than “Farm House: Nothing”, filler characters were added to the shorter lines, namely Xs and Os. On some trials Xs were the filler characters on the top row and Os were the filler characters on the bottom rows. This manipulation allowed us to attempt to decode the relative position of the Xs and Os to test whether decoding could be achieved due only to character-level differences in the display. We found no evidence that our results reflect low level visual confounds such as the properties of the filler characters.

Participants were under no time constraint on the information screen and pressed a button when they were ready to continue. A delay screen then appeared consisting of four empty boxes. After a jittered interval (1-6 s, distributed exponentially), arrows appeared in the boxes. The arrows represented movement directions and the boxes corresponded to four buttons under the participants left middle finger, left index finger, right index finger, and right middle finger, from left to right. Participants pressed the button corresponding to the box with the arrow pointing in the desired direction to initiate a movement. A fixation cross then appeared for another jittered delay of 0–4 s, followed by a 2 s display of the newly reached location if their choice was correct or an error screen if it was incorrect.

If the participant did not yet have the key required to open the chest, the correct movement was always to the key. Sometimes the key and chest were in the same location in which case the participant would earn points immediately. If they were in different locations, then on the next trial the participant had to move to the chest. This structure facilitated a mix of goal distances (one and two states away) while controlling the route required to navigate to the goal.

If the chosen direction was incorrect, participants saw an error screen displaying text and a map of the environment. Participants advanced from this screen with a button press and then restarted the failed trial. If the failed trial was the second step in a two-step sequence (i.e., if they had already gotten the key and then moved to the wrong state to get to the chest), they had to repeat the previous two trials.

Repeating the failed trial ensured that there were balanced numbers of each class of event for decoding, since an incorrect response indicated that some information was not properly maintained or utilized. For example, if a participant failed the second step of a two-trial sequence, then they may not have properly encoded the final goal when first presented with the information screen on the previous trial, which specified the location of the key and the chest.

Halfway through the experiment, the map was reconfigured such that states were swapped across the diagonal axes of the map. This was necessary because otherwise, each state could be reached by exactly two movement directions and exactly two movement directions could be made from it. For instance, if the farm house was the state in the lower left, the farmhouse could only be reached by moving left or down from adjacent states, and participants starting at the farm house could only move up or to the right. If this were true across the entire experiment, above-chance classification of target state, for instance, could appear in regions that in fact only contain information about the intended movement direction.

Each state was the starting state for one quarter of the trials and the target destination for a different quarter of the trials. All trials were one of three types. One category consisted of single-trial (single) sequences in which the chest and key were in the same location. The sequences in which the chest and key were in separate locations required two trials to complete, one to move from the initial starting location to the key and another to move from the key location to the chest location. These two steps formed the other two classes of trials, the first-of-two (first) and second-of-two (second) trials. Recall that on second trials, no information other than the participant’s current location is presented on the starting screen to ensure that the participant maintained the location of the chest in memory across the entire two-trial sequence (if it was presented on the second trial, there would be no need to maintain that information through the first trial). The trials were evenly divided into single, first, and second classes with 64 trials in each class. Therefore, every trial had a starting state and an immediate goal, while one third of trials also had a more distant final goal.

Immediately prior to participating in the fMRI version of the task, participants completed a short 16-trial practice outside the scanner to refresh their memory. Before beginning the first run inside the scanner, participants saw a map of the farm states and indicated when they had memorized it before moving on. Within each run, participants completed as many trials as they could within eight minutes. As described above, exactly halfway through the trials, the state space was rearranged with each state moving to the opposite corner. Therefore, when participants completed the first half of the experiment, the current run was terminated and participants were given time to learn the new state space before scanning resumed. At the end of the experiment, participants filled out a short survey about their strategy.

Participants

In total, 49 participants (28 female) completed the behavioral-only portion of the experiment, including during task piloting (early versions of the behavioral task were slightly different than described below). Participants provided written informed consent in accordance with the Institutional Review Board at Indiana University, and were compensated $10/hour for their time plus a performance bonus based on accuracy up to an additional $10. The behavioral task first served as a pilot during task design and then as a pre-screen for the fMRI portion, in that only participants with at least 90% accuracy were invited to participate. Additional criteria for scanning were that the subjects be right handed, free of metal implants, free of claustrophobia, weigh less than 440 pounds, and not be currently taking psychoactive medication. In total, 25 participants participated in the fMRI task but one subject withdrew shortly after beginning, leaving 24 subjects who completed the imaging task (14 female). Across the 24 subjects, the average error rate of responses during the fMRI task was 2.4%, and error trials were modeled separately in the fMRI analysis. These were not analyzed further as there were too few error trials for a meaningful analysis.

fMRI acquisition and data preprocessing

Imaging data were collected on a Siemens Magnetom Trio 3.0-Tesla MRI scanner and a 32 channel head coil. Foam padding was inserted around the sides of the head to increase participant comfort and reduce head motion. Functional T2* weighted images were acquired using a multiband EPI sequence 80 with 42 contiguous slices and 3.44 × 3.44 × 3.4 mm 3 voxels (echo time = 28 ms; flip angle = 60; field of view = 220, multiband acceleration factor = 3). For the first subject, the TR was 813 ms, but during data collection for the second subject the TR changed to 816 ms for unknown reasons. The scanner was upgraded after collecting data from an additional five subjects, at which point the TR remained constant at 832 ms. All other parameters remained unchanged. High-resolution T 1 –weighted MPRAGE images were collected for spatial normalization (256 × 256 × 160 matrix of 1 × 1 × 1mm 3 voxels, TR = 1800, echo time = 2.56 ms; flip angle = 9).

Functional data were spike-corrected using AFNI’s 3dDespike ( http://afni.nimh.nih.gov/afni ). Functional images were corrected for difference in slice timing using sinc-interpolation and head movement using a least-squares approach with a 6-parameter rigid body spatial transformation. For subjects who moved more than 3 mm total or 0.5 mm between TRs, 24 motion regressors were added to subsequent GLM analyses 81 .

Because MVPA and representation similarity analysis (RSA) rely on precise voxelwise patterns, these analyses were performed before spatial normalization. For the univariate analyses, structural data were coregistered to the functional data and segmented into gray and white matter probability maps 82 . These segmented images were used to calculate spatial normalization parameters to the MNI template, which were subsequently applied to the functional data. As part of spatial normalization, the data were resampled to 2 × 2 × 2mm 3 , and this upsampling allowed maximum preservation of information. All analyses included a temporal high-pass filter (128 s) and correction for temporal autocorrelation using an autoregressive AR(1) model.

Univariate GLM

For initial univariate analyses, we measured the neural response associated with each outcome state at the outcome screen (when an image of the state was displayed), as well as the signal at the start of the trial associated with each immediate goal location. Five timepoints were modeled in the GLM used in this analysis, namely the start of the trial, the button press to advance, the appearance of the arrows and subsequent response, the start of the feedback, and the end of the feedback. The regressors marking the start of the trial and the start of the feedback screen were further individuated by the immediate goal on the trial. A separate error regressor was used when the response was incorrect, meaning they did not properly pursue the immediate goal and received error feedback. All correct trials in which participants moved to, for instance, the cow field, used the same trial start and feedback start regressors.

The GLM was fit to the normalized functional images. The resulting beta maps were combined at the second level with a voxel-wise threshold of p  < 0.001 and cluster corrected ( p  < 0.05) to control for multiple comparisons. We assessed the univariate response associated with each outcome location, by contrasting each particular outcome location with all other outcome locations. The response to the error feedback screen was assessed in a separate contrast against all correct outcomes. To test for any univariate responses related to the immediate goal, we performed an analogous analysis using the trial start regressors which were individuated based on the immediate goal. For example, the regressor ‘trialStartHouseNext’ was associated with the beginning of every trial where the farmhouse was the immediate goal location. To assess the univariate signal associated with the farmhouse immediate goal, we performed a contrast between this regressor and all other trial start regressors.

Representational similarity analysis (RSA)

As before, a GLM was fit to the realigned functional images. The following events were modeled with impulse regressors: trial onset (information screen), key press to advance to the decision screen, the prompt and immediately subsequent action (modeled as a single regressor), the onset of the outcome screen, and the termination of the outcome screen. The RSA analysis used beta maps derived from the regressors marking trial onset, prompt/response, and outcome screen onset.

Each of these regressors (except those used in error trials) were further individuated by the (start state, next state, final goal) triple constituting the goal path. There were 8 distinct trial types starting in each state. Each state could serve as the starting point of two single-step sequences (in which the key and treasure chest are in the same location) and four two-step sequences (in which the key and treasure chest are in different locations). Each state could also be the midpoint of a two-step sequence with the treasure chest located in one of two adjacent states. With three regressors used for each trial, there were 4 starting states * 8 trial types * 3 time points = 96 total patterns used to create the Representational Dissimilarity Matrix (RDM) in each searchlight region, where cell x ij in the RDM is defined as one minus the Pearson correlation between the ith and jth patterns. Values close to 2 therefore represent negative correlation (high representational distance) while values close to 0 indicate a positive correlation (low representational distance).

To derive the model-based RDMs, the GOLSA model was run on an analogue of the goal pursuit task, using a four state state-space with four actions corresponding to movement in each cardinal direction. The model layer timecourses of activity are shown in Figs.  6 and ​ and7 7 for one- and two-step trials, respectively. The base GOLSA model is not capable of maintaining a plan across an arbitrary delay, but instead acts immediately to make the necessary state transitions. The competitive queue 83 module allows state transition sequences to be maintained and executed after a delay, and was therefore necessary to model the task in the most accurate manner possible. However, the goal-learning module was not necessary since goals were externally imposed. Because participants had to demonstrate high performance on the task before entering the scanner, little if any learning took place during the experiment. As a result, the model was trained extensively on the state space before performing any trials used in data collection. To further simulate likely patterns of activity in the absence of significant learning, the input from state to goal-gradient (used in the learning phase of an oscillatory cycle) was removed and the goal-gradient received steady input from the goal layer, interrupted only by the state-change inhibition signal. In other words, the goal-gradient layer continuously represented the actual goal gradient rather than shifting into learning mode half of the time.

An external file that holds a picture, illustration, etc.
Object name is 41598_2023_28834_Fig6_HTML.jpg

Model activity during a simulated one-step sequence of the Treasure Hunt task. The competitive queueing module first loads a plan and then executes it sequentially. State activity shows that the agent remains in state 1 for the first half of the simulation, while simulated-state (StateSim) shows the state transition the agent simulates as it forms its plan. Adjacent-states (Adjacent) receives input from stateSim which, along with goal-gradient (Gradient) activity determines the desired next state and therefore the appropriate transition to make. The plan is kept in queue-store (Store) which receives a burst of input from queue-input (QueueIn) and finally executes the plan by sending output to queue-output (QueueOut) which drives the motor system. The vertical dashed lines indicating the different phases of the simulation used in the creation of the model RDMs. For each layer, activity within each period was averaged across time to form a single vector representing the average pattern for that time period in the trial type being simulated. The bounds of each phase were determined qualitatively. The planning period is longer than the acting and outcome periods because the model takes longer to form a plan than execute it or observe the outcome.

An external file that holds a picture, illustration, etc.
Object name is 41598_2023_28834_Fig7_HTML.jpg

Model activity during a simulated two-step sequence of the Treasure Hunt task. The competitive queueing module first loads a plan and then executes it sequentially. State activity shows that the agent remains in state 1 for the first half of the simulation, while simulated-state shows the state transitions the agent simulates as it forms its plan. Adjacent-states receives input from simulated-state which, along with goal-gradient activity determines the desired next state and therefore the appropriate transitions to make. The plan is kept in queue-store which receives bursts of input from queue-input and finally executes the plan by sequentially sending output to queue-output which drives the motor system. To force the agent to go to the appropriate intermediate state, goal activity first reflects the key location and then the chest location. The vertical dashed lines indicate time periods used when creating the RDMs for the two-step sequence simulations. The first three time periods correspond to the first trial in the sequence while the latter three correspond to the second trial in the sequence. Again, the first planning period is much longer due to the nature of the model dynamics. During the second “planning” period (P2), the plan was already formed as must have been the case in the actual experiment since on the second trials in a two-step sequence, no information was presented at the start of the trial and had to be remembered from the previous trial.

In the task, participants first saw an information screen from which they could determine the immediate goal state and the appropriate next action. This plan was maintained over a delay before being implemented. At the beginning of each trial simulation, the queuing module was set to “load” while the model interactions determined the best method of getting from the current state to the goal state. This period is analogous to the period in which subjects look at the starting information screen and plan their next move. Then, the queuing module was set to “execute,” modeling the period in which participants are prompted to make their selection. Finally, the chosen action implements a state transition and the environment provides new state information to the state layer, modeling the outcome phase of the experiment.

Some pairs of trials in the task comprised a two-step sequence in which the final goal was initially two states away from the starting state. On the second trial of such sequences, participants were not provided any information on the information screen at the start of the trial, ensuring that they had encoded and maintained all goal-related information from the information screen presented at the start at the first trial in the sequence. These pairs of trials were modeled within a single GOLSA simulation. The model seeks the quickest path to the goal, identifying immediately available subgoals as needed. However, in the task, the location of the key necessitated a specific path to reach the final goal of the treasure chest. To provide these instructions to the model at the start of a two-step simulation, the goal representation from the subgoal (the key) was provided to the model first until the appropriate action was loaded and then the goal representation shifted to the final goal (the chest). Once the full two-step state transition sequence was loaded in the queue, the actions were read out sequentially, as shown in Fig.  7 .

A separate RDM was generated for each model layer. Patterns were extracted from three time intervals per action (six total for the two-step sequence simulations). Due of the time required to load the queue, the first planning period was longer than all other intervals. For each simulation and time point, the patterns of activity across the units were averaged over time, yielding one vector. Each trial type was repeated 10 times and the patterns generated in the previous step were averaged across simulation repetitions. The activity of each layer was thus summarized with at most 96 patterns of activity which were converted into an RDM by taking one minus the Pearson correlation between each pattern. Patterns in which all units were 0 were ignored since the correlation is undefined for constant vectors.

We looked for neural regions corresponding to the layers that played a critical role in the model during the acting phase in the typical learning oscillation since in these simulations the learning phase of the oscillation was disabled. We created RDMs from the following layers: current-state, adjacent-states, goal, goal-gradient, next-desired-state, desired-transition, action-out, simulated-state, and queue-store. As a control, we also added a layer component which generated normally distributed noise (μ = 1, σ = 1).

RSA searchlight

The searchlight analysis was conducted using Representational Similarity Analysis Toolbox, developed at the University of Cambridge ( http://www.mrc-cbu.cam.ac.uk/methods-and-resources/toolboxes/license/ ). For each of these layer RDM, a searchlight of radius of 10 mm was moved through the entire brain. At each voxel, an RDM was created by from the patterns in the spherical region centered on that voxel.

An r value was obtained for each voxel by computing the Spearman correlation between the searchlight RDM and the model layer RDM, ignoring trial time periods in which all model units showed no activity. A full pass of the searchlight over the brain produced a whole-brain r map for each subject for each layer. Voxels in regions that perform a similar function to the model component will produce similar RDMs to the model component RDM and thus will be assigned relatively high values. The r maps were then Fisher-transformed into z maps ( z = 1 2 ln 1 + r 1 - r ). The z maps were normalized into the MNI template but were not smoothed, as the searchlight method already introduces substantial smoothing. Second level effects were assessed with a t test on the normalized z maps, with a cluster defining threshold of p  < 0.001, cluster corrected to p  < 0.05 overall. The cluster significance was determined by SPM5 and verified for clusters >  = 24 voxels in size with a version of 3DClustSim (compile date Jan. 11, 2017) that corrects for the alpha inflation found in pervious 3DClustSim versions 84 . The complete results are shown in Table ​ Table1 1 .

Supplementary Information

Acknowledgements.

We thank A. Ramamoorthy for helpful discussions and J. Fine and W. Alexander for helpful comments on the manuscript. Supported by the Indiana University Imaging Research Facility. JWB was supported by NIH R21 DA040773.

Author contributions

J.W.B. and N.Z. designed the model and experiment. N.Z. implemented and simulated the model, implemented and ran the fMRI experiment, and analyzed the data. J.W.B. and N.Z. wrote the paper.

Data availability

Competing interests.

The authors declare no competing interests.

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The online version contains supplementary material available at 10.1038/s41598-023-28834-3.

Spatial Reasoning: A Critical Problem-Solving Tool in Children’s Mathematics Strategy Tool-Kit

  • First Online: 08 December 2018

Cite this chapter

spatial problem solving approach

  • Beth M. Casey 5 &
  • Harriet Fell 6  

Part of the book series: Research in Mathematics Education ((RME))

1809 Accesses

5 Citations

This chapter reviews the spatial literature from the perspective of potential mechanisms for widening the range of spatially-based strategies available when solving math problems. We propose that teaching generalized spatial skills, disconnected from specific math content, may not be the best direction to go in future spatial interventions. Students who do not start out with strong spatial skills may need to learn to develop different types of “spatial sense” specific to each content area. Thus, acquiring and applying spatial strategies may depend in part on developing spatial sense within these specific math domains. In this chapter, we present an overview of evidence for different types of spatial sense that may serve as a prerequisite for effectively applying spatial strategies within these math content areas. The chapter also provides examples of math activities designed to help children acquire spatial sense and apply spatial strategies when solving diverse types of math problems.

This material is based on work supported by the National Science Foundation under NSF #HRD-1231623.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

spatial problem solving approach

The impact of an intervention program on students’ spatial reasoning: student engagement through mathematics-enhanced learning activities

spatial problem solving approach

In search of the mechanisms that enable transfer from spatial reasoning to mathematics understanding

spatial problem solving approach

Infusing Spatial Thinking Into Elementary and Middle School Mathematics: What, Why, and How?

Ashcraft, M. H., & Fierman, B. A. (1982). Mental addition in third, fourth, and sixth graders. Journal of Experimental Child Psychology, 33 , 216–234. https://doi.org/10.1016/0022-0965(82)90017-0

Article   Google Scholar  

Baddeley, A. D., & Hitch, G. J. (1974). Working memory. In G. A. Bower (Ed.), Recent advances in learning and motivation (Vol. 8, pp. 47–90). New York: Academic Press.

Google Scholar  

Bailey, D. H. (2017). Causal inference and the spatial-math link in early childhood. Monographs of the Society for Research in Child Development, 82 (1), 127–136. https://doi.org/10.1111/mono.12288

Battista, M. T. (2003). Understanding students’ thinking about area and volume measurement. In D. H. Clements & G. Bright (Eds.), Learning and teaching measurement: 2003 yearbook (pp. 122–142). Reston, VA: National Council of Teachers of Mathematics.

Bjorklund, D. F., & Rosenblum, K. E. (2001). Children’s use of multiple and variable addition strategies in a game context. Developmental Science, 4 (2), 184–194. https://doi.org/10.1207/S15327647JCD0302_5

Boonen, A. J. H., van der Schoot, M., van Wesel, F., DeVries, M. H., & Jolles, J. (2013). What underlies successful word problem solving? A path analysis in sixth grade students. Contemporary Educational Psychology, 38 , 271–279. https://doi.org/10.1016/j.cedpsych.2013.05.001

Carr, M., & Alexeev, N. (2011). Fluency, accuracy, and gender predict developmental trajectories of arithmetic strategies. Journal of Educational Psychology, 103 , 617–631. https://doi.org/10.1037/a0023864

Carr, M., & Davis, H. (2002). Gender differences in mathematics strategy use: The influence of temperament. Learning and Individual Differences, 13 , 83–95. https://doi.org/10.1006/ceps.2000.1059

Carr, M., Hettinger Steiner, H. H., Kyser, B., & Biddlecomb, B. A. (2008). Comparison of predictors of early emerging gender differences in mathematics competency. Learning and Individual Differences, 18 , 61–75. https://doi.org/10.1016/j.lindif.2007.04.005

Carr, M., & Jessup, D. L. (1997). Gender differences in first-grade mathematics strategy use: Social and metacognitive influences. Journal of Educational Psychology, 89 , 318–328. https://doi.org/10.1037/0022-0663.89.2.318

Casey, B. M. (2013). Spatial abilities and individual differences. In D. A. Waller & L. Nadel (Eds.), Handbook of spatial cognition (pp. 117–134). Washington, DC: American Psychological Association.

Chapter   Google Scholar  

Casey, M. B., Andrews, N., Schindler, H., Kersh, J., Samper, A., & Copley, J. (2008). The development of spatial skills through interventions involving block-building activities. Cognition and Instruction, 26 , 269–309. https://doi.org/10.1080/07370000802177177

Casey, M. B., Erkut, S., Ceder, I., & Mercer Young, J. (2008). Use of a storytelling context to improve girls’ and boys’ geometry skills in kindergarten. Journal of Applied Developmental Psychology, 28 , 29–48. https://doi.org/10.1016/j.appdev.2007.10.005

Casey, B. M., Dearing, E., Dulaney, A., Heyman, M., & Springer, R. (2014). Young girls’ spatial and arithmetic performance: The mediating role of maternal support during spatial problem solving. Early Childhood Research Quarterly, 29 , 636–648. https://doi.org/10.1016/j.ecresq.2014.07.005

Casey, B. M., Lombardi, C., Pollock, A., Fineman, B., Pezaris, E., & Dearing, E. (2017). Girls’ spatial skills and arithmetic strategies in first grade as predictors of fifth grade analytical math reasoning. Journal of Cognition and Development, 18 (5), 530–555. https://doi.org/10.1080/15248372.2017.1363044

Casey, B. M., Pezaris, E., Fineman, B., Pollock, A., Demers, L., & Dearing, E. (2015). A longitudinal analysis of early spatial skills compared to arithmetic and verbal skills as predictors of fifth-grade girls’ math reasoning. Learning and Individual Differences, 40 , 90–100. https://doi.org/10.1016/j.lindif.2015.03.028

Cohen, D. (1989). Calculus by and for young people . Champaign/Urbana, IL: Don Cohen-The Mathman.

Cronin, V. (1967). Mirror-image reversal discrimination in kindergarten and first grade children. Journal of Experimental Child Psychology, 5 , 577–585.

Cvencek, D., Metzoff, A. N., & Greenwald, A. C. (2011). Math gender stereotypes in elementary school children. Child Development, 82 (3), 766–779. https://doi.org/10.1111/j.1467-8624.2010.01529.x

Educational Development Center, Inc. (2008). Think math: Teacher guide, first grade . Orlando, FL: Harcourt.

Ehrlich, S., Levine, S., & Goldin-Meadow, S. (2006). The importance of gestures in children’s spatial reasoning. Developmental Psychology, 42 (6), 1259–1268. https://doi.org/10.1037/0012-1649.42.6.1259

Fennema, E., Carpenter, T. P., Jacobs, V. R., Franke, M. L., & Levi, L. W. (1998). A longitudinal study of gender differences in young children’s mathematical thinking. Educational Researcher, 27 , 6–11.

Foley, A. E., Vasilyeva, M., & Laski, E. (2017). Children’s use of decomposition strategies mediates the visuospatial memory and arithmetic accuracy relation. British Journal of Developmental Psychology, 35 (2), 303–309. https://doi.org/10.1111/bjdp.12166

Frick, A., Ferrara, K., & Newcombe, N. S. (2013). Using a touch screen paradigm to assess the development of mental rotation between 3 ½ and 5 ½ years of age. Cognitive Processes, 14 , 117–127. https://doi.org/10.1007/s10339-012-0534-0

Fuchs, L. S., Schumacher, R. R., Long, J., Jordan, N., Gersten, R., Cirino, P. T., … Changas, P. (2013). Improving at-risk learners’ understanding of fractions. Journal of Educational Psychology, 105 (3), 683–700. https://doi.org/10.1037/a0032446

Fuchs, L. S., Schumacher, R. F., Sterba, S. K., Long, J., Namkung, J., Malone, A., … Changas, P. (2014). Does working memory moderate the effects of fraction intervention? An aptitude–treatment interaction. Journal of Educational Psychology, 106 , 499–514. https://doi.org/10.1037/a0034341

Gathercole, S. E., Pickering, S. J., Ambridge, B., & Wearing, H. (2004). The structure of working memory from 4 to 15 years of age. Developmental Psychology, 40 (2), 177–190. https://doi.org/10.1037/0012/1649.40.2.177

Geary, D. C. (2011). Cognitive predictors of achievement growth in mathematics: A 5-year longitudinal study. Developmental Psychology, 47 (6), 1539–1552. https://doi.org/10.1037/a0025510

Gunderson, E. A., Ramirez, G., Beilock, S. L., & Levine, S. C. (2012). The relation between spatial skill and early number knowledge: The role of the linear number line. Developmental Psychology, 48 (5), 1229–1241. https://doi.org/10.1037/a0027433

Hamdan, N., & Gunderson, E. A. (2017). The number line is a critical spatial-numerical representation: Evidence from a fraction intervention. Developmental Psychology, 53 (3), 587–596. https://doi.org/10.1037/dev0000252

Harris, J., Hirsh-Pasek, K., & Newcombe, N. S. (2013). Understanding spatial transformations: Similarities and differences between mental rotation and mental folding. Cognitive Processing, 14 (1), 105–115.

Hawes, Z., Moss, J., Caswell, B., Naqvi, S., & MacKinnon, S. (2017). Enhancing children’s spatial and numerical skills through a dynamic spatial approach to early geometry instruction: Effects of a 32-week intervention. Cognition and Instruction, 35 (3), 236–264.  https://doi.org/ 10.1080/07370008.2017.1323902

Hegarty, M., & Kozhevnikov, M. (1999). Types of visual-spatial representations and mathematical problem solving. Journal of Educational Psychology, 91 (4), 684–689. https://doi.org/10.1037/0022-0663.91.4.684

Höffler, T. N. (2010). Spatial ability: Its influence on learning with visualizations—A meta-analytic review. Educational Psychology Review, 22 (3), 245–269. https://doi.org/10.1007/s10648-010-9126-7

Hurst, M., & Cordes, S. (2016). Rational-number comparison across notation: Fractions, decimals, and whole numbers. Journal of Experimental Psychology: Human Perception and Performance, 42 (2), 281–293. https://doi.org/10.1037/xhp0000140

Jitendra, A. K., Nelson, G., Pulles, S. M., Kiss, A. J., & Houseworth, J. (2016). Is mathematical representation of problems an evidence-based strategy for students with mathematical difficulties? Exceptional Children, 83 (1), 8–25. https://doi.org/10.1177/0014402915625062

Joram, E. (2003). Benchmarks as tools for developing measurement sense. In D. H. Clements & G. Bright (Eds.), Learning and teaching measurement: 2003 yearbook (pp. 57–67). Reston, VA: National Council of Teachers of Mathematics.

Kingsdorf, S., & Krawec, J. (2016). A broad look at the literature on math word problem-solving interventions for third graders. Cogent Education, 3 (1), 1–12. https://doi.org/10.1080/2331186X.2015.1135770

Kozhevnikov, M., Motes, M. A., & Hegarty, M. (2007). Spatial visualization in physics problem solving. Cognitive Science, 31 , 549–579. https://doi.org/10.1080/15326900701399897

Kozhevnikov, M., & Thorton, R. (2006). Real-time data display, spatial visualization ability, and learning force and motion concepts. Journal of Science Education and Technology, 15 (1), 111–132. https://doi.org/10.1007/s10956-006-0361-0

Krajewski, K., & Schneider, W. (2009). Exploring the impact of phonological awareness, visual-spatial working memory, and preschool quantity-number competencies on mathematics achievement in elementary school: Findings from a 3-year longitudinal study. Journal of Experimental Child Psychology, 103 (4), 516–531. https://doi.org/10.1016/j.jecp.2009.03.009

Krawec, J. L. (2012). Problem representation and mathematical problem solving of students of varying math ability. Journal of Learning Disabilities, 4 (2), 1–13. https://doi.org/10.1177/0022219412436976

Laski, E. V., Casey, B. M., Qingyi, Y., Dulaney, A., Heyman, M., & Dearing, E. (2013). Spatial skills as a predictor of first grade girls’ use of higher-level arithmetic strategies. Learning and Individual Differences, 23 , 123–130. https://doi.org/10.1016/j.lindif.2012.08.001

LeFevre, J., Fast, L., Skwarchuk, S., Smith-Chant, B. L., Bisanz, J., Kamawar, D., & Penner-Wilger, M. (2010). Pathways to mathematics: Longitudinal predictors of performance. Child Development, 81 , 1753–1757. https://doi.org/10.1111/j.1467-8624.2010.01508.x

LeFevre, J., Jimenez Lira, C., Sowinski, C., Cankaya, O., Kamawar, D., & Skwarchuk, S. L. (2013). Charting the role of the number line in mathematical development. Frontiers in Psychology, 4 (641), 1–9. https://doi.org/10.3389/fpsyg.2013.00641

Lemaire, P., & Siegler, R. S. (1995). Four aspects of strategic change: Contributions to children’s learning of multiplication. Journal of Experimental Psychology: General, 124 (1), 83–97. https://doi.org/10.1037/0096-3445.124.1.83

Levine, S. C., Foley, A., Lourenco, S., Ehrlich, S., & Ratliff, K. (2016). Sex differences in spatial cognition: Advancing the conversation. WIREs Cognitive Science, 7 , 127–155. https://doi.org/10.1002/wcs.1380

Levine, S. C., Huttenlocher, J., Taylor, A., & Langrock, A. (1999). Early sex differences in spatial skill. Developmental Psychology, 35 , 940–949. https://doi.org/10.1037/0012-1649.35.4.940

Li, Y., & Geary, D. C. (2013). Developmental gains in visuospatial memory predict gains in mathematics achievement. PLoS One, 8 (7), e70160. https://doi.org/10.1371/journal.pone.0070160

Li, Y., & Geary, D. C. (2017). Children’s visuospatial memory predicts mathematics achievement through early adolescence. PLoS One, 12 (2), e0172046. https://doi.org/10.1371/journal.pone.0172046

Mix, K. S. (2010). Spatial tools for mathematical thought. In K. S. Mix, L. B. Smith, & M. Gasser (Eds.), The spatial foundations of language and cognition . New York: Oxford University Press.

Mix, K. S., & Cheng, U. L. (2012). The relation between space and math: Developmental and educational implications. In J. B. Benson (Ed.), Advances in child development and behavior (Vol. 42, pp. 199–243). Waltham, MA: Academic Press.

Mix, K. S., Levine, S. C., Cheng, Y., Young, C., Hambrick, D. Z., … Konstantopolous, S. (2016). Separate but correlated: The latent structure of space and mathematics across development. Journal of Experimental Psychology: General, 145 (9), 1206–1227.

Miyake, A., Friedman, N. P., Rettinger, D. A., Shah, P., & Hegarty, M. (2001). How are visuospatial working memory, executive functioning, and spatial abilities related? A latent-variable analysis. Journal of Experimental Psychology: General, 130 (4), 621–640. https://doi.org/10.1037/0096-3445.130.4.621

Moss, J., Bruce, C. D., Caswell, B., Flynn, T., & Hawes, Z. (2016). Taking shape: Activities to develop geometric and spatial thinking grades k-2 . Toronto, ON, Canada: Pearson, Canada.

Nath, S., & Szücs, D. (2014). Construction play and cognitive skills associated with the development of mathematical abilities in 7-year-old children. Learning and Instruction, 32 , 73–80. https://doi.org/10.1016/j.learninstruc.2014.01.006

National Council of Teachers of Mathematics. (2000). Principles and standards for school mathematics . Reston, VA: National Council of Teachers of Mathematics.

National Governors Association Center for Best Practices, Council of Chief State School Officers. (2011). Common Core State Standards: Preparing America’s Students for College and Career . Retrieved from http://www.corestandards.org/ .

Schneider, M., & Siegler, R. S. (2010). Representations of the magnitudes of fractions. Journal of Experimental Psychology: Human Perception and Performance, 36 , 1227–1238. https://doi.org/10.1037/a0018170

Shen, C., Vasilyeva, M., & Laski, E. V. (2016). Here, but not there: Cross-national variability of gender effects in arithmetic. Journal of Experimental Child Psychology, 146 , 50–65. https://doi.org/10.1016/j.jecp.2016.01.016

Shaw, J., & Pucket-Cliatt, M. (1989). Developing measurement sense. In P. Trafton & A. Schule (Eds.), New directions for elementary school mathematics: 1989 yearbook (pp. 149–155). Reston, VA: National Council of Teachers of Mathematics.

Siegler, R. S. (1987). The perils of averaging data over strategies: An example from children’s addition. Journal of Experimental Psychology: General, 116 (3), 250–264. https://doi.org/10.1037/0096-3445.116.3.250

Siegler, R. S. (2005). Children’s learning. American Psychologist, 60 (8), 769–778. https://doi.org/10.1037/0003-066X.60.8.769

Siegler, R. S. (2007). Cognitive variability. Developmental Science, 10 (1), 104–109. https://doi.org/10.1111/j.1467-7687.2007.00571.x

Siegler, R. S., Duncan, G. J., Davis-Kean, P. E., Duckworth, K., Claessens, A., Engel, M., … Chen, M. (2012). Early predictors of high school mathematics achievement. Psychological Science, 23 (7), 691–697. https://doi.org/10.1177/0956797612440101

Siegler, R. S., & Shrager, J. (1984). A model of strategy choice. In C. Sophian (Ed.), Origins of cognitive skills (pp. 229–293). Hillsdale, NJ: Erlbaum.

Siegler, R. S., Thompson, C. A., & Schneider, M. (2011). An integrated theory of whole number and fractions development. Cognitive Psychology, 62 , 273–296. https://doi.org/10.1016/j.cogpsych.2011.03.001

TERC. (2008). Investigations in number, data, and space (2nd ed.). Boston, MA: Pearson Education.

Tzuriel, D., & Egozi, G. (2010). Gender differences in spatial ability of young children: The effects of training and processing strategies. Child Development, 81 (5), 1417–1430. https://doi.org/10.1111/j.1467-8624.2010.01482.x

Torbeyns, J., Verschaffel, L., & Ghesquière, P. (2005). Simple addition strategies in a first-grade class with multiple strategies. Cognition and Instruction, 23 (1), 1–21. https://doi.org/10.1207/s1532690xci2301_1

University of Chicago School Mathematics Project. (2007). Everyday math (3rd ed.). Boston, MA: McGraw-Hill Education.

Vandenberg, S. G., & Kuse, A. R. (1978). Mental rotations, a group test of three-dimensional spatial visualization. Perceptual and Motor Skills, 47 (2), 599–604.

van Garderen, M. (2006). Spatial visualization, visual imagery, and mathematical problem solving of students with varying abilities. Journal of Learning Disabilities, 39 , 496–506. https://doi.org/10.1177/00222194060390060201

Van Helsing (2007, August 7). Bicylinder Steinmetz solid.gif . Retrieved from https://commons.wikimedia.org/wiki/File:Bicylinder_Steinmetz_solid.gif

Vasilyeva, M., Laski, E., & Shen, C. (2015). Computational fluency and strategy choice predict individual and cross-national differences in complex arithmetic. Developmental Psychology, 51 (1), 1489–1500. https://doi.org/10.1037/dev0000045

Verdine, B. N., Golinkoff, R. M., Hirsh-Pasek, K., Newcombe, N. S., Filipowicz, A. T., & Chang, A. (2014). Deconstructing building blocks: Preschoolers’ spatial assembly performance relates to early mathematical skills. Child Development, 85 (3), 1062–1076. https://doi.org/10.1111/cdev.12165

Voyer, D., & Hou, J. (2006). Type of items and the magnitude of gender differences on the Mental Rotations Test. Canadian Journal of Experimental Psychology, 60 (2), 91–100. https://doi.org/10.1037/cjep2006010

Wai, J., Lubinski, D., & Benbow, C. P. (2009). Spatial ability for STEM domains: Aligning over 50 years of cumulative psychological knowledge solidifies its importance. Journal of Educational Psychology, 101 (4), 817–835. https://doi.org/10.1037/a0016127

Download references

Author information

Authors and affiliations.

Lynch School of Education, Boston College, Boston, MA, USA

Beth M. Casey

College of Computer and Information Science, Northeastern University, Boston, MA, USA

Harriet Fell

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Beth M. Casey .

Editor information

Editors and affiliations.

Department of Human Development and Quantitative Methodology, University of Maryland, College Park, MD, USA

Kelly S. Mix

Department of Teaching and Learning, The Ohio State University, Columbus, OH, USA

Michael T. Battista

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this chapter

Casey, B.M., Fell, H. (2018). Spatial Reasoning: A Critical Problem-Solving Tool in Children’s Mathematics Strategy Tool-Kit. In: Mix, K., Battista, M. (eds) Visualizing Mathematics. Research in Mathematics Education. Springer, Cham. https://doi.org/10.1007/978-3-319-98767-5_3

Download citation

DOI : https://doi.org/10.1007/978-3-319-98767-5_3

Published : 08 December 2018

Publisher Name : Springer, Cham

Print ISBN : 978-3-319-98766-8

Online ISBN : 978-3-319-98767-5

eBook Packages : Education Education (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Find Info For

  • Current Students
  • Prospective Students
  • Research and Partnerships
  • Entrepreneurship and Commercialization

Quick Links

  • Health and Life Sciences
  • Info Security and AI
  • Transformative Education
  • Purdue Today
  • Purdue Global
  • Purdue in the News

August 1, 2024

Purdue researchers adopt interdisciplinary approach to assessing emerging tar spot disease in corn

fernandezcampos-cornpathology

Mariela Fernández-Campos, a graduate student in Purdue University’s Department of Botany and Plant Pathology, examines leaves for tar spot in a field at Pinney Purdue Agricultural Center. (Purdue Agricultural Communications photo/Joshua Clark)

Mix of agricultural and engineering methods deployed for monitoring and assessment

WEST LAFAYETTE, Ind. — For more than a century, tar spot fungal disease in corn kept its distance from the U.S., biding its time. 

“Tar spot had not been detected in the U.S. prior to 2015, but it has been endemic in several Latin American countries, starting in Mexico in 1904,” said C.D. Cruz , associate professor of botany and plant pathology at Purdue University. 

Tar spot afflicted two of Indiana’s 92 counties in 2015. By 2022, only a few had managed to remain unscathed. At that time, the fungus had extended to 16 states and Ontario, Canada, according to the Pest Information Platform for Extension and Education .

The fungus had also affected crops in 15 other countries in Central and South America, the Caribbean, and the U.S. territories of Puerto Rico and the Virgin Islands, Cruz and co-authors reported in an article published in the journal Plant Disease .  

Where did the disease come from, via what pathway, and how different is the pathogen in other countries compared to the U.S.? “Maybe we don’t have those populations here yet, and they could potentially be much more damaging than the ones currently present,” Cruz noted.

Cruz and his associates apply a mix of statistics, data science, epidemiology, microbiology, artificial intelligence, computer vision and continual stakeholder feedback to better understand the dynamics of tar spot in corn. Their work is funded by two grants totaling nearly $1.1 million from the USDA National Institute of Food and Agriculture .

Deploying these methods of high-tech data collection and analysis could make a vital mass of corn pathology information more widely and rapidly available. “There are opportunities in integrating AI-based digital technologies and point-of-care diagnostics that can provide rapid and accurate information for stakeholders,” Cruz said.

Colleagues Joaquín Ramirez in Colombia; Carlos Góngora in Mexico; César Falconí in Ecuador; Da-Young Lee in South Korea; and Paul Esker, Steve Goodwin, and Matt Helm in the U.S. are participating collaborators. The need to address biosecurity challenges with a science-based approach lend urgency to their work.

In previous efforts, Cruz and Lee, who was a postdoctoral researcher in Cruz’s lab, developed a digital method for quantifying tar spot in corn. They tested their stromata contouring detection algorithm (SCDA) under field conditions against human raters. They successfully quantified trends, and their results appeared in a 2021 paper published in Frontiers in Plant Science.

Earlier this year, Cruz’s team conducted a similar experiment with SCDA 2.0, again with promising results, but some inaccuracies persist in fully automated scenarios. “We’re refining our training methods to significantly improve the model’s accuracy and performance,” he said.

Another research effort includes the laboratory of Purdue’s Mohammad Jahanshahi , associate professor in the Lyles School of Civil and Construction Engineering and the Elmore Family School of Electrical and Computer Engineering . Jahanshahi and Cruz, together with Mariela Fernández-Campos, a PhD student in botany and plant pathology, and Yu-Ting Huang, a postdoctoral research associate in civil engineering, published the results of their first collaboration in 2021 , a special blend of agriculture with engineering that focused on another high-profile fungal disease called wheat blast. The developed model detected and classified images of disease symptoms within two seconds.  

Together they are educating the next generation of problem solvers, Jahanshahi said. “Whether you’re working on civil infrastructure health monitoring or agricultural disease epidemiology, you can have a profound impact not only on humans but also on the environment.”

Previously, Jahanshahi had used robotic systems to monitor civil infrastructure such as roads, bridges, buildings, sewer pipelines and nuclear reactors for damage . But when he began working with Cruz, he realized that structural damage corresponds to disease in wheat and other crops.

Even so, Jahanshahi noted, “You cannot just copy this method and apply it to another domain.” He now has adapted his AI and computer-vision approaches to inspecting and analyzing civil infrastructure damage for agriculture.

Currently, Fernández-Campos and Abhishek Subedi, a PhD student in civil engineering, are finalizing a proof-of-concept data fusion project. They leveraged data collected from previous collaborations with Darcy Telenko , an associate professor of botany and plant pathology.

“In the current digital agricultural revolution, we aim to understand tar spot by using various data types. Unlike traditional models that rely solely on weather information, we are integrating drone imagery and weather data to capture crucial details about the tar spot pathosystem,” Fernández-Campos said.

Jahanshahi and Subedi helped fuse two very different data types: vegetation measures with drone imagery and numerical weather information using AI approaches. Meanwhile, Fernández-Campos used the same data, this time implementing traditional statistical methods for plant pathology.

“We tested our method’s performance against the disease evaluations of human raters,” Subedi said, and they were able to correlate the appearance of the disease and its progress. “We were able to build a strong enough model to detect the tar spot using images taken from a 50-meter height and on-site weather information.”

Another parallel task is to quantify and model tar spot epidemics in production-style fields. Graduate student Brenden Lane from Cruz’s lab and Sungchan Oh, a computational infrastructure specialist in the Institute for Plant Sciences , use disease observations and drone-based imagery to characterize tar spot epidemiological dynamics at various scales. Ultimately, the team aims to quantify and model tar spot severity by referencing and integrating imagery with spatial and temporal data.

The faded disciplinary boundaries of such a project can place greater time demands on the researchers as they learn to understand each other’s specialized languages. “But because each person brings their own expertise to the table, it can lead to better problem solving,” Jahanshahi said.

About Purdue University

Purdue University is a public research institution demonstrating excellence at scale. Ranked among top 10 public universities and with two colleges in the top four in the United States, Purdue discovers and disseminates knowledge with a quality and at a scale second to none. More than 105,000 students study at Purdue across modalities and locations, including nearly 50,000 in person on the West Lafayette campus. Committed to affordability and accessibility, Purdue’s main campus has frozen tuition 13 years in a row. See how Purdue never stops in the persistent pursuit of the next giant leap — including its first comprehensive urban campus in Indianapolis, the Mitchell E. Daniels, Jr. School of Business, Purdue Computes and the One Health initiative — at https://www.purdue.edu/president/strategic-initiatives . 

Writer: Steve Koppes

Media contact: Devyn Raver, [email protected]

Sources: C.D. Cruz, [email protected] ; Mohammad Jahanshahi, [email protected] ; Mariela Fernández-Campos, [email protected] ; and Abhishek Subedi, [email protected]

Agricultural Communications: 765-494-8415;

Maureen Manier, Department Head, [email protected]

Agriculture News Page

Note to journalists: Publication-quality photos are available in Google Drive .

Communication

  • OneCampus Portal
  • Brightspace
  • BoilerConnect
  • Faculty and Staff
  • Human Resources
  • Colleges and Schools

Info for Staff

  • Purdue Moves
  • Board of Trustees
  • University Senate
  • Center for Healthy Living
  • Information Technology
  • Ethics & Compliance
  • Campus Disruptions

Purdue University, 610 Purdue Mall, West Lafayette, IN 47907, (765) 494-4600

© 2015-24 Purdue University | An equal access/equal opportunity university | Copyright Complaints | Maintained by Office of Strategic Communications

Trouble with this page? Disability-related accessibility issue? Please contact News Service at [email protected] .

IMAGES

  1. Spatial problem solving approach—Related Concepts

    spatial problem solving approach

  2. Spatial problem solving approach

    spatial problem solving approach

  3. Spatial Problem Solving

    spatial problem solving approach

  4. Spatial Problem Solving: A Conceptual Framework

    spatial problem solving approach

  5. Learn ArcGIS

    spatial problem solving approach

  6. PPT

    spatial problem solving approach

VIDEO

  1. Brain boosting activity #braintraining

  2. Block by Block from Thinkfun BI5931

  3. Problem Solving

  4. CubeWorks

  5. EMFT |Lec 162| Conceptual Discussion to Solve Problem on Reflection of Plane Wave at Normal Incident

  6. Christy Heaton

COMMENTS

  1. Spatial problem solving approach—Related Concepts

    Documentation. Many problems in the world today can be solved using the spatial problem solving approach. 1. Ask and explore. Set the goals for your analysis. Begin with a well-framed question based on your understanding of the problem. Getting the question right is key to deriving meaningful results. Questions that can be answered using ...

  2. Spatial Problem Solving: A Conceptual Framework

    Spatial Problem Solving: A Conceptual Framework. Many types of problems and scenarios can be addressed by applying the spatial problem solving approach using ArcGIS. You can follow the five steps in this approach to create useful analytical models and use them in concert with spatial data exploration to address a whole array of problems and ...

  3. Geographic Approach

    The geographic approach is a way of thinking and problem-solving that integrates and organizes all relevant information in the crucial context of location. Leaders use this approach to reveal patterns and trends; model scenarios and solutions; and ultimately, make sound, strategic decisions.

  4. (PDF) Spatial Problem Solving in Spatial Structures

    solving approaches, we can include the spatial problem domain as part of the system. and (1) maintain some of the spatial relations in their original form; (2) simulate. spatial relations and ...

  5. PDF Improving Critical Thinking Skills of Geography Students with Spatial

    SPBL with problem-solving characteristics with a spatial approach has the advantage to be applied to develop 21st-century skills. These advantages are: (1) students work in teams (Sun, et al., 2018), (2) With scientific observations and activities allow students to identify and formulate spatial problems (Darmaji, et al., 2019), (3) students think

  6. Spatial Problem Solving in Spatial Structures

    Spatial problem solving has been a fundamental research topic in AI from the very beginning. Initially, spatial relations were treated like other features: task-relevant aspects of the domain were formalized and represented in some kind of data structure; general computation and reasoning methods were applied; and the result of the computation was interpreted in terms of the target domain.

  7. A conceptual model for solving spatial problems

    There are many possible applications for this approach to problem solving. The following topic provides an example where the conceptual model was used to solve a siting problem: Use the conceptual model to create a suitability map; References. Malczewski, J, GIS and Multicriteria Decision Analysis, 1999, Wiley & Sons. Related topics

  8. Spatial Problem Solving by Chris Mickle

    Course Reflection: The GIS520 Spatial Problem Solving course has provided a fresh perspective and approach for spatial problem solving. The geopsatial courses part of the Center for Geospatial Analytics at NCSU provide instruction on tools and techniques that provide the skills to use GIS. This course has also provided a framework for properly ...

  9. Foundations of human spatial problem solving

    Problem solving then requires 1) defining the goal state; 2) planning a sequence of state transitions to move the current state toward the goal; and 3) generating actions aimed at implementing the ...

  10. Spatial Problem Solving in Spatial Structures

    We look at the roles of spatial configurations and of cognitive agents in the process of spatial problem solving from a cognitive architecture perspective. In particular, we discuss (a) the role of the structures of space and time; (b) the role of conceptualizations 1 This holds for arbitrary granularities of neighborhoods. Download Free PDF.

  11. PDF Enhancing spatial skills through mechanical problem solving

    findings suggest that mechanical problem solving is a potentially viable approach to enhancing spatial thinking. For a while I gave myself up entirely to the intense enjoyment of picturing machines and devising new forms. . . . The pieces of apparatus I conceived were to me absolutely real and tangible in every detail, even to the

  12. Probing the Relationship Between Process of Spatial Problems Solving

    There were two purposes in the study. One was to explore the cognitive activities during spatial problem solving and the other to probe the relationship between spatial ability and science concept learning. Twenty university students participated in the study. The Purdue Visualization of Rotations Test (PVRT) was used to assess the spatial ability, whose items were divided into different types ...

  13. Bridging the gap: blending spatial skills instruction into a ...

    Strong spatial skills are not only important to visualization but they also support 'thinking' when problem-solving (Duffy et al., 2020). Furthermore, strong spatial skills help to increase the capacity of working memory and reduce cognitive load during graphical problem-solving tasks (Buckley et al., 2019 ; Delahunty et al., 2020 ).

  14. Situating space: using a discipline-focused lens to examine spatial

    The expert's knowledge of the domain influences his approach to solving the spatial problem (Chase & Simon, 1973; Hegarty et al., 2007; Shipley & Tikoff, 2016). In a parallel example to the role of STEM expertise on spatial thinking, the relation between context, prior knowledge, and problem-solving approach has been heavily investigated by ...

  15. Why spatial is special in education, learning, and everyday activities

    Spatial thinking is a broader topic than spatial ability, however (Hegarty 2010).We use symbolic spatial tools, such as graphs, maps, and diagrams, in both educational and everyday contexts. These tools significantly enhance human reasoning, for example, graphs are a powerful tool to show the relationship among a set of variables in two (or higher) dimensions.

  16. Applied geography: A problem-solving approach

    Applied Geography is a journal devoted to the publication of research which utilizes geographic approaches (human, physical, nature-society and GIScience) to resolve human problems that have a spatial dimension. These problems may be related to the assessment, management and allocation of the world's physical and/or human resources.The underlying rationale of the journal is that only through a ...

  17. Spatial Thinking Is Everywhere, And It's Changing Everything

    Spatial thinking or the geographic approach is a decision-making tool. A planning tool. A tool of data analysis, data modeling, and data presentation. It's a collaboration tool, a communication ...

  18. Enhancing spatial skills through mechanical problem solving

    Effects of mechanical problem solving training on spatial skills were evaluated. • Engaging in mechanical problem solving enhanced spatial visualization performance. • Gains in spatial skills remained stable across immediate and delayed post-tests. • Mechanical problem solving could be a viable approach to enhance spatial thinking.

  19. Spatial autocorrelation informed approaches to solving location

    To begin, MAJORITY THEOREM (MT): For an n destinations p = 1 source location-allocation (i.e., p-median) problem in continuous space, with n > 1 and Euclidean distance as the metric, if a single weight w k > ∑ i = 1 n w i 2, then the demand point ( u k, v k) is the optimal location (i.e., spatial median) solution.

  20. Investigating the use of spatial reasoning strategies in geometric

    A core aim of contemporary science, technology, engineering, and mathematics (STEM) education is the development of robust problem-solving skills. This can be achieved by fostering both discipline knowledge expertise and general cognitive abilities associated with problem solving. One of the most important cognitive abilities in STEM education is spatial ability however understandings of how ...

  21. Spatial Ability: Understanding the Past, Looking into the Future

    The spatial intelligence is defined by Gardner ( 1983) as the ability of forming a mental model of the spatial world and maneuvering and working with this model. Spatial intelligence helps the individual to perceive, decode and activate in the imagination visual representations that create the image of space. Researchers in this field consider ...

  22. Foundations of human spatial problem solving

    This would be consistent with evidence of serial search during planning 63, 64 and would afford a new approach to artificial general intelligence that is both powerful and similar to human brain function. Another limitation is that the Treasure Hunt task is essentially a spatial problem solving task.

  23. Spatial Reasoning: A Critical Problem-Solving Tool in Children's

    When examining pathways between spatial skills and word problems in sixth graders, Boonen and associates (Boonen et al., 2013) found that 21% of the association between spatial skills and word problem solving was explained through the indirect effects of strategies involving visual-schematic representations. Thus, spatial skills can be ...

  24. Solving Short-Term Relocalization Problems In Monocular Keyframe Visual

    lem. The proposed approach introduces a novel multimodal keyframe descriptor consisting of semantics and spatial data from monocular images. This is integrated into a KPR method that chooses appropriate keyframes candidates by passing them through a multi-stage filtering algorithm. The novel KPR is coupled with an off-the-shelf 3D-2D pose esti-

  25. Purdue researchers adopt interdisciplinary approach to assessing

    The need to address biosecurity challenges with a science-based approach lend urgency to their work. In previous efforts, Cruz and Lee, who was a postdoctoral researcher in Cruz's lab, developed a digital method for quantifying tar spot in corn. ... the team aims to quantify and model tar spot severity by referencing and integrating imagery ...