Skip to main content
eScholarship
Open Access Publications from the University of California

UC Irvine

UC Irvine Electronic Theses and Dissertations bannerUC Irvine

Brain Inspired Neural Network Models of Visual Motion Perception and Tracking in Dynamic Scenes

Abstract

For self-driving vehicles, aerial drones, and autonomous robots to be successfully deployed in the real-world, they must be able to navigate complex environments and track objects. While Artificial Intelligence and Machine Vision have made significant progress in dynamic scene understanding, they are not yet as robust and computationally efficient as humans or other primates in these tasks. For example, the current state-of-the-art visual tracking methods become inaccurate when applied to random test videos. We suggest that ideas from cortical visual processing can inspire real world solutions for motion perception and tracking that are robust and efficient. In this context, the following contributions are made in this dissertation. First, a method for estimating 6DoF ego-motion and pixel-wise object motion is introduced, based on a learned overcomplete motion field basis set. The method uses motion field constraints for training and a novel differentiable sparsity regularizer to achieve state-of-the-art ego and object-motion performances on benchmark datasets. Second, a Convolutional Neural Network (CNN) that learns hidden neural representations analogous to the response characteristics of dorsal Medial Superior Temporal area (MSTd) neurons for optic flow and object motion is presented. The findings suggest that goal driven training of CNNs might automatically result in the MSTd-like response properties of model neurons. Third, a recurrent neural network model of predictive smooth pursuit eye movements is presented that generates similar pursuit initiation and predictive pursuit behaviors as observed in humans. The model provides the computational mechanisms of formation and rapid update of an internal model of target velocity, commonly attributed to zero lag tracking and smooth pursuit of occluded objects. Finally, a spike based stereo depth algorithm is presented that reconstructs dynamic visual scenes at 400 frames-per-second with one watt of power consumption when implemented using the IBM TrueNorth processor. Taken together, the presented models and implementations provide the computations for motion perception in the dorsal visual pathway in the brain and inform ideas for efficient computational vision systems.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View