SCALE-DEPENDENT PROCESSING

OF

CLUSTERED SENSORY SIGNALS

 

 

Ongoing Research Supported by the National Science Foundation

 

 

Narendra Ahuja, Department of Electrical and Computer Engineering, and

Artificial Intelligence Group of the Beckman Institute

 

Albert Feng, Department of Molecular and Integrative Physiology, and

NeuroTech Group of the Beckman Institute

 

Mark Nelson Department of Molecular and Integrative Physiology, and

NeuroTech Group of the Beckman Institute

 

 

 

 

SUMMARY

 

 

      A key selective pressure driving the evolution of complex nervous systems is an animal’s need to detect, identify and localize resources in its environment. Reproductive success is tightly coupled to an animal’s ability to efficiently carry out resource-acquisition tasks such as finding food or a mate. In natural environments, resources often occur in clusters, both in space and time, such that resource-related signals are embedded in a cluttered background arising from similar nearby sources. For example, during the mating season, the auditory cues that guide a female frog to a particular male are embedded in a dense chorus arising from hundreds of calling males of different species. Many animals form clusters (flocks, swarms, schools, etc.) for a variety of reasons, including social interactions, migration, cooperative hunting, cooperative mate attraction, and minimizing predation risks (Alcock 2001; Bradbury and Vehrencamp 1998; Partridge 1983). Sensory systems that have evolved to function under these conditions are expected to exhibit sophisticated capabilities for processing clustered signals.

 

      Clustered signals also pose a significant challenge for intelligent neural prostheses and machine perception systems that operate under real-world conditions. For example, an intelligent hearing aid designed to enhance speech perception should exhibit robust performance in cluttered environments with other voices in the background. An intelligent video surveillance system should be able to automatically track the movements of one selected individual moving in a crowded environment, such as an airport terminal. These tasks are beyond the capabilities of current machine perception systems, but approaches for tackling such problems are beginning to emerge. The theoretical component of this ongoing research draws on recent developments in multiscale processing of video sequences in computer vision to establish a generalized framework for efficient and reliable processing of clustered sensory signals.

 

Text Box: Fig. 1 Frog mating behavior as an example of a multiscale task. Successful performance requires information processing on multiple spatial and temporal scales (gray region). Signal characteristics and analysis tasks can be scale-dependent.

      The research in this project combines experimental and theoretical approaches to explore the neural mechanisms and computational algorithms used to detect, identify, and localize individual signals embedded in an ensemble of similar signals. We hypothesize that the sensory cues used to perform these tasks change as a function of distance from the ensemble. In some cases, the changes may be relatively subtle, such as a gradual shift in the spectral content of an auditory signal with distance. In other cases, the changes might be more marked, such as a shift in the relative importance of different modalities for classification of a target. Furthermore, we hypothesize that behavioral strategies and sensory filtering properties in the nervous system are adaptively adjusted in response to scale-dependent changes in the sensory stimulus. The multiscale nature of a representative biological task is illustrated graphically in Figure 1, using frog mating-behavior as an example.

 

Choice of Biological Model Systems
 
Text Box: Fig. 2 Overview of segmentation and grouping algorithms for video sequence analysis in computer vision. Algorithms that give priority to spatial information are shown in red (dashed) and those giving priority to temporal information are shown in blue (solid). From Megret and DeMenthon (2002).
      We use two well-established neuroethological model systems to serve as the biological foundation for this project, phonotaxis in frogs and prey capture in weakly electric fish. These systems are representative of two important classes of clustered signals confronted by animals—communication signals and prey-related signals. In frog mating behavior, each male frog in a chorus generates a stereotyped communication signal with species-specific and call-type-specific harmonic structure, amplitude modulation and frequency modulation patterns. In addition to frog choruses (Farris et al. 2002; Sun et al. 2000), other examples of clustered communication signals include bird song (Bradbury and Vehrencamp 1998), insect choruses (Gerhardt and Huber 2002), and the classic “cocktail party” scenario (Bregman 1990; Cherry 1953). A second important class of clustered sensory signals is associated with groups of prey. Examples include visual predation on schooling fish (Parrish 1992; Partridge 1983), electrosensory foraging on zooplankton swarms (Freund et al. 2002, Russell et al. 1999), bat echolocation of insects in a swarm (Moss and Surlykke 2001; Schnitzler and Kalko 2001), and dolphin echolocation of fish in a school (Au and Benoit-Bird 2003).

 

Theoretical approach

      Theoretical analysis of scale-dependent processing of clustered signals draws on algorithms from computer vision for grouping and segmentation, target detection and tracking, active vision, texture analysis, and motion and structure estimation. For example, Figure 2 provides an overview of spatiotemporal grouping algorithms in computer vision for extracting targets in dynamic scenes (Megret and DeMenthon 2002). Algorithms can be characterized as giving priority to segmentation in the spatial domain (e.g., feature detection and segmentation of static images), or the temporal domain (e.g., trajectory/motion analysis). For clustered sensory signals, the ability to separate one particular signal from the background is dependent on the degree of overlap and occlusion in the space-time domain, as well as along other feature dimensions. Algorithms that perform segmentation and grouping in the joint space-time domain (i.e., along the diagonal of Fig. 2), are expected to offer better source separation capabilities under clustered conditions than algorithms that operate predominantly along the spatial or temporal dimensions.

      This research consists of four main projects: (1) empirical characterization of clustered signals at different distances, (2) behavioral analysis of approach trajectories to clustered sources, (3) development of a theoretical framework and computational algorithms, and (4) investigation of neural correlates of scale-dependent sensory processing in the nervous system.

 

Project 1: Clustered signal characteristics at multiple distance scales

Understanding the neural basis of target detection, localization and classification requires detailed information about the signal and background characteristics across all behaviorally relevant distance scales. The aim of Project 1 is to systematically obtain this information for clustered sources at multiple scales for the two model systems under investigation:

1a. Sound characteristics of frog choruses at different distance scales (field study).

1b: Characteristics of active electrolocation based on perturbations in weakly electric fish’s self-generated electric field of prey swarms.

 

Project 2: Characterizing approach trajectories over multiple scales

Important clues regarding neural algorithms for detection and localization can be obtained by analyzing both the long-range movement strategies used to approach a cluster and the short-range strategies used to approach a particular individual within the cluster. Project 2 analyzes approach trajectories of frogs and fish to clustered sources that were characterized in Project 1:

2a. Female frog approach trajectories to natural choruses (field study).

2b: Female frog approach trajectories to playback of naturalistic sounds (lab study).

2c: Weakly electric fish approach and strike response to Daphnia swarms .

 

Project 3: Multiscale algorithms for clustered-source analysis

The goal of this project is to develop a theoretical framework for detection, characterization and tracking of sources under clustered conditions. The framework will be generic so that it can be applied to a variety of different modalities (vision, hearing, electroreception, chemoreception) under different scenarios. We intend to use the framework to make experimentally testable predictions about the types of neural computations and adaptive scale-dependent changes that are likely to be found in the nervous system. The framework is also intended to be applied to real-world engineering problems, such as designing an intelligent hearing aid that can enhance the voice of a selected speaker in a crowd of many voices, or designing a video surveillance system that can identify and track anomalous movements in a crowded airport terminal. Our effort is targeted at:

3a: Development and application of a theoretical framework for clustered source analysis.

3b: Development of a high-resolution digital video system with automatic tracking.

 

Project 4: Neural correlates of clustered signal analysis

Having characterized naturalistic signals (Project 1), behavioral patterns (Project 2) and identified candidate computational algorithms (Project 3), we will turn our attention to exploring the underlying neural substrates. Historically, this approach has proven very effective. For example, it has helped establish how “matched filters” for different features of isolated calls are constructed along the frog’s central auditory pathway. In the studies proposed here, we will characterize the encoding and processing of clustered signals at different levels of the nervous system and look for evidence of scale-dependent adaptive filtering in the CNS. These projects represent only the initial steps in understanding the neural basis of clustered signal analysis. Over the course of the project, we expect additional neurophysiological studies to emerge from ongoing interactions between theory and experiment. Our effort will focus on:

4a. Frog neurophysiology – encoding and processing of chorus sounds.

4b. Electric fish neurophysiology – encoding and processing of swarm signals.

 

 

 

REFERENCES

 

Alcock J (2001) Animal Behavior: An Evolutionary Approach. Sunderland (MA): Sinauer Associates.

Au WWL, Benoit-Bird KJ (2003) Automatic gain control in the echolocation system of dolphins. Nature 423: 861-863.

Bradbury JW, Vehrencamp SL (1998) Principles of animal communication. Sunderland (MA): Sinauer Associates, Inc.

Bregman AS (1990) Auditory scene analysis. Cambridge (MA): MIT Press.

Cherry EC (1953) Some experiments on the recognition of speech with one and two ears. J. Acoust. Soc. Am. 25: 957-959.

Farris HE, Rand AS, Ryan MJ (2002) The effects of spatially separated call components on phonotaxis in Tungara frogs: Evidence for auditory grouping. Brain Behav. Evol. 60: 181-188.

Freund J, Schimansky-Geier L, Beisner B, Neiman A, Russell D, Yakusheva T and Moss F (2002) Behavioral Stochastic Resonance: How the Noise from a Daphnia Swarm Enhances Individual Prey Capture by Juvenile Paddlefish. J. Theor. Biol. I214, 71-83.

Gerhardt HC, Huber F (2002) Acoustic Communication in Insects and Anurans. Chicago: University of Chicago Press.

Megret R, DeMenthon D (2002) A survey of spatio-temporal grouping techniques. Research report CS-TR-4403, LAMP, University of Maryland, August 2002.

Moss CF, Surlykke A (2001) Auditory scene analysis by echolocation in bats J. Acoust. Soc. Am. 110: 2207-2226.

Parrish JK (1992) Do predators shape fish schools – Interactions between predators and their schooling prey. Neth. J. Zool. 42 (2-3): 358-370.

Partridge BL (1983) The structure and function of fish schools, Scientific American, June 1983, 90-99.

Russell D,Wilkens L, Moss F (1999) Use of behavioral stochastic resonance by paddlefish for feeding. Nature 402, 219-223.

Schnitzler HU, Kalko EKV (2001) Echolocation by insect-eating bats. Bioscience, 51: 557-569.

Sun L, Wilczynski W, Rand AS, Ryan MJ (2000) Trade-off in short- and long-distance communication in tungara (Physalaemus pustulosus) and cricket (Acris crepitans) frogs. Behav. Ecol. 11: 102-109.

 

Back to Theme IV