Engineering | ASU Active Perception Group (APG) / Yezhou Yang

ASU Active Perception Group Seminar Series

Key Facts

Seminar Date
Biweekly Friday 9am
Workshop Location
Brickyard 5th floor conference room
Organizers
Yezhou Yang, Open for volunteers
Audiences
Everyone is welcome to sit-in for the presentation part
April 20th 2018, Shuai Li and Maverick Chung

Congratulations Maverick for concluding his Senior project with ASU APG.

April 13th: Xin Ye - Robot with vision that finds objects

 

April 06th: Venka and Diptanchu - Partially observable decision makings

March 30th, Houpu Yao, Recognition by Imagination

 

March 23rd, Kong Shu from UCI

 

The topic is “scene parsing through per-pixel labeling: a better and faster way”, which partially combines the following two write-ups (with splash figures:-)–
Recurrent Scene Parsing with Perspective Understanding in the Loop, CVPR 2018.
Pixel-wise Attentional Gating for Parsimonious Pixel Labeling, submitted to ECCV2018.

Objects may appear at arbitrary scales in perspective images of a scene, posing a challenge for recognition systems that process images at a fixed resolution. We propose a depth-aware gating module that adaptively selects the pooling field size (by fusing multi-scale pooled features) in a convolutional network architecture according to the object scale (inversely proportional to the depth) so that small details are preserved for distant objects while larger receptive fields are used for those nearby. The depth gating signal is provided by stereo disparity or estimated directly from monocular input. We further integrate this depth-aware gating into a recurrent convolutional neural network to perform semantic segmentation. Our recurrent module iteratively refines the segmentation results, leveraging the depth and semantic predictions from the previous iterations.

Moreover, rather than fusing mutli-scale pooled features based on estimated depth, we show the “correct” size of pooling field for each pixel can be learned in an attentional fashion by our  Pixel-wise Attentional Gating unit (PAG) that learns to selectively process a subset of spatial locations at each layer of a deep convolutional network.  PAG enables us to achieve parsimonious inference in per-pixel labeling tasks with a limited computational budget. PAG is a generic, architecture-independent, problem-agnostic mechanism that can be readily “plugged in” to an existing model with fine-tuning. We utilize PAG in two ways: 1) learning spatially varying pooling fields that improve model performance without the extra computation cost associated with multi-scale pooling, and 2) learning a dynamic computation policy for each pixel to decrease total computation while maintaining accuracy. We extensively evaluate PAG on a variety of per-pixel labeling tasks, including semantic segmentation, boundary detection, monocular depth and surface normal estimation. We demonstrate that PAG allows competitive or state-of-the-art performance on these tasks. Our experiments show that PAG learns dynamic spatial allocation of computation over the input image which provides better performance trade-offs compared to related approaches (e.g., truncating deep models or dynamically skipping whole layers). Generally, we observe PAG can reduce computation by 10% without noticeable loss in accuracy and performance degrades gracefully when imposing stronger computational constraints.

March 09 2018: Kevin Luck, Differentiable Neural Computer

 

Feb 17th, Varun: Effects of AR/VR in HRI

 

Feb 2nd, Kausic: Hallucination

 

Jan 19th, Mohammad: AHCNN

Jan 12, 2018: Duo Lu, on Visual Recognition and Security

 

Dec 1st, 2017, Jacob Zhiyuan Fang on Capsule Networks and ASU APG end-of-semester gathering.

 

Oct 19th and Nov 3rd Fall research expo
Oct 5th, Mo Izady, Key Evidences Localization in Medical Images
September 21st 2017, Shuai Li: "Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning"
September 8th, Zhiyuan (Jacob) Fang

sdr

April 24th, 2017, Aman Verma: Structure from Motion and its CNN modeling

 

Aug 18 2017 Summer Research Expo

 

External Speakers: Kowshik Thopalli and Perikumar Mukundbhai Javia

Topic: Visual Question Answering and it’s Adversarial Modeling

cof

cof

ASU APG Memory of 2016-2017

 

 

 

 

 

 

 

 

 

April 24th, 2017, Ramu Ponneganti: Rationalizing Neural Predictions

sdr

April 3rd, Xin Ye, Hand Movement Prediction from Vision for Human Robot Interaction

sdr

Feb 27th 2017, Khimya Khetarpal, Learning Visual Representations

Feb 20th 2017, Rudra Saha, InfoVAE

Feb 1st, Mo Izady, Deep learning in medical image processing

Jan 23rd 2017, Divyanshu Bandil, Visual Question Categorization

Jan 9th 2017, Mohammad Farhadi, Meta-modeling for deep learning

wp_20170109_001

Nov 21st 2016 Stephen McAleer, Generative Adversarial Networks (GAN)

 

wp_20161121_001-1 wp_20161121_002-1

Slides from Stephen:

generative-adversarial-networks-v2

Nov 7th 2016, Ramu Ponneganti: Event Recounting for Video processing
Nov 7th 2016, Yantian Zha: Differentiable Neural Computer