Carnegie Mellon University
Browse

Ankit Parag Shah

Publications

  • DCASE 2017 Challenge Setup: Tasks, Datasets and Baseline System
  • Repeatability and Scalability of Code for Top Level Verification
  • An Approach for Self-Training Audio Event Detectors Using Web Data
  • Learning Sound Events From Webly Labeled Data
  • Accident Forecasting in CCTV Traffic Camera Videos
  • Tartan: A retrieval-based socialbot powered by a dynamic finite-state machine architecture
  • Activity Recognition on a Large Scale in Short Videos-Moments in Time Dataset
  • Natural Language Person Search Using Deep Reinforcement Learning
  • Experiments on the DCASE Challenge 2016: Acoustic Scene Classification and Sound Event Detection in Real Life Recordings
  • Content-based Representations of audio using Siamese neural networks
  • Large-Scale Weakly Labeled Semi-Supervised Sound Event Detection in Domestic Environments
  • Framework for evaluation of sound event detection in web videos
  • Pipelined implementation of high radix adaptive CORDIC as a coprocessor
  • Automated Audio Captioning and Language-Based Audio Retrieval
  • An Approach to Ontological Learning from Weak Labels
  • Approach to Learning Generalized Audio Representation Through Batch Embedding Covariance Regularization and Constant-Q Transforms
  • Conformers are All You Need for Visual Speech Recogntion
  • Overview of the Tenth Dialog System Technology Challenge: DSTC10
  • Audio-visual scene-aware dialog and reasoning using audio-visual transformers with joint student-teacher learning
  • DSTC10-AVSD Submission System with Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning
  • An approach for self-training audio event detectors using web data
  • DCASE 2017 challenge setup: Tasks, datasets and baseline system
  • On the pragmatism of using binary classifiers over data intensive neural network classifiers for detection of COVID-19 from voice
  • Triple Attention Network architecture for MovieQA
  • Multimodal behavioral markers exploring suicidal intent in social media videos
  • An overview of techniques for biomarker discovery in voice signal
  • Feature extraction and evaluation for BioMedical Question Answering
  • Content-based Representations of audio using Siamese neural networks
  • NELS-Never-Ending Learner of Sounds
  • Sound event detection in domestic environments with weakly labeled data and soundscape synthesis
  • DCASE 2017 challenge setup: Tasks, datasets and baseline system
  • CADP: A Novel Dataset for CCTV Traffic Camera based Accident Analysis
  • Framework for evaluation of sound event detection in web videos
  • Repeatability and Scalibility of Code at Top level Verification
  • An approach for self-training audio event detectors using web data
  • Improving Perceptual Quality, Intelligibility, and Acoustics on VoIP Platforms
  • Tartan: A retrieval-based socialbot powered by a dynamic finite-state machine architecture
  • Overview of Audio Visual Scene-Aware Dialog with Reasoning Track for Natural Language Generation in DSTC10
  • Sound event detection in synthetic domestic environments
  • Learning Sound Events From Webly Labeled Data
  • Pipelined implementation of high radix adaptive CORDIC as a coprocessor
  • Reasoning for Audio Visual Scene-Aware DialogTrack in DSTC10
  • A Closer Look at Weak Label Learning for Audio Events
  • Archive ouverte HAL
  • Imprecise label learning: A unified framework for learning with various imprecise label configurations
  • Training image classifiers using Semi-Weak Label Data
  • Activity Recognition on a Large Scale in Short Videos-Moments in Time Dataset
  • Large-Scale Weakly Labeled Semi-Supervised Sound Event Detection in Domestic Environments
  • Experiments on the DCASE Challenge 2016: Acoustic scene classification and sound event detection in real life recording
  • Hardware Architecture for High Radix Adaptive CORDIC Algorithm

Ankit Parag Shah's public data