ENLSP NeurIPS Workshop 2021

Accepted papers

Title: A versatile and efficient approach to summarize speech into utterance-level representations
Authors:

Joao B Monteiro (Institut National de la Recherche Scientifique)*; Jahangir Alam (Computer Research Institute of Montreal (CRIM), Montreal (Quebec) Canada); Tiago H Falk (INRS-EMT)

Abstract:

Time delay neural networks (TDNN) have become ubiquitous for voice biometrics and language recognition tasks relying on utterance-level speaker- or language-dependent representations. In this paper, we discuss directions to improve upon the conventional TDNN architecture to render it more generally applicable. More specifically, we explore the utility of performing pooling operations across different levels of the convolutional stack and further propose an approach to efficiently combine such set of representations. We show that the resulting models are more versatile, in the sense that a fixed architecture can be re-used across different tasks, and learned representations are more discriminative. Evaluations are performed across two settings: (1) two sub-tasks for spoofing attack detection, and (2) three sub-tasks for spoken language identification. Results show the proposed design yielding improvements over the original TDNN architecture, as well as other previously proposed methods.

A versatile and efficient approach to summarize speech into utterance-level representations
Towards Zero and Few-shot Knowledge-seeking Turn Detection in Task-orientated Dialogue Systems BEST PAPER AWARD
Consistent Accelerated Inference via Confident Adaptive Transformers
Communication-Efficient Federated Learning for Neural Machine Translation
Dynamic-TinyBERT: Further Enhance the Inference Efficiency of TinyBERT by Dynamic Sequence Length
CTR-BERT: Cost-effective knowledge distillation for billion-parameter teacher models
Cutting Down on Prompts and Parameters:Simple Few-Shot Learning with Language Models BEST POSTER AWARD
Weight, Block or Unit? Exploring Sparsity Tradeoffs for Speech Enhancement on Tiny Neural Accelerators
Towards Textual Out-of-Domain Detection without any In-Domain Labels
Continual Few-Shot Named Entity Recognition via Data-Free Distillation
Efficient Variational Graph Autoencoders for Unsupervised Cross-domain Prerequisite Chains
Unsupervised Domain Adaptation with Adapter
Efficient Strategies of Few-Shot On-Device Voice Cloning
Adversarial Conversational Shaping for Intelligent Agents
Adaptive Fine-tuning for Vision and Language Pre-trained Models
Towards Continual Entity Learning in LanguageModels for Conversational Agents
Magic Pyramid: Accelerating Inference with Early Exiting and Token Pruning
A Short Study on Compressing Decoder-Based Language Models
Towards efficient end-to-end speech recognition with biologically-inspired neural networks
Compressing Pre-trained Language Models using Progressive Low Rank Decomposition
User-in-the-Loop Named Entity Recognition by Counterfactual Learning
Pruning Pretrained Encoders with a Multitask Objective
Undivided Attention: Are Intermediate Layers Necessary for BERT?
Prune Once for All: Sparse Pre-Trained Language Models
Kronecker Decomposition for GPT Compression
Evaluating robustness of You Only Hear Once(YOHO) Algorithm on noisy audios in the VOICe Dataset