Efficient Natural Language and Speech Processing

(Models, Training and Inference)

This workshop aims at introducing some fundamental problems in the field of natural language and speech processing which can be of interest to the general machine learning and deep learning community to improve the efficiency of the models, their training and inference. The workshop program offers an interactive platform for gathering experts and talents from academia and industry through different invited keynote talks, panel discussions, paper submissions and reviews, poster and oral presentations and a mentorship program. This will provide an opportunity to discuss and learn from each other, exchange ideas, build connections, and brainstorm on potential solutions and future collaborations. The topics of this workshop can be of interest for people working on general machine learning, deep learning, optimization, theory and NLP & Speech applications.

Overview

Despite the great success of deep neural networks due to huge over-parameterization and using very large amount of data in different tasks of natural language processing (NLP) and speech processing, training or deploying these networks on devices or even cloud services with limited memory and computational power can be very expensive and challenging. For instance, pre-trained language models (PLMs) such as GPT-3 have led to a great breakthrough in NLP; but running GPT-3 with more than 170 billion parameters trained with more than 500 GB of data requires more than 10 Tesla V-100 GPUs. That being said, still improving the NLP and Speech models by increasing their number of parameters and incorporating more data is deemed a very common practice in the NLP and Speech domains. Therefore, it is of vital importance to invest on enhancing the efficiency of these models in terms of model architectures, training and inference from different perspectives highlighted in this workshop. In this regard, we would like to share some unique and fundamental challenges with the NeurIPS community to be considered in their future investigations.



Call for Papers

We encourage the NeurIPS community to submit their solutions, ideas, and ongoing work concerning data, model, training, and inference efficiency for NLP and speech processing. The scope of this workshop includes, but not limited to, the following topics.

Efficient Pre-Training and Fine-Tuning. Pre-training is a very expensive process. Even a small modification to the configuration of the models requires the user to redo pre-training:
  • Fast pre-training techniques, avoiding pre-training from scratch
  • Multi-domain pre-training/fine-tuning and fast domain adaptation for pre-trained/fine tuned models
  • Multimodal pre-trained (e.g., text--speech) models
  • Avoiding task-specific fine tuning of pre-trained models
  • New efficient architectures for pre-trained models

Model Compression. Neural model compression techniques such as quantization, pruning, layer decomposition and knowledge distillation (KD) aim at reducing the number of parameters of the models, improving their memory requirements or running efficiency:
  • Impact of different compression techniques on the inductive biases learned by the original models
  • Combined compression techniques for more efficient NLP and speech models
  • Efficient KD for NLP and speech, efficient intermediate layer distillation, and teacher-free distillation
  • Improving KD for large classification problems (e.g., text generation and machine translation with a very large number of output classes)
  • Solving the Capacity Gap problem and the Search Problem associated with finding the best checkpoint of the teacher
  • Theory of KD (e.g., how does KD work?)

Efficient Training. How to improve the training speed of the NLP and speech models:
  • Improving the optimizer for faster training
  • Accelerated training of different tasks in NLP and speech
  • Distributed training, federated learning and continual learning for NLP and speech tasks

Data Efficiency. Pre-trained models rely on a huge amount of unlabeled data which makes the training very sample inefficient:
  • Sample efficient training, training with less data, few-shot and zero-shot learning
  • Sample efficient data-augmentation, identifying which training samples should be augmented
  • Low-resource NLP and speech, considering training tasks with limited available data

Edge Intelligence. Running the trained models on edge devices will require a conversion process to match the network with hardware specifications:
  • TinyML for NLP and speech on embedded systems
  • Efficient conversion versus hardware-aware training
  • Training on device

Submission Instructions

You are invited to submit your papers in our CMT submission portal. All the submitted papers have to be anonymous for double-blind review. We expect each paper will be reviewed by at least three reviewers. The content of the paper (excluding the references and supplementary materials) should not be longer than 6 pages, strictly following the NeurIPS template style (which can be found here).

Authors can submit up to 100 MB of supplementary materials separately. Authors are highly encouraged to submit their codes for reproducibility purposes. Although original submissions are preferred, submitted papers can be among your already published or ArXiv papers, and your under submission works. Please make sure to indicate the complete list of conflict of interests for all the authors of your paper. To encourage higher quality submissions, our sponsors are offering the Best Paper Award to qualified outstanding original oral and poster presentations (upon nomination of the reviewers). Bear in mind that our workshop is not archival, but the accepted papers will be hosted on the workshop website.



Important Dates:

  • Submission Deadline: September 22, 2021 AOE
  • Uploading Supplementary Materials: September 26, 2021 AOE
  • Acceptance Notification: October 23, 2021 AOE
  • Camera-Ready Submission: November 1, 2021 AOE
  • Workshop Date: December 13, 2021


Confirmed Speakers

Mirella Lapata
Prof.
Mirella Lapata

University of Edinburgh
Luke Zettlemoyer
Prof.
Luke Zettlemoyer

University of Washington (Facebook)
Kevin Duh
Prof.
Kevin Duh

Johns Hopkins University
Boxing Chen
Dr.
Boxing Chen

Alibaba
Saneer Singh
Prof.
Sameer Singh

University of California
Danqi Chen
Prof.
Danqi Chen

Princeton University
Mohammad Norouzi
Dr.
Mohammad Norouzi

Google Brain
Yejin Choi
Prof.
Yejin Choi

University of Washington (Allen Institute for AI)
Lu Hou
Dr.
Lu Hou

Huawei Noah's Ark Lab
Xu Sun
Prof.
Xu Sun

Peking University
Barbara Plank
Prof.
Barbara Plank

IT University of Copenhagen
Samira Ebrahimi Kahou
Prof.
Samira Ebrahimi Kahou

ETS & MILA


Schedule (EST time zone - New York/Montreal/Toronto)

Time Title Presenter
08:00 AM - 08:10 AM Opening SpeechPascal Poupart
08:10 AM - 08:50 AMContinual Learning in Large-Scale Pre-Training
Xu Sun
08:50 AM - 09:30 AMEfficient Multi-lingual Neural Machine Translation
Boxing Chen
09:30 AM - 10:10 AMCompression and Acceleration of Pre-trained Language Models
Lu Hou
10:10 AM - 10:20 AMBreak
10:20 AM - 11:00 AMSummarization in Quantized Transformer Spaces
Mirella Lapata
11:00 AM - 11:40 AMData-Efficient Cross-Lingual Natural Language Processing
Barbara Plank
11:40 AM - 12:20 PMFrom model compression to self-distillation: a review
Samira Ebrahimi Kahou
12:20 PM - 01:00 PMBreak
12:20 PM - 01:20 PMPoster session
01:20 PM - 01:25 PMA versatile and efficient approach to summarize speech into utterance-level representations
Joao B Monteiro
01:25 PM - 01:30 PMTowards Zero and Few-shot Knowledge-seeking Turn Detection in Task-orientated Dialogue Systems
Di Jin
01:30 PM - 01:35 PMConsistent Accelerated Inference via Confident Adaptive Transformers
Tal Schuster
01:35 PM - 01:40 PMCommunication-Efficient Federated Learning for Neural Machine Translation
Tanya Roosta 
01:40 PM - 01:45 PMDynamic-TinyBERT: Further Enhance the Inference Efficiency of TinyBERT by Dynamic Sequence Length
Shira Guskin
01:45 PM - 01:50 PMCTR-BERT: Cost-effective knowledge distillation for billion-parameter teacher models
TBD
01:50 PM - 01:55 PMCutting Down on Prompts and Parameters:Simple Few-Shot Learning with Language Models
Robert Logan
01:55 PM - 02:00 PMWeight, Block or Unit? Exploring Sparsity Tradeoffs for Speech Enhancement on Tiny Neural Accelerators
Marko Stamenovic
02:00 PM - 02:40 PMHow to Win LMs and Influence Predictions: Using Short Phrases to Control NLP Models
Sameer Singh
02:40 PM - 03:20 PMBenchmarks for Multi-objective Hyperparameter Optimization
Kevin Duh
03:20 PM - 04:00 PMNLP with Synthetic Text
Mohammad Norouzi
04:00 PM - 04:10 PMBreak
04:10 PM - 04:50 PMToward Efficient Training of Large Language Models with Balanced Conditional Compute
Luke Zettlemoyer
04:50 PM - 05:30 PMWhy We Want Contrastive Learning in Language Models
Danqi Chen
05:30 PM - 06:10 PMBattling with Larger Models through Grounding and Searching
Yejin Choi
06:10 PM - 06:15 PMBreak
06:15 PM - 07:00 PM Panel Discussion Pascal Poupart
Ali Ghodsi
Luke Zettlemoyer
Sameer Singh
Kevin Duh
Yejin Choi
Lu Hou
07:00 PM - 07:10 PM Best Papers and Closing Remarks Pascal Poupart & Ali Ghodsi
07:10 PM - 08:00 PMPoster session


Organizers

Mehdi Rezagholizadeh
Mehdi Rezagholizadeh
Huawei Noah's Ark Lab
Lili Mou
Lili Mou
University of Alberta
Yue Dong
Yue Dong
McGill University & MILA
Pascal Poupart
Pascal Poupart
University of Waterloo
Ali Ghodsi
Ali Ghodsi
University of Waterloo
Qun Liu
Qun Liu
Huawei Noah's Ark Lab

Volunteers

Khalil Bibi
Khalil Bibi
Huawei Noah's Ark Lab
Anderson Avilla
Andrson Avilla
Huawei Noah's Ark Lab


Technical Committee

  • Pascal Poupart (UoWaterloo)
  • Kevin Duh (Johns Hopkins University)
  • Wulong Liu (Huawei Noah's Ark Lab)
  • Bang Liu (UoMontreal)
  • Di Jin (Amazon Alexa AI)
  • Hamidreza Mahyar (McMaster University)
  • Lili Mou (UoAlberta)
  • Peyman Passban (Amazon)
  • Prasanna Parthasarathi (McGill & MILA)
  • Vahid Partovi Nia (Huawei Noah's Ark Lab)
  • Yue Dong (McGill & MILA)
  • Ivan Kobyzev (Huawei Noah's Ark Lab)
  • Jad Kabbara (McGill & MILA)
  • Aref Jafari (UoWaterloo)
  • Ahmad Rashid (Huawei Noah's Ark Lab)
  • Shailza Jolly (TU Kaiserslautern)
  • Md. Akmal Haidar (Nuance Communications)
  • Jingjing Xu (ByteDance)
  • Vasileios Lioutas (UoBritish Colombia (UBC))
  • Anderson R. Avila (Huawei Noah's Ark Lab)
  • Malik H. Altakrori (McGill & MILA)
  • Ali Vahdat (Thomson Reuters)
  • Fattane Zarrinkalam (Thomson Reuters)
  • Makesh S Narsimhan (McGill & MILA)
  • Borna Jafarpour (Thomson Reuters)
  • Shohreh Shaghaghian (Thomson Reuters)
  • Ehsan Kamalloo (UoAlberta)
  • Ali Saheb Pasand (UoWaterloo)
  • Abbas Ghaddar (Huawei Noah's Ark Lab)
  • Mehrdad Ganjeh (Ernst & Young (EY))
  • Mingxuan Wang (ByteDance)
  • Tanya Roosta (Amazon)
  • Soheila Samiee (BASF)
  • Yimeng Wu (Huawei Noah's Ark Lab)
  • Marzieh Tahaei (Huawei Noah's Ark Lab)
  • Habib Hajimolahoseini (Huawei Technologies)
  • Mohammad Salameh (Huawei Technologies)
  • Kira Aveline Selby (UoWaterloo)
  • Mohammed Senoussaoui (Fluent.ai)
  • M. Sarria-Paja (Universidad Santiago de Cali)
  • Puneeth Saladi (Huawei Noah's Ark Lab)
  • Flávio Ávila (Verisk Analytics)
  • Tal Schuster (MIT)
  • Irene Li (Yale)
  • Shentong Mo (Carnegie Mellon University)
  • Alpana Agarwal (Thapar University)
  • Vinay Kumar (Thapar University)
  • Shivani Malhotra (TIET Patiala)
  • Iman Keivanloo (Amazon)
  • Aashiq Muhamed (Amazon)
  • Robert L. Logan IV (UCI University)
  • Patrick Xia (Johns Hopkins University)
  • Moshe Wasserblat (Intel)
  • Guy Boudoukh (Intel)
  • Ankit Chadha (Amazon)
  • Khalil Bibi (Huawei Noah's Ark Lab)
  • David Alfonso Hermelo (Huawei Noah's Ark Lab)


Sponsor