A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.



First PC build


My everyday work computer is a MacBook Pro that I’ve had since 2013. It’s a great machine and continues to serve me well, but I was moved in a moment of pandemic malaise to treat myself to a little upgrade.

Siri from scratch! (Not really.)


I make fairly heavy use of the voice assistant on my phone for things like setting timers while cooking. As a result, when I spent some time this summer at my in-laws’ place—where there was no cell signal and not-very-good Wi-Fi—I often tried using Siri only to get a sad little “sorry, no Internet :(“ response. (#FirstWorldProblems.)

Sequence-to-sequence learning with Transducers


The Transducer (sometimes called the “RNN Transducer” or “RNN-T”, though it need not use RNNs) is a sequence-to-sequence model proposed by Alex Graves in “Sequence Transduction with Recurrent Neural Networks”. The paper was published at the ICML 2012 Workshop on Representation Learning. Graves showed that the Transducer was a sensible model to use for speech recognition, achieving good results on a small dataset (TIMIT).

My research goals


I wanted to clarify to myself and others what some of my research goals are, and why I’m working on certain problems. The hope is that putting this online for the world to see will help challenge me to keep focused and working towards those goals—sort of like telling your friends that you’re going to quit smoking, or something like that.

Predictive coding in machines and brains


The name “predictive coding” has been applied to a number of engineering techniques and scientific theories. All these techniques and theories involve predicting future observations from past observations, but what exactly is meant by “coding” differs in each case. Here is a quick tour of some flavors of “predictive coding” and how they’re related.

A contemplation of $\text{logsumexp}$


$\text{logsumexp}$ is an interesting little function that shows up surprisingly often in machine learning. Join me in this post to shed some light on $\text{logsumexp}$: where it lives, how it behaves, and how to interpret it.

Notebook: Fun with Hidden Markov Models


I’ve written a notebook introducing Hidden Markov Models (HMMs) with a PyTorch implementation of the forward algorithm, the Viterbi algorithm, and training a model on a text dataset—check it out here!

An introduction to sequence-to-sequence learning


Many interesting problems in artificial intelligence can be described in the following way:

Map a sequence of inputs $\mathbf{x}$ to the correct sequence of outputs $\mathbf{y}$.


Neural Offset Min-Sum Decoding

ISIT, 2017

This paper is about neural offset min-sum decoding (NOMS), a generalization of the offset min-sum algorithm used in practical channel decoders for LDPC codes.

Recommended citation: L. Lugosch and W. J. Gross, "Neural offset min-sum decoding," 2017 IEEE International Symposium on Information Theory (ISIT), Aachen, 2017, pp. 1361-1365.

Deep Learning Methods for Improved Decoding of Linear Codes

IEEE Journal of Selected Topics in Signal Processing, 2018

A collaboration between researchers at Tel-Aviv University and McGill University describing a family of neural belief propagation algorithms for channel decoding.

Recommended citation: E. Nachmani, E. Marciano, L. Lugosch, W. J. Gross, D. Burshtein and Y. Be’ery, "Deep learning methods for improved decoding of linear codes," in IEEE Journal of Selected Topics in Signal Processing, special issue on "Machine Learning for Cognition in Radio Communications and Radar", vol. 12, no. 1, pp. 119-131, Feb. 2018.

Learning Algorithms for Error Correction

Masters thesis, 2018

Channel coding enables reliable communication over unreliable, noisy channels: by encoding messages with redundancy, it is possible to decode the messages in such a way that errors introduced by the channel are corrected. Modern channel codes achieve very low error rates at long block lengths, but long blocks are often not acceptable for low-latency applications. While there exist short block codes with excellent error-correction performance when decoded optimally, designing practical, low-complexity decoding algorithms that can achieve close-to-optimal results for short codes is still an open problem. In this thesis, we explore an approach to decoding short block codes in which the decoder is recast as a machine learning algorithm. After providing the background concepts on errorcorrecting codes and machine learning, we review the literature on learning algorithms for error correction, with a special emphasis on the recently introduced “neural belief propagation” algorithm. We then describe a set of modifications to neural belief propagation which improve its performance and reduce its implementation complexity. We also propose a new syndrome-based output layer for neural error-correcting decoders which takes the code structure into account during training to yield decoders with lower frame error rate. Finally, we suggest some future work.

Recommended citation: L. Lugosch, “Learning algorithms for error correction”, Masters thesis, McGill University, 2018.

Tone Recognition Using Lifters and CTC

Interspeech, 2018

An acoustic model for tonal languages that uses convolutional lifters operating on the cepstrogram representation of the input speech signal and CTC to map inputs to outputs to achieve better tone recognition performance.

Recommended citation: L. Lugosch and V. S. Tomar, “Tone recognition using lifters and CTC”, Interspeech, Hyderabad, India, pp. 2305-2309, September 2018.

Learning from the Syndrome

Asilomar, 2018

Use a differentiable relaxation of the syndrome as the loss function for training neural network channel decoders for improved frame error rate and online unsupervised learning. (Invited paper.)

Recommended citation: L. Lugosch, W. J. Gross, "Learning from the syndrome," in Asilomar Conference on Signals, Systems, and Computers, special session on "Machine Learning for Wireless Systems", Oct. 2018.

DONUT: CTC-based Query-by-Example Keyword Spotting

NeurIPS IRASL Workshop, 2018

Train your device to wake up for any phrase you want by recording the phrase three times, estimating the label sequence using a beam search, and computing the log probability of the label sequence at test time using the forward algorithm.

Recommended citation: L. Lugosch, S. Myer, and V. S. Tomar, “DONUT: CTC-based query-by-example keyword spotting”, NeurIPS Workshop on Interpretability and Robustness in Audio, Speech, and Language, Montreal, Canada, December 2018.