Ajay Subramanian, Sharad Chitlangia, Veeky Baths. (2021).Reinforcement Learning and its Connections with Neuroscience and Psychology [Elsevier Neural Networks]

Reinforcement learning methods have been recently been very successful in complex sequential tasks like playing Atari games, Go and Poker. Through minimal input from humans, these algorithms are able to learn to perform complex tasks from scratch, just through interaction with their environment. While there certainly has been considerable independent innovation in the area, many core ideas in RL are inspired by animal learning and psychology. Moreover, these algorithms are now helping advance neuroscience research by serving as a computational model for many characteristic features of brain functioning. In this context, we review a number of findings that establish evidence of key elements of the RL problem and solution being represented in regions of the brain.

Vedant Shah, Anmol Agarwal, Tanmay Tulsidas Verlekar, Raghavendra Singh .(2021). Adapting Deep Learning Models for Pedestrian-Detection to Low-Light Conditions without Re-training [Accepted at the 1st TradiCV Workshop, ICCV 2021]

Neuromorphic computing is emerging to be a disruptive computational paradigm that attempts to emulate various facets of the underlying structure and functionalities of the brain in the algorithm and hardware design of next-generation machine learning platforms. This work goes beyond the focus of current neuromorphic computing architectures on computational models for neuron and synapse to examine other computational units of the biological brain that might contribute to cognition and especially self-repair. We draw inspiration and insights from computational neuroscience regarding functionalities of glial cells and explore their role in the fault-tolerant capacity of Spiking Neural Networks (SNNs) trained in an unsupervised fashion using Spike-Timing Dependent Plasticity (STDP). We characterize the degree of self-repair that can be enabled in such networks with varying degree of faults ranging from 50 to 90% and evaluate our proposal on the MNIST and Fashion-MNIST datasets.

Ashwin Vaswani*, Rijul Ganguly*, Het Shah*, Sharan Ranjit S.*, Shrey Pandit, Samruddhi Bothara. (2020). Whatif Challenge: An Autoencoder Based Approach to Simulate Football Games [Accepted at 7th Workshop on Machine Learning and Data Mining for Sports Analytics 2020]

Propaganda spreads the ideology and beliefs of like-minded people, brainwashing their audiences, and sometimes leading to violence. SemEval 2020 Task-11 aims to design automated systems for news propaganda detection. Task-11 consists of two sub-tasks, namely, Span Identification - given any news article, the system tags those specific fragments which contain at least one propaganda technique; and Technique Classification - correctly classify a given propagandist statement amongst 14 propaganda techniques. For sub-task 1, we use contextual embeddings extracted from pre-trained transformer models to represent the text data at various granularities and propose a multi-granularity knowledge sharing approach. For sub-task 2, we use an ensemble of BERT and logistic regression classifiers with linguistic features. Our results reveal that the linguistic features are the strong indicators for covering minority classes in a highly imbalanced dataset.

In this paper, we assess the ability of BERT and its derivative models (RoBERTa, DistilBERT, and ALBERT) for short-edits based humor grading. We test these models for humor grading and classification tasks on the Humicroedit and the FunLines dataset. We perform extensive experiments with these models to test their language modeling and generalization abilities via zero-shot inference and cross-dataset inference based approaches. Further, we also inspect the role of self-attention layers in humor-grading by performing a qualitative analysis over the self-attention weights from the final layer of the trained BERT model. Our experiments show that all the pre-trained BERT derivative models show significant generalization capabilities for humor-grading related tasks.

In this paper, we describe an approach for modelling causal reasoning in natural language by detecting counterfactuals in text using multi-head self-attention weights. We use pre-trained transformer models to extract contextual embeddings and self-attention weights from the text. We show the use of convolutional layers to extract task-specific features from these self-attention weights. Further, we describe a fine-tuning approach with a common base model for knowledge sharing between the two closely related sub-tasks for counterfactual detection. We analyze and compare the performance of various transformer models in our experiments. Finally, we perform a qualitative analysis with the multi-head self-attention weights to interpret our models' dynamics.

Srivatsan Krishnan*, Sharad Chitlangia*, Maximilian Lam*, Zishen Wan, Alexandra Faust, Vijay Janapa Reddi. (2020). Quantized Reinforcement Learning [Accepted at ReCoML Workshop, MLSys 2020]

Recent work has shown that quantization can help reduce the memory, compute, and energy demands of deep neural networks without significantly harming their quality. However, whether these prior techniques, applied traditionally to image-based models, work with the same efficacy to the sequential decision making process in reinforcement learning remains an unanswered question. To address this void, we conduct the first comprehensive empirical study that quantifies the effects of quantization on various deep reinforcement learning policies with the intent to reduce their computational resource demands. We apply techniques such as post-training quantization and quantization aware training to a spectrum of reinforcement learning tasks (such as Pong, Breakout, BeamRider and more) and training algorithms (such as PPO, A2C, DDPG, and DQN). Across this spectrum of tasks and learning algorithms, we show that policies can be quantized to 6-8 bits of precision without loss of accuracy. We also show that certain tasks and reinforcement learning algorithms yield policies that are more difficult to quantize due to their effect of widening the models' distribution of weights and that quantization aware training consistently improves results over post-training quantization and oftentimes even over the full precision baseline. Finally, we demonstrate real-world applications of quantization for reinforcement learning. We use half-precision training to train a Pong model 50% faster, and we deploy a quantized reinforcement learning based navigation policy to an embedded system, achieving an 18×speedup and a 4× reduction in memory usage over an unquantized policy.

Ajay Subramanian*, Rajaswa Patil*, Veeky Baths. (2019). Word2Brain2Image: Visual Reconstruction from Spoken Word Representations [Accepted for Poster Presentation, ACCS 2019]

Recent work in cognitive neuroscience has aimed to better understand how the brain responds to external stimuli. Extensive study is being done to gauge the involvement of various regions of the brain in the processing of external stimuli. A study by Ostarek et al. has produced experimental evidence of the involvement of low-level visual representations in spoken word processing, using Continuous Flash Suppression (CFS). For example, hearing the word ‘car’ induces a visual representation of a car in extrastriate areas of the visual cortex that seems to have a spatial resolution of some kind. Though the structure of these areas of the brain has been extensively studied, research hasn’t really delved into the functional aspects. In this work, we aim to take this a step further by experimenting with generative models such as Variational Autoencoders (VAEs) (Kingma et al 2013) and Generative Adversarial Networks (GANs) (Goodfellow et al. 2014) to generate images purely from the EEG signals induced by listening to spoken words of objects.

Rajaswa Patil*, Siddhant Mahurkar. (2019). Citta: A Lite Semantic Recommendation Framework for Digital Libraries [Best Student Poster Award, KEDL 2019]

Most of the recommendation and search frameworks in Digital Libraries follow a keyword-based approach to resolve text-based search queries. Keyword-based methods usually fail to capture the semantic aspects of the user’s query and often lead to a misleading set of results. In this work, we propose an efficient and content-sentiment aware semantic recommendation framework, Citta. The framework is designed with the BERT language model. It is designed to retrieve semantically related reading recommendations with short input queries and shorter response times. We test the proposed framework on the CMU Book Summary Dataset and discuss the observed advantages and shortcomings of the framework.

Souradeep Chakroborty. (2019). Capturing financial markets to apply deep reinforcement learning [Accepted at 9th India Finance Conference held at IIM-A]

In this paper we explore the usage of deep reinforcement learning algorithms to automatically generate consistently profitable, robust, uncorrelated trading signals in any general financial market. In order to do this, we present a novel Markov decision process (MDP) model to capture the financial trading markets. We review and propose various modifications to existing approaches and explore different techniques like the usage of technical indicators, to succinctly capture the market dynamics to model the markets. We then go on to use deep reinforcement learning to enable the agent (the algorithm) to learn how to take profitable trades in any market on its own, while suggesting various methodology changes and leveraging the unique representation of the FMDP (financial MDP) to tackle the primary challenges faced in similar works. Through our experimentation results, we go on to show that our model could be easily extended to two very different financial markets and generates a positively robust performance in all conducted experiments.


Recent work has shown that distributed word representations can encode abstract semantic and syntactic information from child-directed speech. In this paper, we use diachronic distributed word representations to perform temporal modeling and analysis of lexical development in children. Unlike all previous work, we use temporally sliced speech corpus to learn distributed word representations of child and child-directed speech. Through our modeling experiments, we demonstrate the dynamics of growing lexical knowledge in children over time, as compared against a saturated level of lexical knowledge in child-directed adult speech. We also fit linear mixed-effects models with the rate of semantic change in the diachronic representations and word frequencies. This allows us to inspect the role of word frequencies towards lexical development in children. Further, we perform a qualitative analysis of the diachronic representations from our model, which reveals the categorization and word associations in the mental lexicon of children.

We present a collated set of algorithms to obtain objective measures of synchronisation in brain time-series data. The algorithms are implemented in MATLAB; we refer to our collated set of 'tools' as SyncBox. Our motivation for SyncBox is to understand the underlying dynamics in an existing population neural network, commonly referred to as neural mass models, that mimic Local Field Potentials of the visual thalamic tissue. Specifically, we aim to measure the phase synchronisation objectively in the model response to periodic stimuli; this is to mimic the condition of Steady-state-visually-evoked-potentials (SSVEP), which are scalp Electroencephalograph (EEG) corresponding to periodic stimuli. We showcase the use of SyncBox on our existing neural mass model of the visual thalamus. Following our successful testing of SyncBox, it is currently being used for further research on understanding the underlying dynamics in enhanced neural networks of the visual pathway.

Het Shah, Avishree Khare*, Neelay Shah*, Khizir Siddiqui* . (2020). KD-Lib: A PyTorch library for Knowledge Distillation, Pruning and Quantization [arXiv]

In recent years, the growing size of neural networks has led to a vast amount of research concerning compression techniques to mitigate the drawbacks of such large sizes. Most of these research works can be categorized into three broad families : Knowledge Distillation, Pruning, and Quantization. While there has been steady research in this domain, adoption and commercial usage of the proposed techniques has not quite progressed at the rate. We present KD-Lib, an open-source PyTorch based library, which contains state-of-the-art modular implementations of algorithms from the three families on top of multiple abstraction layers. KD-Lib is model and algorithm-agnostic, with extended support for hyperparameter tuning using Optuna and Tensorboard for logging and monitoring. The library can be found at - Lib