Reza Davari PhD Student @ Mila and Concordia University AI enthusiast, currently focused on NLP, CV, and CL. A watermelon connoisseur. Based in Montreal, QC.

Variational Auto Encoders (VAE)

Github Report

This is a report on the development of a VAE model and experiments performed on it to better understand its functionality and properties. The first 2 sections of the report are dedicated to the development of the model and the choices that came along with it. A general overview of the architecture is presented in section 1. Section 2 demonstrates the different methods that could be used to increase the feature map size. The analysis presented in this section is both from a visual aspect and mathematical one. Section 3 explores the properties of VAE for a better understanding of its functionality and presents some analysis between the vanilla VAE and IWAE. Finally, in section 4 the model presented in the report will be evaluated in a more quantitative setting for future comparisons.

Reproducing: On The Convergence of ADAM and Beyond

Github Report

This is a study based on the paper On The Convergence of ADAM and Beyond. In this report we present a short summery explaining what the paper is about and reproduce the results of the experiments that was shown in the paper. In addition to these experiment, we also designed an experiment of our own, and investigated the behaviour of the method proposed by the paper in a different setting.

Reinforcement Learning in Sports: Cricket

Github Report

Reinforcement learning has already made its mark of expertise in two player strategy games such as chess and GO that require sequential decision making. The game of cricket is also a game between two high level entities (teams) and involves sequential decision making under uncertainty. It requires customizing one's strategy according to the situation they are facing. At the heart of it, we can view it with similar reinforcement learning problem formulation where the teams can be modeled as agents. In this project we are aiming to find the optimal strategy, using reinforcement learning techniques, for the 2 teams playing against each other.If you are not familiar with the game of cricket, you can read more about the game, here.

General Value Functions and Successor Representation

Github Report

This is a report on the topic of Successor Representation (SR). It briefly discusses the relation between SR and the General Value Functions and then proceeds to explore the properties of SR and its benefits over vanilla TD(0) methods. The experiments performed for the purposes of this study are all done on the FrozenLake environment with slight modifications that are mentioned in the report.

Policy Gradient Methods in Reinforcement Learning

Github Report

This is a report on the use of policy gradient methods in Reinforcement Learning. The focus of this report is on the experimental comparison between 3 types of Actor-Critic methods. This investigation mainly revolves around the effect of the eligibility traces on Actor-Critic methods. These methods are the followings:

  1. Actor-Critic with Eligibility Traces
  2. Actor-Critic with Eligibility Traces only on the Critic but not the Actor
  3. Actor-Critic Without any Eligibility Traces using one-step returns
The experiments are carried on the FrozenLake environment.

Neural Turing Machines

Github Report

This is a report on the implementation of the paper Neural Turing Machines. The paper presents the idea of Neural Turing Machines (NTM) and the results of using NTM on a few tasks. However, it does not provide the readers with the necessary details to reproduce the results presented in the paper. In this report we tried to fill in those gaps and provide an analysis over the experiments we performed to implement NTM. For this report we chose to only implement the NTM on the copy task.

Function Approximation in Reinforcement Learning

Github Report

This is a report on the use of function approximation in Reinforcement Learning. In this report, we implement the method of Kernel Based Reinforcement Learning (kbrl) as explained in Ormoneit and Sen (2002) using a nearest neighbor measure where the neighbors are weighted via a Gaussian Kernel. The report assumes the reader has already read Ormoneit and Sen (2002) and is familiar with the concepts that are introduced in the paper. Hence, there is no introduction section in this report and it may be hard to follow if the reader is not familiar with the said paper. Therefore, it is highly suggested to read the paper before reading this report!

Road Crossing Aid for Visually Impaired

Github Report

There are over 37 million people across the globe who are visually impaired. Out of this population, over 15 million are from India. Living in an underdeveloped country with little infrastructure for the disables could be very harsh and isolating. We wanted to make a difference and not let their disabilities hinder their day to day life. This motivation drove us to develop the project presented here. In this project, we designed, made and tested and application that uses AI to navigate and guide the visually impaired in crossing the street. In our first attempt in solving this problem we are considering the case of zebra line crossings. A demo of the application running on android platform could be seen in this video.