Publications

To Each Optimizer a Norm, To Each Norm its Generalization. S. Vaswani, R. Babanezhad, J. Gallego, A. Mishkin, S. Lacoste-Julien, N. Le Roux. arXiv Preprint, 2020. [arXiv]
Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates. S. Vaswani, A. Mishkin, I. Laradji, M. Schmidt, G. Gidel, S. Lacoste-Julien. NeurIPS, 2019. [arXiv] [code] [video]
SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient. A. Mishkin, F. Kunstner, D. Nielsen, M. Schmidt, M. E. Khan. NeurIPS, 2018. [arXiv] [code] [video]
Web ValueCharts: Analyzing Individual and Group Preferences with Interactive, Web-based Visualizations. A. Mishkin. Review of Undergraduate Computer Science, 2018. [pdf]

Talks

Better Optimization via Interpolation: a short interview at CalTech on interpolation and the Armijo line-search. [slides]
Painless SGD: A longer version of the same talk for a research exchange with the PLAI lab. [slides] [src]
Painless SGD: Slides from a video for MLSS 2020. [slides] [video] [src]

Instrumental Variables, DeepIV, and Forbidden Regressions: learning to evaluate counterfactuals via instrumental variables. Talk for MLRG 2019W2. [slides] [src]
Why Does Deep Learning Work? An intuitive outline of the role "implicit regularization" plays in deep neural networks. Introduction talk for MLRG 2019W1. [slides] [src]
Generative Adversarial Networks: an intro from the perspective of GANs as probabilistic models with intractible density functions. Talk for MLRG 2018W2. [slides] [src]
Standard and Natural Policy Gradients for Discounted Rewards: an intro to policy-gradient algorithms. Integral-heavy. Talk for MLRG 2018W1. [slides] [src]

CUCSC 2017: Web ValueCharts: Exploring Individual and Group Preferences Through Interactive Web-based Visualizations.
MURC 2017: Web ValueCharts: Supporting Decision Makers with Interactive, Web-Based Visualizations. [slides]