ResearchResearch InterestsI am currently interested in nonlinear optimization with application to machine learning. Most machine learning and statistical learning problems may be posed as largescale stochastic optimization problems over large datasets, where the size of the data constrains our capability to solve the problem efficiently. In addition, these methods must discover solutions that correspond to models which generalize well to new data. This motivates the development of new learning algorithms that admit solutions that generalize well, exploit modern computer architectures and utilize parallel programming effectively, and scale to increasingly larger datasets. By developing better algorithms for learning from massive amounts of data, I hope to design better intelligent systems to perform tasks such as image and speech recognition, image and video recovery, etc. Current ProjectsOptimization for Deep LearningStochastic gradient descent (SGD) is often considered the prototypical optimization algorithm for machine learning applications due to its computational efficiency and the generalizability of its solutions. SGD and its variants have been particularly effective in the training of deep neural networks, which have spurred many recent advancements in artificial intelligence. I am interested in this intersection between deep learning and optimization: why do certain optimization algorithms learn and generalize better than others? Can we develop better optimization algorithms for deep learning? Coordinate Descent Methods
