Welcome to Zhenxun's Homepage

2018-2022
Boston University, MA, U.S.

Ph.D. in Computer Science

Dissertation: Adaptive Strategies in Non-convex Optimization

Adviser: Francesco Orabona
2016-2018
Stony Brook University, NY, U.S.

Ph.D. in Computer Science

Adviser: Francesco Orabona
2012-2016
University of Science and Technology of China
Hefei, Anhui, China

B.Eng. in Electronic Information Engineering

Thesis: Prediction & Transform Combined Intra Coding in HEVC

Adviser: Feng Wu

Understanding AdamW through Proximal Methods and Scale-freeness

Robustness to Unbounded Smoothness of Generalized SignSGD

Michael Crawshaw, Mingrui Liu, Francesco Orabona, Wei Zhang, Zhenxun Zhuang (alphabetical order).

Conference on Neural Information Processing Systems, November, 2022.

Paper Code

A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks

Mingrui Liu, Zhenxun Zhuang, Yunwei Lei, Chunyang Liao.

Conference on Neural Information Processing Systems, November, 2022.

Paper Code

Understanding AdamW through Proximal Methods and Scale-freeness.

Zhenxun Zhuang, Mingrui Liu, Ashok Cutkosky, and Francesco Orabona.

Transactions on Machine Learning Research, August, 2022.

Paper Code

A Second look at Exponential and Cosine Step Sizes: Simplicity, Convergence, and Performance

Xiaoyu Li*, Zhenxun Zhuang*, Francesco Orabona (*equal contribution).

Proceedings of the 38th International Conference on Machine Learning, July, 2021

Paper Code

No-regret Non-convex Online Meta-learning.

Zhenxun Zhuang, Yunlong Wang, Kezi Yu, and Songtao Lu.

Proceedings of the 45th International Conference on Acoustics, Speech, and Signal Processing, May, 2020

Paper

Surrogate Losses for Online Learning of Stepsizes in Stochastic Non-Convex Optimization

Zhenxun Zhuang, Ashok Cutkosky, and Francesco Orabona.

Proceedings of the 36th International Conference on Machine Learning, June, 2019

Paper Code

Online Meta-learning on Non-convex Setting.

Zhenxun Zhuang, Kezi Yu, Songtao Lu, Lucas Glass, Yunlong Wang.

NeurIPS Workshop on Meta-Learning (MetaLearn 2019), Dec, 2019

Paper

More to come soon...

Meta, Remote, Jun. 2021 - Aug. 2021

• Added support for deriving features from a new data source on training machine learning models.

• Migrated a machine learning model to a new data platform for privacy reasons, retrained the model using new data, and fine-tuned the model to attain matching performance compared with the model currently used in prod.

• Developed a command-line tool for tracing and visualizing the lineage information from a model and a feature it uses back to data source tables and columns used to generate this feature which reduces the time taken to parse lineage information from >5 minutes to <15 seconds.
Meemo, Remote, Jun. 2020 - Aug. 2020

• Designed, implemented, and deployed two features on learning delightful insights of users' spending habits and memorable experience.

• Developed an API to automate the process of moderation of contents sent to users which has been used by the whole team and shortened the contents preparation time by at least 50%.

• Designed several visualization tools for investigating a user's purchase history from different angles.
IQVIA, Plymouth Meeting, PA, U.S., May 2019 - Aug. 2019

• Extended the Online Meta-Learning framework to the non-convex setting and introduced a new performance measure to replace the original one which is only applicable to convex cases.

• Solved a stochastic optimization problem by employing this algorithm, theoretically proved its performance guarantee and robustness to any hyperparameter initialization, and empirically showed its improvement over traditional methods by 20% on accuracy.

• Published results in NeurIPS 2019 workshop and ICASSP 2020.

The thirty-sixth Conference on Neural Information Processing Systems (NeurIPS), 2022
The thirty-ninth International Conference on Machine Learning (ICML), 2022
The thirty-fifth Conference on Neural Information Processing Systems (NeurIPS), 2021
(Expert Reviewer) The thirty-eighth International Conference on Machine Learning (ICML), 2021
The thirty-fourth Conference on Neural Information Processing Systems (NeurIPS), 2020
The thirty-seventh International Conference on Machine Learning (ICML), 2020
The twenty-third International Conference on Artificial Intelligence and Statistics (AISTATS), 2020
The thirty-third Conference on Neural Information Processing Systems (NeurIPS), 2019

CSE 303: Introduction to Theory of Computation.

Instructor: Anita Wasilewska.
Class size: 103 students.
My role: held office hours, graded homework and exams, and assisted the instructor on preparing exams (I am the only TA).
CSE 101: Introduction to Computers.

Instructor: Michael Tashbook.
Class size: 247 students.
My role: graded homework, lab assignments, and exams.

Zhenxun Zhuang

Education.

Boston University, MA, U.S.

Stony Brook University, NY, U.S.

University of Science and Technology of China
Hefei, Anhui, China

Publications.

Robustness to Unbounded Smoothness of Generalized SignSGD

A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks

Understanding AdamW through Proximal Methods and Scale-freeness.

A Second look at Exponential and Cosine Step Sizes: Simplicity, Convergence, and Performance

No-regret Non-convex Online Meta-learning.

Surrogate Losses for Online Learning of Stepsizes in Stochastic Non-Convex Optimization

Preprints and Workshops.

Online Meta-learning on Non-convex Setting.

Projects.

Activities.

Education.

Publications.

Robustness to Unbounded Smoothness of Generalized SignSGD

A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks

Understanding AdamW through Proximal Methods and Scale-freeness.

A Second look at Exponential and Cosine Step Sizes: Simplicity, Convergence, and Performance

No-regret Non-convex Online Meta-learning.

Surrogate Losses for Online Learning of Stepsizes in Stochastic Non-Convex Optimization

Preprints and Workshops.

Online Meta-learning on Non-convex Setting.

Projects.

Generalized SignSGD

Distributed SGDClipGrad

AdamW and Scale-freeness

SGD with Exponential and Cosine Step Sizes

Stochastic Gradient Descent with Online Learning

Interactive Data Visualization System for University Rankings

Activities.