I am currently unemployed. I used to research a subfield of AI called deep reinforcement learning. In this subfield, I was most interested in improving the optimization stability of off-policy deep Q-learning algorithms. I wrote two works towards this goal:
-
Stabilizing Q-Learning for Continuous Control
David Yu-Tung Hui
MSc Thesis, University of Montreal, 2022
I derived a deep reinforcement learning algorithm from mathematical first principles. I derived the SACLite loss functions from the principle of maximum-entropy and justified the use of LayerNorm with a neural-tangent-kernel-inspired analysis. Compared to baseline actor-critic algorithms, my algorithm did not diverge in high-dimensional continuous control.
[.pdf] [Errata] -
Double Gumbel Q-Learning
David Yu-Tung Hui, Aaron Courville, Pierre-Luc Bacon
Spotlight at NeurIPS 2023
We showed that Q-learning with function approximation had two previously unnoticed heteroscedastic Gumbel noise sources. An algorithm accounting for these noise sources attained just under 2 times the aggregate performance of the popular SAC baseline.
[.pdf] [Reviews] [Poster (.png)] [5-min talk] [1-hour seminar] [Code (GitHub)] [Errata]
More of my research is listed on Google Scholar.
The best way to contact me is email. You can find my address in my works.
* There are multiple ways to write my name. In Latin script, my surname is "Hui" and my firstname is "David Yu-Tung." In Traditional Chinese characters, my family name is "許" and my given name is "宇同." Most people call me "David." Others call me "宇同" or "Yu-Tung."