
Jan Buys Lecturer Computer Science Building Room 314 Department of Computer Science University of Cape Town South Africa jbuys -at- cs.uct.ac.za Google Scholar GitHub |
About
I am a Lecturer (≈ Assistant Professor) in the Department of Computer Science at the University of Cape Town. My research area is Natural Language Processing and Machine Learning. My current research focusses on text generation, linguistic structure prediction, and low-resource language processing.Previously, I was a postdoctoral researcher at the University of Washington, working with Yejin Choi. I completed my PhD at the University of Oxford, supervised by Phil Blunsom. Before that I obtained a Masters degree and undergraduate degrees in Computer Science at the University of Stellenbosch in South Africa.
Reviewing (2020): ACL, EMNLP, NeurIPS, SACAIR, TACL.
Teaching (2021): CSC1019 Intro Programming; CSC5031 Natural Language Processing; CSC4025Z Artificial Intelligence.
Research Group
Current postgraduate students:- Shane Acton (MSc Computer Science)
- Maxwell Mojapelo (MPhil Information Technology)
- Neil Sinclair (MSc Data Science)
- Yassin Nurmahomed (MSc Computer Science)
- Francois Meyer (PhD Computer Science)
- Sello Ralethe (PhD Computer Science)
News
- April 2021: Two papers accepted for presentation at the AfricaNLP workshop at EACL 2021.
- February 2021: I gave a Tutorial on Deep Learning of Natural Language Processing at SACAIR 2020.
- January 2021: I have a funded position available for a full-time Masters by Dissertation student, starting in March 2021. See this page for details on how to apply.
- January 2021: I have been awarded a Thuthuka grant from the South African National Research Foundation.
Preprints
-
Canonical and Surface Morphological Segmentation for Nguni Languages.
Tumi Moeng, Sheldon Reay, Aaron Daniels, Jan Buys.
AfricaNLP workshop at EACL 2021. [Code]
-
Low-Resource Language Modelling of South African Languages.
Stuart Mesham, Luc Hayward, Jared Shapiro, Jan Buys.
AfricaNLP workshop at EACL 2021. [Code]
Publications
-
Discourse Understanding and Factual Consistency in Abstractive Summarization.
Saadia Gabriel, Antoine Bosselut, Jeff Da, Ari Holtzman, Jan Buys, Kyle Lo, Asli Celikyilmaz, Yejin Choi.
EACL 2021.
-
The Curious Case of Neural Text Degeneration.
Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes and Yejin Choi.
ICLR 2020. [Code]
-
BottleSum: Unsupervised and Self-supervised Sentence Summarization using the Information Bottleneck Principle.
Peter West, Ari Holtzman, Jan Buys and Yejin Choi.
EMNLP 2019. [Code]
-
Neural Text Generation from Rich Semantic Representations.
Valerie Hajdik, Jan Buys, Michael Wayne Goodman and Emily M. Bender.
NAACL 2019. [Code]
-
Benchmarking Hierarchical Script Knowledge.
Yonatan Bisk, Jan Buys, Karl Pichotta and Yejin Choi.
NAACL 2019. [Code]
-
Bridging HMMs and RNNs through Architectural Transformations.
Jan Buys, Yonatan Bisk and Yejin Choi.
IRASL NeurIPS Workshop 2018. [Code]
-
Learning to Write with Cooperative Discriminators.
Ari Holtzman, Jan Buys, Maxwell Forbes, Antoine Bosselut, David Golub and Yejin Choi.
ACL 2018. [Code]
-
Neural Syntactic Generative Models with Exact Marginalization.
Jan Buys and Phil Blunsom.
NAACL 2018. [Code]
-
Robust Incremental Neural Semantic Graph Parsing.
Jan Buys and Phil Blunsom.
ACL 2017. [Code]
-
Oxford at SemEval-2017 Task 9: Neural AMR Parsing with Pointer-Augmented Attention.
Jan Buys and Phil Blunsom.
SemEval 2017 Shared Task.
-
Online Segment to Segment Neural Transduction.
Lei Yu, Jan Buys and Phil Blunsom.
EMNLP 2016.
-
Cross-Lingual Morphological Tagging for Low-Resource Languages.
Jan Buys and Jan Botha.
ACL 2016.
-
Generative Incremental Dependency Parsing with Neural Networks.
Jan Buys and Phil Blunsom.
ACL 2015. [Code]
-
A Bayesian Model for Generative Transition-based Dependency Parsing.
Jan Buys and Phil Blunsom.
Depling 2015. [Code]
-
A Tree Transducer Model for Grammatical Error Correction.
Jan Buys and Brink van der Merwe.
CoNLL 2013 Shared Task.
-
Chorale Harmonization with Weighted Finite-state Transducers.
Jan Buys and Brink van der Merwe.
PRASA 2012.
-
Non-Negative Matrix Factorization for Learning Alignment-Specific Models of Protein Evolution.
Ben Murrell, Thomas Weighill, Jan Buys, Robert Ketteringham, Sasha Moola, Gerdus Benade, Lise du Buisson, Daniel Kaliski, Tristan Hands and Konrad Scheffler.
PLoS ONE 2011.
Theses
-
Incremental Generative Models for Syntactic and Semantic Natural Language Processing.
DPhil thesis, University of Oxford, 2018.
-
Probabilistic Tree Transducers for Grammatical Error Correction.
MSc thesis, University of Stellenbosch, 2013.
-
Generative Models of Music for Style Imitation and Composer Recognition.
Honours project report, University of Stellenbosch, 2011.