Filter by type:

Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View

Details PDF Code arXiv

Efficient Training of BERT by Progressively Stacking

Details PDF Code

Towards Binary-Valued Gates for Robust LSTM Training

Details PDF Slides Video Code arXiv Blog Post (Chinese)

Reproducing Vectorization of the Tersoff Multi-Body Potential on the Intel Broadwell Architecture

Details PDF DOI

ParConnect Reproducibility Report

Details PDF DOI