"How Many Layers and Why? An Analysis of the Model Depth in Transformers."

Antoine Simoulin, Benoît Crabbé (2021)

Details and statistics

DOI: 10.18653/V1/2021.ACL-SRW.23

access: open

type: Conference or Workshop Paper

metadata version: 2022-01-20

a service of  Schloss Dagstuhl - Leibniz Center for Informatics