Paper ID: 2402.14313

Learning to Kern: Set-wise Estimation of Optimal Letter Space

Kei Nakatsuru, Seiichi Uchida

Kerning is the task of setting appropriate horizontal spaces for all possible letter pairs of a certain font. One of the difficulties of kerning is that the appropriate space differs for each letter pair. Therefore, for a total of 52 capital and small letters, we need to adjust $52 \times 52 = 2704$ different spaces. Another difficulty is that there is neither a general procedure nor criterion for automatic kerning; therefore, kerning is still done manually or with heuristics. In this paper, we tackle kerning by proposing two machine-learning models, called pairwise and set-wise models. The former is a simple deep neural network that estimates the letter space for two given letter images. In contrast, the latter is a transformer-based model that estimates the letter spaces for three or more given letter images. For example, the set-wise model simultaneously estimates 2704 spaces for 52 letter images for a certain font. Among the two models, the set-wise model is not only more efficient but also more accurate because its internal self-attention mechanism allows for more consistent kerning for all letters. Experimental results on about 2500 Google fonts and their quantitative and qualitative analyses show that the set-wise model has an average estimation error of only about 5.3 pixels when the average letter space of all fonts and letter pairs is about 115 pixels.

Submitted: Feb 22, 2024