Proposal: OpenType math font support

What

  • investigate the possibility to remove the old TeX fonts and only use modern OpenType font containers (at least for the default fonts).

Why

At the moment Mogan (and TeXmacs) is able to use only very few math fonts (like Stix, TeX Gyre) and this support is hardwired in the C++ files and does not take into account all the informations available in the font files.

These fonts file contain the position in the font of extensible brackets or extensible glyphs like arrows, square roots, etcā€¦, and the metrics to correctly position index and subscripts.

Some time ago Iā€™ve started to implement some OpenType font support for TeXmacs, here: https://github.com/mgubi/texmacs/tree/wip-unicode-math

If some other developer is interested we could try to make it working in Mogan. TeXmacs still depends very much on old font formats while there are now available many free fonts in OpenType format which make it easier to support professional math typesetting. In particular there are Latin Modern fonts which can replace our old TeX fonts, and possibly improve speed since there will be less font searching.

One side goal could be to get rid of all TeX fonts and distribute only the OpenType variants. This is for example what ConTeXt does.:

https://wiki.contextgarden.net/ConTeXt_distribution's_Fonts

For a discussion see here: https://github.com/XmacsLabs/mogan/issues/523

How

  • Study and document the font selection mechanism in TeXmacs, in particular how math glyphs are selected from the various fonts. Possibly write some debugging tools which allow developers to check the workings of which glyphs are selected, which metrics used, etcā€¦

Could you give a few clues on where the select glyph and calculate the metric processes happen? Iā€™m just working on a preliminary otf parser, will that help in some sense?

Also there some font files with id, offset etc. handwired in TeXmacs/fonts/font-*. That also should be done in cpp end, right?

Well, itā€™s complicated, because TeXmacs has a sophisticated automatic way to select glyphs for a given font and font parameter (style, sizeā€¦) setting. This is so in order to cope with the fact that most fonts do not have all the glyphs especially those necessary to type math. Joris developed a notion of smart_font which is a kind of virtual font, composed by a collection of various fonts from which sets of glyphs are taken.

Some useful initial reading is the paper ā€˜Mathematical Font Artā€™ here: https://www.texmacs.org/joris/fontart/fontart-abs.html

Iā€™ve learned some internals by looking at the code and try to implement the wip-unicode-math branch and Vau. Most of the work is done in the C++ part, since glyph selection must be really fast and can be complicated, as just said.

A nice way to learn is to look for interesting commits in GitHub and look at which parts of the code are modified and how. For example you can give a look at my changes here:

To summarise the situation, the entry point to font selection is the function edit_env_rep::update_font in src/Typeset/Edit/env_semantics.cpp where the smart_font constructor is called to create an appropriate font. Note that TeXmacs has a notion of resource which are objects cached at creation and fonts are resources, so the construction does work only at first invocation. All the smart font logic is then in src/Graphics/Fonts/smart_font.cpp which implements a virtual font object as I said. This in turn instantiate other font objects according to several kind of rules, until some real platform font is created which contain metric and glyph informations. These are accessed by the Typesetter to make text boxes.

There are also other virtual fonts which are composed of pieces of other glyphs and defined in the Scheme side in $TEXMACS_PATH/fonts/virtual and declared in a small S-expression sublanguage interpreted in src/Graphics/Fonts/virtual_font.cpp.

I think this covers the basic. To understand better how math fonts support works you can also search for stix or pagella (or variants of these) in smart_font.cpp and similar files.

Feel free to ask more. We can use this thread to gather informations on this task and on the font handling. Note however that it is some time Iā€™m not working on this.

2 Likes

much work has already been done. We mostly use FreeType to handle font files and extract information. But we parse the MATH font tables ourselves because this is not supported by FreeType.

2 Likes

The MATH table parser is almost done, but it offers overwhelmingly much information to use. I may first try to implement those features mentioned in jorisā€™ article when I get some time after the end of semester. Hopeful weā€™ll make some progress in the summer.

1 Like

As I said (few times), the parsing code already existed, so Iā€™m not sure why you want to reimplement it. The difficult part is not to parse the code, but to integrate the information within the way TeXmacs search and use glyphs.

See here

for the code parsing of the MATH table.

Another general useful feature to implement would be to be able to select variants of a given font (e.g. old style numerals). This information is present in OpenType fonts but we currently do not use it.

But we parse the MATH font tables ourselves because this is not supported by FreeType.

I thought you meant we (should) parse it ourselves. Didnā€™t notice the sentence below several images, sorry for that.

Complete and debug the initial implementation of the MATH table parser here: https://github.com/mgubi/texmacs/tree/wip-unicode-math

Sorry for the misunderstanding. I usually use ā€œweā€ to mean TeXmacs developers. Therefore yes, we have to parse them ourselves, but I already wrote some of the code (actually a lot) long ago in that branch, so I was not understanding why you wanted to rewrite it. Please give a look at my modifications in that branch, maybe this will speed up your work. The easiest way is to compare that branch with another more or less in the same period. For example here:

this link shows you a diff between the head of the wip-unicode branch and commit 992408a48297cc11b8fd8132a16754846d56c591 which shows you essentially all the modifications I made to the font code to support some of the information I parsed from the MATH table (like large parenthesis). The work is not complete and maybe this is not the right way to go, but I suggest you give it a look because very likely you will have to modify the same code areas. That said I believe I made some progress and the code I wrote was working and was able to use some of the fonts, at some point I stopped for other reasons but the project still interests me and if you also want to work on it we can discuss.

Of course you can do whatever you like, especially if you do not like my code, but parsing tables is quite dumb and annoying and not the difficult part of the task. Integrating the information in the way TeXmacs handles the additional typesetting requirements for math is more complex.

Maybe in Mogan we can try to be less conservative and redesign a bit the code to be more generic. Also as apparent in another discussion we might also want to use other informations from the font files which we currently donā€™t, for example ā€œfeaturesā€ like selecting old numerals, etcā€¦

This is a very important proposal, could you turn it into a project and act as a mentor for this project? @mgubi

I think weā€™d better find a student with a strong math background to complete this project.

@mgubi This proposal looks interesting and important. Could you turn it into a project for Summer of Code? Iā€™d like to contribute to it this summer if possible.

Iā€™m quite busy at the moment, but I agree this is an interesting and useful project. @darcy, when is the deadline for SoC?

May these papers and references there help you


From https://summer-ospp.ac.cn/help/en/mentor/#2-how-to-become-a-mentor

  • After the community liaison submits project information, mentors need to log in to the mentor system, fill in personal information, and complete mentor identity verification. The deadline for verification is April 28th, 24:00 UTC+8. Projects that have not completed mentor identity verification will not be published on the official website.

Please login https://summer-ospp.ac.cn/ and get yourself verified again for OSPP 2024.

The deadline is April 28th, 24:00 UTC+8, youā€™d better login and confirm your personal info before 2024/04/25

Iā€™m quite busy at the moment, but I agree this is an interesting and useful project.

Let me convert this proposal to project and you just need to review the project info on this forum.

And as the community organizer, I have to submit the projects to OSPP before 04/27 18:00 UTC+8.

Two projects proposed by me and @Oyyko are ready to be submitted:

The project proposed by @jingkaimori still needs to be polished:

This proposal is informative and recently Iā€™ve tuned the font support via project 11 in Mogan: https://codeberg.org/XmacsLabs/mogan/src/branch/branch-1.2/devel/11.tm

It is not that hard for me to write a project based on your proposal.

@ssm Do you have any problem on this proposal?

Thanks, if you could help, that would be valuable. I can act as supervisor for the project. Need to refresh what I did in the wip_unicode branch, but IIRC the situation was quite advanced. Proper support need deeper changes, and maybe it is better to implement it first in Mogan and then backport. It would also allow to replace the TeX fonts with the Latin Modern variant.

I list here some references I gathered researching for the project.

Resources

I have no objections to this proposal.

Remember to login and confirm your personal info. And when you have confirmed your personal info, please notify me via OSPP SoC 2024 Stage 2: Project Submission

Because there are no way for me as a community organizer to learn if a registered mentor has updated his/her personal info.

Itā€™s ready now with few modifications: