Apostrophes in system fonts

Hello @jonsterling and welcome to the TeXmacs community!

The forum has definitely been more active than the users mailing list lately :slight_smile:

Could you try and see if ' Tab (apostrophe followed by Tab) is what you need?

1 Like

Hi @jeroen, thank you so much! That worked perfectly. I really appreciate your help!

2 Likes

Although JvdH responds much more on the mailing list than on the forum (in the few years in which I have been following the mailing list, sometimes he does not appear on the mailing list for a few months, then he is back and responds with regularity).

1 Like

I noticed another problem related to this, now involving BibTeX. Have a look at the following screenshot:

In the above, the apostrophe in the [Grothendieck, 1957] entry is rendered incorrectly. I think this is probably not noticeable to most TeXmacs users because the builtin TeXmacs fonts render even a straight apostrophe as a curly apostrophe. But other fonts, such as Libertine (which is used by the acmart styles), do not have this rendering behavior.

Because BibTeX databases usually come from LaTeX projects, where the behavior is to automatically fix apostrophes, it seems that a more desirable behavior for TeXmacs would be to copy that behavior — but I of course don’t know the implications.

Does anyone have any thoughts, or suggestions for how I might move forward? I can’t change the .bib file, because I have to share that with LaTeX projects.

This is odd, as it works correctly in the builtin bibliography database. If I try a linked bibtex file, both U+0027 (plain, straight apostrophe) and U+2019 (right single quotation mark, curly) get rendered straight. Somewhere there along the way there seems to be a conversion happening.

If you want to try it with my database just in case, here is the file: https://github.com/jonsterling/bibtex-references/blob/master/refs-bibtex.bib

According to https://www.johndcook.com/unicode_latex.html the LaTeX command \rq should give the curly apostrophe, but this also gives a straight apostrophe in TeXmacs. This seems to happen in line 763 of src/Data/Convert/Tex/fromtex.cpp. I don’t know if this will fix the other conversion.

Are you using bibtex or the internal bibliography compiler? (this is determined by whether the bibliography style starts with “tm-” or not)

@mgubi Thanks for your question, it appears that I am using bibtex from the code depicted below:

You could also find out by putting your cursor in the bibliography, but before the first bibitem, and look into the focus bar, it will show you the arguments to bibliography. What happens if you instead replace “plainnat” with “tm-plainnat”? (I do not remember if it exists or not, however).

I think tm-plainnat doesn’t exist; I tried it and received the following result:

Indeed, I checked and it is not there. Creating your own style file with scheme is not difficult. See e.g. $TEXMACS_PATH/progs/bibtex/plain.scm. Here: About bibliography in TeXmacs you find an example of a custom bibliography style. You can install them in your $TEXMACS_HOME_PATH.

Anyway the problem with the apostrophe is real, so we need to understand how to properly address it. Suggestions? @jeroen?

1 Like

I’ve been doing some debugging and this is what I’ve got so far. The actual conversion happens in src/Data/Convert/BibTeX/parsebib.cpp. There the function western_to_cork is called on the input bibtex string. This function is defined in src/Data/String/wencoding.cpp, where the apostrophe changes in the function convert (s, "UTF-8", "Cork"). This uses utf8_to_cork.

2 Likes

I guess the conversion tables are those defined in $TEXMACS_PATH/langs/encoding/. In particular there is

("#2019"	"'")		; right single quotation mark

3 Likes

Yes, I just came to the same conclusion :slight_smile:
Removing that line gives the right apostrophe.

1 Like

I think it would be useful to submit a bug report. I would like to check with Joris the reason of that conversion.

Yes, definitely, this is a frequently used function, so this would need to be thoroughly checked.

1 Like

For reference:
https://savannah.gnu.org/bugs/index.php?62396

2 Likes

Hi there, I wonder if there’s any update on that? I am also confused by the default behavior.

Seems it even automatically replaces all the following characters in any text you paste into TeXmacs (when it comes to longer paragraphs, I usually write in other text editors with better spelling/grammar check and formatting features and then paste into texmacs):

  • \u2018 , the opening curly single quotes (‘) are replaced by the backtick (`)
  • \u2019 , the closing curly single quotes (’) are replaced by the straight single quote (’)
  • the curly double quotes (“”) and backticks (`) are kept untouched and as is

However, if you type directly in texmacs:

It looks quite inconsistent and doesn’t make sense to me why the default setting is like this.

I am sorry and I have to admit that I’m not familiar with how the mailing list work and don’t know how to bring up a question, discussion, issue, or feature request to the developer team. Can someone please help me understand if that’s a bug, or an expected behavior? And if that’s not an intentional design decision, how can we make a bug report?

Thank you so much in advance.

— edit —

Sorry for disturbing…! I was trying to ask you how you made thing work but I just somehow managed to get it working. And I don’t know how I can cancel replying to you in this thread…

Just adding my notes here for someone else in need.

You can check the $TEXMACS_PATH on your device by: Open a TeXmacs editor > Menu > Insert > Session > Shell, and run echo $TEXMACS_PATH in the inserted shell. It is /Applications/TeXmacs.app/Contents/Resources/share/TeXmacs/langs/encoding/ on macOS for example.

In the $TEXMACS_PATH/langs/encoding/ folder, there are several scheme .scm files involved in the auto substitution of and in pasted texts.

  1. unicode-cork-oneway.scm
  • ("#2019" "'") ; right single quotation mark
  1. corktounicode.scm
  • ("#60" "#2018") ; typographic backquote
  1. utf8tolatex-onedir.scm
  • ("#2018" "`")
  • ("#2019" "'")

You can delete those lines or comment them out by adding a leading ; semicolon, and then restart TeXmacs. Now the curly single quotes and should be kept as is in pasted texts, and won’t be replaced by backtick ` and straight single quote '.