How to enter accented strings from .scm modules without encoding problems?

I’m trying to personalize the exam style and everything around it to look like one style I built for me in LaTeX. While trying to edit the markup file in order to automate some things, I decided to provide “default” values for some data tags.

The problem I have is that, whenever I enter accented strings, TeXmacs typesets them with encoding problems…

For example, while updating the make-doc-data macro, I typed:

(tm-define (make-doc-data)
  (:applicable (not (selection-active-non-small?)))
  (insert-go-to
    '(doc-data
      (doc-title "Título"))
    '(0 0 0)))

but when rendered, it looks like this:

I saw a post suggesting to use Cork encoding, but when I do (choosing the “í” character), the character is not affected by the “small-caps” style… It stays in lowercase.

So, what I wanted was a way to insert accented characters in scheme code so that it behaves as if I had typed it within TeXmacs.

Thank you.

I tested this with

(kbd-map 
  (:mode in-text?)
  ("t i t l e tab"
   (insert
    '(doc-data
      (doc-title "Título")))))

and I obtain the same output.
I also tried to Edit -> Copy to -> TeXmacs Scheme on the title, obtaining again (doc-data (doc-title "Título")). For this reason, I think there is at least one bug worth reporting in the Savannha bug tracker (if writing “í” is disallowed in the Scheme code, then the result of copying should not be this).

Could you please show me how you do the Cork encoding in Scheme?

I have managed to do it with the help of Accents problem and https://en.wikipedia.org/wiki/Cork_encoding.

I placed in my my-init-texmacs.scm

(kbd-map 
  (:mode in-text?)
  ("t i t l e tab"
   (insert
    '(doc-data
      (doc-title (with "font-shape" "small-caps" "T<#00ED>tulo"))))))

obtaining

Cork-encoded-small-caps

The accented i looks thicker than the other characters, so that may be a bug too.

Moreover if now one copies “to TeXmacs Scheme”, one obtains back the <#00ED> Cork-encoded character.

I am going to post about this on the mailing list, Joris van der Hoeven reads that and only rarely the forum, may be in this way we figure out what happens.

Edit: the link to the forum post is https://lists.texmacs.org/wws/arc/texmacs-users/2021-12/msg00062.html

G.

1 Like

You might want to try using a function like utf8->cork. I’m not sure what is going on here, but probably your text file is saved in utf8 format and then read as such by guile. (I do not remember the details here) So putting something like (utf8->cork "Tìtulo") instead that just ‘“Tìtulo”’ should do the job.

2 Likes

I tried your suggestion and it worked like a charm!

@pireddag, could you please keep us informed if something comes up on the mailing list? I’m not subscribed to it…

Thank you, guys.

Edit: Just as an addendum, the doc-title needs a string (apparently) and so it won’t work if I try to add the utf8->cork to be executed by the GUI. This means that, after copying to TeXmacs scheme, I get the same output that pireddag got, and so if someone was to reuse the markup it would give them the same problem.

Ok!

I do not understand this (I did not yet try), I expect (utf8->cork "Tìtulo") to return a string. Can you explain again?

What I tried was something like this:

;; ...
(insert-go-to
  '(doc-data
    (doc-title (utf8->cork "Título")) ;; Notice it was not unquoted
    ;; ...
  )
;; ...

And thus the GUI rejected the code. The problem with this is, as the conversion happens from the Scheme code, when I use the “Copy to -> TeXmacs Scheme” option, the source I obtain is exactly the problematic version (without the conversion).

It is probably a bug, really.

This worked for me, using quasiquoting and unquoting.

(kbd-map
  ("t i t l e tab" (insert
		    `(doc-data
			(doc-title ,(utf8->cork "Título")) ;; Notice it was unquoted
			;; ...
			))))

I have the feeling I still do not understand your