Howto: literate programming in TeXmacs

jeroen · June 5, 2023, 9:16am

Recently, @mgubi pointed out to me the existence of the literate style package in TeXmacs, which enables “literate programming” style, as proposed by Knuth. See also this video (HT @mgubi).

I decided to experiment a bit with the package to see how it can be used. It was not completely straightforward, so I decided to post here some of the notes I made along the way. Perhaps it can be useful to others. Comments and additions are most welcome.

Literate programming is a style of programming that puts emphasis on the description of the large scale structure of a complex program and the interconnection of its constituent parts (the “why” of the program), as opposed to code documentation, which describes fine-grained procedures (the “how” of the program).

To use the package:

Add the literate package from Document->Style->Add package -> Utilities -> literate
Write explanation of your code in the document.
Add a first “chunk” of code in the language of your choice via Literate->First chunk
Note: The top field of a chunk is the filename for that chunk and has to contain a dot! The code itself goes in the bottom part.
Write some more explanation.
Subsequent chunks can be inserted using “Literate->Next chunk”. All the chunks in one sequence of chunks will be concatenated into one source file with as file name the “title” of the chunk.
Once you’re satisfied with the code, clicking Literate->Build buffer will extract all the chunks in the document into source files in the same directory.
The “Build buffer in” option extracts the source files in another directory.
The “Build directory” option can be used to extract source code from tm files from a directory, with all its subdirectories. “Build directory in” is similar, but extracting the source into another destination directory, where the same subdirectory structure is recreated.
One chunk can be included into another by using “Literate->Reference”

I am not sure yet what the Invisible openings and endings are for.

I have used the package successfully in combination with a live session in the same document. This has been useful to experiment with the code being written. The “stable” code goes into literate programming chunks, intertwined with equations and explanation, while the more experimental parts go into the session. While text fields (potentially with equations) can be added as documentation into sessions as well, it’s not possible to intertwine text and code in the same way as can be done with literate, for example to document a long function.

Note: The source files will be extracted in the same directory as the tm file, or in a directory structure which mirrors that of the source directory. It does not seem possible yet to write a tm document to describe a mix of code spanning multiple directories. This would be useful to describe, for example, related Scheme and C++ code in the TeXmacs source code.

mgubi · June 5, 2023, 10:25am

Great post @jeroen!

I was planning to use the tool at smaller scales, before trying it on TeXmacs itself (or some parts of it). At that point I think one needs to customise the literate package to the specific needs, e.g. by adding an explicit tree structure for the extracted files.

On the other hand one could argue that a tree structure for the source code is only needed if the source code should be used by a human, so this goes against the general design principle. One can have a very structured literate program which outputs all the source files in a flattened way, i.e. without any particular structure, since the compiler will not have problems with this. Generally the division of a program in many source files is an optimisation to allow the compiler to cache some compilation results and the editors not to have to deal with million-lines sources. Of course this a very radical point of view. I do not mean one should do it, just that the literate approach give you other ways to deal with large codebases and one should keep this in mind when designing tools.

samjam · February 14, 2024, 10:26am

I’ve done a few very large literate programming documents.

The first one was a windows device driver using LyX, after that I moved to TexMacs and did some large systems management project using (using fangle literate programming style) which is a bit of a dogs dinner even if I did do it myself, and it was too complicated for me to clean up at the time.

I developed a quoting system so that one source file type could incorporate programs for another, to allow full composing for shell-script systems.

I was then forced to develop this Makefile spell so that Makefiles could be composed in this manner, so that valid shell-script could be incorporated into a Makefile fragment and still worked perfectly.

It’s time to dig it out and integrate the enhanced-syntax untangler properly into scheme instead of writing it in awk based on the text-export from TexMacs.

darcy · February 15, 2024, 1:53pm

I think literate programming with ChatGPT would be a good practice in the future.