Reproducible Research "Plugin"

tmbb · April 13, 2021, 10:47pm

I usually analyze data with Python and I’d like to be able to write a paper based on that data in a completely reproducible way. I think the simplest thing would be to add slots to my document (using the macro, for example), which could be replaced later. My python code to generate something like a map from variable names to TeXmacs trees (in scheme format), and I’d have TeXmacs read that file to populate the tags according to their name.

For example, my python code would generate:

(("p_value" "0.032")
 ("nr_of_successes" "245")
 ("ratio" '(frac "a" "b"))

And at the click of a button, TeXmacs would read this file generated by the python code and replace occurrences of (tag "id" placeholder) by (tag "p_value" "0.032").

Without expecting anyone to write this for me, what are the steps I should take to write a pluging that does this? Another possiblity would be for the python code to do the replacement, but that would require saving the file as texmacs scheme, which is somewhat disruptive.

mgubi · April 14, 2021, 6:37am

There are various way to achieve this I think and I’m not an expert myself.

One possibility is the one you suggest: traverse the document tree and replace the fields in certain tags with the values computed by Python. You find more informations in Help->Scheme extensions->Programming routines for editing documents. There is an API from scheme to do this, and you can also look around in $TEXMACS_PATH/progs to see example of invocations of such an API. Also post here if you want more specific help.
Another approach is to use the spreadsheet facilities in TeXmacs, see `Help->Manual->TeXmacs as an interface, in particular Section 4 and 5. You can have regions of the document which display results of computations from a series of plugins, I think Python is included. So you can just code some small Python snippet which reads the values and format them nicely, TeXmacs will take care to exectute these snippet and put the result in the typesetted document. There is also a spreadsheet system which reacts to changes of values by recomputing other values, always using the plugin, however I’m not sure if this is implemented yet for Python, it needs some more complex plugin infrastructure. It is certainly implemented for Mathemagix (the language Joris develops).

fbob · April 14, 2021, 6:42am

I confirm this: spreadsheet is working with Python

mgubi · April 14, 2021, 6:52am

You might also want to give a look at Help->Interfacing->Background evaluation which give some example on how an external program can be invoked to modify the content of a document.

jeroen · April 14, 2021, 8:17am

This kind of substitution can be implemented using a “locus” in TeXmacs. You can insert a locus by first enabling the linking tool via Tools->Linking Tool. Then click Link->New Locus. Go to Document->Source->Edit Source Tree to find the “id” of that locus. Now copy the locus to wherever you want your value to appear.

Now the locus can be set at all location it appears by doing using locus-set in Scheme. You could, in principle, do the following:

(locus-set "+nWR9oal1xAc6vbB" (plugin-eval "python" "default" "import numpy as np; np.sqrt(2)"))

However, there is currently a bug causing plugin-eval to hang.

Presumably some of this should be doable via the menu of the linking tool, but I can’t figure out how. Perhaps there are some bugs there as well.

pireddag · April 14, 2021, 11:21am

I would propose a different approach from those of @mgubi and @jeroen—admitting that I understood well what you would like to do and what Python, which I used only superficially, can do.

The thing that I understood is that you would like to have a document where certain values are read in from the results of calculations.

I would get Python to generate data in a different form:

 (tm-define p_value "0.032")
 (tm-define nr_of_successes "245")
 (tm-define ratio '(frac "a" "b"))

To load these definitions into the TeXmacs document, I would use a Scheme session in a support TeXmacs document, where you execute (you have to put the full path in place of data-python.scm)

(load "data-python.scm")

Finally, I would define environment variables in TeXmacs, one for each Scheme symbol, in the following way

<assign|ratio|<extern|(lambda () ratio)>>

and use them in the document (I defined these variable in the preamble of the main document, maybe it works as well if one does that in the support document, I did not test it).

I tried the procedure I described and you can update the document when you change the file by re-executing the Scheme command in the support document and then doing “Update all” in either document (I tried first in one, then in the other and it worked in both cases).

Maybe there is a way to insert the Scheme session in the same document so that it can be shown and hidden through a command, but I did not investigate.
Or maybe one can automate even more, having to do only “Update all”, but I did not achieve that.

Finally, welcome to the forum.

jeroen · April 14, 2021, 11:18am

That can be done using an executable fold: Insert->Fold->Executable->Scheme

@tmbb, you may also be interested in
Insert->Fold->Executable->Python

pireddag · April 14, 2021, 11:20am

Right, thanks, I had not connected the thought.
The Insert->Fold->Executable->Python is also good, but I do not know if you can do typesetting with it. I could investigate what happens if Python outputs a TeXmacs tree, for example.

jeroen · April 14, 2021, 11:37am

I don’t think that would work. Plugins need to request their output to be formatted. As generic Python doesn’t know about mathematical object, it doesn’t request to format them. In SageMath this is possible, since it has a SageObject class that represents mathematical objects.

pireddag · April 14, 2021, 12:13pm

Answering to myself, after a bit of testing I found out that

<extern|(lambda () (load "data-python.scm"))>

in the preamble works. With that, you can “Update all” when you have a new data-python.scm file

tmbb · April 14, 2021, 12:34pm

Thanks, I really have to try this solution! It’s very simple and works without any fancy texmacs plugins.

mgubi · April 14, 2021, 1:13pm

Taking this nice approach of @pireddag to the extreme one can then have just TeXmacs macros for the parameters and then include in the pramble an external small documents which is automatically generated by python and which do the only job of defining such macros giving them the right computed values. This would avoid to use scheme and extern macros for this job.

jeroen · April 14, 2021, 3:01pm

I can’t seem to find a Scheme function to define TeXmacs macros or to set environment variables. Is this possible?

pireddag · April 14, 2021, 3:51pm

I do not know if hat exists or not, but if it does not exist maybe the reason is that you can put the Scheme tree for assignment as argument of a stree->tree function.

(stree->tree '(assign "b" "bb"))

I thought that this would make the code for the “reproducible research” document easier

I would get Python to generate data in a different form:

(stree->tree '(assign "p-value" "0.032"))
(stree->tree '(assign "nr-of-successes" "245"))
(stree->tree '(assign "ratio" (frac "a" "b")))

but it did not work in the test that I did.
It worked in a Scheme session, but not loading a Scheme file.

jeroen · April 14, 2021, 3:59pm

Ah, I see the problem now. I was using texmacs-expand, which has a similar effect, but the effects only appear below the session, not for the whole document. It is the same with stree->tree

pireddag · April 14, 2021, 4:23pm

I think the reason why it did not work for me with load, both in the preamble within an extern and inside the Scheme session, is that the command needs to insert the tree inside the document, and neither extern nor load do that.
I do not know (did not yet learn) how to do that with the tree-manipulation routines. If one does that, one could obtain the definitions from a file written with the Scheme syntax alone.

pireddag · April 14, 2021, 4:26pm

I find it “unnatural” that the scope of an environment variable starts from the point in the document where it is defined would find it more natural if it were the whole document.

tmbb · April 14, 2021, 4:27pm

This way I could just add

<extern|p-value>

to my document, right? If so, then this is very elegant! Sorry for not trying it out, I’m at a work computer without texmacs right now…

The only disadvantage of this option is that it doesn’t make it easy to distinguish visually “fixed” values from dynamic values. the <tag|id|value> macro adds a little flag that is useful. But I could have my expressions expand to things using the <tag|...|...> macros, so it’s ok. I’m quite happy to generate (simple) scheme code from Python so that I don’t have to duplicate functionality between Scheme and Python.

I should namespace these identifiers somehow, though, to avoid overwriting default macros.

pireddag · April 14, 2021, 4:38pm

Either <p-value> or <value|p-value> would work, and also <tag|1|<value|p-value>> (I do not know what the id argument of tag does, though, I put 1 as a plausible value that crossed my mind).

The obstacle that I am not able to overcome is how to inject the trees you generate with the Scheme commands in the external files inside TeXmacs (I know where it is described—Scheme developer manual, I tried time ago and I did not succeed—crashed TeXmacs). But maybe @jeroen and @mgubi know.

Edit: on a second thought, this (once it works ) looks equivalent to the suggestion of @mgubi above.

mgubi · April 14, 2021, 5:45pm

You can use a prefix for your macros, also look at \merge and \compound which allow you to create new tags programmatically, that is you can still define your \tag macro with an argument myval which inside just invoke a \myprefix-$myval macro. This can be done using \merge{myprefix-}{\myval}.