Big Data/Analytics Zone is brought to you in partnership with:

Rob Hyndman is a Professor of Statistics at Monash University, Australia. He is Editor-in-Chief of the International Journal of Forecasting and author of over 100 research papers in statistical science. He also maintains an active consulting practice, assisting hundreds of companies and organizations. His recent consulting work has involved forecasting electricity demand, tourism demand, the Australian government health budget and case volume at a US call centre. Rob J is a DZone MVB and is not an employee of DZone and has posted 42 posts at DZone. You can read more from them at their website. View Full User Profile

Makefiles for R/LaTex Projects

03.08.2013
| 2145 views |
  • submit to reddit

Make is a mar­vel­lous tool used by pro­gram­mers to build soft­ware, but it can be used for much more than that. I use make when­ever I have a large project involv­ing R files and LaTeX files, which means I use it for almost all of the papers I write, and almost of the con­sult­ing reports I produce.

If you are using a Mac or Linux, you will already have make installed. If you are using Win­dows and have Rtools installed, then you will also have make. Oth­er­wise, Win­dows users will need to install it. One imple­men­ta­tion is in GnuWin.

A typ­i­cal project of mine will include sev­eral R files con­tain­ing code that fit some mod­els, and gen­er­ate tables and graphs. I try to set things up so I can re-​​create all the results by sim­ply run­ning the R files. Then I will have a LaTeX file which con­tains the paper or report I am writ­ing. The tables and graphs pro­duced by R are pulled in to the LaTeX file. Con­se­quently, all I need to do is run all the R files, and then process the tex file, and the paper/​report is generated.

Make relies on a Makefile to deter­mine what it must do. Essen­tially, a Makefile spec­i­fies what files must be gen­er­ated first, and how to gen­er­ate them. So I need a Makefile that spec­i­fies that all the R files must be processed first, and then the LaTeX file.

The beauty of a Makefile is that it will only process the files that have been updated. It is smart enough not to re-​​run code if it has already been run. So if noth­ing has changed, run­ning make does noth­ing. If only the tex file changes, run­ning make will re-​​compile the tex doc­u­ment. If the R code has changed, run­ning make will re-​​run the R code to gen­er­ate the new tables and graphs, and then re-​​compile the tex doc­u­ment. All I do is type make and it fig­ures out what is required.

A Make­file for LaTeX

It is easy to tell if the latex doc­u­ment needs com­pil­ing — make sim­ply has to check that the pdf ver­sion of the doc­u­ment is older than the tex ver­sion of the doc­u­ment. Here is a sim­ple Makefile that will just han­dle a LaTeX document.

TEXFILE= paper
$(TEXFILE).pdf: $(TEXFILE).tex
	latexmk -pdf -quiet $(TEXFILE)

The first line spec­i­fies the name of my file, in this case paper.tex. The sec­ond line spec­i­fies that the pdf file must be cre­ated from the tex file, and the last line explains how to do that. Mik­TeX users might pre­fer pdftexify instead of latexmk.

To use the above Makefile, copy the code into a plain text file called Makefile and store it in the same direc­tory as your tex file. Change the first line so the name of your tex file (with­out the exten­sion) is used. Then type make from a com­mand prompt within the same direc­tory as the tex file, and it should do what­ever is nec­es­sary to con­vert your tex to pdf.

Of course, you wouldn’t nor­mally bother with a Makefile if that is all it did. But throw in a whole lot of R files, and it becomes very worthwhile.

A Make­file for R and LaTeX

We need a way to allow make to be able to tell if an R file has been run. If the R files are run using

R CMD BATCH file.R

then the out­put is saved as file.Rout. Then make only has to check if file.Rout is older than file.R.

I also like to strip out all the white space from the pdf fig­ures cre­ated in R before I put them in a LaTeX doc­u­ment. There is a nice com­mand pdfcrop which does that. (You should already have it on a Mac or Linux, and also on Win­dows pro­vided you are using Mik­TeX.) So I also want my Makefile to crop all images if they have not already been done. Once an image is cropped, an empty file of the form file.pdfcropis cre­ated to indi­cate that file.pdf has already been cropped.

OK, now we are ready for my mar­vel­lous Makefile.

# Usually, only these lines need changing
TEXFILE= paper
RDIR= .
FIGDIR= ./figs
 
# list R files
RFILES := $(wildcard $(RDIR)/*.R)
# pdf figures created by R
PDFFIGS := $(wildcard $(FIGDIR)/*.pdf)
# Indicator files to show R file has run
OUT_FILES:= $(RFILES:.R=.Rout)
# Indicator files to show pdfcrop has run
CROP_FILES:= $(PDFFIGS:.pdf=.pdfcrop)
 
all: $(TEXFILE).pdf $(OUT_FILES) $(CROP_FILES)
 
# May need to add something here if some R files depend on others.
 
# RUN EVERY R FILE
$(RDIR)/%.Rout: $(RDIR)/%.R $(RDIR)/functions.R
	R CMD BATCH $<
 
# CROP EVERY PDF FIG FILE
$(FIGDIR)/%.pdfcrop: $(FIGDIR)/%.pdf
	pdfcrop $< $< && touch $@
 
# Compile main tex file and show errors
$(TEXFILE).pdf: $(TEXFILE).tex $(OUT_FILES) $(CROP_FILES)
	latexmk -pdf -quiet $(TEXFILE)
 
# Run R files
R: $(OUT_FILES)
 
# View main tex file
view: $(TEXFILE).pdf
	evince $(TEXFILE).pdf &
 
# Clean up stray files
clean:
	rm -fv $(OUT_FILES) 
	rm -fv $(CROP_FILES)
	rm -fv *.aux *.log *.toc *.blg *.bbl *.synctex.gz
	rm -fv *.out *.bcf *blx.bib *.run.xml
	rm -fv *.fdb_latexmk *.fls
	rm -fv $(TEXFILE).pdf
 
.PHONY: all clean

Down­load the file here. For most projects I copy this file into the main direc­tory of my project, then all I have to do is mod­ify the first few lines. RDIR spec­i­fies where the R files are kept and FIGDIR spec­i­fies where the fig­ures are kept. Nor­mally I keep these together, but some­times they might be in sep­a­rate directories.

Now make will do every­thing nec­es­sary — run the R files, crop the pdf graph­ics, and process the latex doc­u­ment. But it won’t do any steps that don’t need doing.

make R will only process the R files.

make view will run the pdf viewer, after updat­ing the pdf file if necessary.

make clean will delete all the files gen­er­ated by latex or by make, so that the entire process must be run again at the next make command.

Notice that my R files all depend on functions.R. This is a file that con­tains project-​​specific func­tions. If this file is updated, all the other R files will need updat­ing also.

For many projects, some R files will depend on some oth­ers hav­ing already run. For exam­ple, read.R may read in the data and refor­mat it for analy­sis, while plot.Rmight pro­duce some graphs assum­ing that read.R has already run. To ensure makeknows about this depen­dency, we need to add a line

$(RDIR)/plot.Rout: $(RDIR)/plot.R $(RDIR)/functions.R $(RDIR)/read.R
	R CMD BATCH $<

This should be inserted where I have the com­ment # May need to add something here if some R files depend on others.

This Makefile works on Linux. Mac and Win­dows users will need to replace evinceby what­ever pdf viewer they pre­fer.

Published at DZone with permission of Rob J Hyndman, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)