Documentation:WeBWorK/The WeBWorKiR Project: Integrating WeBWorK with R/Authoring Guide

From UBC Wiki

WeBWorKiR User Guide

The audience for this guide are problem authors who want to use R's facilities in their WeBWorK questions. We assume that you already know how to create problems with the PG language (which itself is a subset of Perl) and are familiar with the concepts of macros in PG. We also assume that your WeBWorKiR environment has been set up as described in the installation guide.

Background

WeBWorKiR is a system that involves two separate servers: WeBWorK and R. In a normal operation, when a student accesses WeBWorK via a web browser to work on homework problems, WeBWorK runs the PG code defining the problem to compute:

a) values used in the question, for example a random vector of five integers

b) the text of the question, typically given in format resembling LaTeX and optionally displaying values computed in (a)

c) values of the correct answer, for example, the mean of the vector calculated in (a)

WeBWorKiR adds to this the ability to run arbitrary computations in R and make their results available to the rest of the PG code as if they were constructed using the standard PG functions. (In our running example, we could construct the random vector in R with `sample`, and/or calculate its mean with `mean`.) The reason why one might want to do this is R's rich library of high-quality statistical functions as well as its graphical abilities. While in theory these could be both replicated with PG, it would take a huge effort that can be better spent by simply using the functionality already available in R.

The way this works is that WeBWorK uses a Perl module that can talk to a server running the Rserve software, which allows remote clients to execute R code on the Rserve's server and returns the execution's result in response. The Perl module converts this response from R's native values (e.g., a generic vector, aka "list") to those understandable to Perl (e.g., an array), making them available to the rest of the PG code.

Loading the macros

To use R code in a problem, include "RserveClient.pl" in the "loadMacros" call at the start of the question. For example:

loadMacros(
  "PGstandard.pl",     # Standard macros for PG language
  "MathObjects.pl",
  "RserveClient.pl"    # <--- R integration
);

Basic Rserve macros

The Rserve software creates an R session for each remote client. This means that clients' interactions with R are kept separate from each other, just as if you started R twice on your local computer. A session persists as long as the client is connected, so that multiple calls from the client using the same session see the objects created in previous calls. (This behaviour mirrors what happens in a local session, where each R command you execute at the console after pressing ENTER sees the results of earlier commands.) When a sessions is *closed*, its contents are wiped off without a trace, just like quitting the R application run locally.

The RserveClient offers macros to start and finish a session, and execute R commands in the current session:

  1. rserve_eval("some R code"): this function sends the R code given as its string argument to Rserve for execution. It returns *an array* representation of the R code's result. (This means that the value of rserve_eval("pi") is an array with a single element 3.14159265358979. If you want to keep this value and use it in the rest of the problem, assign it to an array variable. For example:

    @pi = rserve_eval("pi");

    Note: Multiple calls within the same problem share the R session and the object workspace, so you can break up your R code in as many rserve_eval statements as you'd like.

  2. rserve_start(), rserve_finish(): Start up and close the current connection to the Rserve server. In normal use, these functions are completely optional because the first call to rserve_eval will start the Rserve session if one is not already open. Similarly, the current session will be automatically closed at the end of the problem. Other than backward compatibility, the only reason for using these functions is to start a new clean session within a single problem, which shouldn't be a common occurrence.

A note on Perl quoting rules

Beware of Perl's quoting rules when writing R code. The text in double quotes gets interpreted for escape sequences (e.g., "\n" represents a newline) and variables (e.g., "The value of pi is $pi[0]" will be interpolated into "The value of pi is 3.14159265358979", given the code above). This is a problem if you're trying to extract an element of a list by name using the "$" operator in R because the text following it will be interpreted as a variable. For instance, running rserve_eval("cars$speed") will not return the "speed" column of the standard "cars" dataset, because "$speed" in the string will be replaced by the value of the PG variable $speed, which if not yet defined will be empty string, so that the R code that actually gets executed is simply "cars". Instead, using single quotes, which prevent variable and escape sequence interpolation and instead keep the string exactly as entered: rserve_eval('cars$speed').

On the other hand, some time you actually might want variable interpolation to be done, for instance to construct the R code that uses values of variables constructed with PG functions. For instance:

Context("Numeric");

$pi = Real("pi");
@difference = rserve_eval("pi - $pi");

will calculate the difference between the value between R and PG's values of "pi" and put the result in the @difference array. Note that the same R code can be constructed using Perl's string concatenation operator dot ("."): rserve_eval('pi - ' . $pi). Personally, I recommend sticking with single quotes to prevent unwanted surprises, and using the dot operator if needing to include the value of a PG variable.

Displaying R graphics

R has excellent facilities for creating production-quality statistical graphics, from simple scatterplots to complex spatial visualizatons overlaid on geographical maps. These graphics can be produced in a variety of formats (in R parlance, devices), from the user's monitor to PDF or JPG files. The RserveClient allows the author to present these graphics in the question by bracketing the R graphing code with macros rserve_start_plot and rserve_finish_plot, and then showing the produced image in the question with PG's macro image (see the WeBWorK documentation).

The following code is a complete example

DOCUMENT();

loadMacros(
   "PGstandard.pl",     # Standard macros for PG language
   "MathObjects.pl",
   "RserveClient.pl",
);

# Print problem number and point value (weight) for the problem
TEXT(beginproblem());

#  Setup
Context("Numeric");

$mean = random(-2, 2, .5);

$img = rserve_start_plot('png');
rserve_eval('curve(dnorm(x, mean=' . $mean . '), xlim=c(-4, 4)); 0');
$image_path = rserve_finish_plot($img);

#  Text
Context()->texStrings;
BEGIN_TEXT

What is the mean of the normal distribution shown in the figure below:
\{ ans_rule(5) \}

$PAR

\{ image($image_path, width=>300, height=>300) \}:
END_TEXT

Context()->normalStrings;

#  Answers
ANS(Real($mean)->cmp);

ENDDOCUMENT();

The four key lines are as follow:

  1. $img = rserve_start_plot('png'): sets up R to plot to a 'PNG' file and returns a unique plot identifier to be used later.

  2. rserve_eval('curve(...)'): runs plotting commands on the R server

  3. $image_path = rserve_finish_plot($img): completes the plotting to the PNG file and transfers it to a location on the WeBWorK server. Returns the path of the file, which is stored in the Perl variable $image_path and later used as the first argument to the image macro.

  4. \{ image($image_path, width=>300, height=>300) \}: inserts the image into the web page.

Transferring files from the R server

Sometimes it may be convenient to make a file from an R server available to the student via a link in WeBWorK. (For instance, using R to generate a (potentially randomized) data file that the student can download to work on the problem offline.) The macro rserve_get_file REMOTE_NAME [, LOCAL_NAME] can be used to transfer the file REMOTE_NAME from the R server to WeBWorK's temporary file area, and returns the name of the local file that can then be used by the htmlLink macro. Specifying LOCAL_NAME is optional; if it is not specified, the filename portion of the REMOTE_NAME is used.

The following code is a complete example

DOCUMENT();

loadMacros(
   "PGstandard.pl",     # Standard macros for PG language
   "MathObjects.pl",
   "RserveClient.pl",
);

# Print problem number and point value (weight) for the problem
TEXT(beginproblem());

#  Setup
Context("Numeric");

my ($intercept, $slope) = rserve_eval('coef(lm(log(dist)~log(speed), data = cars))');

my ($remote_file) = rserve_eval('filename <- tempfile(fileext=".csv"); write.csv(cars, filename); filename');
my $local_file = rserve_get_file($remote_file);

($local_url = $local_file) =~ s|$tempDirectory|$tempURL|;

#  Text
Context()->texStrings;
BEGIN_TEXT

What is the slope of the linear regression of log-transformed stopping distance vs. car speed in the dataset linked below:
\{ ans_rule(5) \}

$PAR

\{ htmlLink($local_url, "Download") \} the problem data (CSV file).

END_TEXT

Context()->normalStrings;

#  Answers
ANS(Real($slope)->cmp);

ENDDOCUMENT();

The four key lines are as follow:

  1. my ($remote_file) = rserve_eval('filename <- tempfile(fileext=".csv"); write.csv(cars, filename); filename'): stores the desired dataset into a temporary CSV file on the R server and returns its path, which is stored in Perl variable $remote_file.

  2. my $local_file = rserve_get_file($remote_file): transfers the file from the R server to WeBWorK's temporary file area and returns its path, which is stored in Perl variable $local_file.

  3. ($local_url = $local_file) =~ s|$tempDirectory|$tempURL|: converts the local file path into a URL that can be used as an argument to the htmlLink macro, saving it in Perl variable $local_url.

  4. \{ htmlLink($local_url, "Download") \}: inserts the link to the downloaded file into the web page.