NeXL: Excel Connector for .NET

January 31, 2017

Using R in Excel with NeXL Connector and F# RProvider

Filed under: F#, R — Adam Mlocek @ 11:08 pm

Introduction

Excel is probably the most popular environment for data and statistical analysis but its functionality is quite limited when compared to R or other statistical software. In this post I would like to present a new way of accessing R functions in Excel which is easier and more productive than any other approach so far. I will start with showing a simple example of fitting a multiple linear model to a sample R data and then explain how it works.

Linear model with R lm() function

If you have Excel 2007 or later and R, e.g. Microsoft R Open, you can try it yourself. Simply download this spreadsheet: RNet.xlsm, right click to Unblock it and follow steps below.

In this case scenario I will try to fit a linear model to R mtcars dataset. This dataset is of course available locally if you have R installed on your machine but I will access it from a remote location here.

My RNet.xlsm spreadsheet has a few User Defined Functions which start with RNet. One of them is RNet.readTable which maps to a similar function in R. I will use it to download the data into Excel, passing as argument the remote location of my dataset:

readtable

As you can see the UDF does not actually return the data directly. Instead, it creates a unique string identifier for R DataFrame object. So how is it useful? We can use this object handle as an argument in other UDFs. Let’s start with showing the data:

showtable

I have passed the data frame object handle to RNet.asXlTable() UDF. This UDF can return the whole data frame without using Excel array formula which is very convenient in the usual case when we do not know the size of the data upfront.

You might be wondering now what will happen if we point to a different dataset? Let’s try that:

ShowTable2.png

I have changed the path from mtcars to airquality which is just another R sample dataset. As you can see a new object handle has been created and Excel has automatically cleared the previous data and updated it to show the new data frame. So in practice you can think of RNet.asXlTable as being an array function with automatic resizing.

Now, how do we add a linear model? In R we would use lm() function. In the spreadsheet there is a RNet.lm() UDF which maps to the R counterpart. It needs a string formula which defines the model and a data frame. Let’s revert to mtcars dataset and try it:

Model.png

We get a new object handle, this time of type SymbolicExpression. This is how R objects are usually represented in .NET and returned to Excel. So how can we use this handle? First, we can peek inside:

ModelMembers.png

I have used the RNet.asXlTable() UDF again. You can pass any SymbolicExpression object to it and it will try to show it in a table format. In this case our model is an R list and we can view all list members and their type. We can also drill down and show a specific list member, e.g. “coefficients”:

coefficients

And so we get a whole R based model with just a few formulas. We can now use Excel recalculation engine to fit different models by just changing the formula in B4:

Coefficients2.png

As you can see the model object handle has been recreated and a new set of coefficients has been automatically shown.

R has of course many more functions. We could expose all or some of them by creating a UDF for each but there is also another way of doing that. For any R function we can create a function object handle and then apply it to input arguments with special UDFs:

Function.png

RNet.eval() is a special UDF which can parse and evaluate R function, including R function inline definition, and return it as an object handle to Excel. In this case I get an object which represents R lm() function. We can now apply it and recreate our linear model:

ModelInvoked.png

I have simply passed my formula and data frame as named arguments to RNet.invokeNamed() UDF. The first argument needs to be a function handle. Alternatively, I can pass the arguments by position:

modelinvoked2

I have used RNet.invoke() UDF which expects the arguments to be passed by position. The function handle is still passed as the first argument. We can now verify that the model has actually been created:

Coefficients3.png

The UDFs in my RNet spreadsheet are embedded in the spreadsheet and will not be available once I close it. If you want to use them in other spreadsheets then you can simply activate it as Excel Customization:

Menu.png

How does it work?

The spreadsheet is powered by NeXL Connector and F# R Type Provider. NeXL is available as a Nuget package and allows me to embed .NET functions as UDFs in Excel spreadsheets.It is free to use, including commercial use (CC 4.0 License), but requires permission to redistribute. With F# R Type Provider I can easily access R functionality from .NET.

All UDFs in RNet spreadsheet are defined in an F# project. The project is open source and you can find it on Github. You can clone and build it on your local machine.

I have used VS Code and F# Ionide plugin to create the project. Any R distribution should be fine, I have used Microsoft R Open.

The UDFs are defined in a module RNet. The module is marked with [<XlQualified(true)>] attribute so that their name in Excel always starts with RNet for easy discovery.

Most UDFs are very simple and call R functions directly, e.g.:

let lm(formula : string, dataFrame : DataFrame) =
    R.lm(formula = formula, data = dataFrame)

F# R Type Provider exposes R function lm() and I simply pass input arguments to it. It returns a SymbolicExpression object which is a .NET representation of the R model object. Note that the 2nd argument is a DataFrame so in Excel we first have to create an object handle of that type. This is of course done in readTable() function:

let readTable(file : string, separator : string option,
              headers : bool option, rowNamesCol : int option) =
    let separator = defaultArg separator ","
    let headers = defaultArg headers true
    let rowNamesCol = defaultArg rowNamesCol 1
    let prms = namedParams ["file", box file; "sep", box separator;
                            "header", box headers; "row.names", box rowNamesCol]
    let frame = R.read_table(prms)
    match frame with
        | DataFrame(df) -> df
        | _ -> raise (new InvalidOperationException("Not a data frame"))

Again, this function simply calls R read_table(). Note there are some optional arguments. If you do not specify those in Excel then None is automatically passed to the function. The R function is called with named parameters this time so I create them with a convenience function namedParams. It returns a DataFrame as expected.

There are a few similar functions which map directly to R, including the higher order functions for applying a function handle to arguments: invoke() and invokeNamed().

The most interesting is the function asXlTable(). As you can see it expects a SymbolicExpression and optionally a list member. Using F# active patterns we can then process the object and convert it to XlTable object. XlTable represents a table of data which can be shown in an Excel range and is defined in NeXL. It is a special type which signals to NeXL runtime that the result is a table and should be sent to a range below and to the right of the calling cell. A comment is also automatically added to the calling cell, e.g. XlTable 2 x 3.

Finally, the project is built with Fake. The script is in build.fsx. The “Build” target will build the project so that we get a .NET assembly with all UDFs inside. Then “Embed” target will package NeXL Runtime, UDF assembly and RProvider assemblies into a macro enabled Excel spreadsheet. This spreadsheet is self contained and can be used on any machine with Excel 2007 or later, .NET and R.

7 Comments »

  1. Very nice! Using the repo as a template, I was able to embed my own user-defined functions written in F# into an Excel file.
    (I had to delete the previous FCell addins, otherwise Excel would crash).
    One question: Say I have an F# function that has as input a Deedle Frame. Is it possible to pass a table in Excel as an argument?

    Comment by Jesus González — February 1, 2017 @ 7:34 am

    • Great:)
      Ideally you would have a function which returns a Deedle frame into Excel as an object handle. You can pass it then to other functions directly.
      If your data is already in Excel range then you will need to create a conversion function similar to convToSymExpr : XlData -> SymbolicExpression in NeXL.RNet project.
      You can also create a function which converts Deedle frame into NeXL XlTable. There is an example in NeXL.IMF repo on Github.

      Adam

      Comment by Adam Mlocek — February 1, 2017 @ 11:38 am

      • Thanks! Again, it worked! I’m able to use functions that require a Deedle.Frame
        One more question, is there support for Excel dates? I don’t see a XlDate, but I see an XlConverter… Do you have an example of parsing Excel dates?

        Comment by Jesús González — February 3, 2017 @ 6:31 am

      • Great:)
        All primitive data types (DateTime, bool, string, float, int…) and their vectors/2d arrays/Nullables/options are converted automatically.

        Adam

        Comment by Adam Mlocek — February 3, 2017 @ 8:19 am

      • “All primitive data types (DateTime, bool, string, float, int…) and their vectors/2d arrays/Nullables/options are converted automatically.”
        But, if I need to do pattern matching (from an excel value to an F# value), what is the similar of, say, XlNumeric(number) but for an Excel Date?

        Comment by Jesús González — February 3, 2017 @ 8:22 am

  2. Dates in Excel are just numbers with optional cell format. You cannot get the cell format into UDFs because Excel C API only gets the cell value which is simply numeric.
    So you can only guess if it is a date or a number.

    Comment by Adam Mlocek — February 3, 2017 @ 8:30 am

  3. […] Using R in Excel with NeXL Connector and F# RProvider – Adam Mlocek […]

    Pingback by F# Weekly #6, 2017 – Sergey Tihon's Blog — February 4, 2017 @ 8:02 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: