Build a R Package with Rcpp in 30 minutes

Example : Poisson LASSO Regression

Rcpp is package with which you could write c++ code within R.

You could simply write a inline C++ code in R like this:

library(Rcpp)
# Define a Rcpp function
add_vec <- Rcpp::cppFunction(
  
  'Rcpp::NumericVector add_vec( Rcpp::NumericVector a, Rcpp::NumericVector b){
  
  return a+b;
  
  }'
  
)
# See the result
t(add_vec(c(1,2,3), c(2,3,4)))

Or you could save your .cpp file and call the function from R. With this powerful tool, you gain the productivity of R and the speed of C++.

Let's try to build a small package with Rcpp . Take the Poisson LASSO regression as an example. I'm gonna to install this model with ADMM algorithm and through RcppArmadillo package.

So let's get start:

Prerequisite

  • R
  • Rcpp
  • RcppArmadillo

Build the Skeleton of the Package

Rcpp provides function that could help you build a package very quickly. Execute next line will build a package skeleton named “PoisLASSO” in you working directory:

Rcpp.package.skeleton('PoisLASSO') # Your package name should go there

In the folder PoisLASSO, there is a folder called src where you should put all your C++ files there. Let's build a new C++ file called "*PoisLASSO.cpp*". And we'll put our function in this file.

Poisson LASSO Regression and ADMM algorithm

I'll skip the detail of this algorithm. But the main idea is to iteratively optimize the objective function and its conjugate problem. The objective function is solved with Lagrange multipler method with augmented variable. And the algorithm is presented as below:

  
  #include <iostream>
  #include <RcppArmadillo.h>
  //[[Rcpp::depends(RcppArmadillo)]]
  using namespace Rcpp;
  using namespace arma;
  using namespace std;
  
  arma::vec ADMM_Lasso( arma::mat X, arma::colvec Y, double lambda, double eta){
  int nVar  = X.n_cols;
  int marker = 1;
  double tol   = 0.01;
  double tol_1 = 0.1;
  double tol_2 = 0.1;
  int terminate = 0;
  int converge  = 0;
  // Initialization
  arma::vec b_gd    = rnorm(X.n_cols, 0, 1) ;
  arma::vec z_admm  = rnorm(X.n_cols, 0, 1) ;
  arma::vec z_old = z_admm;
  arma::vec mu_admm = rnorm(X.n_cols, 0, 1) ;
  // Using ADAM GD method
  NumericVector m_temp;
  NumericVector v_temp;
  arma::vec m;
  arma::vec v;
  arma::vec steps;
  
  while(!terminate){
    m_temp = rep(0,X.n_cols);
    m = as<arma::vec>(m_temp); 
    v_temp = rep(0,X.n_cols);
    v = as<arma::vec>(v_temp);
    converge  = 0;
    int marker_inner = 0;
    
    // Optimize Beta
    while(!converge){
      arma::vec G_1(X.n_cols);
      arma::vec G_2(X.n_cols);
      arma::vec G_3(X.n_cols);
      arma::vec G(X.n_cols);
      for(int i=0; i<nVar; i++){
        G_1(i) = sum(X.col(i) % Y);
        G_2(i) = sum(exp(X * b_gd) % X.col(i));
      }
      G_3 = (b_gd - z_admm) + mu_admm;
      G = G_1 - G_2 + G_3;
      m = eta * 0.1*m + eta * 0.9*G;
      v = 0.1*v + 0.9*G % G;
      steps = m % pow(v, -0.5);
      b_gd  = b_gd  +  steps;
      Rcout << "b_gd : " << trans(b_gd) << std::endl;
      marker_inner = marker_inner + 1;
      
      if( max(steps) < tol || marker_inner > 20 ){
        converge = 1;
        break;
      }
    }
    // Soft Threshhold Z
    for(int i=0; i<nVar; i++){
      if(b_gd(i)+mu_admm(i) > lambda){
        z_admm(i) = b_gd(i) + mu_admm(i) - lambda;
      }
      else if(b_gd(i)+mu_admm(i) < -lambda){
        z_admm(i) = b_gd(i) + mu_admm(i) + lambda;
      }
      else{
        z_admm(i) = 0;
      }
    }
    
    // For monitor reason
    Rcout << "z_admm : "  << trans(z_admm) << endl;
    // Updating augmented mu
    mu_admm = b_gd - z_admm + mu_admm;
    // For monitor reason
    Rcout << "mu_admm : " << trans(mu_admm) << endl;
    // Check Convergence
    int cond_1 = (max(abs(b_gd - z_admm))  < tol_1);
    int cond_2 = (max(abs(z_admm - z_old)) < tol_2);
    if((cond_1==1) && (cond_2==1)){
      terminate = 1;
      break;
    }
    z_old = z_admm;
    marker = marker + 1;
    Rcout << "Iteration : " << marker << std::endl;
  }
  Rcpp::Rcout << trans(b_gd)    << std::endl ;
  Rcpp::Rcout << trans(z_admm)  << std::endl ;
  Rcpp::Rcout << trans(mu_admm) << std::endl ;
  return(b_gd);
}

Build the Package and Upload to Github

Go to the upper directory of your current directory, which contain the PoissLASSO folder. And open your command line (you could do it with devtools as well), and write:

R CMD build PoissLASSO
R CMD check PoissLASSO

If everything is OK, you could now install your package locally devtools::install('PoissLASSO') .

And upload your package to Github with:

git init
git add .
git remote add -v "Your Repo URL go here"
git commit -m "First Commit"
git push -u origin master

And after finishing this, you should be able to download your own package with

devtools::install_github('Your Git Account Name/Your repo name')
library('PoissLASSO')
comments powered by Disqus