R (15) Admin (12) programming (11) Rant (6) personal (6) parallelism (4) HPC (3) git (3) linux (3) rstudio (3) spectrum (3) C++ (2) Modeling (2) Rcpp (2) SQL (2) amazon (2) cloud (2) frequency (2) math (2) performance (2) plotting (2) postgresql (2) DNS (1) Egypt (1) Future (1) Knoxville (1) LVM (1) Music (1) Politics (1) Python (1) RAID (1) Reproducible Research (1) animation (1) audio (1) aws (1) data (1) economics (1) graphing (1) hardware (1)

28 February 2012

Adventures in R Studio Server: Apache2, Https, Security, and Amazon EC2.

I just put a fresh install of Ubuntu Server (10.04.4 LTS) on one of our machines.  As I was doing some post-install config, I accidentally installed Rstudio Server.  And subsequently fell down an exciting little rabbit-hole of server configuration and "ooooh-lala!" playtime.

A friend sung the wonders of Rstudio Server to me recently, and I filed it under "things to ignore for now".  Just another thing to learn, right?  Turns out, the Rstudio folks do *great* work and write good docs, so I hardly had to learn anything.  I just had to dust off my sysadmin skills and fire up some google.

I'm a little concerned about running web services on public-facing machines.  Even more so, given that R provides fairly low-level access to operating system services.  Still, I was impressed to see system user authentication.

I followed the docs for running apache2 as a proxy server, and learned a little about apache in the process.  Since I made it this far, I figured I'd run it through https/ssl, add some memory limitations, etc.  I'm still not entirely convinced this is secure -- it seems that running it in a virtual machine or chroot jail would be ideal.

 On the other hand, I ran across this post on running Rstudio Server inside Amazon EC2 instances.  Nighttime EC2 spot prices on "Quadruple Extra Large" instances (68.4 GB of memory,
8 virtual cores with 3.25 EC2 Compute Units each) fell below $1 an hour tonight, which is cheap enough to play with for an hour or two -- take it through some paces and see how well it does with a *very* *large* *job* or two.  Instances can now be stopped and saved to EBS (elastic block storage), and so only need to be configured once, which really simplifies matters. In fact, I'm wondering if Rstudio (well, R, really) is my "killer app" for EC2. 

Overall, I was really impressed at how fast and easy this was to get up and running. Fun times ahead!

1 comment:

  1. Currently I work for DEll and thought your blog is really impressive. I think server is a computer or device on a network that manages network resources.