Labels

R (15) Admin (12) programming (11) Rant (6) personal (6) parallelism (4) HPC (3) git (3) linux (3) rstudio (3) spectrum (3) C++ (2) Modeling (2) Rcpp (2) SQL (2) amazon (2) cloud (2) frequency (2) math (2) performance (2) plotting (2) postgresql (2) DNS (1) Egypt (1) Future (1) Knoxville (1) LVM (1) Music (1) Politics (1) Python (1) RAID (1) Reproducible Research (1) animation (1) audio (1) aws (1) data (1) economics (1) graphing (1) hardware (1)

13 June 2013

Secure webserver on the cheap: free SSL certificates

Setting up an honest, fully-certified secure web server (e.g. https) on the cheap can be tricky, mainly due to certificates. Certificates are only issued to folks who can prove they are who they say they are. This verification generally takes time and energy, and therefore money. But the great folks at https://www.startssl.com/ have an automated system that verifies identity and auto-renders associated SSL certificates for free.

Validating an email is easy enough, but validating a domain is trickier -- it requires a receiving mailserver that startssl can mail a verification code to. Inbound port 25 (mail server) is blocked by my ISP, the University of New Mexico (and honestly, I'd rather not run an inbound mail server).

I manage my personal domain through http://freedns.afraid.org/. They provide full DNS management, as well as some great dynamic DNS tools. They're wonderful. But they don't provide any fine-grained email management, just MX records and the like.

The perfect companion of afraid.org is https://www.e4ward.com/. They have mail servers that will conditionally accept mail for specific addresses at personal domain, and forward that mail to an email account. This lets me route specific addresses @mydomain.com, things like postmaster@mydomain.com, to my personal gmail account. E4ward is a real class-act. They manually moderate/approve new accounts, so there's a bit of time lag. To add a domain, they also require proof of control via a TXT record (done through afraid.org).

This whole setup allowed me to prove that I owned my domain to startssl.com without running a mail server or paying for anything other than the domain. The result is my own SSL certificates. I'm running a pylons webapp with apache2 and mod_wsgi. In combination with python's repoze.what, I get secure user authentication over https without any snakeoil.

Hat-tip to this writeup, which introduced me to e4ward.com and their mail servers.

Finally, there are a number of online tools to query domains. dnsstuff.com was one of the better ones I found. It takes longer to load, but gives a detailed report of domain configuration, along with suggestions. A nice tool to verify that everything is working as expected.


11 June 2013

Learning new fileserver tricks: RAID + LVM

I've finally gotten comfortable with linux's software raid, aka mdadm. I've been hearing about LVM, and I finally took the plunge and figured out how to get the two to play together. Of course, a benefit of RAID is data security. The big benefit I see from LVM is getting to add/remove disk space without repartitioning. Once RAID is working, stacking LVM on top was easy enough, especially for my use case of a single-big-filesystem. I was able to move all my data onto one RAID array, built a new filesystem on top of a logical volume, move data to the new filesystem, and then add the final RAID array to the logical volume and resize the filesystem. Thus, I end up with 3 separate RAID arrays glommed together into a single, large filesystem.

## Tell LVM about RAID arrays 
sudo pvcreate /dev/md2
sudo pvcreate /dev/md3

## Create a volume group from empty RAID arrays
sudo vgcreate VolGroupArray /dev/md2 /dev/md3

## Create a logical volume named "archive", using all available space 
sudo lvcreate -l +100%FREE VolGroupArray -n archive
sudo lvdisplay 
## and create a filesystem on the new logical volume 
sudo mkfs.ext4 /dev/VolGroupArray/archive

## mount the new filesystem
## and move files from the mount-point of /dev/md1 to /dev/VolGroupArray/archive
## then unmount /dev/md1

## Add the last RAID array to the volume group
sudo pvcreate /dev/md1
sudo vgextend VolGroupArray /dev/md1

## Update the logical volume to use all available space 
sudo lvresize -l +100%FREE /dev/VolGroupArray/archive
## And resize the filesystem -- rather slow, maybe faster to unmount it first...
sudo resize2fs /dev/VolGroupArray/archive

## Finally, get blkid and update /etc/fstab with UUID and mount options (here, just noatime)
sudo blkid

I probably should have made backups before I did this, but everything went smoothly...
Also, I discovered this python tool to do conversions in-place. Again, this appears non-destructive, but back-ups never hurt. Also of interest for a file server is Smartmontools to monitor for hardware/disk failures: a nice review is here.

[REFS]
* http://home.gagme.com/greg/linux/raid-lvm.php
* https://wiki.archlinux.org/index.php/Software_RAID_and_LVM
* http://webworxshop.com/2009/10/10/online-filesystem-resizing-with-lvm

07 June 2013

Symmetric set differences in R

My .Rprofile contains a collection of convenience functions and function abbreviations. These are either functions I use dozens of times a day and prefer not to type in full:
## my abbreviation of head()
h <- function(x, n=10) head(x, n)
## and summary()
ss <- summary
Or problems that I'd rather figure out once, and only once:
## example:
## between( 1:10, 5.5, 6.5 )
between <- function(x, low, high, ineq=F) {
    ## like SQL between, return logical index
    if (ineq) {
        x >= low & x <= high
    } else {
        x > low & x < high
    }
}
One of these "problems" that's been rattling around in my head is the fact that setdiff(x, y) is asymmetric, and has no options to modify this. With some regularity, I want to know if two sets are equal, and if not, what are the differing elements. setequal(x, y) gives me a boolean answer to the first question. It would *seem* that setdiff(x, y) would identify those elements. However, I find the following result rather counter-intuitive:
> setdiff(1:5, 1:6) 
integer(0)
I personally dislike having to type both setdiff(x,y) and setdiff(y,x) to identify the differing elements, as well as remember which is the reference set (here, the second argument, which I find personally counterintuitive). With this in mind, here's a snappy little function that returns the symmetric set difference:
symdiff <- function( x, y) { setdiff( union(x, y), intersect(x, y))}
> symdiff(1:5, 1:6) == symdiff(1:6, 1:5)
[1] TRUE

Tada! A new function for my .Rprofile!