package-cache

Available in a git repository.
Repository: package-cache
Browsable repository: package-cache
Author: W. Trevor King

I've been building a lot of Docker containers recently, using my Dockerfile framework. Building Gentoo containers from a seed stage3 is nice, but debugging iterations are a bit slow when you have to fetch all the source from distant the mirrors. To work around this problem, I've written package-cache which handles on-demand caching for content from the Gentoo mirrors. Now you can setup a locale distfiles cache as easily as you can already setup an rsync mirror for the Portage tree. I've added a net-proxy/package-cache package to my wtk overlay, and the README (on PyPI) has instruction on setting this up locally. There are also some Docker-specific notes in my dockerfile repository.

Bower

Bower is a package manager for the user-facing website content (JavaScript, CSS, media, …). It's less JavaScript focused than Jam, so it can't compile optimized JavaScript. That's not a big loss though, because you can always use the RequireJS optimizer directly.

Following the docs (with my already-configured ~/.local prefix):

$ npm install -g bower

Install some stuff in your project:

$ bower install bootstrap d3 requirejs

This automatically installs dependencies like jQuery, as you'd expect from a package manager:

$ bower list
bower check-new     Checking for new versions of the project dependencies..
tmp /tmp
├─┬ bootstrap#3.0.0 extraneous
│ └── jquery#2.0.3
├── d3#3.3.4 extraneous
└── requirejs#2.1.8 extraneous

The package tree is flat:

$ ls bower_components/
bootstrap  d3  jquery  requirejs

Unlike Jam, you get a whole host stuff along with the JavaScript.

$ tree bower_components/boostrap
bower_components/bootstrap/
├── …
├── LICENSE
├── README.md
├── …
├── assets
│   ├── css
│   │   └── …
│   ├── ico
│   │   └── …
│   └── js
│       └── …
├── …
├── dist
│   ├── css
│   │   └── …
│   ├── fonts
│   │   └── …
│   └── js
│       ├── bootstrap.js
│       └── bootstrap.min.js
├── examples
│   ├── carousel
│   │   ├── carousel.css
│   │   └── index.html
│   └── …
├── fonts
│   └── …
├── getting-started.html
├── index.html
├── javascript.html
├── js
│   └── …
├── less
│   └── …
└── package.json

That's a lot of stuff! Define your dependencies for reuse with a bower.json file:

{
  "name":  "myproject",
  "version": "0.0.1",
  "dependencies": {
    "bootstrap": "3.0.0",
    "d3": "3.3.4",
    "requirejs": "latest"
  }
}

Now that you've listed the dependencies, Bower no longer thinks they are “extraneous”:

$ bower list
bower check-new     Checking for new versions of the project dependencies..
myproject#0.0.1 /tmp
├─┬ bootstrap#3.0.0
│ └── jquery#2.0.3
├── d3#3.3.4
└── requirejs#2.1.8
Salt Stack

Salt is a remote execution and automated deployment system. It's great for running your own clusters once your website outgrows a single box. If you get bored of running your own boxes, you can use salt-cloud to provision minions on someone else's cloud (Amazon EC2, Linode, …). You can install Salt on Gentoo with:

# USE=git emerge -av app-admin/salt

Usually you'll have one master, and a host of minions running salt daemons that locally execute commands sent from the master. After setting up BIND so salt (the default master name) resolves to your development box, you should be able to run:

# rc-service salt-master start
# rc-service salt-minion restart
# salt-key -L
Accepted Keys:
Unaccepted Keys:
devbox.example.net
Rejected Keys:
# salt-key -A
The following keys are going to be accepted:
Unaccepted Keys:
devbox.example.net
Proceed? [n/Y] y
Key for minion devbox.example.net accepted.

If you were not confined to the local box, it would be wise to compare the proposed key:

# salt-key -p devbox.example.net

with that on the minon itself:

# cat /etc/salt/pki/minion/minion.pub

before accepting the key.

Once you have accepted the minon, ping it:

# salt '*' test.ping
devbox.example.net:
    True

Then you can browse through all of the available goodies:

# salt '*' sys.doc

Once you've had some fun reading about those, it's time to configure your state tree. For a quick intro, we can just borrow the salt-state example repository. The salt state data is conventionally kept in /srv/salt, which seemed odd to me, but does indeed follow the FHS.

# mkdir /srv/
$ git clone git://github.com/saltstack/salt-states.git
# mv salt-states /srv/salt

This leaves /srv/salt owned by my personal user (instead of root), because as much as I love Git, I'm not going to run it as root.

Once you've got a state tree in /srv/salt, you can mock-install the configured state for each node. It's always a good idea to test your commands before you run them, to make sure they won't do something wonky.

# salt '*' state.highstate test=True
devbox.example.net:
----------
    State: - file
    Name:      /etc/hosts
    Function:  comment
        Result:    None
        Comment:   File /etc/hosts is set to be updated
        Changes:
----------
    State: - file
    Name:      /etc/hosts
    Function:  uncomment
        Result:    True
        Comment:   Pattern already uncommented
        Changes:
----------
    State: - cmd
    Name:      date > /tmp/date
    Function:  run
        Result:    None
        Comment:   Command "date > /tmp/date" would have been executed
        Changes:
----------
…

You can also install a particular sub-state on a particular minon (again, I'm showing the testing version):

# salt 'devbox.example.net' state.sls python,ssh.server test=True
devbox.example.net:
----------
    State: - pkg
    Name:      openssh
    Function:  installed
        Result:    False
        Comment:   Package category missing for "openssh" (possible matches: net-misc/openssh).
        Changes:   
----------
    State: - pkg
    Name:      python-mako
    Function:  installed
        Result:    False
        Comment:   Package category missing for "python-mako" and no match found in portage tree.
        Changes:   

----------
…

The comments (Package category missing for…) mean that the salt-states repository hasn't been updated to recent versions of Salt (0.12+), which require fully qualified package names.

You can also install a particular ID declaration on a particular minon (again, I'm showing the testing version):

# salt 'devbox.example.net' state.sls python,ssh.server test=True

For single-box testing, you can also skip the master node, running commands on a masterless minion by using salt-call --local instead of salt '<target>' in your Salt invocations:

# salt-call --local state.highstate test=True
local:
----------
    State: - file
    Name:      /etc/hosts
    Function:  comment
        Result:    None
        Comment:   File /etc/hosts is set to be updated
        Changes:   
----------
…

Because you don't have a master passing you state, --local calls require you to have the state stored on your local box (in /srv/salt by default). It's hard to imagine using Salt without storing state anywhere ;).

It's also possible to run Salt as a non-root user, but I haven't looked into that yet.

Docker

Docker is a tool for managing Linux containers (LXC). LXC allows you to provision multiple, isolated process-spaces on the same kernel, which is useful for creating disposable development/staging/production environments. Docker provides a sleek frontend for container managerment. Like Salt Stack, it uses a daemonized manager with a command-line client.

Installation

To avoid the aufs3 dependency, I wanted Docker v0.7+. The main portage tree is currently at Docker v0.6.6, so I installed the docker overlay:

# layman -a docker

I also wanted Go v1.2+ to avoid a “slice bounds out of range” on /usr/lib/go/src/pkg/archive/tar/writer.go:233 (fixed upstream in 0c7e4c4, archive/tar: Fix support for long links and improve PAX support, 2013-08-18. Thanks to Jonathan Stoppani for help figuring this out). That version is not in the main Portage tree yet, but it is in the OSSDL overlay:

# layman -a OSSDL

After that, installing Docker on Gentoo is the usual:

# emerge -av app-emulation/docker

After reading through a few docs about IP forwarding, I determined that my internal development box could safely enable IP forwarding, which containers use to connect to the outside network. I enabled it for subsequent reboots:

# echo net.ipv4.ip_forward = 1 > /etc/sysctl.d/docker.conf

I also had to enable the following additional kernel options (as suggested by app-emulation/lxc): CONFIG_CGROUP_DEVICE, CONFIG_USER_NS, CONFIG_DEVPTS_MULTIPLE_INSTANCES, CONFIG_VETH, and CONFIG_MACVLAN. There's also the convenient lxc-checkconfig script distributed with app-emulation/lxc, which pointed out the need for CONFIG_CGROUP_MEM_RES_CTLR (renamed to CONFIG_MEMCG in c255a45 (memcg: rename config variables, 2012-07-31), released in Linux v3.6, lx-checkconfig was updated in c93c7b1, Fix checkconfig to handle kernel memory cgroup name change, 2012-11-14, released in LXC v0.9.0) and CONFIG_VLAN_8021Q. On top of those, app-emulation/docker recommended CONFIG_BRIDGE, CONFIG_NETFILTER_XT_MATCH_ADDRTYPE, CONFIG_NF_NAT, CONFIG_NF_NAT_NEEDED, CONFIG_IP_NF_TARGET_MASQUERADE (since 045eb9f (another necessary kernel flag, 2013-12-09), in a fast response to my comment), and CONFIG_DM_THIN_PROVISIONING. These are the new docker-supporting lines in my .config for Linux v3.10:

CONFIG_CGROUP_DEVICE=y
CONFIG_MEMCG=y
CONFIG_USER_NS=y
CONFIG_UIDGID_STRICT_TYPE_CHECKS=y
CONFIG_MM_OWNER=y
CONFIG_NETFILTER_ADVANCED=y
CONFIG_BRIDGE_NETFILTER=y
CONFIG_NF_NAT=m
CONFIG_NF_NAT_NEEDED=y
CONFIG_NF_NAT_FTP=m
CONFIG_NF_NAT_IRC=m
CONFIG_NF_NAT_SIP=m
CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m
CONFIG_NF_NAT_IPV4=m
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_NF_NAT_IPV6=m
CONFIG_IP6_NF_TARGET_MASQUERADE=m
CONFIG_STP=m
CONFIG_BRIDGE=m
CONFIG_BRIDGE_IGMP_SNOOPING=y
CONFIG_VLAN_8021Q=m
CONFIG_LLC=m
CONFIG_DM_BUFIO=m
CONFIG_DM_BIO_PRISON=m
CONFIG_DM_PERSISTENT_DATA=m
CONFIG_DM_THIN_PROVISIONING=m
CONFIG_MACVLAN=m
CONFIG_VETH=m
CONFIG_DEVPTS_MULTIPLE_INSTANCES=y

After that, I rebuilt my kernel, rebooted into the new kernel, and started the daemon with:

# /etc/init.d/docker start

Command line usage

Docker does a good job with interactive help:

$ docker
Usage: Docker [OPTIONS] COMMAND [arg...]
-H="127.0.0.1:4243": Host:port to bind/connect to

A self-sufficient runtime for linux containers.

Commands:

attach    Attach to a running container
build     Build a container from a Dockerfile
commit    Create a new image from a container's changes
…

Docker images are archived for easy reuse, so it's likely someone has the image you need (or a good start) already built. Docker's images consist of a number of layers on top of the base image:

$ docker pull learn/tutorial
Pulling repository learn/tutorial from https://index.docker.io/v1
Pulling image 8dbd9e392a964056420e5d58ca5cc376ef18e2de93b5cc90e868a1bbc8318c1c (precise) from ubuntu
Pulling image b750fe79269d2ec9a3c593ef05b4332b1d1a02a62b4accb2c21d589ff2f5f2dc (12.10) from ubuntu
Pulling image 27cf784147099545 () from tutorial

The image has everything needed by a process (filesystem, system libraries, …), so you spin up the container by specifiying an image and target command.

$ docker run learn/tutorial echo hello world
hello world

Changes made to a container are preserved between runs:

$ docker run learn/tutorial apt-get install -y ping

You can checkpoint changes by committing, which adds a new layer:

$ docker ps -l
ID            IMAGE         COMMAND               CREATED       STATUS  PORTS
6982a9948422  ubuntu:12.04  apt-get install ping  1 minute ago  Exit 0
$ docker commit 6982a99 learn/ping
effb66b31edb
$ docker run learn/ping ping www.google.com
64 bytes from nuq05s02-in-f20.1e100.net (74.125.239.148): icmp_req=1 ttl=55 time=2.23 ms
64 bytes from nuq05s02-in-f20.1e100.net (74.125.239.148): icmp_req=2 ttl=55 time=2.30 ms
^C

You can list running containers:

$ docker ps
ID            IMAGE              COMMAND              CREATED         STATUS         PORTS
efefdc74a1d5  learn/ping:latest  ping www.google.com  37 seconds ago  Up 36 seconds
$ docker inspect efefdc7
[2013/07/30 01:52:26 GET /v1.3/containers/efef/json
{
  "ID": "efefdc74a1d5900d7d7a74740e5261c09f5f42b6dae58ded6a1fde1cde7f4ac5",
  "Created": "2013-07-30T00:54:12.417119736Z",
  "Path": "ping",
  "Args": [
      "www.google.com"
  ],
  "Config": {
      "Hostname": "efefdc74a1d5",
      "User": "",
      "Memory": 0,
      "MemorySwap": 0,
      "CpuShares": 0,
      "AttachStdin": false,
      "AttachStdout": true,
      "AttachStderr": true,
      "PortSpecs": null,
      "Tty": false,
      "OpenStdin": false,
      "StdinOnce": false,
      "Env": null,
      "Cmd": [
          "ping",
          "www.google.com"
      ],
      "Dns": null,
      "Image": "learn/ping",
      "Volumes": null,
      "VolumesFrom": "",
      "Entrypoint": null
  },
  "State": {
      "Running": true,
      "Pid": 22249,
      "ExitCode": 0,
      "StartedAt": "2013-07-30T00:54:12.424817715Z",
      "Ghost": false
  },
  "Image": "a1dbb48ce764c6651f5af98b46ed052a5f751233d731b645a6c57f91a4cb7158",
  "NetworkSettings": {
      "IPAddress": "172.16.42.6",
      "IPPrefixLen": 24,
      "Gateway": "172.16.42.1",
      "Bridge": "docker0",
      "PortMapping": {
          "Tcp": {},
          "Udp": {}
      }
  },
  "SysInitPath": "/usr/bin/docker",
  "ResolvConfPath": "/etc/resolv.conf",
  "Volumes": {},
  "VolumesRW": {}
}

And you can push your local images to the repository:

$ docker images
ubuntu          latest  8dbd9e392a96  4 months ago    131.5 MB (virtual 131.5 MB)
learn/tutorial  latest  8dbd9e392a96  2 months ago    131.5 MB (virtual 131.5 MB)
learn/ping      latest  effb66b31edb  10 minutes ago  11.57 MB (virtual 143.1 MB)
$ docker push learn/ping

You can also run interactive shells:

$ docker pull ubuntu
$ docker run -i -t ubuntu /bin/bash

You can also set environment variables, which are useful for customizing generic images:

$ docker run -e HOST=example.net -e PORT=1234 ubuntu

Dockerfiles

Instead of building containers manually, you can also specify them with a Dockerfile (spec):

$ cat gentoo-portage/Dockerfile
FROM tianon/gentoo
MAINTAINER W. Trevor King, wking@tremily.us
RUN echo 'GENTOO_MIRRORS="http://mirror.mcs.anl.gov/pub/gentoo/"' >> /etc/portage/make.conf
#RUN echo 'SYNC="rsync://rsync.us.gentoo.org"' >> /etc/portage/make.conf
RUN mkdir -p /usr/portage
RUN emerge-webrsync
RUN emerge --sync --quiet
RUN eselect news read new
$ docker build -t wking/gentoo-portage gentoo-portage
…
$ cat gentoo-portage-en-us/Dockerfile
FROM wking/gentoo-portage
MAINTAINER W. Trevor King, wking@tremily.us
RUN echo en_US ISO-8859-1 > /etc/locale.gen
RUN echo en_US.UTF-8 UTF-8 >> /etc/locale.gen
RUN locale-gen
RUN echo 'LANG="en_US.UTF-8"' >> /etc/env.d/02locale
RUN env-update
$ docker build -t wking/gentoo-portage-en-us gentoo-portage-en-us
$ cat gentoo-portage-en-us-syslog/Dockerfile
FROM wking/gentoo-portage-en-us
MAINTAINER W. Trevor King, wking@tremily.us
RUN emerge -v sys-process/vixie-cron app-admin/syslog-ng app-admin/logrotate
RUN rc-update add syslog-ng default
RUN rc-update add vixie-cron default
$ docker build -t wking/gentoo-portage-en-us-syslog gentoo-portage-en-us-syslog

You can't currently set the description (returned by docker search …) from the Dockerfile, although there are some proposals to add this feature.

You can mount volumes from the Dockerfile using VOLUME, but it doesn't support host mounts. For those you have to use the -v option when you invoke docker run …:

$ docker run -i -t -v /usr/portage:/usr/portage:ro -v /usr/portage/distfiles:/usr/portage/distfiles:rw wking/gentoo-portage /bin/bash

Bridging

Docker sets up a docker0 bridge between the host's network and the containers. Docker tries to guess an IP range that does not conflict with your local network, but it's not omniscient. Until 1558 is fixed, you're best bet is to set up your own bridge. If you already have a docker0 bridge:

$ ip addr show dev docker0
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN 
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
    inet 10.0.42.1/16 scope global docker0
       valid_lft forever preferred_lft forever
…

then you can just change it's address:

# ip addr del 10.0.42.1/16 dev docker0
# ip addr add 172.31.42.1/16 dev docker0

If the docker0 bridge doesn't already exist, you'll have to create it yourself:

# brctl addbr docker0
# ip addr add 172.31.0.1/16 dev docker0
# ip link set dev docker0 up

If you want to start over from scratch, you can stop docker and remove the bridge:

# /etc/init.d/docker stop
# ip link set dev docker0 down
# brctl delbr docker0

Linking

You can link containers by name:

$ docker run -d -name redis crosbymichael/redis
$ sudo docker run -link redis:db -d -name webapp me/someapp

This links webapp to redis using the alias db. DB_PORT (and similar) environment variables are set in the webapp container, which it can use to connect to redis container. You can also use ambassadors to link containers on separate hosts.

eCryptfs

eCryptfs is an encrypted filesystem for Linux. You'll need to have a kernel with the ECRYPT_FS module configured to use eCryptfs. Once you have the kernel setup, install the userspace tools (sys-fs/ecryptfs-utils on Gentoo, where you may want to enable the suid USE flag to allow non-root users to mount their private directories).

$ zcat /proc/config.gz | grep ECRYPT_FS
CONFIG_ECRYPT_FS=m
# echo 'sys-fs/ecryptfs-utils suid' >> /etc/portage/package.use/ecryptfs
# echo 'sys-fs/ecryptfs-utils ~amd64' >> /etc/portage/package.accept_keywords/ecryptfs
# emerge -av sys-fs/ecryptfs-utils
# modprobe ecryptfs

eCryptfs is usually used to maintain encrypted home directories, which you can setup with ecryptfs-setup-private. I used --noautomount because I'm not using the PAM module for automounting. Other than that, just follow the instructions. This sets up a directory with encrypted data in ~/.Private, which you mount with ecryptfs-mount-private. Mounting exposes the decrypted filesystem under ~/Private, which you should use for all of your secret stuff. If you don't like the ~/Private path, you can tweak ~/.ecryptfs/Private.mnt as you see fit.

$ ecryptfs-setup-private --noautomount
$ ecryptfs-mount-private
$ mkdir ~/Private/my-secret-stuff

To encrypt stuff that is bound to a specific path (e.g. ~/.mozilla), you can move the source into ~/Private and add symlinks from the canonical location to the encrypted location:

$ mv ~/.mozilla ~/Private/mozilla
$ ln -s ~/Private/mozilla ~/.mozilla

Encrypting arbitrary directories

You can also encrypt arbitrary directories using mount. This is useful if you have private information in a PostgreSQL database.

# /etc/init.d/postgresql-9.2 stop
# mv /var/lib/postgresql{,-plain}
# mkdir /var/lib/{.,}postgresql
# chown postgres:postgres /var/lib/{.,}postgresql
# mount -t ecryptfs /var/lib/{.,}postgresql
Passphrase: 
Select cipher: 
…
Would you like to proceed with the mount (yes/no)? : yes
Would you like to append sig [REDACTED] to
[/root/.ecryptfs/sig-cache.txt] 
in order to avoid this warning in the future (yes/no)? : yes
Successfully appended new sig to user sig cache file
Mounted eCryptfs
# mv /var/lib/postgresql{-plain/*,/}
# rmdir /var/lib/postgresql-plain
# /etc/init.d/postgresql-9.2 start

You can also specify mount options explicitly instead of entering them interactively. To figure out the proper incantation, look at the mtab entry after an interactive mount:

$ grep postgresql /etc/mtab
/var/lib/.postgresql /var/lib/postgresql ecryptfs rw,ecryptfs_sig=REDACTED,ecryptfs_cipher=aes,ecryptfs_key_bytes=16,ecryptfs_fnek_sig=REDACTED,ecryptfs_unlink_sigs 0 0

You should also look over the mount helper options in ecryptfs(7). Then run future mounts with:

# mount -t ecryptfs -o rw,ecryptfs_cipher=aes,ecryptfs_key_bytes=16,ecryptfs_enable_filename_crypto=y,ecryptfs_passthrough=n,ecryptfs_sig=REDACTED,ecryptfs_fnek_sig=REDACTED,ecryptfs_unlink_sigs /var/lib/{.,}postgresql

You can also add a line like:

/var/lib/.postgresql /var/lib/postgresql ecryptfs rw,ecryptfs_cipher=aes,ecryptfs_key_bytes=16,ecryptfs_enable_filename_crypto=y,ecryptfs_passthrough=n,ecryptfs_sig=REDACTED,ecryptfs_fnek_sig=REDACTED,ecryptfs_unlink_sigs,key=passphrase:passphrase_passwd_file=/home/wking/Private/ecryptfs/postgresql,noauto 0 0

to your /etc/fstab. With a passphrase file containing:

passphrase_passwd=[passphrase]

Add the user option to allow non-root mounts (see “The non-superuser mounts” section in mount(8)). Once you've setup your fstab, you can mount the directory more intuitively with:

# mount /var/lib/postgresql
Jam

Jam is a package manager for front-end JavaScript. While you want to use npm for server-side stuff, Jam is the tool to use for JavaScript that you'll be sending to your users. Following the docs (with my already-configured ~/.local prefix):

$ npm install -g jamjs

Integrating with Django is a bit tricky, especially since Jam doesn't manage the CSS, images, … that are often associated with JavaScript libraries. If you need that, you probably want to look at Bower instead.

Write a setup package.json defining your project's dependencies:

{
  "name": "myproject",
  "version": "0.0.1",
  "description": "An example project"
  "dependencies": {
    // NPM dependencies go here…
    "async": "0.1.22"
  },
  "jam": {
    "packageDir": "apps/my-app/static/js/",
    "baseUrl": "apps/beehive_common",
    "dependencies": {
      // Jam dependencies go here…
      "jquery": "1.7.1",
      "underscore": null
    }
  }
}

Install your dependencies with:

$ jam install

Upgrade with:

$ jam upgrade

Compile just the bits you use into a single require-able replacement.

$ jam compile apps/my-app/static/js/require.js

This last bit is really cool, and where a less JavaScript-oriented tool like Bower falls short. Jam is using the RequireJS optimizer under the hood for the task, so if you don't use Jam you can always run the optimizer directly.

Comcast

A while back I posted about Comcast blocking outgoing traffic on port 25. We've spent some time with Verizon's DSL service, but after our recent move we're back with Comcast. Luckily, Comcast now explicitly lists the ports they block. Nothing I care about, except for port 25 (incoming and outgoing). For incoming mail, I use Dyn to forward mail to port 587. For outgoing mail, I had been using stunnel through outgoing.verizon.net for my SMTP connections. Comcast takes a similar approach forcing outgoing mail through port 465 on smtp.comcast.net.

Posted
Node

Node is a server-side JavaScript engine (i.e. it executes JavaScript without using a browser). This means that JavaScript developers can now develop tools in their native language, so it's not a surprise that the Bootstrap folks use Grunt for their build system. I'm new to the whole Node ecosystem, so here are my notes on how it works.

Start off by installing npm, the Node package manager. On Gentoo, that's:

# USE=npm emerge -av net-libs/nodejs

Configure npm to make "global" installs in your personal space:

# npm config set prefix ~/.local/

Install the Grunt command line interface for building Bootstrap:

$ npm install -g grunt-cli

That installs the libraries under ~/.local/lib/node_modules and drops symlinks to binaries in ~/.local/bin (which is already in my PATH thanks to my dotfiles).

Clone Boostrap and install it's dependencies:

$ git clone git://github.com/twbs/bootstrap.git
$ cd bootstrap
$ npm install

This looks in the local package.json to extract a list of dependencies, and installs each of them under node_modules. Node likes to isolate its packages, so every dependency for a given package is installed underneath that package. This leads to some crazy nesting:

$ find node_modules/ -name graceful-fs
node_modules/grunt/node_modules/glob/node_modules/graceful-fs
node_modules/grunt/node_modules/rimraf/node_modules/graceful-fs
node_modules/grunt-contrib-clean/node_modules/rimraf/node_modules/graceful-fs
node_modules/grunt-contrib-qunit/node_modules/grunt-lib-phantomjs/node_modules/phantomjs/node_modules/rimraf/node_modules/graceful-fs
node_modules/grunt-contrib-watch/node_modules/gaze/node_modules/globule/node_modules/glob/node_modules/graceful-fs

Sometimes the redundancy is due to different version requirements, but sometimes the redundancy is just redundant :p. Let's look with npm ls.

$ npm ls graceful-fs
bootstrap@3.0.0 /home/wking/src/bootstrap
├─┬ grunt@0.4.1
│ ├─┬ glob@3.1.21
│ │ └── graceful-fs@1.2.3
│ └─┬ rimraf@2.0.3
│   └── graceful-fs@1.1.14
├─┬ grunt-contrib-clean@0.5.0
│ └─┬ rimraf@2.2.2
│   └── graceful-fs@2.0.1
├─┬ grunt-contrib-qunit@0.2.2
│ └─┬ grunt-lib-phantomjs@0.3.1
│   └─┬ phantomjs@1.9.2-1
│     └─┬ rimraf@2.0.3
│       └── graceful-fs@1.1.14
└─┬ grunt-contrib-watch@0.5.3
  └─┬ gaze@0.4.1
    └─┬ globule@0.1.0
      └─┬ glob@3.1.21
        └── graceful-fs@1.2.3

Regardless of on-disk duplication, Node caches modules so a given module only loads once. If it really bothers you, you can avoid some duplicates by installing duplicated packages higher up in the local tree:

$ rm -rf node_modules
$ npm install graceful-fs@1.1.14
$ npm install
$ npm ls graceful-fs
bootstrap@3.0.0 /home/wking/src/bootstrap
├── graceful-fs@1.1.14  extraneous
├─┬ grunt@0.4.1
│ └─┬ glob@3.1.21
│   └── graceful-fs@1.2.3 
├─┬ grunt-contrib-clean@0.5.0
│ └─┬ rimraf@2.2.2
│   └── graceful-fs@2.0.1 
└─┬ grunt-contrib-watch@0.5.3
  └─┬ gaze@0.4.1
    └─┬ globule@0.1.0
      └─┬ glob@3.1.21
        └── graceful-fs@1.2.3 

This is probably not worth the trouble.

Now that we have Grunt and the Bootstrap dependencies, we can build the distributed libraries:

$ ~/src/node_modules/.bin/grunt dist
Running "clean:dist" (clean) task
Cleaning dist...OK

Running "recess:bootstrap" (recess) task
File "dist/css/bootstrap.css" created.

Running "recess:min" (recess) task
File "dist/css/bootstrap.min.css" created.
Original: 121876 bytes.
Minified: 99741 bytes.

Running "recess:theme" (recess) task
File "dist/css/bootstrap-theme.css" created.

Running "recess:theme_min" (recess) task
File "dist/css/bootstrap-theme.min.css" created.
Original: 18956 bytes.
Minified: 17003 bytes.

Running "copy:fonts" (copy) task
Copied 4 files

Running "concat:bootstrap" (concat) task
File "dist/js/bootstrap.js" created.

Running "uglify:bootstrap" (uglify) task
File "dist/js/bootstrap.min.js" created.
Original: 58543 bytes.
Minified: 27811 bytes.

Done, without errors.

Wohoo!

Unfortunately, like all language-specific packing systems, npm has trouble installing packages that aren't written in its native language. This means you get things like:

$ ~/src/node_modules/.bin/grunt
…
Running "jekyll:docs" (jekyll) task
`jekyll build` was initiated.

Jekyll output:
Warning: Command failed: /bin/sh: jekyll: command not found
 Use --force to continue.

Aborted due to warnings.

Once everybody wises up and starts writing packages for Gentoo Prefix, we can stop worrying about installation and get back to work developing :p.

Relative submodules

I like Git submodules quite a bit, but they often get a bad rap. Most of the problems involve bad git hygiene (e.g. not developing in feature branches) or limitations in the current submodule implementation (e.g. it's hard to move submodules). Other problems involve not being able to fetch submodules with git:// URLs (due to restrictive firewalls).

This last case is easily solved by using relative submodule URLs in .gitmodules. I've been through the relative-vs.-absolute URL argument a few times now, so I thought I'd write up my position for future reference. I prefer the relative URL in:

[submodule "some-name"]
  path = some/path
  url = ../submod-repo.git

to the absolute URL in:

[submodule "some-name"]
  path = some/path
  url = git://example.net/submod-repo.git

Arguments in favor of relative URLs:

  • Users get submodules over their preferred transport (ssh://, git://, https://, …). Whatever transport you used to clone the superproject will be recycled when you use submodule init to set submodule URLs in your .git/config.
  • No need to tweak .gitmodules if you mirror (or move) your superproject Git hosting somewhere else (e.g. from example.net to elsewhere.com).
  • As a special case of the mirror/move situation, there's no need to tweak .gitmodules in long-term forks. If I setup a local version of the project and host it on my local box, my lab-mates can clone my local superproject and use my local submodules without my having to alter .gitmodules. Reducing trivial differences between forks makes collaboration on substantive changes more likely.

The only argument I've heard in favor of absolute URLs is Brian Granger's GitHub workflow:

  • If a user forks upstream/repo to username/repo and then clones their fork for local work, relative submodule URLs will not work until they also fork the submodules into username/.

This workflow needs absolute URLs:

But relative URLs are fine if you also fork the submodule(s):

Personally, I only create a public repository (username/repo) after cloning the central repository (upstream/repo). Several projects I contribute too (such as Git itself) prefer changes via send-email, in which case there is no need for contributors to create public repositories at all. Relative URLs are also fine here:

Once you understand the trade-offs, picking absolute/relative is just a political/engineering decision. I don't see any benefit to the absolute-URL-only repo relationship, so I favor relative URLs. The IPython folks felt that too many devs already used the absolute-URL-only relationship, and that the relative-URL benefits were not worth the cost of retraining those developers. `

Posted
Open hardware analog I/O

Over at Software Carpentry, Greg Wilson just posted some thoughts about a hypothetical open science framework. He uses Ruby on Rails and similar web frameworks as examples where frameworks can leverage standards and conventions to take care of most of the boring boilerplate that has to happen for serving a website. Greg points out that it would be useful to have a similar open science framework that small projects could use to get off the ground and collaborate more easily.

My thesis is about developing an open source framework for single molecule force spectroscopy, so this is an avenue I'm very excited about. However, it's difficult to get this working for experimental labs with a diversity of the underlying hardware. If different labs have different hardware, it's hard to write a generic software stack that works for everybody (at least at the lower levels of the stack). Our lab does analog control and aquisition via an old National Instruments card. NI no longer sells this card, and developing Comedi drivers for new cards is too much work for many to take on pro bono. This means that new labs that want to use my software can't get started with off the shelf components; they'll need to find a second-hand card or rework the lower layers of my stack to work with a DAQ card that they can source.

I'd be happy to see an inexpensive, microprocessor-based open hardware project for synchronized, multi-channel, near-MHz analog I/O to serve as a standard interface between software and the real world, but that's not the sort of thing I can whip out over a free weekend (although I have dipped my toe in the water). I think the missing component is a client-side version of libusb, to allow folks to write the firmware for the microprocessor without dealing with the intricacies of the USB specs. It would also be nice to have a standard USB protocol for Comedi commands, so a single driver could interface with commodity DAQ hardware—much like the current situation for mice, keyboards, webcams, and other approved classes. Then the software stack could work unchanged on any hardware, once the firmware supporting the hardware had been ported to a new microprocessor. There are two existing classes (a physical interface device class and a test and measurement class), but I haven't had time to dig through those with an eye toward Comedi integration yet. So much to do, so little time…


Powered by ikiwiki.