In the grand tradition of my publishing little building-block shell scripts of interest, here goes another one. This is a simple cron job that I run daily on a number of hosts to generate storage usage growth. (This is in addition to Cacti and Nagios which poll some of this data already but for different reasons and with different granularity).

The FILES variable should be populated with a whitespace separated list of files, directories, and block devices to track.

The DB_ABCD variables should be populated with appropriate credentials to talk to a mysql server.

The actual script looks something like this:

#!/bin/bash

FILES='/var/lib/mysql/ibdata1 /var/lib/mysql/db/table.ibd /dev/sda1 /var/log/mysql'

LOCAL=`hostname -s`
DB_HOST='aaa'
DB_USER='bbb'
DB_PASS='ccc'

function insert {
    FILE=$1
    SIZE=$2
    QUERY="replace into metrics.storage_usage values( now(), '$LOCAL', '$FILE', $SIZE )"
    mysql --host=${DB_HOST} --user=${DB_USER} --password="${DB_PASS}" -e "${QUERY}"
}

for FILE in $FILES
do
    if [ -d $FILE ]; then
        BASE=$FILE
        SIZE=`du -ks $FILE/ | awk '{print $1}'`
    elif [ -b $FILE ]; then
        TMP=`df -k -P $FILE | tail -n1 | awk '{print $3 " " $6}'`
        SIZE=`echo $TMP | awk '{print $1}'`
        BASE=`echo $TMP | awk '{print $2}'`
    else
        BASE=`basename $FILE`
        SIZE=`du -k $FILE | awk '{print $1}'`
    fi

    echo "$BASE = $SIZE"
    insert $BASE $SIZE
done

I am putting my data into a table called “storage_usage” in a database called “metrics”:

CREATE TABLE `storage_usage` (
  `ts` date NOT NULL,
  `host` varchar(25) NOT NULL,
  `file` varchar(64) NOT NULL,
  `size` int(10) unsigned NOT NULL COMMENT 'in kbytes',
  PRIMARY KEY (`ts`,`host`,`file`)
)

Obviously, this could be tweaked in any different number of ways, based on your needs. One tweak you might want to consider if you’re running it in a daily cron is to remove the echo so you don’t get an email report of every run. Also, if you might want to record more than one snapshot per file per host per day – in the which case you probably need to change the type of the timestamp column to a datetime. Or there might be cases where you want to change the replace to an insert or… whatever ;)