Ok, so it’s 4am and I’m awake. Nothing new there. I was scrounging sites and looking for some bit of information obscure enough that I’d actually forgotten what my original goal was. Thus, it was only natural that I get sidetracked by whatever shiny object floated past me as I clicked URL’s at random.

I landed on a French anime wallpaper site. I’d not been there before, and I started looking at a decently sized collection of images that I’d also never before encountered. Naturally, I decided to acquire them – but the sheer scope of the site lended a non-trivial amount of difficulty to this. At least, it did until I realized that their images were stored in sequentially numbered files in sequentially numbered directories. No zero padding or anything.

This led to the following script, which is currently quite happily churning its way through the 36th directory – whatever show that may actually happen to be ;)

I am making one assumption here, that they haven’t divided things up into more than 500 directories. I could just as easily have done the sort of while loop as I did on the inside on the outside, but this is also meant as demonstrative code for future reference.

#!/usr/bin/bash
for DIR in `seq 0 500`
do
        mkdir $DIR
        cd $DIR
        IMG = -1
        while [ $? != 1 ]
        do
                ((IMG += 1))
                wget -c host/$DIR/$IMG.jpg
        done
        cd ..
        rmdir $DIR

        if [ $? == 0 ]
        then
                echo "removed $DIR, assuming we're done"
                exit
        fi
done

Why not just use wget’s spider functionality? I have in the past, but not for something this fun. Besides, this way I don’t have any cleanup to do (other than identifying shows and renaming their folders when all is done).

It’s working on directory number 45 now ;)

Leave a Reply