Skip to main content

How to use your unlimited quota 2AM-8AM

Fun with BASH. Disclaimer: this may contain typoes, thinkoes, etc. Use (or don’t use) entirely at your own risk. The intent is to be educational here rather than subject to chapter-&-verse unquestioning acceptance. If the clock on the computer running this is inaccurate by more than five minutes, the downloads may (partially) take place from your monthly quota (in which case install rdate & aim it at a suitable time server, then install/use hwclock to update the PC’s battery-backed clock).

Behold, the grabqueue.sh file:

#!/bin/sh
for list in pending/*; do
    N=$(wc -l $list | sed -e 's/ .*$//')
    for url in $(seq $N); do
        L=$(tail -n +$url $list | head -n 1 -)
        if [ "$L" != "" ]; then
            wget -c "$L"
        fi
    done
    rm -f $list
done
rm -f /tmp/grabqueue.pid


Run this (in cron) at 02:05 (a little clock slackness):

#!/bin/sh
cd /path/to/download/directory
sh grabqueue.sh &
echo $! >/tmp/grabqueue.pid


Run this (in cron) at 07:55 (again, clock slackness):

#!/bin/sh
if [ -f /tmp/grabqueue.pid ]; then
    kill $(cat /tmp/grabqueue.pid)
    rm -f /tmp/grabqueue.pid
fi


Discussion: put the URLs of whatever you wish to fetch within text files inside the pending subdirectory within your downloads directory, one URL per line. Empty lines within these files are ignored.

The “-c” option (to wget) causes it to continue an existing download (or start a fresh one if the file does not yet (locally) exist). This means that attempting to download a file you already have will occupy a fraction of a second, plus little or zero traffic.

The script deletes each file full of URLs after the last URL in the file has been downloaded from.

The $! phrase in BASH is replaced with the PID (Process ID) of the most recently spawned sub-process. The first cron job launches the script, then records the PID within a temporary file. The second cron job (if said file still exists) kills the process so listed, then deletes the temporary file. Anything not completely downloaded at that point will be resumed during the next morning, until each download is completed.

Within the script, $list is the name of the URL-list file currently being processed. $L is the line containing a URL which is being downloaded.

To download videos, ensure that you have the URL for the video file itself (which typically ends with .mpg or .flv or .mp4), rather than the URL of the web page upon which the video is presented (so typically not ending with .html or .asp or .htm or .php).

Comments

kundip@hotmail.com said…
Here is my attempt
#!/bin/sh
for list in /home/k~p/grabqueue/*; do
N=$(wc -l $list | sed -e 's/ .*$//')
for url in $(seq $N); do
L=$(tail -n +$url $list | head -n 1 -)
if [ "$L" != "" ]; then
wget -c "$L"
fi
done
rm -f $list
done
rm -f /tmp/grabqueue.pid

I put a file named "list" in /home/k~p/grabqueue
Do I need
rm -f /home/k~p/grabqueue/list
as last line so it does not download over and over each night?
Leon RJ Brooks said…
Brett, $list is a variable name, not a filename. Within the “for” loop, it represents each text file in the /home/k~p/grabqueue directory, one file at a time.

Each file is read, each non-empty line in the file is treated as a URL & downloaded.

YouTube URLs will only fetch the page around the video, rather than the video file itself, hence the need for the grabyoutube.sh mucking around.

This process is only useful if your ISP has a no-quota download period (which ExeTel does from 02:00 to 08:00).
Unknown said…
Thanks I have 20 gig off-peak 95% unused to date usually use "at" command
and wget with a text file. Prone to running past 8am if links are slow to download. The best would be for me if as each line of text file is done it is deleted then at 8am something to kill wget. Then I could grab the rest on following nights.
Leon RJ Brooks said…
Yo, that’s what the second cron job does: kills off the downloading process.

One amendment needed for the grabyoutube.sh script is that ${RANDOM} fails, (resolves to an empty string), it needs to be $RANDOM instead (to resolve to a random number).

Popular posts from this blog

new life for an old (FTX) PSU, improved life for one human

the LEDs on this 5m strip happen to emit light centred on a red that does unexpectedly helpful things to (and surprisingly deeply within) a human routinely exposed to it. it has been soldered to a Molex connector, plugged into a TFX power supply from a (retired: the MoBo is cactus) Small Form Factor PC, the assorted PSU connectors (and loose end from the strip) have been taped over. the LED strip cost $10.24 including postage, the rest cost $0, the PSU is running at 12½% of capacity, consumes less power than a laptop plug-pack despite running a fan. trial runs begin today.

every-application-is-part-of-a-toolkit at work

I have a LibreOffice Impress slideshow that I wish to turn into a narrated video. 1. export the slideshow as PNG images (if that is partially broken — as at now — at higher resolutions, Export Directly as PDF then use ‘pdftoppm’ (from the poppler-utils package) to do the same). 2. write a small C program (63 lines including comments) to display those images one at a time, writing a config file entry for Imagination (default transition: ‘cross fade’) based on when the image-viewer application (‘display,’ from the GraphicsMagick suite) is closed on each one; run that, read each image aloud, then close each image in turn. 3. run ‘Imagination’ over the config file to produce a silent MP4 video with the correct timings. 4. run ‘Audacity’ to record speech while using ‘SMPlayer’ to display the silent video, then export that recording as a WAV file. 4a. optionally, use ‘TiMIDIty’ to convert a non-copyright-encumbered MIDI tune to WAV, then import that and blend it with the speech (as a quiet b...

boundaries

pushing the actual boundaries of the physical (not extremes, the boundaries themselves) can often remove barriers not otherwise perceived. one can then often resolve an issue itself, rather than merely stonewalling at the physical consequences of the issue.