I have a decent ebook library, featuring several thousands of books, that are all assumed to come from the project Gutenberg for the rest of this article. I wanted to put all of them on ipfs to help alleviate the load on Anna's Archive and make it more resilient to intempestive shutdowns, as they asked help for.
The ipfs project recommended way to interacting with the network is to use kubo. Unfortunately, it's not packaged in Alpine Linux anymore, and the prebuilt binaries are depending on the glibc, making them unable to be ran with musl, so build from source it is:
$ git clone https://github.com/ipfs/kubo.git
$ cd kubo
$ make build CGO_ENABLED=0
$ /home/ipfs/kubo/cmd/ipfs/ipfs version
ipfs version 0.19.0-dev
$
My collection of ebooks is around 50G sitting on my NAS, so I really don't want
to have it duplicated to be put into ipfs, which is the default method with
kubo. Fortunately, in 2017, the --nocopy
option was
added to
allow exactly this behaviour, although it's still marked as experimental.
Enabling "Accelerated DHT"
is also handy to speed things up, so make sure to enable it as well.
I'm using proxmox on my hypervisor, so I could mount my ebooks as
read-only inside of my ipfs container by adding mp0:
/mnt/nfs/books/,mp=/home/ipfs/books/,ro=1
to its configuration file.
Because I'm using calibre
to manage my virtual library, book covers (jpg, png, …)
and metadata files
have to be excluded. Also make sure to use --hash=blake2b-256 --chunker=size-1048576
since this is what Anna's Archive is using. "Interestingly" one
can pick amongst hundreds of hashing
primitives to generate CID on
ipfs.
$ /home/ipfs/kubo/cmd/ipfs/ipfs init --profile server
$ /home/ipfs/kubo/cmd/ipfs/ipfs config --json Experimental.AcceleratedDHTClient true
$ /home/ipfs/kubo/cmd/ipfs/ipfs config --json Experimental.FilestoreEnabled true
$ /home/ipfs/kubo/cmd/ipfs/ipfs add -r --pin=true --hash=blake2b-256 --chunker=size-1048576 --nocopy --ignore='*.jpg' --ignore='*.png' --ignore='*.opf' ./books/
[……]
added bafykbzacedbkjcavohu3tghnva6v3nwgpdtz3umtt3v7s5tgmjyeltrjsftmw books/Pierre-Joseph Proudhon/De la justice dans la Revolution et dans l'Eglise (515)
added bafykbzacebwarcecieaoeeftrex4r5yisicztrzyjdarpw3ijgypsq4fpnzlm books/Pierre-Joseph Proudhon/Qu'est-ce que la propriete _ (197)
[……]
added bafykbzacec7xpt2ojfyfeyqhw76yyzukeym4nvwhukcmyincefuyp5jzrfnhq books/William Shakespeare/Macbeth (389)
[……]
48.36 GiB / 48.36 GiB [===============================================] 100.00%
$ /home/ipfs/kubo/cmd/ipfs/ipfs swarm peers | wc -l
1337
$ /home/ipfs/kubo/cmd/ipfs/ipfs stats bw --poll=true --interval=1s
Total Up Total Down Rate Up Rate Down
2.9 GB 5.2 GB 27 MB/s 51.3 MB/s
Now that things are working, it's only a matter of writing a simple openrc
unit, running rc-update add ipfs
and rebooting:
#!/sbin/openrc-run
name="ipfs"
command="/home/ipfs/kubo/cmd/ipfs/ipfs"
command_args="daemon"
pidfile="/run/ipfs.pid"
command_user="ipfs"
command_background=true
depend() {
need net
}
And here we go, my whole library on ipfs for the whole world to enjoy:
$ curl -s -i https://ipfs.io/ipfs/bafykbzacedbkjcavohu3tghnva6v3nwgpdtz3umtt3v7s5tgmjyeltrjsftmw/ | grep epub -m 1
>De la justice dans la Revolution et dans l - Pierre-Joseph Proudhon.epub
$ curl -s -i https://ipfs.io/ipfs/bafykbzacebwarcecieaoeeftrex4r5yisicztrzyjdarpw3ijgypsq4fpnzlm/ | grep epub -m 1
>Qu'est-ce que la propriete _ - Pierre-Joseph Proudhon.epub
Don't forget to add a crontab to automatically add new books:
# echo '0 0 * * 0 /home/ipfs/kubo/cmd/ipfs/ipfs add -r --pin=true --hash=blake2b-256 --chunker=size-1048576 --nocopy --ignore='*.jpg' --ignore='*.png' --ignore='*.opf' ./books/' >| /etc/crontabs/ipfs