git-annex is magic

a shell adventure

 

Open your Terminal

This is a shell adventure

You can (safely) follow along by typing or copying
all the commands into your terminal.

You'll need a relatively recent Linux or Mac system, and maybe
an USB stick. The Stick can be simulated by any folder
on the computer, so you can also try it out on any machine.

Commands are lines start start with a '$' sign,
the '$' is not part of the command.

Try it out now:

date "+%s"
1385670018
^ There are snippets of the expected output, where it is important.

Alias

For educational purposes, we define an alias,
so it is more clear when we are using git or git annex.

alias magic="git annex"

We will be defining more variables in this adventure,
so make sure to not close your shell (or edit .bashrc).

Download and Install

Official instructions and downloads for every OS are here:

https://git-annex.branchable.com/install/

Install

Linux:

tar xvzf git-annex-standalone-i386.tar.gz
APP="~/git-annex.linux"

OS X:

open ~/anx.dmg

Drag and Drop App to /Applications

APP="/Applications/git-annex.app\
/Contents/MacOS"

$PATH

  • It should work via shell and ssh!

  • Linux: at beginning of '/etc/bash.bashrc'!
    • user must have shell /bin/bash in /etc/passwd
  • OSX: ~/.bashrc and ??? for ssh
  • If it doesn't work: ~/.bashrc is enough for this adventure

export PATH="$PATH":"$APP"

Save Point #1:
check install

  • in a shell
magic version
git-annex version: 5.20131117-gbd514dc
build flags: Assistant Webapp …
key/value backends: SHA256E WORM …
remote types: git directory…
  • via ssh (optional)
ssh localhost 'git annex version'
git-annex version: 5.20131117-gbd514dc
…

Works?


Link to this point: <#s1>


Workflow

  1. init
  2. add $FILES
  3. clone
  4. setup & sync
  5. use

magic init

We start with a new folder. This could be your new Dropbox.

mkdir ~/magicfolder
cd ~/magicfolder
git init
Initialized empty Git repository …
magic init 'My Laptop'
init My Laptop ok
(Recording state in git...)

magic add

We make a sub-folder with a new text file inside.

cd ~/magicfolder
mkdir foo && echo 'simsalabim' > foo/bar.txt
magic add foo/bar.txt
add foo/bar.txt (checksum...) ok
(Recording state in git...)
git commit -m 'added'
[master (root-commit) 67a2114] added
 1 file changed, 1 insertion(+)
 create mode 120000 foo/bar.txt

magic sync

To make things easier, you can always use sync.

Should be done after every change (or in intervals w/ cron).

It does the following:

  • auto-commit
  • push to all remotes
  • fetch from all remotes
  • merge with all remotes
cd ~/magicfolder
echo '1' > one.txt
magic sync
commit  
ok

git clone

Now clone this repo using any git transport (file, ssh, https,…).

Note that file works with everything you can mount somehow,
including Dropbox, SFTP, SMB, NFS, etc..

STICK="/Volumes/USBSTICK"

Clone:

git clone file://$HOME/magicfolder "$STICK/magicfolder"

Alternative Clone: SSH

git clone user@host:magicfolder "$STICK/magicfolder"

Init:

cd "$STICK/magicfolder"
magic init 'My USB Stick'
init My USB Stick
ok
(Recording state in git...)

magic sync #2

magic sync
(merging origin/git-annex into git-annex...)
(Recording state in git...)
commit  
ok
pull origin
ok
push origin
Writing objects: 100% (8/8), 776 bytes …
 * [new branch] git-annex -> synced/git-annex
ok

look what we've got

We seem to have our file…

ls foo/bar.txt
foo/bar.txt

(bear with me for a second)

connect the remotes

Standard git setup.

For the Laptop:

cd ~/magicfolder
git remote add stick "$STICK/magicfolder"

For the stick:

cd "$STICK/magicfolder"

The origin remote is not used by magic, can be removed or not.

git remote remove origin # optional
git remote add laptop ~/magicfolder

magic sync #3

With the configured remotes, sync now also does pull and push.

cd ~/magicfolder
magic sync
(merging synced/git-annex into git-annex...)
commit  
ok
pull stick
From /Volumes/USBSTICK/magicfolder
 * [new branch] git-annex  -> stick/git-annex
 * [new branch] master     -> stick/master
 * [new branch] synced/master -> stick/synced/master
ok

Now what?

We have just cloned the meta data.

All the files and folder are on the stick, as symlinks!

magic takes care of the symlinks.

Metadata includes info about the remotes, and what files they have available!

cd "$STICK/magicfolder"
ls foo/bar.txt # foo/bar.txt

magic get a file

So, how do we get the content?

To manually transfer files, use get.

cd "$STICK/magicfolder"
magic get foo/bar.txt
cat foo/bar.txt
simsalabim

magic drop a file

To delete a file just from a remote, use drop.

This is useful for making free space on Laptops with small hard drives, etc.

cd "$STICK/magicfolder"
magic drop foo/bar.txt
cat foo/bar.txt # > "No such file or directory"
magic sync

wait…, what?

dropping?

Isn't that a fancy word to say DELETE?

is this safe?

YES it is safe!

(unless you use the --force)

This is why we want to have a record of all files and in which remotes they are.

Let's try 'dropping' it on our Laptop, too:

cd ~/magicfolder/
magic drop foo/bar.txt
drop foo/bar.txt (merging synced/git-annex into git-annex...)
(unsafe)
  Could only verify the existence of 0 out of 1 necessary copies
  (Use --force to override this check, or adjust annex.numcopies.)
failed
git-annex: drop: 1 failed

ok, let's copy

What if I want to push a file to the stick?

  • copy --to a remote
  • also: move --to!
cd ~/magicfolder/
magic copy foo/bar.txt --to stick
copy foo/bar.txt (to stick...)
SHA256E-s11--9062535d58….txt
  11 100%    0.00kB/s    0:00:00 (xfer#1, to-check=0/1)

sent 173 bytes  received 42 bytes  430.00 bytes/sec
total size is 11  speedup is 0.05
ok
(Recording state in git...)

magic whereis

We can also just get a list of a file availability in other remotes,
without trying to drop it:

cd ~/magicfolder
magic whereis foo/bar.txt
whereis foo/bar.txt (2 copies)
    ab8aad2f-f87e-440e-baef-… -- here (My Laptop)
    f7feb954-0250-4117-a368-… -- stick (My USB Stick)
ok

Save Point #2:

  • OK so far? Cool.

  • Not ok?

    • remove all
    cd ; \
    sudo rm -rf \
    ~/magicfolder \
    "$STICK/magicfolder"

Questions?


Link to this point: <#s2>


editing Files

Files are locked by default!

Also, while the content is not tracked in git by magic,
you can still commit with git, and magic will take of it (w/ git-hooks).

cd ~/magicfolder
echo 'fail' > foo/bar.txt
Permission denied

editing Files

We need to unlock it first!

magic unlock foo/bar.txt
echo 'booya' > foo/bar.txt  # No error!

Now "commit and push".

magic add foo/bar.txt
git commit -m 'changed foo bar'
magic sync

Note: We didn't really have to git commit!
magic sync would have done that for us,
but committing let us put in our own message.

Syncing the edits

To get the file to the stick,

cd "$STICK/magicfolder"

we sync, which fetches the meta data with git
and then get . (everything) to actually transfer the file.

magic sync && magic get .
cat foo/bar.txt
booya

…"locking", eh?

Doesn't this locking/unlocking seem complicated
in comparison to, say, Dropbox?

YES. That is why we use 'direct mode'.

direct mode

You may now forget the slides about locking/editing.

cd ~/magicfolder
magic direct
…
direct  ok

direct mode

All available files are normal files;
all unavailable files are symlinks
(and we can use magic get to make them available).

Do something:

echo 'pow' > foo/bar.txt
magic sync

Sync it!

cd "$STICK/magicfolder"
magic sync && magic get .
cat foo/bar.txt #
pow

Note: We could also have used copy --to stick.

special remotes

Usage: If you just want to copy somewhere, for backup or transfer.

If you'll look at the directory, it will just be a bunch of object files.

Special remotes are (GPG) encrypted by default,
so you have to explicitly turn it off (encryption=none).

Once setup with initremote,
magic can use them like any other remote – it just works™.

encrypted folder in Dropbox

The remote will have a name and a GPG key.

BOX=dropbox # how to call the remote
BOX_PATH="~/Dropbox/BACKUP/magic/"
KEY=965113EA # YOUR GPG keyid - don't take mine ;)

Make a new folder in the Dropbox:

mkdir -p "$BOX_PATH"
cd ~/magicfolder
magic initremote $BOX type=directory \
directory="$BOX_PATH" \
keyid=$KEY
initremote dropbox (encryption setup)
(hybrid cipher with gpg key 3771835A3BADB56D) ok

Alternative if you don't have a key handy (unecrypted):

magic initremote $BOX type=directory \
directory="$BOX_PATH" \
encryption=none

encrypted copy to Dropbox

Like any other remote, just a password prompt for the key.

cd ~/magicfolder
magic copy . --to $BOX
copy foo/bar.txt (gpg)
You need a passphrase to unlock the secret key for
user: "Max F. Albrecht <1@178.is>"
4096-bit RSA key, ID 965113EA, created 2013 (main key ID FOO)

(to dropbox...)
ok

more special remotes

  • xmpp (jabber!)
  • rsync
  • bup
  • web
  • webdav
  • (Amazon) S3
  • Amazon Glacier
  • More in the Docs

more backends

  • how files are matched with they meta data.
    obviously very important.

  • default is SHA256E, a file hash plus the file extension

  • WORM: alternative if you don't want hashing, uses just file name, date, etc. usefull if:
    1. you trust the disk or
    2. you don't care
    3. performance (2 TB Videos @ Raspberry Pi)
cd ~/magicfolder
echo '* annex.backend=WORM' > .gitattributes

Save Point #3:

Use the direct mode for normal folders,
and the indirect mode for normal git repos + magic.

Use sync to sync meta data between remotes.

Use get, drop, copy and move to sync content between remotes.

Questions?


Link to this point: <#s3>


assistant

assistant

  • daemon
  • watches folders (also: just magic watch)
  • show repos, remotes, transfers
  • remote settings
cd ~/magicfolder
magic assistant # or 'magic webapp' to re-open

screenshot

Git Annex Assistant (watching)
Git Annex Assistant (watching)

assistant:
internals

  • uses direct mode by default
  • uses only special remotes by default
    • good bc. encrypted
  • apart from that, everything done manually still works!

tips & tricks

  • --fast do stuff faster.
    For example, rely on local data instead of updating before checking.

  • magic fsck: for the data conscious

  • magic describe: change the description
    that was set with magic init 'My Laptop'.

  • always try to use the faster machine for hashing…

fin

made with pandoc and reveal.js.

Fork me on GitHub