mirror of
https://github.com/usatiuk/backup.git
synced 2025-10-26 17:37:47 +01:00
init
This commit is contained in:
102
README.md
Normal file
102
README.md
Normal file
@@ -0,0 +1,102 @@
|
||||
# Backup app
|
||||
|
||||
## Key features
|
||||
|
||||
- Deduplicated backups with additional compression and encryption
|
||||
|
||||
- Files are split into chunks and stored in a repository as a list of
|
||||
pointers to these chunks
|
||||
|
||||
- These chunks are reused for all files in the repository, based on
|
||||
their hash matching.
|
||||
|
||||
- If when creating an archive, a file hasn’t changed it is reused from
|
||||
a previous archive
|
||||
|
||||
- Content defined chunking algorithm based on a rolling hash
|
||||
|
||||
- Unlike when splitting file in fixed size chunks, still works
|
||||
when there is an insertion/deletion of bytes in middle/beginning
|
||||
of the file
|
||||
|
||||
- These Files, Chunks and Archives are stored in object storage
|
||||
Repository
|
||||
|
||||
- Default implementation - FileRepository, stores these objects in
|
||||
the file system
|
||||
|
||||
- Possible other implementations - database, cloud object storage…
|
||||
|
||||
- Multithreading
|
||||
|
||||
- Directories are ignored if they have a `.nobackup` file
|
||||
|
||||
- Files are ignored in a directory recursively based on a `.ignore`
|
||||
file - one line contains one regex rule, which, if a directory entry
|
||||
matches, it is ignored
|
||||
|
||||
# Quick start
|
||||
|
||||
Initialize a repository with compression and encryption (you can also
|
||||
specify other options like –from here, they will be written to the
|
||||
repository if it’s possible, so far changing options of an already
|
||||
existing repo is not implemented)
|
||||
|
||||
sembackup init --repo <target dir> --compression zlib --compression-level 4 --encryption aes --password <password> --salt <random salt>
|
||||
|
||||
Run a backup
|
||||
|
||||
sembackup run --from <source dir> --repo <target dir> --password <password>
|
||||
|
||||
List avaiable archives
|
||||
|
||||
sembackup list --repo <repo dir> --password <password>
|
||||
|
||||
List files in an archive
|
||||
|
||||
sembackup list-files --repo <repo dir> --password <password> --aid <archive id>
|
||||
|
||||
Restore a backup
|
||||
|
||||
sembackup restore --repo <repo dir> --password <password> --aid <archive id> --to <destination>
|
||||
|
||||
Compare source dir with latest archive
|
||||
|
||||
sembackup diff --repo <repo dir> --password <password> --from <source dir>
|
||||
|
||||
Compare source dir with an archive
|
||||
|
||||
sembackup diff --repo <repo dir> --password <password> --from <source dir> --aid <archive id>
|
||||
|
||||
Compare two archives
|
||||
|
||||
sembackup diff --repo <repo dir> --password <password> --from <source dir> --aid <archive id> --aid2 <archive id>
|
||||
|
||||
Compare subdirectory in the source dir with latest archive
|
||||
|
||||
sembackup diff --repo <repo dir> --password <password> --from <source dir> --prefix <subdir>
|
||||
|
||||
# Data format
|
||||
|
||||
All data is represented as objects, stored in a repository.
|
||||
|
||||
An `Archive` object represents a snapshot of the file system in the
|
||||
moment of its creation, and consists of a list of pointers (Object ids)
|
||||
to `File` objectsю
|
||||
|
||||
`File` object consists of its basic metadata, and a list of chunks,
|
||||
identified by their ids, which can be shared between multiple files (and
|
||||
within the same file) if their MD5 hashes match.
|
||||
|
||||
`Chunk` object is a binary blob, identified by its MD5 hash.
|
||||
|
||||
These objects are children of `Object`, providing a `getKey()` method,
|
||||
(name for `Archive`, path for `File`, and MD5 hash for `Chunk`) which is
|
||||
used by `Repository` to make them easily accessible.
|
||||
|
||||
In default (and so far the only) repository implementation
|
||||
`FileRepository` these objects are grouped together into files of size
|
||||
approixmately `repo-target` MB, (by default 128), and there exists a
|
||||
key-value index of indexed objects written into `index`, and an
|
||||
`offsets` file recording the location and offset of each object in the
|
||||
file system.
|
||||
Reference in New Issue
Block a user