salvage
This tool will salvage data from a (potentially corrupt) Hypertable database by recursively walking the /hypertable/tables directory in the brokered filesystem looking for CellStore files and extracting data from them. The recovered data is written into a tree of .tsv files that can be re-loaded into a clean database at a later time. For each unique table directory encountered, a .hql file will also be generated which contains the HQL required to re-create the table.
Prerequisites
This tool requires Hyperspace to be intact and running and an FS broker running on local host. These services can be started as follows:
$ ht cluster start_hyperspace $ ht start-fsbroker hadoop
This tool only extracts data from CellStores. To extract data in commit logs, the log_player tool can be used.
Basic Usage
The simplest usage is to run it with no options, supplying only the output directory argument:
$ ht salvage output
After successfully completing, the output directory will be populated with the namespace hierarchy and the .tsv and .hql files. For example:
$ tree output output ├── alerts │ ├── create-realtime.hql │ └── realtime.tsv ├── cache │ ├── create-image.hql │ └── image.tsv └── search ├── blog.tsv ├── create-blog.hql ├── create-image.hql ├── create-news.hql ├── image.tsv └── news.tsv
Include and Exclude
To exclude a specific namespace or table, the --exclude option may be used:
$ ht salvage --exclude search output $ tree output output ├── alerts │ ├── create-realtime.hql │ └── realtime.tsv └── cache ├── create-image.hql └── image.tsv
To include a specific namespace or table, the --include option may be used:
$ ht salvage --include search/blog output $ tree output output └── search ├── blog.tsv └── create-blog.hql
Restricting Row Space
To restrict the salvaged data to a specific row key range, use the --start-key and --end-key options. For example:
$ ht salvage --start-key "019999999" --end-key "030000000" output
Path Regex
With the --path-regex option, a regular expression can be supplied to specify which directories in the brokered filesystem should be included in the recovery. For example:
$ ht salvage --verbose --path-regex "/hypertable/tables/[2-3]" output