Wednesday, 19 July 2017

HBase - Identifying and moving corrupted HFiles


Identifying corrupted files:


HBase provides a utility to check for any corrupted files in HBase.
hbase hbck -checkCorruptHFiles

The above command will check the entire HBase tables for any corrupted files.
To check corrupted in specific table use the following command:
hbase hbck -checkCorruptHFiles <table_name>

If any files are corrupted, then we will have something similar to following in the output log:

Checked 17 hfile for corruption
  HFiles corrupted:                  2
    HFiles moved while checking:     0
Summary: CORRUPTED


The logs also contain details about which files are corrupted.
Another way of checking whether a file is corrupted or not is using the following command:
hbase org.apache.hadoop.hbase.io.hfile.HFile -f <path_to_hfile> 


Sidelining corrupted files:


HBase provides a utility sidelining corrupted files in HBase.
hbase hbck -sidelineCorruptHFiles

The above command will check the entire HBase tables.
To apply on specific table use the following command:
hbase hbck -sidelineCorruptHFiles <table_name>

You will see similar information in the command output log:

Checked 17 hfile for corruption
  HFiles corrupted:                  2
    HFiles successfully quarantined: 2
      maprfs:/hbase/corrupt/hcrt/1090c602c005a4ca76fda4ec7bd2865c/f/97fcf7fee25c469a81e7a0aa567a4627
      maprfs:/hbase/corrupt/hcrt/1090c602c005a4ca76fda4ec7bd2865c/f/97fcf7fee25c469a81e7a0aa567a4628
    HFiles failed quarantine:        0
    HFiles moved while checking:     0
Summary: CORRUPTED => OK


The corrupted files are moved to '/hbase/corrupt' folder.

No comments:

Post a Comment