arundhaj

regression towards the datascience

Hbase row filter with regex

 

Querying Hbase rows based on RegEx with spring-data-hadoop framework; following code snippet would help.

To keep it simple, I query the table to get only those row keys that match my INPUT. In this case the row key structure is 10 digit followed by a string.

String name = String.format("^\\d{10}%s$", INPUT);

RegexStringComparator keyRegEx = new RegexStringComparator(name);
RowFilter rowFilter = new RowFilter(CompareOp.EQUAL, keyRegEx);
Scan rowScan = new Scan();
rowScan.setFilter(rowFilter);

List<String> children = getHbaseTemplate().find(TABLE_NAME,
                            rowScan, new ResultsExtractor<List>String>>() {
    @Override
    public List<String> extractData(ResultScanner rs)
            throws Exception {
        List<String> children = new ArrayList<String>();

        Result result = rs.next();

        String key = Bytes.toString(result.getRow());

        children.add(key);
    }
}

For reference visit the project github

Comments