[HBASE-3342] Server-side Row-level Inverted Index Join via Coprocessors - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Closed
Priority: Major
Resolution: Later
Affects Version/s: None
Fix Version/s: None
Component/s: Coprocessors
Labels:
None

Description

A common schema in HBase is to created an inverted index per row (a la inbox search) where a row is a user/entity, each column is a word, and versions are instances of that word in documents (values can be empty or could contain additional scoring info like position / count information).

When querying indexes like this, we may want to do something like: give me the N most recent documents that contain the word "foo" (exact word matching) and contain a word that starts with "bar" (prefix matching).

Currently this join has to be done on the client-side, so we may have to read far more than N documents for each word to be able to get N documents which match for both words. This gets worse as the number of words increase.

We could implement this join on the server-side in a coprocessor.

Attachments

Issue Links

is related to

HBASE-2000 Coprocessors

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Jonathan Gray

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 13/Dec/10 19:43

Updated:: 12/Jun/22 00:37

Resolved:: 29/Dec/14 19:44