Description
The method QueryResult.getRows().getSize() is supposed to return the number of rows of the result, but can return -1 if the size is unknown.
For both Jackrabbit 2.0 and Oak, in many cases, the size is unknown, because data is read on demand, possibly using a configurable fetch size to reduce the number of (network) roundtrips.
For Jackrabbit 2.0, as far as I'm aware the size is always available if 'order by' is used. This currently is also the case for Oak. However, in the future this might change as there is no need to read all data in memory if the index itself returns the data in the correctly sorted order.
For some existing applications, it might be a problem if getSize() returns -1 for those cases. For compatibility with Jackrabbit 2.0, and for ease of use, it would be good to have a clearly defined way to get the size of the result. For example using a configurable setting to define how many rows to fetch at most in case getSize() is called (if the end of the result has reached within the given fetch limit then getSize() can return the correct value; if not, getSize() returns -1). A possible way to configure such a setting is:
SimpleCredentials cred = new SimpleCredentials(user, pwd.toCharArray());
cred.setAttribute("queryFetchSizeLimit", "20");
Session s = getRepository().login(cred);
return s;
Another way to configure the setting is using an option within the query. We would have to invent a special syntax, as this doesn't doesn't seem to be the same as "limit" or "fetch size", maybe:
select text from [nt:base] where id = $id fetch size limit 20
This ("fetch size limit 20") is somewhat similar to the SQL "fetch first
10 rows only" - http://en.wikipedia.org/wiki/Select_(SQL)#FETCH_FIRST_clause - but it's not the same really.
I don't consider this as a very urgent issue, but we need to keep it in mind.