Description
Setting the global QueryEngineSettingsService.getFastQuerySize() value to true is currently the only way to allow service users to leverage JCR query for collecting accurate repository count metrics in a performant way. However, doing so in a multiuser repository may be inadvisable because the fast result size is returned to the caller without considering the caller's read permissions over the paths returned in the result, which may allow less privileged users to discover the presence of nodes that are not otherwise visible to them.
See https://jackrabbit.apache.org/oak/docs/query/query-engine.html#result-size
As an alternative to the global setting, Oak should provide a query option alongside TRAVERSAL, OFFSET / LIMIT, and INDEX TAG, such as "INSECURE RESULT SIZE" .
Similarly, IndexDefinition.SecureFacetConfiguration.MODE.INSECURE (insecure facets) can provide extremely valuable counts for property value distribution in large repositories. At the moment, it can only be defined on an index definition, even though it governs the facet counts at query time and has no effect on the persisted content of the index at all. Like fastQuerySize, Oak should provide a query option such as "INSECURE FACETS", for permitted system users to leverage insecure facets even when the query execution plan uses an index definition that only allows secure or statistical facet security.
For example,
select a.[jcr:path] from [nt:base] as a where contains(a.[text], 'Hello World') option(insecure result size, insecure facets, offset 10)
To address the security risk, the application should also provide a configuration of some kind to restrict the ability to effectively leverage this option to permitted system users, which could be implemented as a JCR repository privilege or an allowlist property in the QueryEngineSettingsService configuration.
I have provided a PR that adds support for an INSECURE RESULT SIZE query option and an INSECURE FACETS query option, as well as an "rep:insecureQueryOptions" repository privilege. I think the JCR privilege-based approach for configuration of this permission is more aligned with how system users are defined in practice, but this approach requires a minor version increase in the following oak-security-spi packages:
- org.apache.jackrabbit.oak.spi.security.authorization.permission
- org.apache.jackrabbit.oak.spi.security.privilege
Because all registered permissions are serialized into a long bitset, there is clearly a premium on adding another built-in privilege, so I figured that it would be better to choose a name for the privilege that would make it applicable to both of these new options, and any future query options that may involve a tradeoff between security and performance.