Details
-
Task
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
Description
Motivation
This could be useful for certain cases:
1) Simple queries where partition can be determined easily in advance, either automatically (IGNITE-4509, IGNITE-4510), or manually.
2) Spark data frame integration (IGNITE-3084)
Proposed API
class Query {
int[] partitions();
void partitions(int...);
}
Important points:
1) Partitions are defined in the very base Query class because we already has this feature for ScanQuery and potentially any query type can benefit from it. If query doesn't support partitions, exception should be thrown.
2) User should be able to specify multiple partitions, not only one. This will make our API more flexible for 3-rd party integrations like Spark. Also it will help users with fine-grained tuning. E.g. if user has a query ... WHERE attribute IN (?, ?, ...), he can determine partitions for IN arguments in advance.
Probably this feature should not be supported for distributed joins. On the other hand - why not? Query is always created from some cache, so the first map step can be executed only against target partitions, and the rest execution flow can go through all partitions of other caches.
Attachments
Issue Links
- duplicates
-
IGNITE-4523 Allow distributed SQL query execution over explicit set of partitions
- Resolved
- is related to
-
IGNITE-3084 Spark Data Frames Support in Apache Ignite
- Closed
-
IGNITE-4523 Allow distributed SQL query execution over explicit set of partitions
- Resolved
-
IGNITE-4509 SQL: query with condition on affinity columns and without joins and subqueries should go to affinity node only
- Resolved
-
IGNITE-4510 SQL: co-located query may calculate target partition in advance in some cases
- Closed