Description
Dataflow should have the following tests running on Java 11 (runner V2) for tracking performance of both batch and streaming pipelines:
- GBK variants
- coGBK variants
- ParDo variants
The variants can be found in the 'Performance tests metrics' section of http://metrics.beam.apache.org/
The tests should report metrics to these dashboards. The task here is to configure the existing tests to also run on runner v2, not introduce new tests.
cwiki entry on load tests: https://cwiki.apache.org/confluence/display/BEAM/Contribution+Testing+Guide#ContributionTestingGuide-TestsofCoreApacheBeamOperations
cwiki entry on how metrics are collected: https://cwiki.apache.org/confluence/display/BEAM/Test+Results+Monitoring
jenkins config for a load test: https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_LoadTests_ParDo_Java.groovy
java file for that load test: https://github.com/apache/beam/blob/master/sdks/java/testing/load-tests/src/main/java/org/apache/beam/sdk/loadtests/ParDoLoadTest.java