Description
Update: the original idea was to only update Curator but keep the old ZooKeeper version in Hadoop. However, we encountered some run-time backward-incompatibility during unit tests with Curator 4.2.0 and ZooKeeper 3.5.5. We haven't really investigated deeply these issues, but upgraded to ZooKeeper 3.5.5 (and later to 3.5.6). We had to do some minor fixes in the unit tests (and also had to change some deprecated Curator API calls), but the latest PR seems to be stable.
ZooKeeper 3.5.6 just got released during our work. (I think the official announcement will get out maybe tomorrow, but it is already available in maven central or on the Apache ZooKeeper ftp site). It is considered to be a stable version, contains some minor fixes and improvements, plus some CVE fixes. See the release notes.
Currently in Hadoop we are using ZooKeeper version 3.4.13. ZooKeeper 3.5.5 is the latest stable Apache ZooKeeper release. It contains many new features (including SSL related improvements which can be very important for production use; see the release notes).
Apache Curator is a high level ZooKeeper client library, that makes it easier to use the low level ZooKeeper API. Currently in Hadoop we are using Curator 2.13.0 and in Ozone we use Curator 2.12.0.
Curator 2.x is supporting only the ZooKeeper 3.4.x releases, while Curator 3.x is compatible only with the new ZooKeeper 3.5.x releases. Fortunately, the latest Curator 4.x versions are compatible with both ZooKeeper 3.4.x and 3.5.x. (see the relevant Curator page). Many Apache projects have already migrated to Curator 4 (like HBase, Phoenix, Druid, etc.), other components are doing it right now (e.g. Hive).
The aims of this task are to:
- change Curator version in Hadoop to the latest stable 4.x version (currently 4.2.0)
- also make sure we don't have multiple ZooKeeper versions in the classpath to avoid runtime problems (it is recommended to exclude the ZooKeeper which come with Curator, so that there will be only a single ZooKeeper version used runtime in Hadoop)
In this ticket we still don't want to change the default ZooKeeper version in Hadoop, we only want to make it possible for the community to be able to build / use Hadoop with the new ZooKeeper (e.g. if they need to secure the ZooKeeper communication with SSL, what is only supported in the new ZooKeeper version). Upgrading to Curator 4.x should keep Hadoop to be compatible with both ZooKeeper 3.4 and 3.5.
Attachments
Issue Links
- is related to
-
YARN-11468 Zookeeper SSL/TLS support
- Resolved
-
HADOOP-18709 Add curator based ZooKeeper communication support over SSL/TLS into the common library
- Resolved
-
HADOOP-17612 Upgrade Zookeeper to 3.6.3 and Curator to 5.2.0
- Resolved
- relates to
-
BIGTOP-3803 Fix Hive3.1.3 Metastore service compatible issue with Hadoop3.3.x when kerberos enabled
- Resolved
-
HADOOP-16763 Make Curator 4 run in soft-compatibility mode with ZooKeeper 3.4
- Open
-
HADOOP-16982 Update Netty to 4.1.48.Final
- Resolved
- requires
-
YARN-9783 Remove low-level zookeeper test to be able to build Hadoop against zookeeper 3.5.5
- Resolved
- links to