Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
-
HideChanges the manner in which file space consumption is reported to the Master for the purposes of space quota tracking to reduce the latency in which system space utilization is observed. This will have a positive effect in how quickly HBase will react to changes in filesystem usage related to file archiving.ShowChanges the manner in which file space consumption is reported to the Master for the purposes of space quota tracking to reduce the latency in which system space utilization is observed. This will have a positive effect in how quickly HBase will react to changes in filesystem usage related to file archiving.
Description
Related to the work proposed on HBASE-17748 and building on the same idea as HBASE-18133, we can make the space quota tracking for HBase snapshots faster to respond.
When snapshots are in play, the location of a file (whether in the data or archive directory) plays a factor in the realized size of a table. Like flushes, compactions, etc, moving files from the data directory to the archive directory is done by the RegionServer. We can hook into this call and send the necessary information to the Master so that it can more quickly update the size of a table when there are snapshots in play.
This will require the RegionServer to report the full coordinates of the file being moved (table+region+family+file) so that the SnapshotQuotaObserverChore running in the master can avoid HDFS lookups in partial or total to compute the location of a Region's hfiles.
This may also require some refactoring of the SnapshotQuotaObserverChore to de-couple the receipt of these file archival reports from RegionServers (e.g. HRegionFileSystem.removeStoreFiles(..), and the Master processing the sizes of snapshots.
Attachments
Attachments
- HBASE-18135.005.patch
- 111 kB
- Josh Elser
- HBASE-18135.004.patch
- 112 kB
- Josh Elser
- HBASE-18135.002.patch
- 111 kB
- Josh Elser
- HBASE-18135.001.patch
- 223 kB
- Josh Elser
Issue Links
- depends upon
-
HBASE-17748 Include HBase Snapshots in Space Quotas
- Resolved
- is blocked by
-
HBASE-18133 Low-latency space quota size reports
- Resolved
- links to
Activity
We might be able to push this down to "HBase" itself (use some kind of compareAndSet) or just synchronize access in the master via some new class.
My thinking is this: for the archive "notices" sent by RS to Master, these can be pushed to the Quota table using Increments. Row-locks server-side provide the level of exclusion we want/need. Then, every so often, the SnapshotQuotaObserverChore will recompute a "total" size for a snapshot which needs to be updated in the table exclusively.
These two process may race, but this can be solved with a RWLock. In this case the "read" lock would be grabbed by the threads issuing Increments as they can happen in parallel. The "write" lock would be grabbed by the Chore when it's updating the "total" size of the snapshot which prevents the "readers" (the Incrementers) from submitting new Increments which may screw up the total. Just.. ignore the fact that we're writing while holding a "read" lock
.001 Parking an initial implementation for reviews to catch up.
This moves the logic for computing the size of a snapshot out of a Chore and into its own in-memory state in the Master. This state can be updated via the Chore or via RPCs from a RS. The computed snapshot sizes are still persisted to the hbase:quota table. The actual code change is much smaller than the patch lends itself because a bit of code was just moved directly.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 0s | Docker mode activated. |
-1 | patch | 0m 3s | |
Subsystem | Report/Notes |
---|---|
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12873310/HBASE-18135.001.patch |
JIRA Issue | |
Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7204/console |
Powered by | Apache Yetus 0.3.0 http://yetus.apache.org |
This message was automatically generated.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 0s | Docker mode activated. |
-1 | patch | 0m 4s | |
Subsystem | Report/Notes |
---|---|
JIRA Issue | |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12873310/HBASE-18135.001.patch |
Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/11586/console |
Powered by | Apache Yetus 0.7.0 http://yetus.apache.org |
This message was automatically generated.
yuzhihong@gmail.com, one more for ya when you have a moment
Will put it on RB shortly too.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 19s | Docker mode activated. |
Prechecks | |||
+1 | hbaseanti | 0m 0s | Patch does not have any anti-patterns. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. |
master Compile Tests | |||
0 | mvndep | 0m 23s | Maven dependency ordering for branch |
+1 | mvninstall | 3m 54s | master passed |
+1 | compile | 1m 25s | master passed |
+1 | checkstyle | 1m 31s | master passed |
+1 | shadedjars | 5m 37s | branch has no errors when building our shaded downstream artifacts. |
+1 | findbugs | 4m 34s | master passed |
+1 | javadoc | 0m 57s | master passed |
Patch Compile Tests | |||
0 | mvndep | 0m 12s | Maven dependency ordering for patch |
+1 | mvninstall | 3m 57s | the patch passed |
+1 | compile | 1m 22s | the patch passed |
+1 | cc | 1m 22s | the patch passed |
+1 | javac | 1m 22s | the patch passed |
-1 | checkstyle | 0m 25s | hbase-client: The patch generated 2 new + 7 unchanged - 1 fixed = 9 total (was 8) |
-1 | checkstyle | 1m 1s | hbase-server: The patch generated 26 new + 333 unchanged - 2 fixed = 359 total (was 335) |
-1 | whitespace | 0m 0s | The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply |
+1 | shadedjars | 4m 11s | patch has no errors when building our shaded downstream artifacts. |
-1 | hadoopcheck | 6m 7s | The patch causes 10 errors with Hadoop v2.6.5. |
-1 | hadoopcheck | 8m 6s | The patch causes 10 errors with Hadoop v2.7.4. |
-1 | hadoopcheck | 10m 13s | The patch causes 10 errors with Hadoop v3.0.0. |
+1 | hbaseprotoc | 1m 5s | the patch passed |
+1 | findbugs | 4m 58s | the patch passed |
+1 | javadoc | 0m 56s | the patch passed |
Other Tests | |||
+1 | unit | 0m 27s | hbase-protocol-shaded in the patch passed. |
+1 | unit | 2m 57s | hbase-client in the patch passed. |
+1 | unit | 94m 13s | hbase-server in the patch passed. |
+1 | asflicense | 1m 0s | The patch does not generate ASF License warnings. |
140m 53s |
Subsystem | Report/Notes |
---|---|
Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
JIRA Issue | |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12912478/HBASE-18135.002.patch |
Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile cc hbaseprotoc |
uname | Linux 792848ccbd32 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 GNU/Linux |
Build tool | maven |
Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh |
git revision | master / bdedcc5631 |
maven | version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
Default Java | 1.8.0_151 |
findbugs | v3.1.0-RC3 |
checkstyle | https://builds.apache.org/job/PreCommit-HBASE-Build/11726/artifact/patchprocess/diff-checkstyle-hbase-client.txt |
checkstyle | https://builds.apache.org/job/PreCommit-HBASE-Build/11726/artifact/patchprocess/diff-checkstyle-hbase-server.txt |
whitespace | https://builds.apache.org/job/PreCommit-HBASE-Build/11726/artifact/patchprocess/whitespace-eol.txt |
Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/11726/testReport/ |
Max. process+thread count | 5025 (vs. ulimit of 10000) |
modules | C: hbase-protocol-shaded hbase-client hbase-server U: . |
Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/11726/console |
Powered by | Apache Yetus 0.7.0 http://yetus.apache.org |
This message was automatically generated.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 17s | Docker mode activated. |
Prechecks | |||
+1 | hbaseanti | 0m 0s | Patch does not have any anti-patterns. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. |
master Compile Tests | |||
0 | mvndep | 0m 12s | Maven dependency ordering for branch |
+1 | mvninstall | 3m 57s | master passed |
+1 | compile | 1m 24s | master passed |
+1 | checkstyle | 1m 36s | master passed |
+1 | shadedjars | 5m 50s | branch has no errors when building our shaded downstream artifacts. |
+1 | findbugs | 4m 39s | master passed |
+1 | javadoc | 0m 56s | master passed |
Patch Compile Tests | |||
0 | mvndep | 0m 14s | Maven dependency ordering for patch |
+1 | mvninstall | 4m 5s | the patch passed |
+1 | compile | 1m 28s | the patch passed |
+1 | cc | 1m 28s | the patch passed |
+1 | javac | 1m 28s | the patch passed |
+1 | checkstyle | 0m 10s | The patch hbase-protocol-shaded passed checkstyle |
+1 | checkstyle | 0m 26s | hbase-client: The patch generated 0 new + 7 unchanged - 1 fixed = 7 total (was 8) |
-1 | checkstyle | 1m 2s | hbase-server: The patch generated 6 new + 332 unchanged - 3 fixed = 338 total (was 335) |
+1 | whitespace | 0m 0s | The patch has no whitespace issues. |
+1 | shadedjars | 4m 17s | patch has no errors when building our shaded downstream artifacts. |
-1 | hadoopcheck | 6m 13s | The patch causes 10 errors with Hadoop v2.6.5. |
-1 | hadoopcheck | 8m 15s | The patch causes 10 errors with Hadoop v2.7.4. |
-1 | hadoopcheck | 10m 22s | The patch causes 10 errors with Hadoop v3.0.0. |
+1 | hbaseprotoc | 1m 6s | the patch passed |
+1 | findbugs | 4m 56s | the patch passed |
+1 | javadoc | 0m 52s | the patch passed |
Other Tests | |||
+1 | unit | 0m 28s | hbase-protocol-shaded in the patch passed. |
+1 | unit | 3m 0s | hbase-client in the patch passed. |
+1 | unit | 105m 33s | hbase-server in the patch passed. |
+1 | asflicense | 0m 50s | The patch does not generate ASF License warnings. |
152m 39s |
Subsystem | Report/Notes |
---|---|
Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
JIRA Issue | |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12912800/HBASE-18135.004.patch |
Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile cc hbaseprotoc |
uname | Linux f050c33d7813 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 GNU/Linux |
Build tool | maven |
Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh |
git revision | master / 1d25b60831 |
maven | version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
Default Java | 1.8.0_151 |
findbugs | v3.1.0-RC3 |
checkstyle | https://builds.apache.org/job/PreCommit-HBASE-Build/11781/artifact/patchprocess/diff-checkstyle-hbase-server.txt |
Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/11781/testReport/ |
Max. process+thread count | 5022 (vs. ulimit of 10000) |
modules | C: hbase-protocol-shaded hbase-client hbase-server U: . |
Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/11781/console |
Powered by | Apache Yetus 0.7.0 http://yetus.apache.org |
This message was automatically generated.
.005 should fix the checkstyle problems. Need to look at the Hadoop errors.
-1 overall |
Vote | Subsystem | Runtime | Comment |
---|---|---|---|
0 | reexec | 0m 15s | Docker mode activated. |
Prechecks | |||
+1 | hbaseanti | 0m 0s | Patch does not have any anti-patterns. |
+1 | @author | 0m 0s | The patch does not contain any @author tags. |
+1 | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. |
master Compile Tests | |||
0 | mvndep | 0m 10s | Maven dependency ordering for branch |
+1 | mvninstall | 4m 11s | master passed |
+1 | compile | 1m 24s | master passed |
+1 | checkstyle | 1m 36s | master passed |
+1 | shadedjars | 5m 50s | branch has no errors when building our shaded downstream artifacts. |
+1 | findbugs | 4m 18s | master passed |
+1 | javadoc | 0m 56s | master passed |
Patch Compile Tests | |||
0 | mvndep | 0m 14s | Maven dependency ordering for patch |
+1 | mvninstall | 4m 0s | the patch passed |
+1 | compile | 1m 19s | the patch passed |
+1 | cc | 1m 19s | the patch passed |
+1 | javac | 1m 19s | the patch passed |
+1 | checkstyle | 0m 7s | The patch hbase-protocol-shaded passed checkstyle |
+1 | checkstyle | 0m 28s | hbase-client: The patch generated 0 new + 7 unchanged - 1 fixed = 7 total (was 8) |
+1 | checkstyle | 1m 0s | hbase-server: The patch generated 0 new + 332 unchanged - 3 fixed = 332 total (was 335) |
+1 | whitespace | 0m 0s | The patch has no whitespace issues. |
+1 | shadedjars | 4m 10s | patch has no errors when building our shaded downstream artifacts. |
-1 | hadoopcheck | 6m 5s | The patch causes 10 errors with Hadoop v2.6.5. |
-1 | hadoopcheck | 7m 59s | The patch causes 10 errors with Hadoop v2.7.4. |
-1 | hadoopcheck | 10m 2s | The patch causes 10 errors with Hadoop v3.0.0. |
+1 | hbaseprotoc | 1m 3s | the patch passed |
+1 | findbugs | 4m 47s | the patch passed |
+1 | javadoc | 0m 51s | the patch passed |
Other Tests | |||
+1 | unit | 0m 28s | hbase-protocol-shaded in the patch passed. |
+1 | unit | 2m 55s | hbase-client in the patch passed. |
+1 | unit | 103m 47s | hbase-server in the patch passed. |
+1 | asflicense | 1m 0s | The patch does not generate ASF License warnings. |
149m 55s |
Subsystem | Report/Notes |
---|---|
Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
JIRA Issue | |
JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12913084/HBASE-18135.005.patch |
Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile cc hbaseprotoc |
uname | Linux 70eb8aac3018 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 GNU/Linux |
Build tool | maven |
Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh |
git revision | master / b7b8683925 |
maven | version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
Default Java | 1.8.0_151 |
findbugs | v3.1.0-RC3 |
Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/11812/testReport/ |
Max. process+thread count | 4905 (vs. ulimit of 10000) |
modules | C: hbase-protocol-shaded hbase-client hbase-server U: . |
Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/11812/console |
Powered by | Apache Yetus 0.7.0 http://yetus.apache.org |
This message was automatically generated.
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-install-plugin:2.5.2:install (default-install) on project hbase-thrift: Failed to install metadata org.apache.hbase:hbase-thrift:3.0.0-SNAPSHOT/maven-metadata.xml: Could not parse metadata /home/jenkins/.m2/repository/org/apache/hbase/hbase-thrift/3.0.0-SNAPSHOT/maven-metadata-local.xml: in epilog non whitespace content is not allowed but got / (position: END_TAG seen ...</metadata>\n/... @25:2) -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn <goals> -rf :hbase-thrift
The hadoopcheck failures are all due to the above. Checking it locally, but I think it's just some issue on the build machine. busbey, does this ring any bells to you?
FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4697 (See https://builds.apache.org/job/HBase-Trunk_matrix/4697/)
HBASE-18135 Implement mechanism for RegionServers to report file (elserj: rev 4a4c0120494757539d680c2d7d44fe6ab3d71d27)
- (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/MasterQuotaManager.java
- (add) hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/FileArchiverNotifier.java
- (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
- (add) hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/FileArchiverNotifierFactoryImpl.java
- (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
- (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/SnapshotQuotaObserverChore.java
- (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/quotas/TestSnapshotQuotaObserverChore.java
- (add) hbase-server/src/test/java/org/apache/hadoop/hbase/quotas/TestFileArchiverNotifierImpl.java
- (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
- (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/MockRegionServerServices.java
- (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/quotas/QuotaTableUtil.java
- (edit) hbase-protocol-shaded/src/main/protobuf/RegionServerStatus.proto
- (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
- (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/quotas/TestLowLatencySpaceQuotas.java
- (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java
- (add) hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/FileArchiverNotifierFactory.java
- (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/RegionServerSpaceQuotaManager.java
- (add) hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/FileArchiverNotifierImpl.java
- (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java
This one is a little tricky. The problem is that all of the reporting/tracking of Table sizes is done on the Region level. For snapshots, we're working at the file level. We definitely don't want to go tracking files themselves as that's a recipe for bugs. It would be really miserable code to understand, implement correctly, and maintain.
I'm thinking about the following scenario which will help the "average" case.
1. RS1 compacts R1 from T1: the files [file1, file2, file3] into [file4].
2. RS1 moves [file1, file2, file3] from the data/ directory to the archive/ directory in HDFS
3. RS1 reports [file1, file2, file3] for T1 to the Master (only if T1 has quotas enabled)
4. If T1 has snapshots, for each file in the list reported by RS1, the Master finds the first Snapshot against T1 that references that file.
5. For each file that the Snapshot references, the snapshot size is updated directly in the hbase:quota table.
This gets the "visible" quota size updated quickly and avoids interfering with the SnapshotQuotaObserverChore. When the quota table is updated, the QuotaObserverChore will see the new size, not introducing another source of latency for quota usage to be updated.
The effort here would be making sure the SnapshotQuotaObserverChore doesn't race against this hypothetical new process. We might be able to push this down to "HBase" itself (use some kind of compareAndSet) or just synchronize access in the master via some new class.