Details
Description
As part of the change to hold on to terminal unacknowledged tasks in the master, we introduced a performance regression during the following patch:
https://github.com/apache/mesos/commit/0760b007ad65bc91e8cea377339978c78d36d247
commit 0760b007ad65bc91e8cea377339978c78d36d247 Author: Benjamin Mahler <bmahler@twitter.com> Date: Thu Sep 11 10:48:20 2014 -0700 Minor cleanups to the Master code. Review: https://reviews.apache.org/r/25566
Rather than keeping a running count of allocated resources, we now compute resources on-demand. This was done in order to ignore terminal task's resources.
As a result of this change, the /stats.json and /metrics/snapshot endpoints on the master have slowed down substantially on large clusters.
$ time curl localhost:5050/health real 0m0.004s user 0m0.001s sys 0m0.002s $ time curl localhost:5050/stats.json > /dev/null real 0m15.402s user 0m0.001s sys 0m0.003s $ time curl localhost:5050/metrics/snapshot > /dev/null real 0m6.059s user 0m0.002s sys 0m0.002s
perf top reveals some of the resource computation during a request to stats.json:
Events: 36K cycles 10.53% libc-2.5.so [.] _int_free 9.90% libc-2.5.so [.] malloc 8.56% libmesos-0.21.0.so [.] std::_Rb_tree<process::ProcessBase*, process::ProcessBase*, std::_Identity<process::ProcessBase*>, std::less<process::ProcessBase*>, std::allocator<process::ProcessBase*> >:: 8.23% libc-2.5.so [.] _int_malloc 5.80% libstdc++.so.6.0.8 [.] std::_Rb_tree_increment(std::_Rb_tree_node_base*) 5.33% [kernel] [k] _raw_spin_lock 3.13% libstdc++.so.6.0.8 [.] std::string::assign(std::string const&) 2.95% libmesos-0.21.0.so [.] process::SocketManager::exited(process::ProcessBase*) 2.43% libmesos-0.21.0.so [.] mesos::Resource::MergeFrom(mesos::Resource const&) 1.88% libmesos-0.21.0.so [.] mesos::internal::master::Slave::used() const 1.48% libstdc++.so.6.0.8 [.] __gnu_cxx::__atomic_add(int volatile*, int) 1.45% [kernel] [k] find_busiest_group 1.41% libc-2.5.so [.] free 1.38% libmesos-0.21.0.so [.] mesos::Value_Range::MergeFrom(mesos::Value_Range const&) 1.13% libmesos-0.21.0.so [.] mesos::Value_Scalar::MergeFrom(mesos::Value_Scalar const&) 1.12% libmesos-0.21.0.so [.] mesos::Resource::SharedDtor() 1.07% libstdc++.so.6.0.8 [.] __gnu_cxx::__exchange_and_add(int volatile*, int) 0.94% libmesos-0.21.0.so [.] google::protobuf::UnknownFieldSet::MergeFrom(google::protobuf::UnknownFieldSet const&) 0.92% libstdc++.so.6.0.8 [.] operator new(unsigned long) 0.88% libmesos-0.21.0.so [.] mesos::Value_Ranges::MergeFrom(mesos::Value_Ranges const&) 0.75% libmesos-0.21.0.so [.] mesos::matches(mesos::Resource const&, mesos::Resource const&)