Description
When we started using the log collation ATS stopped writing to the errors.log file. Traffic Server thinks that the configured log space is exhausted:
[Sep 18 23:00:40.817] Server {0x2b1bdcc51700} WARNING: Access logging to local log directory suspended - configured space allocation exhausted. [Sep 18 23:01:45.008] Server {0x2b05c4e93700} WARNING: Access logging to local log directory suspended - configured space allocation exhausted. [Sep 19 20:57:05.766] Server {0x2b05c4c91700} NOTE: logging space exhausted, failed to write orphan file, drop(1832) bytes [Sep 19 20:57:15.743] Server {0x2b05c4c91700} NOTE: logging space exhausted, failed to write orphan file, drop(4360) bytes
LogConfig::update_space_used() emits the exhaustion message when LogConfig::space_to_write(headroom) fails:
$ sudo /opt/bazinga/bin/traffic_line -r proxy.config.log.max_space_mb_headroom 1000 $ sudo gdb -p 28375 ... (gdb) p (bool)(Log::config->space_to_write((int64_t)(1000*1024))) $1 = false (gdb) p *(Log::config) $1 = {initialized = true, reconfiguration_needed = false, logging_space_exhausted = true, m_space_used = 28313954, m_partition_space_left = 46870138880, roll_log_files_now = false, log_object_manager = {_objects = 0x2b1b68758d00, _numObjects = 1, _maxObjects = 8, _APIobjects = 0x2b1b68758d50, _numAPIobjects = 0, _maxAPIobjects = 8, _APImutex = 0x2b1b6874ee40}, global_filter_list = {m_filter_list = {<DLL<LogFilter, LogFilter::Link_link>> = {head = 0x0}, tail = 0x0}, m_does_conjunction = true}, global_format_list = {m_format_list = {<DLL<LogFormat, LogFormat::Link_link>> = {head = 0x2b1b68764810}, tail = 0x2b1b68767e10}}, log_buffer_size = 9216, max_secs_per_buffer = 5, max_space_mb_for_logs = 102400, max_space_mb_for_orphan_logs = 25, max_space_mb_headroom = 1000, logfile_perm = 420, squid_log_enabled = false, xuid_logging_enabled = true, squid_log_is_ascii = false, common_log_enabled = false, common_log_is_ascii = true, extended_log_enabled = false, extended_log_is_ascii = false, extended2_log_enabled = false, extended2_log_is_ascii = true, separate_icp_logs = false, separate_host_logs = false, collation_mode = 2, collation_port = 8085, collation_host_tagged = false, collation_preproc_threads = 1, collation_retry_sec = 5, collation_max_send_buffers = 16, rolling_enabled = 2, rolling_interval_sec = 86400, rolling_offset_hr = 0, rolling_size_mb = 10240, auto_delete_rolled_files = true, custom_logs_enabled = true, search_log_enabled = false, search_rolling_interval_sec = 86400, search_server_ip_addr = 0, search_server_port = 8080, search_top_sites = 100, search_log_filters = 0x0, search_url_filter = 0x0, search_log_file_one = 0x2b1b68758e00 "search_log1", search_log_file_two = 0x2b1b68758e20 "search_log2", sampling_frequency = 1, file_stat_frequency = 32, space_used_frequency = 2, ascii_buffer_size = 36864, max_line_size = 9216, hostname = 0x2b1b68758f20 "localhost", logfile_dir = 0x2b1b6823fcd0 "/var/log/trafficserver", squid_log_name = 0x2b1b68240960 "squid", squid_log_header = 0x0, common_log_name = 0x2b1b68225a90 "common", common_log_header = 0x0, extended_log_name = 0x2b1b682419f0 "extended", extended_log_header = 0x0, extended2_log_name = 0x2b1b6874f7b0 "extended2", extended2_log_header = 0x0, collation_host = 0x2b1b6874fe50 "none", collation_secret = 0x2b1b6874fa20 "Aapl_Baz1ng4!", xml_config_file = 0x2b1b68758da0 "logs_xml.config", hosts_config_file = 0x2b1b68758dc0 "log_hosts.config", use_orphan_log_space_value = true, m_log_collation_accept = 0x0, m_dir_entry = 0x2b1b68769e30, m_pDir = 0x0, m_disk_full = true, m_disk_low = false, m_partition_full = false, m_partition_low = false, m_log_directory_inaccessible = false}
Note that use_orphan_log_space_value = true.
In proxy/logging/LogConfig.h:
169 170 int get_max_space_mb() { 171 return (use_orphan_log_space_value ? max_space_mb_for_orphan_logs : max_space_mb_for_logs); 172 } 173
So the orphan log space is being used to determine the maximum log space allowed. This is determined in LogConfig::init():
742 // determine if we should use the orphan log space value or not 743 // we use it if all objects are collation clients, or if some are and 744 // the specified space for collation is larger than that for local files 745 // 746 size_t num_collation_clients = log_object_manager.get_num_collation_clients(); 747 use_orphan_log_space_value = 748 (num_collation_clients == 0 ? false : 749 (log_object_manager.get_num_objects() == num_collation_clients ? true : 750 max_space_mb_for_orphan_logs > max_space_mb_for_logs)); 751
Our configuration has 1 log object that is a collation client, so log_object_manager.get_num_objects() == log_object_manager.get_num_collation_clients(), and we end up using the orphan log space.
The fix is to treat the error.log like a standard log object and manage it like all the others. This makes the assumptions around logging thresholds turn out to be true.