Skip to content

feat: enhance logging capabilities and configuration#393

Open
niteshpurohit wants to merge 16 commits into
mainfrom
feat/telemetry-and-recovery
Open

feat: enhance logging capabilities and configuration#393
niteshpurohit wants to merge 16 commits into
mainfrom
feat/telemetry-and-recovery

Conversation

@niteshpurohit
Copy link
Copy Markdown
Member

@niteshpurohit niteshpurohit commented May 24, 2026

  • Introduced structured logging support to improve log readability and parsing.
  • Added access and error log file paths to the configuration for better log management.
  • Implemented logging for runtime events, including worker lifecycle and health state changes.
  • Created methods for logging access events and runtime errors to separate concerns.
  • Updated runtime configuration to include new logging options, ensuring they are configurable via environment variables and Ruby options.
  • Enhanced the control plane to serve metrics and stats endpoints, providing observability into the system's performance.

closes: #124
closes: #137
closes: #150

- Introduced structured logging support to improve log readability and parsing.
- Added access and error log file paths to the configuration for better log management.
- Implemented logging for runtime events, including worker lifecycle and health state changes.
- Created methods for logging access events and runtime errors to separate concerns.
- Updated runtime configuration to include new logging options, ensuring they are configurable via environment variables and Ruby options.
- Enhanced the control plane to serve metrics and stats endpoints, providing observability into the system's performance.

closes: #124
closes: #150
@niteshpurohit niteshpurohit self-assigned this May 24, 2026
Copilot AI review requested due to automatic review settings May 24, 2026 20:40
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands Vajra’s operator-facing observability surface by adding configurable access/error log destinations, introducing worker health state tracking, and exposing native control-plane endpoints for stats and Prometheus-style metrics.

Changes:

  • Added new runtime configuration options (access_log, error_log, structured_logs, stats_path, metrics_endpoint) and plumbed them from Ruby → native runtime.
  • Implemented runtime/access/error logging helpers and added access logging for both app and control-plane responses.
  • Added worker health state tracking plus control-plane /stats (JSON) and /metrics (text) endpoints, with new E2E coverage.

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
gems/vajra/spec/support/documented_server_options.rb Updates documented/native option fixtures to include the new logging + control-plane config.
gems/vajra/spec/e2e/vajra/configuration_spec.rb Adds E2E coverage for configured stats/metrics endpoints and adjusts an ordering assertion.
gems/vajra/lib/vajra.rb Extends documented/native start option key lists to include new logging/control-plane options.
gems/vajra/ext/vajra/vajra.hpp Extends native start signature to accept logging + control-plane parameters.
gems/vajra/ext/vajra/vajra.cpp Passes newly loaded runtime config fields into the native start call.
gems/vajra/ext/vajra/runtime/worker_pool.hpp Adds worker health state enum and shared telemetry/transition counters.
gems/vajra/ext/vajra/runtime/runtime_logging.hpp Adds declarations for logging configuration and new access/error logging helpers.
gems/vajra/ext/vajra/runtime/runtime_logging.cpp Implements file-backed logging helpers, access/error logging, and enriches lifecycle logs.
gems/vajra/ext/vajra/runtime/runtime_config.hpp Extends RuntimeConfig with logging + control-plane fields.
gems/vajra/ext/vajra/runtime/runtime_config.cpp Loads/validates new options from Ruby/env and returns them in RuntimeConfig.
gems/vajra/ext/vajra/runtime/native_runtime.hpp Adds health policy data and a method to refresh worker health.
gems/vajra/ext/vajra/runtime/native_runtime.cpp Implements health refresh, tracks worker telemetry, wires control-plane config, configures logging.
gems/vajra/ext/vajra/request/request_processor.cpp Logs access events and handles control-plane responses before Rack execution.
gems/vajra/ext/vajra/request/request_executor.hpp Adds control_response virtual hook for control-plane handling.
gems/vajra/ext/vajra/request/request_executor.cpp Provides default control_response implementation returning nullopt.
gems/vajra/ext/vajra/rack/rack_request_executor.hpp Adds control-plane config plumbing and default stats/metrics payload hooks.
gems/vajra/ext/vajra/rack/rack_request_executor.cpp Implements control-plane request matching and emits stats/metrics payloads.

Comment thread gems/vajra/ext/vajra/runtime/runtime_logging.cpp Outdated
Comment thread gems/vajra/ext/vajra/runtime/runtime_logging.cpp Outdated
Comment thread gems/vajra/ext/vajra/runtime/runtime_logging.cpp Outdated
Comment thread gems/vajra/ext/vajra/runtime/runtime_logging.cpp
Comment thread gems/vajra/ext/vajra/runtime/runtime_config.cpp Outdated
Comment thread gems/vajra/ext/vajra/runtime/runtime_config.cpp
Comment thread gems/vajra/lib/vajra.rb
Comment thread gems/vajra/spec/e2e/vajra/configuration_spec.rb Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (3)

gems/vajra/ext/vajra/runtime/runtime_logging.cpp:392

  • logging_config.structured_logs is accessed without synchronization here. Since configure_runtime_logging can reset the config/streams under logging_mutex, this should read the flag (and any derived state) under the same mutex (or via atomics) to avoid undefined behavior in multi-threaded logging.
void Vajra::runtime::log_runtime_error(const std::string &message)
{
  if (logging_config.structured_logs)
  {
    write_error_line(
        "{\"component\":\"error\",\"timestamp\":\"" + utc_timestamp() + "\",\"message\":" +
        escaped_log_value(message) + "}");
    return;

gems/vajra/ext/vajra/runtime/runtime_logging.cpp:407

  • logging_config.structured_logs is read without locking, but is written under logging_mutex in configure_runtime_logging. To avoid a data race during concurrent access logging, read a synchronized snapshot (or make the flag atomic) before branching.
void Vajra::runtime::log_access_event(const std::string &method, const std::string &target, int status_code)
{
  if (logging_config.structured_logs)
  {
    std::ostringstream line;
    line << "{\"component\":\"access\""
         << ",\"timestamp\":\"" << utc_timestamp() << "\""
         << ",\"method\":" << escaped_log_value(method)
         << ",\"target\":" << escaped_log_value(target)
         << ",\"status\":" << status_code

gems/vajra/ext/vajra/runtime/native_runtime.cpp:1434

  • The runtime writes log lines (potentially to buffered std::ofstreams) and then forks workers shortly afterwards. Any buffered-but-unflushed data at fork time can be duplicated (flushed by both parent and child) or lost. Consider flushing log streams after the boot banner/configure step, or deferring opening/initializing file streams until after fork() in each process.
    const bool debug_logging = debug_logging_enabled(config.log_level);
    {
      const std::lock_guard<std::mutex> lock(server_mutex_);
      health_policy_ = health_policy_for(config);
      debug_logging_.store(debug_logging, std::memory_order_release);
    }
    configure_runtime_logging(config.structured_logs, config.access_log, config.error_log);
    log_runtime_banner_start(config.host, config.port, config.workers, config.min_threads, config.max_threads);
    const BootContractResult master_boot_result = BootContract::run(
        BootContractConfig{config.port, config.max_request_head_bytes, kMasterPreloadRuntimeRole});
    BootContract::ensure_ready(master_boot_result);

    std::vector<std::shared_ptr<SharedWorkerState>> booted_worker_states;

Comment thread gems/vajra/ext/vajra/runtime/runtime_logging.cpp Outdated
Comment thread gems/vajra/ext/vajra/runtime/native_runtime.cpp Outdated
Comment thread gems/vajra/ext/vajra/runtime/runtime_logging.cpp Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 25 changed files in this pull request and generated 5 comments.

Comment thread gems/vajra/spec/e2e/vajra/support/http_helpers.rb Outdated
Comment thread gems/vajra/ext/vajra/runtime/native_runtime.cpp
Comment thread gems/vajra/ext/vajra/runtime/native_runtime.cpp Outdated
Comment thread gems/vajra/ext/vajra/runtime/runtime_logging.cpp
Comment thread gems/vajra/spec/e2e/spec_helper.rb
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 25 changed files in this pull request and generated 5 comments.

Comment thread gems/vajra/ext/vajra/runtime/native_runtime.cpp Outdated
Comment thread gems/vajra/ext/vajra/runtime/native_runtime.cpp Outdated
Comment thread gems/vajra/ext/vajra/runtime/native_runtime.cpp
Comment thread gems/vajra/ext/vajra/rack/rack_request_executor.cpp Outdated
Comment thread gems/vajra/sig/vajra.rbs Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 25 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

gems/vajra/spec/e2e/vajra/configuration_spec.rb:165

  • wait_for_runtime_output is now implemented both here and in spec/e2e/vajra/support/http_helpers.rb (included via VajraE2EHttpHelpers). Keeping two copies increases drift risk; consider removing this local definition and using the shared helper method instead.
  def wait_for_runtime_output(output, runtime_output, pattern, count: 1, timeout: 2)
    Timeout.timeout(timeout) do
      loop do
        runtime_output << read_available_output(output)
        break if runtime_output.scan(pattern).size >= count

        sleep 0.01
      end
    end
  end

Comment thread gems/vajra/ext/vajra/runtime/runtime_logging.cpp Outdated
Comment thread gems/vajra/ext/vajra/runtime/runtime_logging.cpp Outdated
Comment thread gems/vajra/ext/vajra/runtime/runtime_logging.cpp Outdated
Comment thread gems/vajra/ext/vajra/runtime/native_runtime.cpp
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 26 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

gems/vajra/ext/vajra/runtime/native_runtime.cpp:2061

  • NativeRuntime::stop() no longer calls stop_worker_processes(). As a result, calling Vajra.stop can leave worker processes running (and the runtime may not actually shut down cleanly) since only the server is stopped here. Consider stopping worker processes (and/or waiting for them to exit) as part of stop() again, consistent with the shutdown path in start().
void Vajra::runtime::NativeRuntime::stop()
{
  const bool had_runtime = runtime_running();
  if (had_runtime)
  {
    begin_runtime_shutdown();
  }

  Vajra::Server *server = nullptr;
  std::shared_ptr<Vajra::Server> server_handle;
  {
    std::lock_guard<std::mutex> lock(server_mutex_);
    server_handle = server_instance_;
    server = server_handle.get();
  }

  if (server != nullptr)
  {
    server->stop();
  }
}

Comment thread gems/vajra/ext/vajra/runtime/native_runtime.cpp
- Introduced tracing capabilities using OpenTelemetry to monitor request and runtime lifecycle spans.
- Added configuration options for enabling tracing, specifying the tracing endpoint, and service name.
- Implemented methods to manage tracing state and lifecycle callbacks.
- Enhanced logging to include tracing status and details in worker lifecycle events.
- Created a new Tracing module to encapsulate tracing logic and state management.
- Added tests to ensure proper functionality and handling of tracing options.
@niteshpurohit niteshpurohit requested a review from Copilot May 26, 2026 04:57
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 29 out of 30 changed files in this pull request and generated 2 comments.

Comment thread gems/vajra/spec/vajra/internal/tracing_spec.rb
Comment thread gems/vajra/ext/vajra/rack/rack_request_executor.cpp Outdated
Copilot AI review requested due to automatic review settings May 27, 2026 03:46
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 29 out of 30 changed files in this pull request and generated 4 comments.

Comment thread gems/vajra/Gemfile Outdated
Comment thread gems/vajra/lib/vajra/internal/tracing.rb
Comment thread gems/vajra/ext/vajra/runtime/runtime_logging.cpp
Comment thread gems/vajra/ext/vajra/runtime/runtime_logging.cpp
Copilot AI review requested due to automatic review settings May 27, 2026 04:07
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 30 out of 31 changed files in this pull request and generated 2 comments.

Comment thread gems/vajra/sig/vajra/internal/tracing.rbs
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 32 out of 33 changed files in this pull request and generated 3 comments.

module Vajra
module Internal
module Tracing
type start_options = Hash[Symbol, bool | String]
TRACE_STATE: TraceState

def self.install_from_start_options!: (start_options) -> bool
def self.with_request_span: [T] (Hash[String, String]) { () -> T } -> T
Comment on lines +807 to +811
#if defined(__APPLE__)
#pragma clang diagnostic push
#pragma clang diagnostic ignored "-Wdeprecated-declarations"
#endif
pid = fork();
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Metrics, traces, and structured operational output Automated recovery and lifecycle management Worker telemetry and health classification

2 participants