Buffer Logs and Flush Automatically on Error with Powertools for Lambda

The latest release of AWS Powertools for Lambda makes it easier to extend the Logger with custom functionality:

Logger class is now more extensible
You can now overwrite the Logger methods createAndPopulateLogItem, printLog, and processLogItem, which were previously private. This allows you to extend the Logger and add new functionality, e.g., implement your own message buffer.
Release v2.12.0

I'm using this new extensibility to implement a useful feature: log buffering and flushing. The idea is simple: in a production environment, we usually only log important information, such as warnings and errors, because log space can get expensive and it creates noise. However, when an error occurs, we want every possible piece of information: all those debug and info logs scattered throughout a function should be available. But they are not because we set our log level too low.

Buffer and Flush

What if we collect all these debug and info logs internally, and if an important event occurs like an error, we print them to the console? I classify logs into two categories: low-level logs and high-level logs. If our configured log level is WARN, a DEBUG or INFO log would be low-level, and ERROR would be a high-level log.

Now when we print a low-level log, instead of discarding it as it is now, we buffer the log in an internal list. As soon as we have a high-level log, we flush all buffered logs to the console.

Implementation

To add this functionality, we create a new class that extends the Logger class from Powertools and override processLogItem(). This is the central method which is called by the different log methods like logger.debug(). The original implementation prints the log item to the console if it's at the right level. By overriding this method, we can add our special logic of buffering and flushing logs depending on the log level.

import { LogItem, Logger as PowertoolsLogger } from '@aws-lambda-powertools/logger';
import type { LogItemExtraInput, LogItemMessage } from '@aws-lambda-powertools/logger/types';

export class Logger extends PowertoolsLogger {
  #buffer: Record<string, Array<[number, string]>> = {};

  get buffer(): Record<string, Array<[number, string]>> {
    return this.#buffer;
  }

  protected override processLogItem(logLevel: number, input: LogItemMessage, extraInput: LogItemExtraInput): void {
    const xRayTraceId = this['envVarsService'].getXrayTraceId() as string;

    // Flush buffer when log level is higher than the configured log level
    if (logLevel > this.level && xRayTraceId) {
      const buffer = this.#buffer[xRayTraceId] ?? [];

      // Print all log items in the buffer
      if (buffer.length) this.info(`Flushing buffer with ${buffer.length} log items`);

      for (const [bufferLogLevel, bufferLogItem] of buffer) {
        // Create a new LogItem from the stringified log item
        this.printLog(bufferLogLevel, new LogItem(JSON.parse(bufferLogItem)));
      }

      // Clear the buffer after flushing
      // This also removes entries from other X-Ray trace IDs
      this.#buffer = {};
    }

    // Buffer the log item when log level is lower than the configured log level
    if (logLevel < this.level && xRayTraceId) {
      const buffer = this.#buffer[xRayTraceId] ?? [];
      // Add the stringified log item to the buffer
      // Serializing the log item ensures it is not mutated after being added to the buffer
      buffer.push([logLevel, JSON.stringify(this.createAndPopulateLogItem(logLevel, input, extraInput))]);

      // Update the buffer with the new log item
      // This also removes other X-Ray trace IDs from the buffer
      this.#buffer = {
        [xRayTraceId]: buffer,
      };
    }

    // Call the parent method to ensure the log item is processed
    super.processLogItem(logLevel, input, extraInput);
  }
}

You might ask why we use the X-Ray Trace ID here. It's common to instantiate the Logger outside of the handler function. However, because the Lambda execution environment is re-used for potentially multiple invocations, the buffer could contain log items from previous invocations. That's the reason the buffer is implemented as an object rather than a simple array. We use the X-Ray Trace ID as an identifier to only buffer log items from the same invocation. The buffer is implemented as an object rather than a simple array. When the buffer is flushed, we can simply reset the object and therefore purge items from other invocations.

Test Locally

Let's quickly verify the implementation locally:

// set X-Ray Trace ID manually if running locally
process.env._X_AMZN_TRACE_ID = '1-abcdef12-3456abcdef123456abcdef12';

// log level = WARN
const logger = new Logger({ logLevel: 'WARN' });

logger.debug('debug'); // < log level
logger.info('info');   // < log level
logger.warn('warn');   // = log level
logger.error('error'); // > log level

Here's the output we get:

{"level":"WARN","message":"warn","sampling_rate":0,"service":"service_undefined","timestamp":"2024-12-18T11:01:55.260Z","xray_trace_id":"1-abcdef12-3456abcdef123456abcdef12"}
{"level":"INFO","message":"Flushing buffer with 2 log items","sampling_rate":0,"service":"service_undefined","timestamp":"2024-12-18T11:01:55.261Z","xray_trace_id":"1-abcdef12-3456abcdef123456abcdef12"}
{"level":"DEBUG","message":"debug","sampling_rate":0,"service":"service_undefined","timestamp":"2024-12-18T11:01:55.254Z","xray_trace_id":"1-abcdef12-3456abcdef123456abcdef12"}
{"level":"INFO","message":"info","sampling_rate":0,"service":"service_undefined","timestamp":"2024-12-18T11:01:55.259Z","xray_trace_id":"1-abcdef12-3456abcdef123456abcdef12"}
{"level":"ERROR","message":"error","sampling_rate":0,"service":"service_undefined","timestamp":"2024-12-18T11:01:59.228Z","xray_trace_id":"1-abcdef12-3456abcdef123456abcdef12"}

The warning is the first message because the debug and info logs were buffered. When the error was logged, we flushed the buffered logs (and printed an info) before the error was actually printed.

Request For Comments

My naive implementation has a few caveats. Most importantly, the buffer size is not limited, which means it could cause memory issues if the buffer grows too large. There are a few approaches to mitigate this issue, for example, by implementing the buffer as a sliding window which only keeps the most recent logs or limiting the total buffer size.

Furthermore, the buffered logs are only flushed in controlled cases like on logger.error(), but not on unhandled errors. This behavior could easily be achieved if we make the buffer public and use a middleware like Middy.js. Middy exposes an onError event that we could utilize to flush the buffer.

I have written about this more extensively in this Request For Comment on the official AWS Powertools for Lambda repository.

If you'd like to see this feature become a part of Powertools for Lambda, please share your ideas and feedback there 🙏