Python Log Rotation: A Comprehensive Guide to Better Log Management
In production applications, logs are crucial for debugging, monitoring, and maintaining system health. However, without proper management, logs can quickly become overwhelming, consuming excessive disk space and degrading system performance. Enter log rotation: a vital practice that helps manage log files efficiently while retaining important historical data.
The Why and What of Log Rotation
Log rotation is more than just a good practice, it's essential for any production application. By implementing log rotation, you can:
Prevent your disk from filling up with endless log files
Maintain optimal system performance
Keep logs organized for easier troubleshooting
Meet compliance and retention requirements
Let's dive into four powerful strategies for implementing log rotation in Python, each with its unique advantages and use cases.
1. Size-Based Rotation: The Space Guardian
Size-based rotation is like having a vigilant guard watching your disk space. Using Python's RotatingFileHandler
, you can specify exactly how large each log file should grow before it's rotated.
from logging.handlers import RotatingFileHandler
handler = RotatingFileHandler(
filename="app.log",
maxBytes=10 * 1024 * 1024, # 10MB
backupCount=5,
encoding='utf-8'
)
When your log file hits the specified size limit, it's renamed to app.log.1
, and a fresh app.log
file is created. This creates a predictable cascade: app.log.1
becomes app.log.2
, and so on, until reaching the specified backup count.
This approach shines in environments where disk space is at a premium or when dealing with applications that have unpredictable logging patterns.
2. Time-Based Rotation: The Calendar Keeper
Time-based rotation, implemented through TimedRotatingFileHandler
, organizes logs based on time intervals rather than size. This approach is particularly elegant for applications requiring date-based log organization.
from logging.handlers import TimedRotatingFileHandler
handler = TimedRotatingFileHandler(
filename="app.log",
when='midnight',
interval=1,
backupCount=30,
encoding='utf-8'
)
handler.suffix = "%Y-%m-%d" # Adds date to rotated files
This creates a clean, chronological archive of your logs, making it trivial to locate logs from specific dates. Whether you need rotation by seconds, minutes, hours, or days, time-based rotation has you covered.
3. Watched File Handler: The System Integrator
For teams preferring system-level tools like logrotate, Python's WatchedFileHandler
offers seamless integration. This handler monitors the log file for external rotations and ensures no logs are lost during the process.
from logging.handlers import WatchedFileHandler
handler = WatchedFileHandler(
filename="app.log",
encoding='utf-8'
)
While this approach requires more setup, it provides excellent flexibility and plays well with existing infrastructure.
4. Compressed Rotation: The Space Optimiser
For applications generating substantial logs that need to be retained for extended periods, compressed rotation offers an elegant solution. Here's how you can implement it:
import gzip
from logging.handlers import RotatingFileHandler
class CompressedRotatingFileHandler(RotatingFileHandler):
def rotation_filename(self, default_name: str) -> str:
return default_name + ".gz"
def rotate(self, source: str, dest: str) -> None:
with open(source, 'rb') as f_in:
with gzip.open(dest, 'wb') as f_out:
f_out.writelines(f_in)
This strategy significantly reduces storage requirements while maintaining log accessibility, albeit with a small CPU overhead for compression.
Best Practices and Implementation Tips
1. Right-Size Your Rotation
When implementing size-based rotation, consider your application's logging patterns:
handler = RotatingFileHandler(
filename="app.log",
maxBytes=50 * 1024 * 1024, # 50MB
backupCount=10 # 500MB total capacity
)
2. Match Retention to Requirements
For applications with specific retention requirements:
handler = TimedRotatingFileHandler(
filename="app.log",
when='midnight',
interval=1,
backupCount=90 # 90 days retention
)
3. Consider a Hybrid Approach: The Best of Both Worlds
While individual rotation strategies each have their merits, some applications require a more sophisticated approach that combines the benefits of multiple strategies. The hybrid approach combines time-based and size-based rotation, offering maximum protection against both time and space constraints.
Here's a detailed implementation of a hybrid handler:
from logging.handlers import TimedRotatingFileHandler
import os
class HybridRotatingHandler(TimedRotatingFileHandler):
"""
A handler that rotates logs based on both time and size constraints.
Inherits from TimedRotatingFileHandler to maintain its time-based functionality
while adding size-based checks.
"""
def __init__(self, filename, max_bytes=0, when='midnight', interval=1,
backup_count=0, encoding='utf-8', **kwargs):
"""
Initialize the handler with both time and size parameters.
Args:
filename (str): Base name of the log file
max_bytes (int): Maximum size in bytes before rotation (0 = no size limit)
when (str): Type of interval ('S', 'M', 'H', 'D', 'midnight')
interval (int): Number of intervals before rotation
backup_count (int): Maximum number of backup files to keep
encoding (str): File encoding to use
"""
self.max_bytes = max_bytes
super().__init__(filename, when, interval, backup_count,
encoding=encoding, **kwargs)
def shouldRollover(self, record):
"""
Determine if rollover should occur based on both time and size conditions.
Args:
record: Log record to be written
Returns:
bool: True if rollover should occur, False otherwise
"""
# Check time-based rotation first (parent class implementation)
should_time_rotate = super().shouldRollover(record)
if should_time_rotate:
return True
# Check size-based rotation if max_bytes is set
if self.max_bytes > 0:
try:
msg = self.format(record)
self.stream.seek(0, os.SEEK_END)
if self.stream.tell() + len(msg.encode()) >= self.max_bytes:
return True
except Exception:
# If there's any error seeking/telling, err on the side of rotation
return True
return False
def getFilesToDelete(self):
"""
Get list of files to delete when backup_count is reached.
Ensures both time-based and size-based backup files are considered.
"""
dir_name, base_name = os.path.split(self.baseFilename)
files = super().getFilesToDelete()
# Add handling for size-based rotation files if needed
if self.max_bytes > 0:
file_pattern = f"{base_name}.*"
dir_files = os.listdir(dir_name)
for f in dir_files:
if f.startswith(base_name) and f not in files:
files.append(os.path.join(dir_name, f))
return sorted(files)[:-self.backupCount] if self.backupCount > 0 else files
This enhanced hybrid handler offers several key features:
Dual Triggers: Rotation occurs when either:
The specified time interval is reached (e.g., midnight)
The file size exceeds the maximum bytes limit
Flexible Configuration:
# Example configuration
handler = HybridRotatingHandler(
filename="app.log",
max_bytes=10 * 1024 * 1024, # 10MB size limit
when='midnight', # Rotate at midnight
interval=1, # Every day
backup_count=30, # Keep 30 backups
encoding='utf-8'
)
- Smart File Management:
# Example usage in a logging configuration
import logging
logger = logging.getLogger('hybrid_logger')
logger.setLevel(logging.INFO)
handler = HybridRotatingHandler(
filename="app.log",
max_bytes=10 * 1024 * 1024,
when='midnight',
interval=1,
backup_count=30
)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)
# The logger will now rotate files based on both time and size
Use Cases for Hybrid Rotation
The hybrid approach is particularly valuable in scenarios such as:
Variable Load Applications
Applications with unpredictable logging patterns
Systems that experience periodic spikes in activity
Services that need guaranteed rotation intervals while preventing runaway file sizes
Compliance Requirements
Systems that must maintain daily log files for auditing
Applications that need to ensure logs don't exceed certain size limits
Environments where both retention period and storage space are strictly regulated
High-Availability Services
Mission-critical applications that can't risk running out of disk space
Services that require both time-based organization and size control
Systems where log management needs to be both predictable and protective
Best Practices for Hybrid Rotation
When implementing hybrid rotation, consider these guidelines:
- Size Threshold Selection
# Calculate size based on average daily volume
average_daily_volume = 5 * 1024 * 1024 # 5MB
peak_factor = 3
handler = HybridRotatingHandler(
filename="app.log",
max_bytes=average_daily_volume * peak_factor, # 15MB
when='midnight'
)
- Monitoring Integration
class MonitoredHybridHandler(HybridRotatingHandler):
def doRollover(self):
"""Override to add monitoring"""
super().doRollover()
# Add monitoring notification
if self.shouldNotifyOps():
notify_ops_team("Log rotation occurred", self.baseFilename)
def shouldNotifyOps(self):
"""Determine if ops team should be notified"""
return (os.path.getsize(self.baseFilename) >=
self.max_bytes * 0.9) # 90% threshold
Choosing Your Strategy
The right rotation strategy depends on your specific needs:
Choose size-based rotation when disk space is limited and log volume is unpredictable
Opt for time-based rotation when logs need date-based organization or retention policies are time-driven
Use watched file handler when integrating with system-level tools
Implement compressed rotation when long-term storage is a priority
Remember to consider your application's requirements, operational environment, compliance needs, and resource constraints when making your choice.
Conclusion
Log rotation is a crucial aspect of maintaining healthy production applications. By choosing the right strategy and implementing it properly, you can ensure your logs remain manageable, accessible, and useful without overwhelming your system resources. Whether you opt for size-based, time-based, watched, or compressed rotation or even a hybrid approach, the key is to align your choice with your specific needs and constraints.