🔍 Code Extractor

function parse_log_line

Maturity: 44

Parses a structured log line string and extracts timestamp, logger name, log level, and message components into a dictionary.

File:
/tf/active/vicechatdev/SPFCsync/monitor.py
Lines:
15 - 34
Complexity:
simple

Purpose

This function is designed to parse log lines that follow a specific format (timestamp - logger_name - level - message) and convert them into structured data. It's useful for log analysis, monitoring systems, and log aggregation tools where raw log strings need to be converted into queryable data structures. The function handles malformed lines gracefully by returning None when the pattern doesn't match or timestamp parsing fails.

Source Code

def parse_log_line(line):
    """Parse a log line and extract information."""
    # Expected format: timestamp - name - level - message
    pattern = r'(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}) - (.*?) - (\w+) - (.*)'
    match = re.match(pattern, line.strip())
    
    if match:
        timestamp_str, logger_name, level, message = match.groups()
        try:
            timestamp = datetime.strptime(timestamp_str, '%Y-%m-%d %H:%M:%S,%f')
            return {
                'timestamp': timestamp,
                'logger': logger_name,
                'level': level,
                'message': message
            }
        except ValueError:
            pass
    
    return None

Parameters

Name Type Default Kind
line - - positional_or_keyword

Parameter Details

line: A string representing a single log line. Expected format: 'YYYY-MM-DD HH:MM:SS,mmm - logger_name - LEVEL - message'. The function will strip whitespace from the line before processing. Can be any string, but will only successfully parse if it matches the expected log format.

Return Value

Returns a dictionary with keys 'timestamp' (datetime object), 'logger' (string), 'level' (string), and 'message' (string) if the log line is successfully parsed. Returns None if the line doesn't match the expected pattern or if the timestamp cannot be parsed. The timestamp is converted from string to a datetime object using the format '%Y-%m-%d %H:%M:%S,%f'.

Dependencies

  • re
  • datetime

Required Imports

import re
from datetime import datetime

Usage Example

import re
from datetime import datetime

def parse_log_line(line):
    pattern = r'(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}) - (.*?) - (\w+) - (.*)'
    match = re.match(pattern, line.strip())
    
    if match:
        timestamp_str, logger_name, level, message = match.groups()
        try:
            timestamp = datetime.strptime(timestamp_str, '%Y-%m-%d %H:%M:%S,%f')
            return {
                'timestamp': timestamp,
                'logger': logger_name,
                'level': level,
                'message': message
            }
        except ValueError:
            pass
    
    return None

# Example usage
log_line = '2024-01-15 14:30:45,123 - myapp.module - ERROR - Connection timeout'
result = parse_log_line(log_line)
if result:
    print(f"Timestamp: {result['timestamp']}")
    print(f"Logger: {result['logger']}")
    print(f"Level: {result['level']}")
    print(f"Message: {result['message']}")
else:
    print("Failed to parse log line")

# Example with invalid line
invalid_line = 'This is not a valid log line'
result = parse_log_line(invalid_line)
print(result)  # Output: None

Best Practices

  • Always check if the return value is None before accessing dictionary keys to avoid AttributeError
  • The function expects milliseconds in the timestamp (3 digits after comma). Ensure your log format matches this expectation
  • The regex pattern uses non-greedy matching (.*?) for the logger name to correctly parse logs with multiple hyphens
  • The function strips whitespace from input, so leading/trailing spaces won't cause parsing failures
  • For batch processing of log files, consider wrapping this function in error handling to continue processing even if individual lines fail
  • The log level is expected to be a word character sequence (\w+), typically INFO, DEBUG, ERROR, WARNING, etc.
  • If you need to parse logs with different formats, consider modifying the regex pattern or creating format-specific variants of this function

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function parse_directory_listing_debug 49.0% similar

    A debug version of a directory listing parser that extracts and categorizes file entries with detailed console output for troubleshooting.

    From: /tf/active/vicechatdev/e-ink-llm/cloudtest/debug_rm_parsing.py
  • function parse_datetime 49.0% similar

    Parses a datetime string in YYYY-MM-DD HH:MM:SS format into a Python datetime object, returning None if parsing fails.

    From: /tf/active/vicechatdev/CDocs/utils/__init__.py
  • function parse_datetime_v2 47.9% similar

    Parses a datetime string by normalizing fractional seconds and timezone format, then converts it to a datetime object using ISO format parsing.

    From: /tf/active/vicechatdev/rmcl/utils.py
  • function parse_datetime_v1 46.5% similar

    Converts various date representations (string, integer, pandas Timestamp) into a numpy datetime64 object using pandas datetime parsing capabilities.

    From: /tf/active/vicechatdev/patches/util.py
  • function parse_date 46.2% similar

    Parses a date string in YYYY-MM-DD format into a datetime object, returning None if parsing fails or input is empty.

    From: /tf/active/vicechatdev/CDocs/utils/__init__.py
← Back to Browse