My Python logging setup

Working on various Python projects has taught me the importance of consistent logging, especially when dealing with distributed computing frameworks like Spark. Logging is not just about keeping track of errors or information; it’s about having a detailed and systematic record of the operations to understand the flow of your program and quickly diagnose issues.

The Essentials: Timestamp and Function Name

For any logging setup, two pieces of information are absolutely crucial:

Timestamp: especially vital in distributed systems like Spark, where tasks can run in parallel, and timing can be crucial for understanding the sequence of events.

Function Name: to trace back the exact point of failure or point of interest.

 

My Python Logging Setup

Here’s a basic setup I use to ensure both the timestamp and the function name are always logged:

import logging

logging.basicConfig(
    level=logging.INFO, 
    format='%(asctime)s [%(levelname)s] - %(funcName)s: %(message)s',
    datefmt='%Y-%m-%d %H:%M:%S'
)

logger = logging.getLogger(__name__)

def sample_function():
    logger.info("This is a sample log entry.")


The above code will produce an output like this:

2023-08-19 14:30:25 [INFO] - sample_function: This is a sample log entry.

Different stages of development or different scenarios might require varying levels of log detail. Sometimes, you want to capture every single event (debug mode), other times just informational messages, and on certain occasions, only errors. Having the ability to dynamically adjust the log level provides this flexibility.


def setup_logger(level='INFO'):
    numeric_level = getattr(logging, level.upper(), None)
    if not isinstance(numeric_level, int):
        raise ValueError(f'Invalid log level: {level}')

    logging.basicConfig(
        level=numeric_level, 
        format='%(asctime)s [%(levelname)s] - %(funcName)s: %(message)s',
        datefmt='%Y-%m-%d %H:%M:%S'
    )
    return logging.getLogger(__name__)

# Example usage:
logger = setup_logger('DEBUG')

def sample_function():
    logger.debug("This is a debug message.")
    logger.info("This is an info message.")
    logger.error("This is an error message.")

Whenever I use AWS Glue, I can configure a parameter called LOG_LEVEL which I can set to the desired value depending on the development stage I’m at. the logging module in Python accepts text values for log levels, such as ‘DEBUG’, ‘INFO’, ‘WARNING’, ‘ERROR’, and ‘CRITICAL’

import sys
import logging
from awsglue.utils import getResolvedOptions

def setup_logger(level='INFO'):
    numeric_level = getattr(logging, level.upper(), None)
    if not isinstance(numeric_level, int):
        raise ValueError(f'Invalid log level: {level}')

    logging.basicConfig(
        level=numeric_level, 
        format='%(asctime)s [%(levelname)s] - %(funcName)s: %(message)s',
        datefmt='%Y-%m-%d %H:%M:%S'
    )
    return logging.getLogger(__name__)

# Read log level from the job arguments
args = getResolvedOptions(sys.argv, ['LOG_LEVEL'])
log_level = args['LOG_LEVEL']

logger = setup_logger(log_level)

def sample_function():
    logger.debug("This is a debug message.")
    logger.info("This is an info message.")
    logger.error("This is an error message.")

By Daniel Pradilla

Soy arquitecto de software y ayudo a la gente a mejorar sus vidas usando la tecnología. En este sitio, intento escribir sobre cómo aprovechar las oportunidades que nos ofrece un mundo hiperconectado. Cómo vivir vidas más simples y a la vez ser más productivos. Te invito a revisar la sección de artículos destacados. Si quieres contactarme, o trabajar conmigo, puedes usar los links sociales que encontrarás abajo.

Discover more from Daniel Pradilla

Subscribe now to keep reading and get access to the full archive.

Continue reading