Log Analysis Techniques

Log Analysis Techniques for Testers

1. Understanding Log Levels

TRACE   - Very detailed, used for diagnosing issues
DEBUG   - Detailed information for debugging
INFO    - General informational messages
WARN    - Warning messages, potential issues
ERROR   - Error events, but application still running
FATAL   - Severe errors causing application shutdown

What to look for at each level:

ERROR/FATAL: Critical issues causing test failures
WARN: Potential problems or deprecated usage
INFO: Application flow and major events
DEBUG: Detailed execution flow for troubleshooting

2. Command Line Tools

grep - Find specific patterns:

# Find all errors
grep "ERROR" application.log
 
# Find errors in the last hour (with timestamp)
grep "2024-01-30 14:" application.log | grep ERROR
 
# Case-insensitive search
grep -i "exception" application.log
 
# Show 3 lines before and after match (context)
grep -A 3 -B 3 "NullPointerException" application.log
 
# Multiple patterns
grep -E "ERROR|FATAL|Exception" application.log
 
# Count occurrences
grep -c "ERROR" application.log
 
# Show line numbers
grep -n "ERROR" application.log

tail - Monitor logs in real-time:

# Show last 50 lines
tail -n 50 application.log
 
# Follow log in real-time
tail -f application.log
 
# Follow and filter
tail -f application.log | grep ERROR
 
# Follow multiple files
tail -f app.log error.log

awk - Extract specific columns:

# Log format: [timestamp] [level] [thread] message
# Extract only ERROR messages with timestamp
awk '/ERROR/ {print $1, $2, $NF}' application.log
 
# Count errors by hour
awk '/ERROR/ {print $2}' application.log | cut -d':' -f1 | sort | uniq -c
 
# Sum response times (if in log)
awk '/response_time/ {sum+=$NF} END {print "Avg:", sum/NR}' api.log

sed - Filter and transform:

# Extract lines between timestamps
sed -n '/2024-01-30 14:00/,/2024-01-30 15:00/p' application.log
 
# Remove DEBUG lines
sed '/DEBUG/d' application.log
 
# Extract only timestamp and message
sed 's/.*\[\(.*\)\].*\[\(.*\)\].*/\1 \2/' application.log

3. Finding Common Issues

Null Pointer Exceptions:

grep -A 10 "NullPointerException" application.log

Database Connection Issues:

grep -i "connection" application.log | grep -i "error\|timeout\|refused"

Performance Issues:

# Find slow API calls (> 1000ms)
grep "response_time" api.log | awk '$NF > 1000 {print}'
 
# Find memory warnings
grep -i "memory\|heap\|OutOfMemoryError" application.log

Authentication/Authorization Failures:

grep -E "401|403|Unauthorized|Forbidden" application.log

API Errors:

# Find 5xx errors
grep -E "50[0-9]" application.log
 
# Find specific endpoint errors
grep "/api/users" application.log | grep ERROR

4. Log Pattern Analysis

Sample Log Entry:

2024-01-30 14:23:45.123 [ERROR] [http-nio-8080-exec-5] 
c.e.service.UserService - Failed to create user: user@example.com
java.sql.SQLException: Duplicate entry 'user@example.com'
    at com.example.dao.UserDao.create(UserDao.java:45)
    at com.example.service.UserService.createUser(UserService.java:78)

Extract useful information:

# Extract timestamp
awk '{print $1, $2}' application.log
 
# Extract log level
awk '{print $3}' application.log | tr -d '[]' | sort | uniq -c
 
# Extract thread name
awk '{print $4}' application.log | tr -d '[]' | sort | uniq -c
 
# Extract class name
awk '{print $5}' application.log

5. Analyzing Test Failures

Find failed test context:

# Get context around test failure
grep -B 20 "Test Failed" test-output.log
 
# Extract all failed tests
grep "FAILED" testng-results.xml
 
# Find stack traces
awk '/Exception/,/^$/' application.log

Correlate test execution with application logs:

# Test started at 14:23:40
# Find application logs during test
sed -n '/2024-01-30 14:23:40/,/2024-01-30 14:23:50/p' application.log

6. Log Aggregation Patterns

Group errors by type:

# Count by exception type
grep "Exception" application.log | \
  awk '{for(i=1;i<=NF;i++) if($i ~ /Exception/) print $i}' | \
  sort | uniq -c | sort -rn

Timeline of errors:

# Errors by hour
grep ERROR application.log | \
  awk '{print $2}' | cut -d':' -f1 | sort | uniq -c

Top error messages:

grep ERROR application.log | \
  awk -F'- ' '{print $2}' | sort | uniq -c | sort -rn | head -10

7. Performance Analysis from Logs

API Response Time Analysis:

# Sample log: ... response_time=245ms ...
 
# Extract all response times
grep "response_time" api.log | \
  grep -oP 'response_time=\K[0-9]+' | \
  awk '{
    sum+=$1; 
    if($1>max) max=$1; 
    if(min==""|$1<min) min=$1
  } 
  END {
    print "Count:", NR;
    print "Avg:", sum/NR, "ms";
    print "Min:", min, "ms";
    print "Max:", max, "ms"
  }'

Slow Query Detection:

# Find queries taking > 1 second
grep "query_time" db.log | awk '$NF > 1000' | sort -k NF -n

8. Structured Log Analysis (JSON)

Sample JSON Log:

{"timestamp":"2024-01-30T14:23:45.123Z","level":"ERROR","thread":"http-8080-1","logger":"UserService","message":"User creation failed","userId":123,"error":"DuplicateKeyException"}

Using jq for analysis:

# Filter ERROR level logs
cat application.json | jq 'select(.level == "ERROR")'
 
# Count by log level
cat application.json | jq -r '.level' | sort | uniq -c
 
# Extract specific fields
cat application.json | jq '{time: .timestamp, msg: .message, error: .error}'
 
# Filter by time range
cat application.json | \
  jq 'select(.timestamp >= "2024-01-30T14:00:00Z" and 
             .timestamp <= "2024-01-30T15:00:00Z")'
 
# Group and count errors
cat application.json | \
  jq -r 'select(.level == "ERROR") | .error' | \
  sort | uniq -c | sort -rn

9. Creating Log Summaries

Automated test run summary:

#!/bin/bash
LOG_FILE="test-execution.log"
 
echo "=== Test Execution Summary ==="
echo "Total Tests: $(grep -c "Test:" $LOG_FILE)"
echo "Passed: $(grep -c "PASSED" $LOG_FILE)"
echo "Failed: $(grep -c "FAILED" $LOG_FILE)"
echo "Skipped: $(grep -c "SKIPPED" $LOG_FILE)"
echo ""
echo "=== Errors Summary ==="
echo "Total Errors: $(grep -c "ERROR" $LOG_FILE)"
echo ""
echo "Top 5 Errors:"
grep ERROR $LOG_FILE | \
  awk -F'- ' '{print $2}' | \
  sort | uniq -c | sort -rn | head -5
echo ""
echo "=== Failed Tests ==="
grep "FAILED" $LOG_FILE | awk '{print $NF}'

10. Best Practices

Quick Log Analysis Checklist:

Identify the time window of the issue

grep "2024-01-30 14:2[3-5]" application.log

Find error context (before/after)
```
grep -B 5 -A 10 "ERROR" application.log
```

Look for patterns in errors

grep ERROR application.log | awk '{print $NF}' | sort | uniq -c

Correlate across log files

# Application logs
grep "request_id=12345" app.log
# Database logs
grep "12345" db.log

Check for cascading failures

grep -A 20 "Connection refused" application.log

Quick Reference Commands

# Essential log analysis commands
 
# Real-time monitoring
tail -f app.log | grep --line-buffered ERROR
 
# Find in time range
sed -n '/14:00:00/,/15:00:00/p' app.log
 
# Top errors
grep ERROR app.log | awk '{print $NF}' | sort | uniq -c | sort -rn
 
# Error count by hour
grep ERROR app.log | awk '{print $2}' | cut -d: -f1 | uniq -c
 
# Extract stack traces
awk '/Exception/,/^[^[:space:]]/' app.log
 
# Combine multiple logs
cat app-*.log | grep ERROR | sort
 
# Filter out noise
grep ERROR app.log | grep -v "SocketTimeout" | grep -v "Expected"
 
# Performance summary
grep "duration" app.log | awk '{sum+=$NF; count++} 
  END {print "Avg:", sum/count, "Count:", count}'

Tools for Advanced Analysis

Splunk: Enterprise log management
ELK Stack: Elasticsearch, Logstash, Kibana
Grafana Loki: Log aggregation and visualization
Papertrail: Cloud-based log management
Loggly: SaaS log management
grep/awk/sed: Command-line basics
jq: JSON log processing
logparser: Windows log analysis

Key Takeaways

Use grep with context (-A, -B, -C) for errors
tail -f for real-time log monitoring
Extract patterns with awk and sed
Analyze JSON logs with jq
Find time-specific issues with timestamp filtering
Look for error patterns and correlations
Create automated summaries for test runs
Keep logs from multiple sources for correlation

Log Analysis Techniques