I keep on summarizing error logs and one of the task is to sum up all occurrences of all errors/warnings/notices etc on the hourly error log sent via email. Instead of adding them one by one, I’ve created a small script to automatically sum them all up.
UPDATE: I’ve made a JavaScript version and is available here: www.lysender.com/extra/tools/sumfirstcol. It has similar feature and is aimed for smaller files since it is just a JavaScript implementation.
Sample file
Not sure if there is a bash/awk combo equivalent for this but what I’m trying to achieve it to sum up the first column of a file since it is numeric. See below for sample error log email.
5 PHP Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource in /data/local/... on line 285 2 PHP Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource in /data/local/... on line 278 2 PHP Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource in /data/local/... on line 268 2 PHP Notice: Undefined index: payment_method in /data/local/... on line 176 1 PHP Warning: array_key_exists() [<a href='function.array-key-exists'>function.array-key-exists</a>]: The second argument should be either an array or an object in /data/www/html/sites/... on line 358 1 PHP Warning: Invalid argument supplied for foreach() in /data/www/html/sites/... on line 415 1 PHP Warning: Invalid argument supplied for foreach() in /data/www/html/sites/... on line 395 1 PHP Warning: Invalid argument supplied for foreach() in /data/www/html/sites/... on line 372
As you can see, it is easy to sum it up. However, the error logs are usually hundred of lines or more. Therefore, I’ve created a simple script to sum up the first numeric column. The result is the grand total of all error occurrences. Below is the Python script. (Not that I don’t know how to write it in PHP, I just wanted to practice more on Python).
#!/usr/bin/python
import sys
import os
import string
def parse_file(filename):
try:
f = open(filename)
parse_now(f)
f.close()
except IOError as e:
print 'Unable to open file %s' % filename
print e
def parse_now(f):
total = 0
lines = 0
for line in f:
lines = lines + 1
chunks = line.strip().split(' ', 2)
n = chunks[0]
if n.isdigit():
total = total + int(n)
print 'Total lines: %d' % lines
print 'First col sum: %d' % total
if __name__ == '__main__':
input_file = None
if (len(sys.argv) == 2):
input_file = sys.argv[1]
if os.path.isfile(input_file):
parse_file(input_file)
else:
print 'Input file does not exists'
else:
print 'sum-first-col <file>'
Save this script as sum-first-col for example and put it on your environment path. Be sure to put an execute bit to be able to run it as a script directly on your terminal. You may run it directly or by passing it to python executable.
# This style sum-first-col sample-error-log.txt # Or by this style python sum-first-col sample-error-log.txt
And below is a sample output.
lysender@darkstar:~$ sum-first-col sample-error-log.php Total lines: 8 First col sum: 15 lysender@darkstar:~$
Note: I’m trying to create a JavaScript version and publish it on the web so it would be a copy and paste job instead of running stuff on the terminal. Let’s see.
Update: Here is the JavaScript version.
Enjoy.