English

How To: Convert GeoEvent logs to Excel and summarize unique messages by count using Python

Summary

In some cases, the GeoEvent log files are converted to Excel for multiple purposes. For example, for filtering different types of errors, or to create a count summary based on unique error messages.

Procedure

The following steps demonstrate how to convert GeoEvent logs to Excel and summarize unique message by count using a Python script.

Note:
The script creates two files, analyzed and message_count.
  1. Import the required libraries.
import os
import pandas as pd
  1. Define the log file directory.
log_dir = r'C:\Program Files\ArcGIS\Server\GeoEvent\data\log'
log_path = [os.path.join(log_dir, log_file) for log_file in os.listdir(log_dir)]
  1. Iterate the conversion to Excel process for each log in the directory.
i=1
for path in log_path:
    if path.endswith('karaf.log'):
        df = pd.read_table(path, 
                           sep='|',  
                           header=None, 
                           names=['level', 'unknown1', 'unknown2','logger', 'message'], 
                           parse_dates=True, error_bad_lines = False, warn_bad_lines = True)
        df.to_excel(os.path.join(log_dir, 'analyzed{}.xlsx'.format
  1. Create the summarize count.
count = df['message'].value_counts()
count.to_excel(os.path.join(log_dir, 'message_count.xlsx'))
The following code sample demonstrates a full working script:
import os
import pandas as pd

# Define Log Directory
log_dir = r'C:\Program Files\ArcGIS\Server\GeoEvent\data\log'
#for log_file in os.listdir(log_dir):
    #log_path = os.path.join(log_dir, log_file)
    #print(log_path)
log_path = [os.path.join(log_dir, log_file) for log_file in os.listdir(log_dir)]

# Convert to Excel
i=1
for path in log_path:
    if path.endswith('karaf.log'):
        df = pd.read_table(path, 
                           sep='|',  
                           header=None, 
                           names=['level', 'unknown1', 'unknown2','logger', 'message'], 
                           parse_dates=True, error_bad_lines = False, warn_bad_lines = True)
        df.to_excel(os.path.join(log_dir, 'analyzed{}.xlsx'.format(i)))
        i+=1

# Summarize by count
count = df['message'].value_counts()
count.to_excel(os.path.join(log_dir, 'message_count.xlsx'))
Note:
The code can be used as a standalone script. To do so, copy the full script into a text editor, for example, Notepad++, and save the file as a .py file. To run the code, right-click the .py file and select Edit with IDLE > Select Run > Run Module.

Related Information