Python Matplotlib plot_date Wrapper
I’ve been doing a lot of work lately with data mining and presentation. Specifically, I’ve been parsing Nagios log files to gather outage statistics and do trending on the data. I came across the amazing Matplotlib in my search for Python graphing modules. I’m sure anyone who does coding and graphing in the scientific community has used this tool. I wrote a pretty useful wrapper to graph date plots and I thought I’d share it.
Earlier in the year I wrote a pretty useful Nagios log file parser to run availability reports from the command line. My parser basically duplicates the availability report in the Nagios CGI. I’ve since extended it to create PDF files of Matplotlib graphs and to build KML files of network devices from a database with color coding based on availability. It also gives a detailed summary of notifications and downtime events in the plot balloon.
I then took this a step further and created a MySQL backend to store daily statistics for all the devices that had at least one downtime event in Nagios. I’m storing cumulative data for:
- Downtime In Minutes
- Downtime Events
- Down Notifications
- Up Notifications
- Device Count
I then backfilled a years worth of data from the logs and got to the graphing bit. I needed to create yearly, monthly and weekly graphs, for each device type, in all five categories. I created some simple queries to get the data from the database and populate lists. Once I had those lists, I needed a generic way to create ten graphs, with two (sometimes three) plots per graph into a single PDF file.
Here’s what I came up with…
If you don’t have Matplotlib installed, you’ll first need to build and install NumPy.
You’ll then need to import the following modules into your script.
#!/usr/bin/env python from datetime import datetime from datetime import timedelta from datetime import date import numpy as np import matplotlib.pyplot as plt import matplotlib as mpl from matplotlib.backends.backend_pdf import PdfPages |
Next we need a function to create the datetime objects that Matplotlib needs to build ranges of dates for the graphs. I decided to only use keyword arguments for this function.
# Return Datetime objects and strings to feed graphs and SQL Queries def getDtObjects(**kw): str_date_today = date.today() today_dt_string = "%s 00:00:00" % (str_date_today) dt_today = datetime.strptime(today_dt_string, '%Y-%m-%d %H:%M:%S') dt_yesterday = dt_today - timedelta(days=1) if(kw.has_key('yearly')): dt_last_year = dt_yesterday - timedelta(days=365) return(dt_yesterday, dt_last_year) if(kw.has_key('monthly')): dt_last_month = dt_yesterday - timedelta(weeks=4) return(dt_yesterday, dt_last_month) if(kw.has_key('weekly')): dt_last_week = dt_yesterday - timedelta(weeks=1) return(dt_yesterday, dt_last_week) |
Next is the function that does the graphing.
def createGraph(plots, dt1, dt2, pdf, **kw): #Define list of colors colors = ['b', 'g'] plot_index = 0 # Create figure and subplot fig = plt.figure() ax1 = fig.add_subplot(111) # Create range of DT objects for graph delta = timedelta(days=1) dates = mpl.dates.drange(dt1, dt2, delta) for label, subplot in plots.iteritems(): ax1.plot_date(dates, subplot, linestyle='-', marker='None', label=label, color=colors[plot_index]) plot_index += 1 # Set formatting for x axes if(kw['scale'] == "year"): majorLoc = mpl.dates.MonthLocator() minorLoc = mpl.dates.DayLocator() if(kw['scale'] == "month"): # Set ticks on Fridays majorLoc = mpl.dates.WeekdayLocator(byweekday=5) minorLoc = mpl.dates.DayLocator() if(kw['scale'] == "week"): # Set ticks on Fridays majorLoc = mpl.dates.DayLocator() minorLoc = mpl.dates.HourLocator() ax1.xaxis.set_major_locator(majorLoc) ax1.xaxis.set_minor_locator(minorLoc) majorFmt = mpl.dates.DateFormatter(kw['majorFmt']) ax1.xaxis.set_major_formatter(majorFmt) # Set axes labels and graph title ax1.set_ylabel(kw['ylabel']) ax1.set_title(kw['title']) # Auto format and print legend fig.autofmt_xdate() ax1.legend(loc='best') # Save graph and clear plot pdf.savefig() fig.clf() |
Here’s a code snippet giving a weekly example. This example shows a multi-plot line graph for two devices; routers and switches.
if __name__ == '__main__': # Set name of PDF to write to pdf_filename = "date-plot.pdf" # Create datetime objects (dt_yesterday, dt_last_week) = getDtObjects(weekly=True) # You would have pre-filled these lists with your data router_downtime = [7, 11, 2, 1, 15, 4, 90] switch_downtime = [13, 1, 9, 5, 33, 17, 2] # Create a plot dictionary containing: # Key: label for graph legend # Value: list of values to plot downtime_plot = {'Routers' : router_downtime, 'Switches' : switch_downtime } # Create PDF pdf = PdfPages(pdf_filename) # Call the createGraph function # 1) Plot dictionary # 2) datetime object start # 3) datetime object end # 4) pdf # 5) Graph Title # 6) Y axes Label # 7) Scale - < week | month | year > # 8) Date format for x axes ticks createGraph( downtime_plot, dt_last_week, dt_yesterday, pdf, title="Downtime", ylabel="Downtime (mins)", scale="week", majorFmt="%Y-%m-%d" ) # Close PDF or else it won't save pdf.close() |
Here’s how you would call createGraph for year and month based graphs. This example assumes you have already populated your lists with exactly the number of data points as days in the range. If you have any more or fewer data points than days in your date range, the graphs will barf on you.
if __name__ == '__main__': # Set name of PDF to write to pdf_filename = "date-plot.pdf" # Create datetime objects (dt_yesterday, dt_last_month) = getDtObjects(monthly=True) (dt_yesterday, dt_last_year) = getDtObjects(yearly=True) # Create PDF pdf = PdfPages(pdf_filename) # You've already populated router_down_alerts and switch_down_alerts with a months worth of data points down_alerts_plot = {'Routers' : router_down_alerts, 'Switches' : switch_down_alerts } createGraph( down_alerts_plot, dt_last_month, dt_yesterday, pdf, title="Down Notifications", ylabel="Notifications", scale="month", majorFmt="%Y-%m-%d" ) # You've already populated router_events and switch_events with 365 data points. events_plot = {'Routers' : router_events, 'Switches' : switch_events } createGraph( events_plot, dt_last_year, dt_yesterday, pdf, title="Down Events", ylabel="Events", scale="year", majorFmt="%b" ) # Close PDF or else it won't save pdf.close() |
That’s it. If you don’t like the formatting of the graphs, you can play around with the scale in createGraphs. Just tweak these. Here’s what you can do with mpl.dates.
# Set formatting for x axes if(kw['scale'] == "year"): majorLoc = mpl.dates.MonthLocator() minorLoc = mpl.dates.DayLocator() if(kw['scale'] == "month"): # Set ticks on Fridays majorLoc = mpl.dates.WeekdayLocator(byweekday=5) minorLoc = mpl.dates.DayLocator() if(kw['scale'] == "week"): majorLoc = mpl.dates.DayLocator() minorLoc = mpl.dates.HourLocator() |
Write a comment
You need to login to post comments!