Integrating Graphite with NAV

NAV uses Graphite to store and retrieve/graph time-series data. Installing Graphite itself is out of scope for this guide, but assuming you already have a complete installation of Graphite, you need to change some configuration options in both Graphite and NAV to ensure your time-series data is stored and retrieved properly:

  1. NAV needs to know the details of where to send data.

  2. NAV needs to know the details of how to retrieve time-series data from graphite-web (Graphite’s front-end web service for retrieving stored data)

  3. carbon-cache (Graphite’s backend component for receiving and storing time series data) needs to know how it should store data received from NAV.

Configuring NAV

NAV must be configured with the IP address and port of your Graphite installation’s Carbon backend, and the URL to the Graphite-web frontend used for graphing. These settings can be configured in the graphite.conf configuration file.

Note

NAV requires the Carbon backend’s UDP listener to be enabled, as it will only transmit metrics over UDP.

For a simple, local Graphite installation, you may not need to touch this configuration file at all, but at its simplest it looks like this:

[carbon]
host = 127.0.0.1
port = 2003

[graphiteweb]
base = http://localhost:8000/

Configuring Graphite

You will need to make some configuration changes to carbon-cache before letting NAV send data to Graphite:

  1. First and foremost, you will need to enable the UDP listener in the configuration file carbon.conf.

    For performance reasons, Carbon will also limit the number of new Whisper files that can be created per minute. This number is fairly low by default, and when starting NAV for the first time, it may send a ton of new metrics very fast. If the limit is set to 50, it will take a long time before all the metrics are created. You might want to increase the MAX_CREATES_PER_MINUTE option, or temporarily set it to inf.

  2. You should add the suggested storage-schema configurations for the various nav prefixes listed in etc/graphite/storage-schemas.conf:

    # Recommended Whisper schema definitions for using Graphite with NAV.
    #
    # If you already have a Graphite installation you wish to use, use these
    # examples to adapt your own config.
    #
    
    # Carbon's internal metrics. This entry should match what is specified in
    # CARBON_METRIC_PREFIX and CARBON_METRIC_INTERVAL settings
    [carbon]
    pattern = ^carbon\.
    retentions = 60:90d
    
    # Statistics - store data for a long time is more important than short
    # intervals.
    [nav-statistics]
    pattern = ^nav\.stats\.
    retentions = 300s:10d, 1h:100d, 1d:6y
    
    # NAV device/system metrics
    [nav-system]
    pattern = ^nav\..*(system|cpu|memory|services|ipdevpoll|sensors)\.
    retentions = 60s:1d, 300s:7d, 30m:12d, 2h:50d, 1d:600d
    
    # NAV multicast metrics
    [nav-multicast]
    pattern = ^nav\.multicast\.groups\.
    retentions = 60s:1d, 300s:7d, 30m:12d, 2h:50d, 1d:600d
    
    # NAV pping metrics. The default pping configuration pings in 20 second
    # intervals; the most detailes retention archive should match up with this.
    [nav-pping]
    pattern = ^nav\..*\.ping\.
    retentions = 20s:6h, 60s:1d, 300s:7d, 30m:12d, 2h:50d, 1d:600d
    
    # NAV IP prefix utilization metrics. Default ARP/ND collection interval is
    # 30 minutes.
    [nav-prefix]
    pattern = ^nav\.prefixes\.
    retentions = 30m:30d, 2h:90d, 6h:600d
    
    # NAV generic metric retention archive
    [nav-generic]
    pattern = ^nav\.
    retentions = 300s:7d, 30m:12d, 2h:50d, 1d:600d
    
    # A not-very-sane default for any metric not caught by the above.
    [default_1min_for_1day]
    pattern = .*
    retentions = 60s:1d
    

    The highest precision retention archives are the most important ones here, as their data point interval must correspond with the collection intervals of various NAV processes. Other than that, the retention periods and the precision of any other archive can be freely experimented with.

    Remember, these schemas apply to new Whisper files as they are created. You should not start NAV until the schemas have been configured, otherwise the Whisper files will be created with the global Graphite defaults, and your data may be munged or inaccurate, and your graphs will be spotty.

  3. You should add the suggested storage-aggregation configurations listed in the file etc/graphite/storage-aggregation.conf:

    # Recommended Whisper aggregation methods for using Graphite with NAV.
    #
    # If you already have a Graphite installation you wish to use, use these
    # examples to adapt your own config.
    #
    
    # ipdevpoll jobs don't necesarily run very often; an xFilesFactor of 0 ensures
    # we roll up everything into the lower precision archives no matter how often
    # runs are logged.
    [ipdevpoll]
    pattern = ^nav\..*\.ipdevpoll\..*runtime$
    xFilesFactor = 0
    aggregationMethod = average
    
    # Any kind of event counter NAV uses will log the number of events since the
    # last time the metric was logged, so the approprate aggregation is to sum the
    # counts.
    [event-counts]
    pattern = ^nav\..*-count$
    xFilesFactor = 0
    aggregationMethod = sum
    
    # NAV stores the raw octet/packet/error/etc counters of interfaces in Graphite.
    # Since these counter values are absolute, and the rates are calculated using
    # the difference of the port counter and the time difference between two
    # counter numbers, the appropriate aggregation method would be to picke the
    # last counter value when rolling up.
    [port-counters]
    pattern = ^nav\..*ports\..*
    aggregationMethod = last
    

    These will ensure that time-series data sent to Graphite by NAV will be aggregated properly when Graphite rolls them into lower-precision archives.

Ensure carbon-cache is restarted to make these changes take effect, before adding devices to monitor in your NAV installation.