I've been working on a script that some of you might find useful if you want to get a basic level of network monitoring up and running very quickly. I spent ages setting up probe discovery jobs for all our locations and then going through all the discovered devices and setting their device type, giving them a name, clicking the "Send alert when unit is down" checkbox and specifying an alert template only to find that (a) tickets don't close automatically when devices come back online again, and (b) we got tickets about MAC addresses changing due to ARP tables on routers. So a lot of noise on the service board.
Essentially, all we care about at the moment is "Is this switch pinging? Yes, or no"
So I ended up writing a script which I schedule on probes every 15 minutes. The script pulls a SQL dataset of network devices that have the "Send alert when unit is down" checkbox ticked and whether the probe scan has detected if it's online or not. If a device is down the script then performs a ping test against the device IP (this is configurable via an EDF to be 1, 5, 10 or 20 packets). A ticket is then created with the results of the ping test as a comment. If the device is found to be online during a later run of the script, the ticket is closed.
You can turn on the alerting for devices very easily:
UPDATE networkdevices SET alert = 1 WHERE devicetype = X
Where X is:
4= NAS Disk
5= VoIP device
10= Network switch
11= WiFi access point
12= Multimedia device
13= Home automation
You don't need to set an alert template on devices for the script to work. It just works on the "Send alert when unit is down" checkbox.
UPDATE 13th Dec, it now supports flapping detection and creation of a ticket when a device is flapping down and up regularly through a day. This is configurable via an EDF.
If anyone's interested, drop me an email and I'm happy to share.
This idea was very interesting to me for ease of use. I went ahead and mocked up a pair of scripts to setup and monitor network probes based on a new group membership with autojoin search "Agents\Agent Probe"
Under the location, in a new tab i've added 15 EDFs one for each device type. The setup script checks these EDFs then sets the "Send alert when unit is down" setting for the device type. This script is set to run daily against the group - i'd like to set up some logic as to if this script has already been run, maybe a "setup complete" edf to check similar to onboarding. The monitor script gets a sql dataset for specific columns IPAddress, DeviceType, DeviceName - loops through each row, attempts to ping the device and if it fails creates a ticket.
This is a very basic script and untested, I'm still interested in seeing your script - curious about the logic you added in for flapping and how you went about closing tickets when the devices come back online. If anybody has any input it would be appreciated! thanks
EDIT: Uploaded wrong script version, it omitted a setup script step
Disclaimer: You are downloading a script that's only been through limited testing - download at your own risk
You do not have the required permissions to view the files attached to this post.