After searching for something, you can click on any graph. You could also do this from the graph browser, reports, alarms, or even some widgets. Clicking on a graph will take you to the troubleshooter.

Once in the troubleshooter, you will see various time and graph controls on the right of the screen. On the left of the screen is a list of actions that you can take and then there are some more details.


TROUBLESHOOT: Show Related Graphs

When you click on Show Related Graphs, the related graphs appear in the tabs at the bottom of the graph. If you click on any of the related graphs, it will now become the main graph. I can also zoom in to a specific period on the graph. You will notice that all the related graphs zoom as well. You can zoom out by right clicking on the graph.


The sub-elements tab shows us sub-elements graphs. These all still relate to the same interface.


This is not the case with similar graphs for this device. Here it shows you exactly the same graph, but for all other interfaces on the device. What is useful about this is that when you see a spike on the graph, it usually has a corresponding spike on another interface as devices don’t exist in isolation. This means that you will be able to trace where this traffic goes through the network following the incoming and outgoing spikes on consecutive interfaces.

Similar graphs in this view is the same, except that it shows you the same traffic graphs for the entire view.


Similar graphs in the segment shows you graphs that have the same circuit identifier. Point to point circuits have an A and B side and are usually identified by the same circuit identifier. This could be issued by a third party telecoms provider; it could be an internal reference; it could just be a VLAN number; or just any other made up identifier.


Similarly, by tracing a traffic spike from the one side of the circuit to the other, or through a specific VLAN, you can troubleshoot the origin or destination of the spike. In order for this to be effective, you have to set up circuit identifiers in Iris.


Service tags are also another way to group graphs. You can give each mnemonic a specific service tag. If you have a CRM system that uses some identifier per customer service, you use that. Alternatively, you can just use it for information, for example: if this is an ADSL service or fibre-to-home service or something similar. By clicking this tab, all these services will be grouped together. There is no hard and fast rule for service tags. And some Iris implementations don’t use them at all.


Let's quickly look at the time controls as they are applicable here before continuing with the Troubleshoot options.


TIME CONTROLS

You can change the start and end time and all the graphs will update as you adjust the time controls.


You can set the time to Now and you can reset the time. You can click update so that the graphs update to the new times. When you click reset, it will reset the time controls to what they were when you first opened troubleshooter.


TROUBLESHOOT: Show Configuration

Note, that in order for this to work, you will need to have Config backup enabled for your device.

If I click config of the device, I get the config for the whole device.


If I click inventory for the device, we will see inventory data such as serial numbers and part numbers of all the parts that have been installed; as well as the chassis serial numbers.


The raw backup data is essentially the unprocessed data that we get back from the device. There might be some problem with the parsing of the config or extracting some information such as a serial number in which case, checking the raw data will help you track it down.


Revision history gives you the list of all the past config changes so you can see what has changed.


TROUBLESHOOT: Show Events
If I click show events, I will see a list of all the events that have happened in the specific time frame that we are looking at. Often, you will see a spike or a drop in traffic, and you can tie that directly to some event that happened on the network.

Directly related events pertain to stuff that is directly tied to this interface. You can broaden your search to all events on the device. An example of where this would be relevant is if somebody made a config change and this caused a drop in traffic. You won’t find that under the interface itself you will find that under the device logs. So normally the broader your event search goes, the narrower your time frame has to go. This may help you find the problem.


TROUBLESHOOT: Show Flow Data

Another incredibly useful tool is Netflow data. Again this has to be enabled on the device and interface for you to see it. 

It gives you the breakdown of the traffic that you see on the graph. AS Number, protocol, application, source IP, destination IP, conversation, which a combination of source and destination IP and even down to the raw detail. Note that you can only see the detail if you have selected a small enough time period, as pulling detail for longer periods puts too much strain on the system. And is too much data to read anyway. If you click on any of these, you actually go through to the flow explorer which allows you to do further debugging.


TROUBLESHOOT: Time Shift

Time Shift is quite an infrequently used Iris feature. But it is quite useful. It takes the pattern of the graph and shifts it by one day. It then shows the comparison between these two patterns. It is very useful for patterns that are very consistent such as interfaces that move high volumes of traffic. On the top, you will see a line that is red or green. This shows whether the deviation in pattern was acceptable or not. The more consistent the pattern, the narrower the allowed deviation, and the more erratic the pattern, the more lenient the allowed deviation. You could use Time Shift to compare by week and month as well.


TROUBLESHOOT: Live Graph

Live Graph is a very useful feature. Sometimes you will need to do real time troubleshooting on a graph. Live graph gives you the ability to refresh up to every 5 seconds on a particular data source on the router.

Sometimes 5 seconds might be too much, so you can select 10 or 15 seconds.


TROUBLESHOOT: Export DataYou can also have the ability to export the data in PDF or CSV format.


CSV will give you the raw values that make up this graph here. And then you can use excel to do your own calculations on the data. 


PDF is quite useful for you to send to a customer. It shows all the related graphs and you can export the Netflow into a nice report that you can send off via email.


DETAILS

The details section shows you the device and interface name. You can click on the device to edit the device if you have permission; or you can also see all the mnemonic details including the monitored IP and you can edit it to change any of these if you like.


GRAPH CONTROLS

On this panel over here you have some graph controls which control what is displayed on the graph image. Show timeframe will add the timeframe to the bottom of the image. This is quite useful if you want to send just an image off in a mail, as you can simply drag the image from your browser into the mail. Be aware that you need to drag it from the bottom of the image; if you try to drag it from the zoom area, it will try to zoom.


Show peak traffic is the max of the max so it just shows you the peak values inside the averages. Averaging won’t give you the maximum value but rather the average, so clicking here will show you the peak value in that average.


Another useful control is business hours. This will shade the areas of the graph that are excluded. This also includes any maintenance that may have been added. Here you can see that the business hour average is higher than the daily average. Also, nightly backups tend to skew the averages so it is useful to exclude that.


Trends are quite a nice way of seeing the rate at which the traffic is increasing. To make the trends more accurate, you will need to select a time period of at least 6 months to a year.


Most of these controls are also available as a report metric so you can just as easily represent these stats in a report.



For the next article, click on the track that you are following:

TRACK 2: User

TRACK 3: Administrator


To return to the contents page, click on the track that you are following:

TRACK 2: User

TRACK 3: Administrator