Dashboard - Standard view

FAQ posts are linked from the application ocp.outages.io therefore, in no particular order in this forum. This is a work in progress
Posts: 44
Joined: Thu Mar 05, 2020 12:13 pm

Dashboard - Standard view

Post by KelAuth »

The Standard view is an optional view that members can pick from.

The Standard view tries to show everything needed on one single page while trying not to make it overwhelming at the same time.
If the Light view does not provide enough information, the Standard view should do the trick in helping to troubleshooting Internet connection problems.
To switch between the two modes, click on the Set dashboard option and pick the preferred view.

At a glance
The right hand column, what we call the 'At a glance' section, contains basic information about the agent along with useful network details.
An Options area shows available options, some of which only come with hardware agents.

A Links section is also shown if there is a camera connected to the hardware agent and if DDNS are enabled. This allows members to click on the link to be taken directly to their camera or to their network using the DDNS URL that was set.

When reports are in Extended mode, additional items become available such as Remote Controls the give direct access to camera settings.

For Organizations, a remote access service (RAS) link is shown for both quick/direct access and for configuration.
RAS, when configured, gives the admin secured, encrypted direct access to any device on the LAN where the agent is installed.
No need to open firewall ports, nothing seen by hackers.

When green, it means the agent is actively communicating with Outages.io. If the agent cannot reach Outages.io or there is an outage in progress, the heartbeat turns red.

This is mainly used as a quick visual prompt to know when agents are communicating or not, useful during troubleshooting remotely.
Below this area are helpful tips that are changed now and then to help members.

Neighborhood map
The neighborhood map shows if others in the area are experiencing related problems with the same Internet provider.
Please see this post for more information. viewtopic.php?f=34&t=89

Recent events
The recent events section gives a brief summary of the last few outages and communications between the agent and Outages.io.

Important: If there is no message under 'Agent communications' called 'Agent sending updated hops to the Outages.io network', it means the agent is not able to communicate properly with Outages.io and therefore, unable to log and confirm outages. Typically, this is because ICMP is blocked either on the PC the agent is running on, the firewall and possibly even the Internet provider. This must be solved.

The outages graph shows the last 50 outages the connection has experienced along with detailed information about each. When mousing over each bar, accumulated details for that particular outage will be displayed. Older information can be found in the historical menu when reports are in Extended.

Outages avg time
As outages build up so does the averages graph. This graph shows when most of the outages are occurring so that over time, trends are built up showing when problems are happening. Older information can be found in the historical menu when reports are in Extended.

Speed test graph
Bandwidth testing is an interesting topic which is often misunderstood because it is not solely about bandwidth. A location can have a high bandwidth service yet users may find themselves barely able to reach resources on the Internet.

Here is an article that tries to explain commercial speed testing services. viewtopic.php?f=19&t=45

If speed testing is enabled, the agent algorithm will trigger speed testing based on a variety of fluctuations, trying to test at the best possible moment. This will help to better visualize how speeds (bandwidth) and in fact, throughput are doing on the connection in a way that a human being trying to test at the right moment could not do.

The test is not completely accurate because by the time it completes, things could have changed drastically.

The Outages.io solution tries to show ongoing averages (baseline) and when speeds become lower. The result is a graph which gives a visual representation of how speeds are doing and which tests were conducted. Mousing over the graph will display dates/times and types of tests. Different tests are shown in different colors to help visualize the overall report to more easily compare with outages and pings reports.

Colors and meanings

Various colors help to visualize which tests are baseline and which are triggered based on certain events.

Green: Baseline test. The agent software is running a speed test on a regular basis in order to establish a baseline or average.

Blue: Latency trigger. This test is triggered when the latency of the connection begins to fluctuate outside of the measured averages.

Orange: Slowdown trigger. This test is triggered when short burst speed tests are run and the results show slower than usual speeds.

Black: Outage trigger. This test is run moments after an outage ends to determine if speed is back to normal or if it remained slower than the calculated average before the outage.

Note: Black speed test bars can be related either to outages or an algorithm triggered test by the agent. This is something we plan to address.

IMPORTANT: Speed testing uses data
This feature is experimental, the algorithm continues to be in development.

Speed vs throughput: Internet 'speed' is technically bandwidth. Bandwidth is the max amount of data the connection will allow based on the purchased plan.

Throughput is the amount of data this connection can actually move at any given time. Bandwidth and throughput are very different things.

Monthly data plan vs Unlimited plans: Speed testing uses data. If the data plan is large or unlimited, this may not be an issue but if it is a capped data plan, speed testing should be conducted conservatively. Outages.io understands capped plans and tries to optimize this test to make it useful without wasting data.

Disable (default setting): No speed testing will be done by the agent.

Allow: The agent will run speed tests to determine if bandwidth has fallen below a certain threshold. An algorithm controls this function based on a variety of conditions such as latency and slowdowns. The latter is a short, non saturation based speed test which triggers a full saturation speed test if the results are poor.

Speed Limit (Internal and experimental): A speed limit may be imposed to conserve bandwidth. The test is trying to determine when bandwidth drops considerably and not what the full speed is. The agents job is to try and report when speeds fall below average or even usefulness which is difficult because it cannot know when something is actually using bandwidth such as watching movies, downloading files, etc. The speed limit is still in development.

Network stats
This section offers overall statistics about the performance of this Internet service and where most of the problems might be.
MOD means 'most often down'.

% Affected networks: Percentage of Internet problems with LAN, ISP or beyond.
Top MOD hops: Top most problematic hops showing where, LAN, ISP or beyond.
Top MOD orgs: Organizations experiencing the most problems relating to this connection.

It is important to note that all references to 'Beyond ISP' are informational only. The most important information is how the ISP is performing. Anything beyond ISP is not only informational but is a test point that Outages.io is using to monitor the performance of the service. In some cases, some of these could have affected services but the main point is to monitor the Internet service provider. Older information can be found in the historical menu.

MOD, meaning 'Most Often Down' and in this case related to the hop and organization. A hop is a networking piece of hardware such as a router or modem, then the providers switches, all of which packets must travel across in order to reach Internet sites or services. Each device that data travels across is called a hop.

If any one of these hops prevents data from getting to the next device, the local connection could suffer slow, sluggish or even unreachable services until that device is fixed. In most cases, the cause of such a loss can be attributed to a bad cable, hardware malfunctioning or improperly configured interface/device or of course human error such as a cable being disconnected.

In today's real time world, such problems can affect VoIP phone calls, live video and other services not to mention constantly getting disconnected from servers and other devices.

The Network stats shows the last 50 outages broken down by Lan, ISP (Internet provider) and Beyond. The top 5 hops will show where most of the hop problems have been and the top 5 orgs will show with which organizations if the problems are beyond the local network.

Along with other tests, pings are used to establish a pattern and averages. Pings are not based on any nearby point and only to generate averages so that the algorithm can do its job. Older information can be found in the historical menu.

That said, if there is a sudden change in pings, this information should be considered as being a potential problem.