Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
app-notes:supervision [2015/04/05 13:50]
hess
app-notes:supervision [2015/04/20 09:43] (current)
fachet [Introduction]
Line 1: Line 1:
 ====== Supervision ====== ====== Supervision ======
 ===== Introduction ===== ===== Introduction =====
-This note descibes ​which provisions are taken by NetModule Routers to guarantee always-on and always-connected ​+{{:​app-notes:​24-7.png?​nolink&​200 |}} 
 +This note describes ​which provisions are taken by NetModule Routers to guarantee always-on and always-connected ​
 functionality. functionality.
  
Line 10: Line 11:
 Wireless LAN (WLAN), Ethernet or PPP over Ethernet (PPPoE) connections. ​ Wireless LAN (WLAN), Ethernet or PPP over Ethernet (PPPoE) connections. ​
  
-The highes ​level of monitoring is the Ping Supervision which is individually configurable for each active (up)  +The highest ​level of monitoring is the Ping Supervision which is individually configurable for each active (up)  
-link as well generic for any link based on the routing table. The user can define up to two host which are pinged  +link as well generic for any WWAN link. The user can define up to two host which are pinged  
-every "ping interval"​ seconds. If there is no respones ​of at least one of the two hosts within "ping timeout" ​+every "ping interval"​ seconds. If there is no response ​of at least one of the two hosts within "ping timeout" ​
 milliseconds a retry is done using the "retry interval"​ time up to "max. number of trial fails"​. ​ milliseconds a retry is done using the "retry interval"​ time up to "max. number of trial fails"​. ​
 If still there is no answer the defined emergency action is executed. It's recommended to set "​restart link services" ​ If still there is no answer the defined emergency action is executed. It's recommended to set "​restart link services" ​
Line 19: Line 20:
 Emergency action "​none"​ means that the link is defined as (inactive) down.  Emergency action "​none"​ means that the link is defined as (inactive) down. 
 If you are using a VPN you should also enable the dead pear detection or keep alive mechanisms.  ​ If you are using a VPN you should also enable the dead pear detection or keep alive mechanisms.  ​
-For the generic link "​ANY" ​it's recommended to use the application host in your office.  +"​ANY"​ is a shortcut for all WAN interfaces.
-This ping will ensure that even interruptions in the back-bone can be detected and a full point-to-point connection  +
-is guaranteed.+
  
 The** WWAN Manager** who is responsible for the control of the mobile network modems checks every 15 seconds ​ The** WWAN Manager** who is responsible for the control of the mobile network modems checks every 15 seconds ​
-if the connection is still established. Different informations like configurable signal ​strenght+if the connection is still established. Different informations like configurable signal ​strength
 network registration status of service type are used to determine the status of the modem links. network registration status of service type are used to determine the status of the modem links.
  
Line 30: Line 29:
 link as the so-called hotlink which holds the default route for outgoing packets. ​ link as the so-called hotlink which holds the default route for outgoing packets. ​
 Every 5 seconds he checks for each active link if the following conditions are met: Every 5 seconds he checks for each active link if the following conditions are met:
 +
 +^Condition^ WWAN^ WLAN^ ETH^ PPPoE^
 +|Modem is registered |  X  | | |
 +|Registered with valid service type |  X  | | | |
 +|Valid SIM state |  X  | | | |
 +|Sufficient signal strength |  X  |  X  | | |
 +|Client is associated |  X  | | | |
 +|Client is authenticated |  X  | | | |
 +|Valid DHCP address retrieved |  X  |  X  |  X  |  X  |
 +|Link is up and holds address |  X  |  X  |  X  |  X  |
 +|Ping check succeeded |  X  |  X  |  X  |  X  |
 +
 +If at least one condition of a link is not met, the link manager counts the number of subsequent failures. ​
 +WWAN links are marked as down after 5 subsequent failures, PPPoE after three, WiFi and Ethernet after one. 
 +The highest priority link which is up will become the so-called hotlink which holds the default route for 
 +outgoing packets. Of course the link manager will change the hotlink as soon a higher priority interface comes up.
 +
 +Finaly a **Hardware Watchdog** is available which will restart the device if he is not retriggered after x seconds. ​
 +The Router Suppervision deamon checks permanently if all components of the router application are still alive. ​
 +So in the unlikely case that the operating systems or the router software has an internal error the system restarts ​
 +completely by rebooting. This is the only condition where the system will reboot except "​reboot system" ​
 +is configured in the link supervision.
 +
 +===== Summary =====
 +  - Level 4: VPN dead peer detection/​keep alive
 +      * Is VPN link over WAN link healthy? See also [[sdk:​vpn-supervision|Supervision of a VPN Tunnel]].
 +  - Level 3: Link level Ping Supervision
 +      * Is link to next network node healthy?
 +  - Level 2: Modem/WiFi Supervision
 +      * Is modem ok, device registered and a sufficient signal available ?
 +  - Level 1: Link Supervison
 +      * Is physical link up or physical connection ok?
 +  - Level 0: Hardware Watchdog
 +     * Is router firmware and operating system running ok?
    
    
 ===== Configuration =====  ===== Configuration ===== 
  {{ :​pictures:​supervision.png?​nolink |}}  {{ :​pictures:​supervision.png?​nolink |}}