====== Improving Fault Tolerance - Bypass Mode Network Cards ====== {{indexmenu_n>15}} **Why do you recommend purchasing SILICOM cards?**\\ The reasons are as follows: * The cards have bypass functionality. * The delivery package can include licenses for the required drivers for maximum performance - DPDK & Libzero. These cards are marked with -SQ1 (example for a 10GbE card). **Does bypass work on SILICOM network cards when the power is off?** * Optical bypass - works when the power is off (tested on the PE210G2BPI9-SR-SQ1 short-range/fiber card). * Copper bypass - works when the power is off (tested on the [[http://www.silicom-usa.com/Networking_Bypass_Adapters/PEG6BPi6-Six_Port_Copper_Gigabit_Ethernet_PCI_Express_Bypass_Server_Adapter_Intel_based_58|PEG6BPi6]] card). **Is there manual control of bypass on SILICOM network cards?**\\ DPI manages the bypass automatically.\\ If necessary, manual control of the bypass is possible using the bpctl_util utility:\\ * bpctl_util all get_bypass - get the state of bypass * bpctl_util all set_bypass on - activate bypass * bpctl_util all set_bypass off - deactivate bypass **Problem: we purchased a used card and the bypass does not work, what should we do?**\\ The problem is related to the card being reconfigured as a standard one, i.e., with bypass functionality disabled.\\ Diagnosis:\\ bpctl_util all get_std_nic 07:00.0 standard 07:00.1 slave 07:00.2 standard 07:00.3 slave It should be non-standard.\\ \\ To set the card to bypass mode, perform the following:\\ bpctl_util all set_std_nic off This command switches the mode to non-standard, i.e., with bypass mode enabled. **Explanation regarding bypass switching time?**\\ Activating the bypass takes a short time interval of about 0.5 seconds (by default), but due to interface renegotiation, it can ultimately take a longer time interval. Below is an explanation regarding bypass switching from the manufacturer.\\ Such switching duration can affect BGP, OSPF, and other mechanisms due to a brief connection interruption (duration may vary, see the description below) or multiple interruptions as in the case of server or service reboot ( //service ->interruption->bypass->interruption-> service//). In this case, the session recovery time (BGP, OSPF) depends on their settings and can take up to several tens of seconds. To reduce this interval, you need to configure the session recovery time after a connection break. **For example**, on Juniper equipment, set hold-timer down to 500ms to avoid BGP session termination and routing table reconfiguration: set interfaces hold-time up 500 down 500 where 500 ms is the wait timeout before the operational status of the interface changes. Basically, the time for the bypass mechanism to switch from one mode to another is 10mS. The timing that you are seeing relates to re-establishing the link and then re-establishing the connection (with new routing tables in switches and devices). This switch to bypass mode is done in our product by physically connecting the pair of ports together (wire to wire). This means that when this happens, our product is actually out of the picture, and the start of the traffic with this new connection will depend on the two networking devices (router / switch / device) on how they link together and how they establish the connection again. You can try to force fix mode (not auto-neg, change to force 1G FD or so) this might reduce the time needed for the negotiation. Not sure how much. For the change from bypass mode to normal mode - all the above also stand as well. The networking devices (router / switch / device) lose the link with each other and start establishing the connection with the Silicom NIC. Here you have more control as the link is done between the two devices and your system (Check that all the devices are set to the same speed settings). From our customer and our experience, a 1-3 second time is reasonable to get the Copper 1G link to be established between 2 network devices. **Everything is fine except that one port on the network works in bypass mode and does not filter traffic.**\\ If it is configured (in/out_dev) but does not switch, try resetting the bypass switch on the card to its initial state: bpctl_util all set_bypass off bpctl_util all set_dis_bypass off bpctl_util all set_bypass_pwoff on bpctl_util all set_bypass_pwup on bpctl_util all set_std_nic off bpctl_util all get_bypass_change on bpctl_util all get_tx on bpctl_util all get_tpl off bpctl_util all get_wait_at_pwup off bpctl_util all get_hw_reset off bpctl_util all get_disc off bpctl_util all get_disc_change off bpctl_util all get_dis_disc off bpctl_util all get_disc_pwup off bpctl_util all get_wd_exp_mode bypass bpctl_util all get_wd_autoreset disable If it doesn't help, the card is defective, replace it under warranty. **Configuring Juniper so that switching to and from bypass does not lead to route reconfiguration.**\\ set interfaces hold-time up 500 down 500 show xe-5/2/0 description "-= 20G UPLINK LAGG =-"; hold-time up 1000 down 1000; gigether-options { 802.3ad ae1; } **Configuring Cisco so that switching to and from bypass does not lead to route reconfiguration.**\\ int fa0/0 ip bgp fast-external-fallover deny Note: The BGP Fast-external-fallover command terminates external BGP sessions of any directly adjacent peer if the link used to reach the peer goes down without waiting for the hold-down timer to expire. **List of all dna interfaces and their MAC addresses**\\ grep ^ /sys/class/net/dna?/address **How to check if the card is equipped with bypass**\\ To check for the presence of bypass, run the command: lspci -v|grep -A1 Eth For cards with bypass, the Subsystem field will indicate: Subsystem: Silicom Ltd. Device