en:dpi:dpi_components:platform:faq:administrator:start [Документация VAS Experts]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:dpi:dpi_components:platform:faq:administrator:start [2021/07/27 14:02] edrudichgmailcomen:dpi:dpi_components:platform:faq:administrator:start [2024/07/29 12:42] (current) – removed elena.krasnobryzh
Line 1: Line 1:
-====== 5 Administration issues ====== 
-{{indexmenu_n>5}} 
  
-===== How do I know the current release (CCC)? ===== 
- 
-  fastdpi -re 
- 
-===== How do I know the current version? ===== 
- 
-  fastdpi -ve 
- 
-===== How do I downgrade to the previous version? ===== 
- 
-  Example of rollback from 2.7 version to 2.6: 
-  yum downgrade fastdpi-2.6 
- 
-===== In the log I found an error "error loading DSCP settings, res=-4" ===== 
-The error is displayed because there is no dscp on the standalone systems. You can ignore it. 
- 
-===== Not all the commands are always processed, the following error appears: Can't connect to 127.0.0.1:29000, errcode=99 : Cannot assign requested address Autodetected fastdpi params : dev='lo', port=29000 connecting 127.0.0.1:29000 ... I suspect that our way of loading the subscribers to the SSG is not quite good for it (we load each subscriber separately, which leads to >50000 commands during initialization, which we do once a day) ===== 
- 
-fdpi_ctrl uses a common linux stack to connect the dpi, so the tuning recommendations are similar to those for web servers (like nginx) under high load\\ 
- 
-The settings are similar to those for nginx, which recommend to put in the /etc/sysctl.conf file (in order to keep them after reboot)\\ 
- 
-  # The OS network stack optimization 
-  net.core.netdev_max_backlog=10000 
-  net.core.somaxconn=262144 
-  net.ipv4.tcp_syncookies=1 
-  net.ipv4.tcp_max_syn_backlog = 262144 
-  net.ipv4.tcp_max_tw_buckets = 720000 
-  net.ipv4.tcp_tw_recycle = 1 
-  net.ipv4.tcp_timestamps = 1 
-  net.ipv4.tcp_tw_reuse = 1 
-  net.ipv4.tcp_fin_timeout = 30 
-  net.ipv4.tcp_keepalive_time = 1800 
-  net.ipv4.tcp_keepalive_probes = 7 
-  net.ipv4.tcp_keepalive_intvl = 30 
-  net.core.wmem_max = 33554432 
-  net.core.rmem_max = 33554432 
-  net.core.rmem_default = 8388608 
-  net.core.wmem_default = 4194394 
-  net.ipv4.tcp_rmem = 4096 8388608 16777216 
-  net.ipv4.tcp_wmem = 4096 4194394 16777216 
- 
-for a 1Gbit interface:\\ 
-   net.core.netdev_max_backlog=10000 
-for a 10Gbit interface:\\ 
-   net.core.netdev_max_backlog=30000 
- 
-To avoid having to reboot, you can change them on the fly by using the command:\\ 
-sysctl -w \\ 
-e.g. sysctl -w net.ipv4.tcp_tw_reuse=1\\ 
- 
-This should solve the problem\\ 
- 
-**For CentOS 7.* **\\ 
-Example: 
-  # The OS network stack optimization 
-  net.core.netdev_max_backlog=65536 
-  net.core.optmem_max=25165824 
-  net.core.somaxconn=1024 
-  net.ipv4.tcp_max_orphans = 60000 
-  net.ipv4.tcp_no_metrics_save = 1 
-  net.ipv4.tcp_window_scaling = 1 
-  net.ipv4.tcp_timestamps = 1 
-  net.ipv4.tcp_sack = 1 
-  net.ipv4.tcp_syncookies=1 
-  net.ipv4.tcp_max_syn_backlog = 262144 
-  net.ipv4.tcp_max_tw_buckets = 720000 
-  net.ipv4.tcp_tw_recycle = 1 
-  net.ipv4.tcp_timestamps = 1 
-  net.ipv4.tcp_tw_reuse = 1 
-  net.ipv4.tcp_fin_timeout = 30 
-  net.ipv4.tcp_keepalive_time = 1800 
-  net.ipv4.tcp_keepalive_probes = 7 
-  net.ipv4.tcp_keepalive_intvl = 30 
-  net.core.wmem_max = 33554432 
-  net.core.rmem_max = 33554432 
-  net.core.rmem_default = 8388608 
-  net.core.wmem_default = 4194394 
-  net.ipv4.tcp_rmem = 4096 8388608 16777216 
-  net.ipv4.tcp_wmem = 4096 4194394 16777216 
- 
-Update command: 
-  sysctl –system 
-[[https://christophermonzon.wordpress.com/2016/10/04/centos-7-network-performance/|ДMore information on CentOS7]] 
- 
- 
- 
-[[http://www.vasexperts.ru/upload/SCESM2СКАТ.zip|Scripts for migration from SCE SM to Stingray DB, description inside]] 
- 
-===== How to check the load by cores and why they are loaded unevenly ===== 
- 
-To view the CPU load by cores in the top utility, press 1 
-To view the load by DPI task, run the command 
-  ps -p `pidof fastdpi` H -o %cpu,lwp,pri,psr,comm 
-** Output example**: 
-<code bash>  %CPU   LWP PRI PSR COMMAND 
- 0.0  23141  41   0 fastdpi_main 
- 0.0  23146  41   0 fastdpi_dl 
- 0.3  23147  41   0 fastdpi_ctrl 
- 35.8 23148  41   0 fastdpi_ajb 
- 32.7 23152  41   1 fastdpi_rx_1 
- 34.1 23165  41   2 fastdpi_wrk0 
- 34.1 23170  41   3 fastdpi_wrk1</code> 
- In DPI, COMMAND tasks are functionally separated by PSR cores so as not to interfere with each other:\\ 
- - The wrk threads analyze data in network packets\\ 
- - The rx thread is responsible for the transit of data between network ports\\ 
- - Other threads perform application and auxiliary tasks (netflow generation, control command reception, list loading, pcap writing, etc.) and can cause CPU peak loads, so they are moved to a separate core. 
- 
- 
-===== We got an error in fastdpi_alert.log, how to solve the issue? [CRITICAL][2017/10/06-16:36:44:616019][0x7fdb297ac700] metadata_storage : Can't allocate memory [repeat 1], cntr=188889, allocated=188889 ===== 
- 
-In DPI, everything is preallocated, by default to a given number of subscribers.  
-This is regulated by the parameter in the configuration, mem_ip_metadata_recs.  
-**For example** to increase up to 500000 subscribers, chanfge the configuration /etc/dpi/fastdpi.conf: 
-  mem_ip_metadata_recs=500000 
-  You will need to restart: 
-  service fastdpi restart 
- 
-===== Which files do you recommend to archive? ===== 
- 
-  cp /etc/pf_ring/ /BACKUPDIR/pf_ring  
-  cp /etc/dpi /BACKUPDIR/etc/ 
-  mdb_copy /var/db/dpi /BACKUPDIR/db/ 
-  (you can make a backup from mdb_copy with fastdpi running) 
- 
- ===== ipmi takes up 100% of cpu, degrades the DPI performance ===== 
- 
-  echo 100 > /sys/module/ipmi_si/parameters/kipmid_max_busy_us 
-  This command can be added to /etc/rc.local to make sure that the setting is not lost upon server restart 
- 
-===== Error in the alert log: [ERROR   ] bpm : thread #1 - does not change self-monitoring counters, and the DPI restarted, and created a core file (or went into bypass) ===== 
- 
-The DPI performs self-diagnostics during operation, and if one worker thread froze and can no longer process traffic, then DPI detects this condition and restarts with the generation of a core-file on the Abort signal\\ 
-<note important>**Important:** trace and dbg settings in fastdpi.conf are intended for troubleshooting and debugging, not for continuous operation. E.g:\\ 
-if disk recording is blocked by other process (e.g. by rotation of logs which usually happens between 3 and 4 am), then when tracing is on, it may cause blocking working thread to write to diagnostic (slave) log and put dpi into bypass or restart, so remember to turn off these settings after diagnostic is done.</note> 
-The problem occurs only on some servers and if your server is among them, we recommend changing the default disk scheduler to deadline:  
- 
-<code bash>echo deadline > /sys/block/sda/queue/scheduler 
-echo deadline > /sys/block/sdb/queue/scheduler</code>  
- 
-===== Why does process memory consumption grow during work? ===== 
- 
-The DPI allocates memory statically: at process startup and when some service profiles (such as NAT, blacklists and whitelists) are created. During operation no additional memory is allocated. So why does memory consumption grow?\\ 
-The Linux operating system distinguishes between resident memory (denoted in the top as RES) and virtual (denoted in the top as VIRT) process memory. The peculiarity is that as long as the memory is not initialized (actually initialized by zero), it is not written by linux to the resident memory and is moved there as it is initialized.\\ 
-By setting mem_preset=1 in /etc/dpi/fastdpi.conf you can make the DPI initialize all (or nearly all) of the allocated memory so the resident part won't grow in size as it runs. This option slows down startup and is good when physical RAM is enough, so it is better just to consider this factor and watch for virtual memory consumption (VIRT) and resident memory consumption (RES) separately.  
- 
-===== One of the SSGs has a lot of zombie processes named wd_*. Only a restart can help? ===== 
-<code bash> 
-166206 ?        Z      0:00  \_ [wd_fastdpi.sh] <defunct> 
-166219 ?        Z      0:00  \_ [wd_fastpcrf.sh] <defunct> 
-</code> 
-To restart watchdog is enough 
-  service watchdog restart