diff --git a/docs/source/advanced/performance_tuning/database_tuning.rst b/docs/source/advanced/performance_tuning/database_tuning.rst index 29e0997ee..d9919d815 100644 --- a/docs/source/advanced/performance_tuning/database_tuning.rst +++ b/docs/source/advanced/performance_tuning/database_tuning.rst @@ -1,13 +1,13 @@ Tuning the Database Server ========================== -1. MariaDB database +#. MariaDB database -MariaDB: `Tuning Server Parameters `_ + MariaDB: `Tuning Server Parameters `_ -According to this documentation, the two most important variables to configure are key_buffer_size and table_open_cache. + According to this documentation, the two most important variables to configure are key_buffer_size and table_open_cache. -2. PostgreSQL database +#. PostgreSQL database -PostgreSQL: `Server Configuration `_ + PostgreSQL: `Server Configuration `_ diff --git a/docs/source/advanced/performance_tuning/linux_os_tuning.rst b/docs/source/advanced/performance_tuning/linux_os_tuning.rst index 5a7a9833d..315c83e36 100644 --- a/docs/source/advanced/performance_tuning/linux_os_tuning.rst +++ b/docs/source/advanced/performance_tuning/linux_os_tuning.rst @@ -4,42 +4,43 @@ System Tuning Settings for Linux Adjusting Operating System tunables can improve large scale cluster performance, avoid bottlenecks, and prevent failures. The following sections are a collection of suggestions that have been gathered from various large scale HPC clusters. You should investigate and evaluate the validity of each suggestion before applying them to your cluster. -1. Tuning Linux ulimits: +#. Tuning Linux ulimits: -The open file limits are important to high concurrence network services, such as *xcatd*. For a large cluster, it is required to increase the number of open file limit to avoid **Too many open files** error. The default value is *1024* in most OS distributions, to add below configuration in ``/etc/security/limits.conf`` to increase to *14096*. -:: + The open file limits are important to high concurrence network services, such as ``xcatd``. For a large cluster, it is required to increase the number of open file limit to avoid **Too many open files** error. The default value is *1024* in most OS distributions, to add below configuration in ``/etc/security/limits.conf`` to increase to *14096*. + :: - * soft nofiles 14096 - * hard nofiles 14096 + * soft nofiles 14096 + * hard nofiles 14096 -2. Tuning Network kernel parameters: +#. Tuning Network kernel parameters: -There might be hundreds of hosts in a big network for large cluster, tuning the network kernel parameters for optimum throughput and latency could improve the performance of distributed application. For example, adding below configuration in ``/etc/sysctl.conf`` to increase the buffer. + There might be hundreds of hosts in a big network for large cluster, tuning the network kernel parameters for optimum throughput and latency could improve the performance of distributed application. For example, adding below configuration in ``/etc/sysctl.conf`` to increase the buffer. -:: + :: - net.core.rmem_max = 33554432 - net.core.wmem_max = 33554432 - net.core.rmem_default = 65536 - net.core.wmem_default = 65536 - - net.ipv4.tcp_rmem = 4096 33554432 33554432 - net.ipv4.tcp_wmem = 4096 33554432 33554432 - net.ipv4.tcp_mem= 33554432 33554432 33554432 - net.ipv4.route.flush=1 - net.core.netdev_max_backlog=1500 + net.core.rmem_max = 33554432 + net.core.wmem_max = 33554432 + net.core.rmem_default = 65536 + net.core.wmem_default = 65536 + + net.ipv4.tcp_rmem = 4096 33554432 33554432 + net.ipv4.tcp_wmem = 4096 33554432 33554432 + net.ipv4.tcp_mem= 33554432 33554432 33554432 + net.ipv4.route.flush=1 + net.core.netdev_max_backlog=1500 -And if you encounter **Neighbour table overflow** error, it meams there are two many ARP requests and the server cannot reply. Tune the ARP cache with below parameters. + And if you encounter **Neighbour table overflow** error, it meams there are two many ARP requests and the server cannot reply. Tune the ARP cache with below parameters. -:: + :: - net.ipv4.conf.all.arp_filter = 1 - net.ipv4.conf.all.rp_filter = 1 - net.ipv4.neigh.default.gc_thresh1 = 30000 - net.ipv4.neigh.default.gc_thresh2 = 32000 - net.ipv4.neigh.default.gc_thresh3 = 32768 - net.ipv4.neigh.ib0.gc_stale_time = 2000000 + net.ipv4.conf.all.arp_filter = 1 + net.ipv4.conf.all.rp_filter = 1 + net.ipv4.neigh.default.gc_thresh1 = 30000 + net.ipv4.neigh.default.gc_thresh2 = 32000 + net.ipv4.neigh.default.gc_thresh3 = 32768 + net.ipv4.neigh.ib0.gc_stale_time = 2000000 -For more tunable parameters, you can refer to "Linux System Tuning Recommendations". + + For more tunable parameters, you can refer to `Linux System Tuning Recommendations `_. diff --git a/docs/source/advanced/performance_tuning/xcatd_tuning.rst b/docs/source/advanced/performance_tuning/xcatd_tuning.rst index ce276ed02..577038326 100644 --- a/docs/source/advanced/performance_tuning/xcatd_tuning.rst +++ b/docs/source/advanced/performance_tuning/xcatd_tuning.rst @@ -1,27 +1,25 @@ Tuning xCAT Daemon Attributes ================================== -For large clusters, you consider changing the default settings in `site` table to improve the performance on a large-scale cluster or if you are experiencing timeouts or failures in these areas: +For large clusters, you consider changing the default settings in ``site`` table to improve the performance on a large-scale cluster or if you are experiencing timeouts or failures in these areas: -*consoleondemand* : When set to 'yes', conserver connects and creates the console output for a node only when the user explicitly opens the console using rcons or wcons. Default is 'no' on Linux, 'yes' on AIX. Setting this to 'yes' can reduce the load conserver places on your xCAT management node. If you need this set to 'no', you may then need to consider setting up multiple servers to run the conserver daemon, and specify the correct server on a per-node basis by setting each node's conserver attribute. +**consoleondemand** : When set to ``yes``, conserver connects and creates the console output for a node only when the user explicitly opens the console using rcons or wcons. Default is ``no`` on Linux, ``yes`` on AIX. Setting this to ``yes`` can reduce the load conserver places on your xCAT management node. If you need this set to ``no``, you may then need to consider setting up multiple servers to run the conserver daemon, and specify the correct server on a per-node basis by setting each node's conserver attribute. -*nodestatus* : If set to 'n', the nodelist.status column will not be updated during the node deployment, node discovery and power operations. Default is 'y', always update nodelist.status. Setting this to 'n' for large clusters can eliminate one node-to-server contact and one xCAT database write operation for each node during node deployment, but you will then need to determine deployment status through some other means. +**nodestatus** : If set to ``n``, the ``nodelist.status`` column will not be updated during the node deployment, node discovery and power operations. Default is ``y``, always update ``nodelist.status``. Setting this to ``n`` for large clusters can eliminate one node-to-server contact and one xCAT database write operation for each node during node deployment, but you will then need to determine deployment status through some other means. -*precreatemypostscripts* : (yes/1 or no/0, only for Linux). Default is no. If yes, it will instruct xcat at nodeset and updatenode time to query the database once for all of the nodes passed into the command and create the mypostscript file for each node, and put them in a directory in tftpdir(such as: /tftpboot). This prevents xcatd from having to create the mypostscript files one at a time when each deploying node contacts it, so it will speed up the deployment process. (But it also means that if you change database values for these nodes, you must rerun nodeset.) If precreatemypostscripts is set to no, the mypostscript files will not be generated ahead of time. Instead they will be generated when each node is deployed. +**precreatemypostscripts** : (``yes/1`` or ``no/0``, only for Linux). Default is ``no``. If ``yes``, it will instruct xcat at ``nodeset`` and ``updatenode`` time to query the database once for all of the nodes passed into the command and create the ``mypostscript`` file for each node, and put them in a directory in ``site.tftpdir`` (such as: ``/tftpboot``). This prevents ``xcatd`` from having to create the ``mypostscript`` files one at a time when each deploying node contacts it, so it will speed up the deployment process. (But it also means that if you change database values for these nodes, you must rerun ``nodeset``.) If **precreatemypostscripts** is set to ``no``, the ``mypostscript`` files will not be generated ahead of time. Instead they will be generated when each node is deployed. -*svloglocal* : if set to 1, syslog on the service node will not get forwarded to the mgmt node. The default is to forward all syslog messages. The tradeoff on setting this attribute is reducing network traffic and log size versus having local management node access to all system messages from across the cluster. +**svloglocal** : if set to ``1``, syslog on the service node will not get forwarded to the mgmt node. The default is to forward all syslog messages. The tradeoff on setting this attribute is reducing network traffic and log size versus having local management node access to all system messages from across the cluster. -*skiptables* : a comma separated list of tables to be skipped by dumpxCATdb. A recommended setting is "auditlog,eventlog" because these tables can grow very large. Default is to skip no tables. +**skiptables** : a comma separated list of tables to be skipped by ``dumpxCATdb``. A recommended setting is ``auditlog,eventlog`` because these tables can grow very large. Default is to skip no tables. -*useNmapfromMN* : When set to yes, nodestat command should obtain the node status using nmap (if available) from the management node instead of the service node. This will improve the performance in a flat network. Default is 'no'. +**dhcplease** : The lease time for the DHCP client. The default value is *43200*. -*dhcplease* : The lease time for the dhcp client. The default value is 43200. +**xcatmaxconnections** : Number of concurrent xCAT protocol requests before requests begin queueing. This applies to both client command requests and node requests, e.g. to get postscripts. Default is ``64``. -*xcatmaxconnections* : Number of concurrent xCAT protocol requests before requests begin queueing. This applies to both client command requests and node requests, e.g. to get postscripts. Default is 64. +**xcatmaxbatchconnections** : Number of concurrent xCAT connections allowed from the nodes. Number must be less than **xcatmaxconnections**. -*xcatmaxbatchconnections* : Number of concurrent xCAT connections allowed from the nodes. Number must be less than xcatmaxconnections. See useflowcontrol attribute. - -*useflowcontrol* : If yes, the postscript processing on each node contacts xcatd on the MN/SN using a lightweight UDP packet to wait until xcatd is ready to handle the requests associated with postscripts. This prevents deploying nodes from flooding xcatd and locking out admin interactive use. This value works with the *xcatmaxconnections* and *xcatmaxbatch* attributes. If the value is no, nodes sleep for a random time before contacting xcatd, and retry. The default is no.. Not supported on AIX. +**useflowcontrol** : If ``yes``, the postscript processing on each node contacts ``xcatd`` on the MN/SN using a lightweight UDP packet to wait until ``xcatd`` is ready to handle the requests associated with postscripts. This prevents deploying nodes from flooding ``xcatd`` and locking out admin interactive use. This value works with the **xcatmaxconnections** and **xcatmaxbatch** attributes. If the value is ``no``, nodes sleep for a random time before contacting ``xcatd``, and retry. The default is ``no``. Not supported on AIX. -These attributes may be changed based on the size of your cluster. For a large cluster, it is better to enable *useflowcontrol* and set *xcatmaxconnection = 128*, *xcatmaxbatchconnections = 100*. Then the daemon will only allow 100 concurrent connections from the nodes. This will allow 28 connections still to be available on the management node for xCAT commands (e.g nodels). +These attributes may be changed based on the size of your cluster. For a large cluster, it is better to enable **useflowcontrol** and set ``xcatmaxconnection = 128``, ``xcatmaxbatchconnections = 100``. Then the daemon will only allow 100 concurrent connections from the nodes. This will allow 28 connections still to be available on the management node for xCAT commands (e.g ``nodels``).