.. _Creating the Storage Cluster:

Creating the Storage Cluster
----------------------------

Before you create the storage cluster, enable high availability of the management node as described in :ref:`Enabling High Availability`.

To create a storage cluster, you need to create a basic storage cluster on one (first) node, then populate it with more nodes.

If networks adapters on your nodes support RDMA (via RoCE, iWARP or IB) and you want to enable this functionality, you must do so before creating the storage cluster as explained in :ref:`Enabling RDMA`.

.. _Creating the Storage Cluster on the First Node:

Creating the Storage Cluster on the First Node
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

#. .. include:: /includes/managing-storage-cluster-part1.inc

#. .. include:: /includes/managing-storage-cluster-part2.inc

#. .. include:: /includes/managing-storage-cluster-part3.inc

#. From the **Storage interface** drop-down list, select a node network interface connected to a network with the traffic type **Storage**.

   .. include:: /includes/managing-storage-cluster-part5.inc

#. If required, enable data encryption. To do this, check the **Encryption** box (see :ref:`Managing Tier Encryption`) and proceed to create the cluster. Encryption will be enabled for all tiers by default. To enable encryption for particular tiers, click the cogwheel icon to open the **Encryption Configuration** panel, select tiers to encrypt, and click **Done**. You can later disable encryption for new chunk services (CS) on the **SETTINGS** > **Advanced settings** panel.

#. Click **New cluster** to have |product_name| assign the roles to disks automatically. Alternatively, click **Advanced configuration** to assign the roles to each drive manually and tweak other settings.

.. include:: /includes/managing-storage-cluster-part4.inc

.. _Adding Nodes to Storage Cluster:

Adding Nodes to Storage Cluster
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To add an unassigned node to a cluster, do the following:

#. On the **INFRASTRUCTURE** > **Nodes** screen, click an unassigned node.

#. On the node overview screen, click **Join cluster**.

#. Make sure a network interface that is connected to a network with the traffic type **Storage** is selected from the **Storage interface** drop-down list.

   .. include:: /includes/managing-storage-cluster-part5.inc

   .. only:: ac

      .. image:: /images/stor_image24_1_ac.png
         :align: center
         :class: align-center

   .. only:: vz

      .. image:: /images/stor_image24_1_vz.png
         :align: center
         :class: align-center

#. Click **Join cluster** to have |product_name| assign the roles to disks automatically and add the node to the current cluster. Alternatively, click **Advanced configuration** to assign the roles to each drive manually (see :ref:`Assigning Disk Roles Manually`).

.. _Assigning Disk Roles Manually:

Assigning Disk Roles Manually
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If you clicked **Advanced configuration** while creating a cluster or adding nodes to it, you will be taken to the list of drives on the node where you can manually assign roles to these drives. Do the following: 

#. On the **Join cluster** or **New cluster** panel, select a drive or check multiple drives in the list and click **Configure**. 

#. On the **Choose role** screen, select one of the following roles for the disk:

   .. only:: ac

      .. image:: /images/stor_image24_2_ac.png
         :align: center
         :class: align-center

   .. only:: vz

      .. image:: /images/stor_image24_2_vz.png
         :align: center
         :class: align-center

   - **Storage**. Use the disk to store chunks and run a chunk service on the node. From the **Caching and checksumming** drop-down list, select one of the following:

     - **Use SSD for caching and checksumming**. Available and recommended only for nodes with SSDs. 

     - **Enable checksumming** (default). Recommended for cold data as it provides better reliability.

     - **Disable checksumming**. Recommended for hot data as it provides better performance.

     Data caching improves cluster performance by placing the frequently accessed data on an SSD.

     Data checksumming generates checksums each time some data in the cluster is modified. When this data is then read, a new checksum is computed and compared with the old checksum. If the two are not identical, a read operation is performed again, thus providing better data reliability and integrity.

     If a node has an SSD, it will be automatically configured to keep checksums when you add a node to a cluster. This is the recommended setup. However, if a node does not have an SSD drive, checksums will be stored on a rotational disk by default. It means that this disk will have to handle double the I/O, because for each data read/write operation there will be a corresponding checksum read/write operation. For this reason, you may want to disable checksumming on nodes without SSDs to gain performance at the expense of checksums. This can be especially useful for hot data storage.

     To add an SSD to a node that is already in the cluster (or replace a broken SSD), you will need to release the node from the cluster, attach the SSD, choose to join the node to the cluster again, and, while doing so, select **Use SSD for caching and checksumming** for each disk with the role **Storage**.

     With the **Storage** role, you can also select a tier from the **Tier** drop-down list. To make better use of data redundancy, do not assign all the disks on a node to the same tier. Instead, make sure that each tier is evenly distributed across the cluster with only one disk per node assigned to it. For more information, see the *Installation Guide*.

     .. note:: If the disk contains old data that was not placed there by |product_name|, the disk will not be considered suitable for use in |product_name|.

   - **Metadata**. Use the disk to store metadata and run a metadata service on the node.

   - **Cache**. Use the disk to store write cache. This role is only for SSDs. To cache a specific storage tier, select it from the drop-down list. Otherwise, all tiers will be cached.

   - **Metadata+Cache**. A combination of two roles described above.

   - **Unassigned**. Remove the roles from the disk.

   Take note of the following:

   - If a physical server has a system disk with the capacity greater than 100GB, that disk can be additionally assigned the **Metadata** or **Storage** role. In this case, a physical server can have at least 2 disks.

   - It is recommended to assign the **System+Metadata** role to an SSD. Assigning both these roles to an HDD will result in mediocre performance suitable only for cold data (e.g., archiving). 

   - The **System** role cannot be combined with the **Cache** and **Metadata+Cache** roles. The reason is that is I/O generated by the operating system and applications would contend with I/O generated by journaling, negating its performance benefits.
   
#. Click **Done**.

#. Repeat steps 1 to 3 for every disk you want to be used in the storage cluster.

#. Click **NEW CLUSTER** or **JOIN CLUSTER**. On the **Configuration summary** screen, check the number of disks per each configuration category.

   .. only:: ac

      .. image:: /images/stor_image24_3_ac.png
         :align: center
         :class: align-center

   .. only:: vz

      .. image:: /images/stor_image24_3_vz.png
         :align: center
         :class: align-center

#. Click **PROCEED**. You can monitor disk configuration progress in the **HEALTHY** list of the **INFRASTRUCTURE** > **Nodes** screen.