2.3. Planning node hardware configurations

Acronis Cyber Infrastructure 在商用硬件上工作,这样您可以从常规服务器、磁盘和网卡创建簇。为了达到最优性能,仍必须满足许多要求,并应遵循大量建议。

注解

If you are unsure of what hardware to choose, consult your sales representative. You can also use the online hardware calculator. If you want to avoid the hassle of testing, installing, and configuring hardware and/or software, consider using Acronis Appliance. Out of the box, you will get an enterprise-grade, fault-tolerant, five-node infrastructure solution with great storage performance that runs in a 3U form factor.

2.3.1. Hardware limits

下表列出了 Acronis Cyber Infrastructure 服务器的当前硬件限制:

表 2.3.1.1 服务器硬件限制

硬件

理论

已认证

RAM

64 TB

1 TB

CPU

5120 个逻辑 CPU

384 个逻辑 CPU

在多核心(多线程)处理器中,逻辑 CPU 是核心(线程)。

2.3.2. Hardware requirements

The following table lists the minimum and recommended disk requirements according to the disk roles (refer to Storage architecture overview):

表 2.3.2.1 磁盘要求

磁盘角色

数量

最小值

推荐

系统

每个节点一个磁盘

100 GB SATA/SAS HDD

250 GB SATA/SAS SSD

元数据

每个节点一个磁盘

一个簇建议使用五个磁盘

100 GB 企业级 SSD,具有断电保护功能,最低 1 DWPD 耐久性

缓存

可选

每 4-12 个 HDD 一个 SSD 磁盘

100+ GB 企业级 SSD,具有断电保护功能,每个服务 HDD 具有 75 MB/s 连续写入性能; 最低 1 DWPD 耐力,建议 10 DWPD

存储器

可选

每个簇至少一个

建议至少 100 GB,最多 16 TB

SATA/SAS HDD 或 SATA/SAS/NVMe SSD(企业级,具有断电保护功能,最低 1 DWPD 耐久性)

The following table lists the amount of RAM and CPU cores that will be reserved on one node, according to the services you will use:

表 2.3.2.2 CPU 和 RAM 要求

服务

RAM

CPU 核心*

系统

6 GB

2 个核心

存储服务:每个具有存储角色或缓存角色的磁盘(任意大小)**

1 GB

0.2 cores

Compute service***

8 GB

3 个核心

Load balancer service***

1 GB

1 个核心

Kubernetes service***

2 GB

2 个核心

S3

4.5 GB

3 个核心

Backup Gateway****

1 GB

2 个核心

NFS

服务

4 GB

2 个核心

每个共享

0.5 GB

0.5 cores

iSCSI

服务

1 GB

1 个核心

每个卷

0.1 GB

0.5 cores

* 64 位 x86 AMD-V 或 Intel VT 处理器,已启用硬件虚拟化扩展。对于 Intel 处理器,请在 BIOS 中使用“扩展页表”启用“不受限制的来宾”和 VT-x。建议在每个节点上有相同的 CPU 型号,以避免 VM 实时迁移问题。此处的 CPU 核心是多核处理器中的物理核心(不考虑超线程)。

** For clusters larger than 1 PB of physical space, please add an additional 0.5 GB of RAM per Metadata service.

*** The compute, load balancer, and Kubernetes service requirements only refer to the management node.

**** When working with public clouds and NFS, Backup Gateway consumes as much RAM and CPU as with a local storage.

As for the networks, at least 2 x 10 GbE interfaces are recommended, for internal and external traffic; 25 GbE, 40 GbE, and 100 GbE are even better. Bonding is recommended. However, for external traffic, you can start with 1 GbE links, but they can limit cluster throughput on modern loads.

Let’s consider some examples and calculate the requirements for particular cases.

  • If you have 1 node (1 system disk and 4 storage disks) and want to use it for Backup Gateway, see the table below for the calculations.

    表 2.3.2.3 Example: 1 node for Backup Gatewway

    服务

    Each node

    系统

    6 GB, 2 cores

    Storage services

    4 storage disks with 1 GB and 0.2 cores, that is 4 GB and 0.8 cores

    备份网关

    1 GB, 2 cores

    Total

    11 GB of RAM and 4.8 cores

  • If you have 3 nodes (1 system disk and 4 storage disks) and want to use them for the compute service, see the table below for the calculations.

    表 2.3.2.4 Example: 3 nodes for the compute service

    服务

    Management node

    Each secondary node

    系统

    6 GB, 2 cores

    6 GB, 2 cores

    Storage services

    4 storage disks with 1 GB and 0.2 cores, that is 4 GB and 0.8 cores

    4 storage disks with 1 GB and 0.2 cores, that is 4 GB and 0.8 cores

    计算

    8 GB, 3 cores

    Load balancer

    1 GB, 1 core

    Kubernetes

    2 GB, 2 cores

    Total

    21 GB of RAM and 8.8 cores

    10 GB of RAM and 2.8 cores

  • If you have 5 nodes (1 system+storage disk and 10 storage disks) and want to use them for Backup Gateway, see the table below for the calculations. Note that if the compute cluster is not deployed, the requirements are the same for the management and the secondary nodes.

    表 2.3.2.5 Example: 5 nodes for Backup Gateway

    服务

    Each node

    系统

    6 GB, 2 cores

    Storage services

    11 storage disks with 1 GB and 0.2 cores, that is 11 GB and 2 cores

    备份网关

    1 GB, 2 cores

    Total

    18 GB of RAM and 6 cores

  • If you have 10 nodes (1 system disk, 1 cache disk, 3 storage disks) and want to use them for the compute service, see the table below for the calculations. Note that three nodes are used for the management node high availability, and each of them meets the requirements for the management node.

    表 2.3.2.6 Example: 10 nodes for the compute service with MN HA

    服务

    Each management node

    Each secondary node

    系统

    6 GB, 2 cores

    6 GB, 2 cores

    Storage services

    3 storage + 1 cache disks with 1 GB and 0.2 cores, that is 4 GB and 0.8 cores

    3 storage + 1 cache disks with 1 GB and 0.2 cores, that is 4 GB and 0.8 cores

    计算

    8 GB, 3 cores

    Load balancer

    1 GB, 1 core

    Kubernetes

    2 GB, 2 cores

    Total

    21 GB of RAM and 8.8 cores

    10 GB of RAM and 2.8 cores

通常,您为簇提供的资源越多,它的运行效果就越好。所有额外的 RAM 用于缓存磁盘读取内容。额外的 CPU 核心可提高性能并降低延迟。

2.3.3. Hardware recommendations

通常,Acronis Cyber Infrastructure 在与为 Red Hat Enterprise Linux 7 建议的相同硬件上工作,包括 AMD EPYC 处理器:服务器组件

以下建议进一步解释了由硬件需求表中的特定硬件添加的优势。使用它们以最优方式配置簇。

2.3.3.1. Storage cluster composition recommendations

Designing an efficient storage cluster means finding a compromise between performance and cost that suits your purposes. When planning, keep in mind that a cluster with many nodes and few disks per node offers higher performance, while a cluster with the minimum number of nodes (3) and a lot of disks per node is cheaper. See the following table for more details.

表 2.3.3.1.1 簇构成建议

设计注意事项

最少的节点 (3),每个节点许多磁盘

许多节点,每个节点少量磁盘,每个节点少量磁盘(全闪存配置)

最优化

成本更低。

性能更高。

要保留的空闲磁盘空间

More space to reserve for cluster rebuilding, as fewer healthy nodes will have to store the data from a failed node.

Less space to reserve for cluster rebuilding, as more healthy nodes will have to store the data from a failed node.

冗余

更少的擦除编码选择。

更多的擦除编码选择。

簇均衡和重建性能

更差的均衡和更慢的重建。

更好的均衡和更快的重建。

网络容量

在重建期间需要更多的网络带宽来维持簇性能。

在重建期间需要更少的网络带宽来维持簇性能。

有利的数据类型

冷数据(例如,备份)。

热数据(例如,虚拟环境)。

服务器配置样本

Supermicro SSG-6047R-E1R36L (Intel Xeon E5-2620 v1/v2 CPU, 32 GB RAM, 36 x 12 TB HDDs, a 500 GB system disk).

Supermicro SYS-2028TP-HC0R-SIOM (4 x Intel E5-2620 v4 CPUs, 4 x 16 GB RAM, 24 x 1.9 TB Samsung PM1643 SSDs).

注意以下内容:

  1. These considerations only apply if the failure domain is the host.

  2. 在复制模式下的重建速度与簇中节点的数量无关。

  3. Acronis Cyber Infrastructure supports hundreds of disks per node. If you plan to use more than 36 disks per node, contact our sales engineers who will help you design a more efficient cluster.

2.3.3.2. General hardware recommendations

  • 对于生产环境,至少需要五个节点。这可确保在两个节点出现故障时,簇仍能正常工作而不会丢失数据。

  • One of the strongest features of Acronis Cyber Infrastructure is scalability. The bigger the cluster, the better Acronis Cyber Infrastructure performs. It is recommended to create production clusters from at least ten nodes, for improved resilience, performance, and fault tolerance in production scenarios.

  • 即使可以在不同的硬件上创建簇,但在每个节点上使用具有相似硬件的节点将产生更好的簇性能、容量和整体均衡。

  • 任何簇基础架构在部署到生产之前,必须经过广泛测试。类似 SSD 驱动器的常见故障点和网络适配器绑定必须始终完全经过验证。

  • 不建议在具有其自己的冗余机制的 SAN/NAS 硬件上为生产运行 Acronis Cyber Infrastructure。这样做可能对性能和数据可用性产生不利影响。

  • To achieve the best performance, keep at least 20 percent of the cluster capacity free.

  • 在灾难恢复期间,Acronis Cyber Infrastructure 可能需要额外的磁盘空间用于复制。确保保留至少与任意单个存储节点具有的空间数量相同的空间。

  • It is recommended to have the same CPU models on each node to avoid VM live migration issues. For more details, refer to the Administrator Command Line Guide.

  • If you plan to use Backup Gateway to store backups in the cloud, make sure the local storage cluster has plenty of logical space for staging (keeping backups locally before sending them to the cloud). For example, if you perform backups daily, provide enough space for at least 1.5 days’ worth of backups. For more details, refer to the Administrator Guide.

  • It is recommended to use UEFI instead of BIOS if this supported by your hardware. This is recommended particularly if you use NVMe drives.

2.3.3.3. Storage hardware recommendations

  • 可以在相同的簇中使用不同大小的磁盘。但要记住,对于相同的 IOPS,较小的磁盘会比较大的磁盘提供更高的每 TB 数据性能。建议在相同层中将具有每 TB 的相同 IOPS 的磁盘编组在一起。

  • 使用建议的 SSD 型号可能有助于避免数据丢失。并不是所有 SSD 驱动器都能够承受企业工作负载,并可能在操作的前几个月中中断,导致 TCO 猛增。

    • SSD memory cells can withstand a limited number of rewrites. An SSD drive should be viewed as a consumable that you will need to replace after a certain time. Consumer-grade SSD drives can withstand a very low number of rewrites (so low, in fact, that these numbers are not shown in their technical specifications). SSD drives intended for storage clusters must offer at least 1 DWPD endurance (10 DWPD is recommended). The higher the endurance, the less often SSDs will need to be replaced, and this will improve TCO.

    • Many consumer-grade SSD drives can ignore disk flushes and falsely report to operating systems that data was written while it, in fact, was not. Examples of such drives include OCZ Vertex 3, Intel 520, Intel X25-E, and Intel X-25-M G2. These drives are known to be unsafe in terms of data commits, they should not be used with databases, and they may easily corrupt the file system in case of a power failure. For these reasons, use enterprise-grade SSD drives that obey the flush rules (for more information, refer to http://www.postgresql.org/docs/current/static/wal-reliability.html). Enterprise-grade SSD drives that operate correctly usually have the power loss protection property in their technical specification. Some of the market names for this technology are Enhanced Power Loss Data Protection (Intel), Cache Power Protection (Samsung), Power-Failure Support (Kingston), and Complete Power Fail Protection (OCZ).

    • It is highly recommended to check the data flushing capabilities of your disks as explained in Checking disk data flushing capabilities.

    • Consumer-grade SSD drives usually have unstable performance and are not suited to withstand sustainable enterprise workloads. For this reason, pay attention to sustainable load tests when choosing SSDs.

    • Performance of SSD disks may depend on their size. Lower-capacity drives (100 to 400 GB) may perform much slower (sometimes up to ten times slower) than higher-capacity ones (1.9 to 3.8 TB). Check the drive performance and endurance specifications before purchasing hardware.

  • Using NVMe or SAS SSDs for write caching improves random I/O performance and is highly recommended for all workloads with heavy random access (for example, iSCSI volumes). In turn, SATA disks are best suited for SSD-only configurations but not write caching.

  • Using shingled magnetic recording (SMR) HDDs is strongly not recommended, even for backup scenarios. Such disks have unpredictable latency, which may lead to unexpected temporary service outages and sudden performance degradations.

  • 在 SSD 上运行元数据服务可提高簇性能。还为了最大程度地降低 CAPEX,相同的 SSD 可用于写缓存。

  • If capacity is the main goal and you need to store infrequently accessed data, choose SATA disks over SAS ones. If performance is the main goal, choose NVMe or SAS disks over SATA ones.

  • The more disks per node, the lower the CAPEX. As an example, a cluster created from ten nodes with two disks in each will be less expensive than a cluster created from twenty nodes with one disk in each.

  • 将具有一个 SSD 的 SATA HDD 用于缓存比仅使用无此类 SSD 的SAS HDD 更经济高效。

  • 使用 RAID 或 HBA 控制器分别为系统磁盘创建硬件或软件 RAID1 卷,以确保其高性能和可用性。

  • Use HBA controllers, as they are less expensive and easier to manage than RAID controllers.

  • Disable all RAID controller caches for SSD drives. Modern SSDs have good performance that can be reduced by a RAID controller’s write and read cache. It is recommended to disable caching for SSD drives and leave it enabled only for HDD drives.

  • 如果使用 RAID 控制器,则不要从用于存储的 HDD 创建 RAID 卷。每个存储 HDD 需要由 Acronis Cyber Infrastructure 识别为单独的设备。

  • If you use RAID controllers with caching, equip them with backup battery units (BBUs), to protect against cache loss during power outages.

  • Disk block size (for example, 512b or 4K) is not important and has no effect on performance.

2.3.3.4. Network hardware recommendations

  • 为内部和公共流量使用单独的网络(尽管是可选的,但最好是单独的网络适配器)。这样做将阻止公共流量影响簇 I/O 性能,还会阻止可能来自外部的服务拒绝攻击。

  • 网络延迟会大大降低簇性能。使用具有低延迟链路的高质量网络设备。不要使用消费级网络交换机。

  • 不要使用桌面网络适配器,比如 Intel EXPI9301CTBLK 或 Realtek 8129,因为它们不是针对高负载而设计的,而且可能不支持全双工链路。还使用非阻止的以太网交换机。

  • 要避免入侵,Acronis Cyber Infrastructure 应位于不可从外部访问的专用内部网络上。

  • 节点上有两个 HDD ,每个使用一个 1 Gbit/s 链路(上入)。对于节点上的一个或两个 HDD,仍建议将两个绑定网络接口用于高网络可用性。此建议的原因是,1 Gbit/s 以太网网络可以提供 110-120 MB/s 的吞吐量,这接近于单个磁盘的序列 I/O 性能。因为服务器上的多个磁盘比单个 1 Gbit/s 以太网链路提供更高的吞吐量,因此网络可能成为一个瓶颈。

  • For maximum sequential I/O performance, use one 1 Gbit/s link per each hard drive or one 10 Gbit/s link per node. Even though I/O operations are most often random in real-life scenarios, sequential I/O is important in backup scenarios.

  • 要获得最高的整体性能,请每个节点使用一个 10 Gbit/s 链路(或者两个绑定用于高网络可用性)。

  • It is not recommended to configure 1 Gbit/s network adapters to use non-default MTUs (for example, 9000-byte jumbo frames). Such settings require additional configuration of switches and often lead to human error. 10+ Gbit/s network adapters, on the other hand, need to be configured to use jumbo frames to achieve full performance.

  • 当前支持的光纤通道主机总线适配器 (HBA) 是 QLogic QLE2562-CK 和 QLogic ISP2532。

  • 建议使用 Mellanox ConnectX-4 和 ConnectX-5 InfiniBand 适配器。Mellanox ConnectX-2 和 ConnectX-3 卡不受支持。

  • 不建议使用 BNX2X 驱动程序的适配器,例如 Broadcom Limited BCM57840 NetXtreme II 10/20-Gigabit Ethernet/HPE FlexFabric 10Gb 2-port 536FLB Adapter。它们会将 MTU 限制为 3616,这会影响簇性能。

2.3.4. Hardware and software limitations

硬件限制:

  • Each management node must have at least two disks (one for system and metadata, one for storage).

  • Each secondary node must have at least three disks (one for system, one for metadata, one for storage).

  • Three servers are required to test all of the product features.

  • The system disk must have at least 100 GB of space.

  • The admin panel requires a Full HD monitor to be displayed correctly.

  • 受支持的最高物理分区大小为 254 TiB。

软件限制:

  • 一个节点可以仅是一个簇的一部分。

  • 仅可以在存储簇上创建一个 S3 簇。

  • 在管理面板中仅可使用预定义的冗余模式。

  • 始终为所有数据启用精简配置,否则无法配置。

  • The admin panel has been tested to work at resolutions 1280x720 and higher in the following web browsers: latest Firefox, Chrome, Safari.

For network limitations, refer to Network limitations.

2.3.5. Minimum storage configuration

The minimum configuration described in this table will let you evaluate the features of the storage cluster. It is not meant for production.

表 2.3.5.1 最低簇配置

节点号

第 1 个磁盘角色

第 2 个磁盘角色

第 3 个及以上磁盘角色

访问点

1

系统

元数据

存储器

iSCSI、S3 专用、S3 公共、NFS、Backup Gateway

2

系统

元数据

存储器

iSCSI、S3 专用、S3 公共、NFS、Backup Gateway

3

系统

元数据

存储器

iSCSI、S3 专用、S3 公共、NFS、Backup Gateway

共 3 个节点

 

共 3 个 MDS

共超过 3 个 CS

访问点服务共在三个节点上运行。

注解

可以为 SSD 磁盘同时指派系统元数据缓存角色,从而为存储角色释放更多磁盘。

即使为最低的配置建议使用三个节点,也可以开始评估仅有一个节点的 Acronis Cyber Infrastructure,并在以后添加更多节点。存储簇必须至少有一个运行的元数据服务和一个区块服务。单节点安装将让您评估类似 iSCSI、Backup Gateway 等服务。但此类配置将有两个主要限制:

  1. 只有一个 MDS 将是单一故障点。如果失败,整个簇将停止工作。

  2. 只有一个 CS 将能够仅存储一个区块副本。如果失败,数据将丢失。

重要

If you deploy Acronis Cyber Infrastructure on a single node, you must take care of making its storage persistent and redundant, to avoid data loss. If the node is physical, it must have multiple disks so you can replicate the data among them. If the node is a virtual machine, make sure that this VM is made highly available by the solution it runs on.

注解

Backup Gateway 以分层存储模式使用本地对象存储。这表示要复制、迁移或上传到公共云的数据将首先本地存储,仅在此之后发送到目标。本地对象存储持续且冗余很重要,以便本地数据不会丢失。有多种方法可确保本地存储的持续和冗余。可以在多个节点上部署 Backup Gateway 并选择一种好的冗余模式。如果网关部署在 Acronis Cyber Infrastructure 中的单个节点上,可通过在多个本地磁盘之间复制它来使存储冗余。如果整个 Acronis Cyber Infrastructure 安装都部署在单个虚拟机上,其唯一目的是创建网关,则确保从中运行 VM 的解决方案使此 VM 高度可用。

2.3.7. Raw disk space considerations

当规划基础架构时,记住以下事项以避免混淆:

  • The capacity of HDD and SSD is measured and specified with decimal, not binary prefixes, so “TB” in disk specifications usually means “terabyte.” The operating system, however, displays a drive capacity using binary prefixes meaning that “TB” is “tebibyte” which is a noticeably larger number. As a result, disks may show a capacity smaller than the one marketed by the vendor. For example, a disk with 6 TB in specifications may be shown to have 5.45 TB of actual disk space in Acronis Cyber Infrastructure.

  • 5 percent of disk space is reserved for emergency needs.

Therefore, if you add a 6 TB disk to a cluster, the available physical space should increase by about 5.2 TB.

2.3.8. Checking disk data flushing capabilities

It is highly recommended to ensure that all storage devices you plan to include in your cluster can flush data from cache to disk if the power goes out unexpectedly. Thus you will find devices that may lose data in a power failure.

Acronis Cyber Infrastructure 附带 vstorage-hwflush-check 工具,该工具检查在紧急情况下存储设备如何将数据刷新到磁盘。该工具作为客户端/服务器实用程序实施:

  • 客户端持续将数据块写入到存储设备。写入数据块时,客户端将增加一个特殊的计数器,并将其发送到保留它的服务器上。

  • The server keeps track of counters incoming from the client and always knows the next counter number. If the server receives a counter smaller than the one it has (for example, because the power has failed and the storage device has not flushed the cached data to disk), the server reports an error.

要检查存储设备在发生电源故障时是否能够将数据成功刷新到磁盘,请遵循以下步骤:

  1. 在一个节点上,运行服务器:

    # vstorage-hwflush-check -l
    
  2. On a different node that hosts the storage device you want to test, run the client. For example:

    # vstorage-hwflush-check -s vstorage1.example.com -d /vstorage/stor1-ssd/test -t 50
    

    其中:

    • vstorage1.example.com 是服务器的主机名。

    • /vstorage/stor1-ssd/test 是用于数据刷新测试的目录。在执行期间,客户端在此目录中创建文件并将数据块写入该文件。

    • 50 is the number of threads for the client to write data to disk. Each thread has its own file and counter. You can increase the number of threads (max. 200) to test your system in more stressful conditions. You can also specify other options when running the client. For more information on available options, refer to the vstorage-hwflush-check manual page.

  3. Wait for at least 10-15 seconds, cut power from the client node (either press the Power button or pull the power cord out), and then power it on again.

  4. 重新启动客户端:

    # vstorage-hwflush-check -s vstorage1.example.com -d /vstorlage/stor1-ssd/test -t 50
    

启动后,客户端将读取所有以前写入的数据、确定磁盘上数据的版本,并从最后一个有效的计数器重新启动测试。然后,它将此有效计数器发送到服务器,服务器将其与最新的计数器进行比较。您看到的输出可能类似于:

id<N>:<counter_on_disk> -> <counter_on_server>

它表示以下其中一项:

  • 如果磁盘上的计数器低于服务器上的计数器,则存储设备无法将数据刷新到磁盘。避免在生产中使用此存储设备,尤其对于 CS 或日志,因为您会有丢失数据的风险。

  • If the counter on the disk is higher than the counter on the server, the storage device has flushed the data to the disk but the client has failed to report it to the server. The network may be too slow or the storage device may be too fast for the set number of load threads, so consider increasing it. This storage device can be used in production.

  • 如果两个计数器相等,则存储设备已将数据刷新到磁盘,并且客户端已将其报告给服务器。此存储设备可以用在生产中。

To be on the safe side, repeat the procedure several times. Once you have checked your first storage device, continue with all of the remaining devices you plan to use in the cluster. You need to test all devices you plan to use in the cluster: SSD disks used for CS journaling, disks used for MDS journals and chunk servers.