Data storage is one of the most important directions in the development of computers, which arose after the advent of non-volatile storage devices. Data storage systems of different scales are used everywhere: in banks, shops, and enterprises. As storage requirements grow, so does the complexity of data warehouses.
Reliably storing data in large volumes, as well as withstanding physical media failures, is a very interesting and complex engineering challenge.
Storage is usually understood as the recording of data on some data storage devices for the purpose of their (data) further use. Let’s omit the historical options for organizing storage, let’s take a closer look at the classification of storage systems according to different criteria.
The connection method has the following options:
- Internal. This includes the classic connection of disks in computers, data drives are installed directly in the same case where they will be used. Typical buses for connection are SATA, and SAS, outdated ones are IDE and SCSI.
- External. It implies connecting drives using some external bus, such as FC, SAS, IB, or using high-speed network cards.
If we consider the form of data storage, then the following are clearly distinguished:
- Files (named data areas). The most popular type of data storage – the structure implies data storage, the same for the user and for the drive.
- Blocks. Areas of the same size, while the data structure is specified by the user. A characteristic feature is the optimization of access speed due to the absence of the block-file conversion layer present in the previous method.
- Objects. Data is stored in a flat-file structure as objects with metadata.
It is quite difficult to draw clear boundaries on implementation, but it can be noted:
- Hardware, such as RAID and HBA controllers, specialized storage systems.
- Software. For example, RAID implementations include file systems (such as BtrFS), specialized network file systems (NFS) and protocols (iSCSI), and SDS.
Let’s take a closer look at some technologies, their advantages, and their disadvantages.
Direct Attached Storage is historically the first media connection option and is still used today. The drive, from the point of view of the computer in which it is installed, is used exclusively, the drive is handled block by block, ensuring the maximum speed of data exchange with the drive with minimal delays. It is also the cheapest option for organizing a data storage system, but not without its drawbacks.
For example, if you need to organize the storage of enterprise data on several servers, then this method of organization does not allow sharing the disks of different servers among themselves, so the data storage system will not be optimal: some servers will experience a lack of disk space, while others will not have it completely. dispose of.
Single drive system configurations are most commonly used for non-demanding workloads, typically for home use. For professional purposes, as well as industrial applications, several drives are most often used, combined into a RAID array by software, or using a hardware RAID card to achieve fault tolerance and/or higher speed than a single drive.
It is also possible to organize caching of the most frequently used data on a faster, but a less capacious solid-state drive to achieve both high capacity and high speed of the computer disk subsystem.
Storage Area Network is a technology for organizing a storage system using a dedicated network, thus allowing you to connect disks to servers using specialized equipment. This solves the issue of disk space utilization by servers, and also eliminates points of failure that are inevitably present in data storage systems based on DAS.
SAN most often uses Fiber Channel technology, but there is no explicit binding to data transmission technology. Drives are used in block mode, and SCSI and NVMe protocols are used to communicate with drives, encapsulated in FC frames, or in standard TCP packets, for example, when using an iSCSI-based SAN.
The disadvantages of such a system are high cost and complexity since to ensure fault tolerance it is required to provide several access paths (multipath) for servers to disk shelves, which means, at a minimum, duplicate factories. Also, due to physical limitations (the speed of light in general and the data transfer capacity in the information matrix of switches in particular), although there is the possibility of unlimited connection of devices to each other, in practice there are most often restrictions on the number of connections (including between switches), the number disk shelves and the like.
Network-attached storage, or network file storage, represents disk resources as files (or objects) using network protocols such as NFS, SMB, and others. Basically based on DAS, but the key difference is file sharing.
Since the work is carried out over a network, the storage system itself can be as far away from consumers as you like (within reasonable limits, of course), but this is also a disadvantage in the case of organizations in enterprises or data centers since the bandwidth of the main network is utilized for operation – which, however, can be mitigated by using dedicated NICs to access the NAS.
Also, compared to SAN, the work of clients is simplified, since the NAS server takes care of all the issues of sharing, etc.
What to choose: NAS or SAN?
The answer depends on the capabilities and needs of the company. In addition, these two forms of storage solve different problems.
NAS provides file access and information sharing for applications on heterogeneous server platforms on a local area network. SAN provides high-performance block database access and storage consolidation that ensures its reliability and efficiency.
In reality, SAN & NAS often coexist or are used simultaneously in a distributed IT infrastructure of a company, which inevitably leads to the problem of management and optimal use of storage resources.
Today, manufacturers are looking for ways to combine both technologies into a single network storage infrastructure that will provide data consolidation, backup centralization, and simplify overall administration, scalability, and data protection. The convergence of NAS and SAN is one of the most important recent trends.
The fact that storage resources should be unified and networked is no longer in doubt today. Disk arrays connected directly to individual servers no longer meet the needs of large organizations with complex distributed IT infrastructures. Ways of optimal consolidation are being sought. In this regard, building single network systems that combine the capabilities of SAN and NAS is just one of the steps toward the global integration of enterprise storage systems.
Follow TodayTechnology for more!