Weeks 4-5: Storage: Redundancy, Network Storage Protocols,& Performance

Weeks 4-5: Storage: Redundancy, Network Storage Protocols,& Performance

Mentor : Merve

Storage Redundancy

1. What is RAID?

RAID (Redundant Array of Independent Disks) is a data storage virtualization technology that combines multiple physical disk drive components into one or more logical units. The purpose of RAID is to enhance data redundancy and/or performance. RAID achieves these goals by distributing data across multiple disks in various ways, known as "RAID levels," each with its own balance of performance, data availability, and storage capacity.

Concepts of RAID:

Data Redundancy: RAID can provide fault tolerance by duplicating data across multiple disks, allowing the system to continue operating even if one disk fails. This is crucial for maintaining data integrity and continuous operation in environments like servers and data centers.

Performance Improvement: Some RAID configurations can enhance performance. For example, RAID can allow multiple disks to read and write data concurrently, speeding up these operations significantly compared to a single disk.

Capacity Utilization: RAID allows the combination of multiple disks into a single logical unit, which can be seen by the operating system as one large disk, simplifying storage management.

2. Important concepts (Mirroring, Striping and Metadata)

Mirroring is a data redundancy technique used in RAID configurations, specifically in RAID 1 and RAID 10. In mirroring, the same data is written simultaneously to two or more disks. 
This ensures that there is a complete copy of the data on each disk.

Striping is a technique used to improve the performance and efficiency of data storage. 
In striping, data is divided into equally-sized blocks (stripes) and distributed across multiple disks in a sequential manner. This technique is used in RAID 0, RAID 5, RAID 6, and RAID 10.

Metadata refers to the data that describes the structure, configuration, and state of the RAID array. 
This metadata is essential for the RAID controller (hardware or software) to manage the RAID array and to ensure data integrity and redundancy. Metadata includes information about how the data is organized across the disks in the array, and it helps in rebuilding the array in case of disk failures.

3. Common types of RAID configurations.

RAID 0 (Striping): This level splits data across multiple disks, increasing performance by allowing reads and writes to be performed simultaneously on all disks. 
However, it offers no redundancy, and if one disk fails, all data in the array is lost.

RAID 1 (Mirroring): Data is copied identically to two or more disks. 
This provides high data redundancy, as data can be read from any of the mirrored drives. If one drive fails, the system can switch to a backup drive. The downside is that it requires double the storage capacity to store the data.

RAID 5 (Striping with Parity): This level uses striping (as in RAID 0) along with parity information, which is distributed among the disks. 
It provides a good balance of performance, storage efficiency, and data security. If one disk fails, the data can be reconstructed from the parity information contained on the other disks.

RAID 6 (Striping with Double Parity): Similar to RAID 5, but it includes two parity blocks instead of one. 
This allows the array to withstand the failure of two disks simultaneously without data loss.

RAID 10 (or 1+0): This level combines mirroring and striping to provide both redundancy and improved performance. 
It requires a minimum of four disks and offers better fault tolerance and rebuild performance than RAID 5.

Software vs. Hardware RAID:
Hardware RAID: Managed by a dedicated processor on the RAID card or motherboard. It provides better performance, especially for high-demand systems, as it does not use CPU resources from the host.

Software RAID: Managed by the operating system's disk management tools. It's typically cheaper because it doesn't require additional hardware. However, it may consume more CPU resources compared to hardware RAID.

Applications of RAID:
RAID is commonly used in servers and data centers where data availability and speed are critical. It is also used in NAS (Network Attached Storage) devices, enterprise storage systems, and by individuals needing robust data protection for their critical data.

Considerations:
While RAID can protect against disk failures, it is not a substitute for regular backups. RAID does not protect against data corruption, virus attacks, or site-related disasters (such as fire or flooding). Thus, maintaining separate backups and disaster recovery plans is crucial for comprehensive data protection.

4. Thought Exercise:
Much of BioHPC's storage solutions implement RAID6 arrays, why?
RAID provides Redundancy, but does it provide High Availability? Re-visit after having digested concepts related to NFS & Parallel File Systems.

BioHPC likely implements RAID 6 arrays due to the following reasons:

High Fault Tolerance: RAID 6 can withstand the failure of two disks simultaneously, providing a higher level of data protection which is crucial for critical HPC (High-Performance Computing) applications where data integrity is paramount.

Cost-Effective Redundancy: Compared to RAID 10, RAID 6 offers a better compromise between redundancy and usable storage capacity, making it more cost-effective for large storage arrays.

Scalability: RAID 6 is suitable for larger storage arrays, which are common in HPC environments where large datasets need to be stored and processed.

RAID and High Availability

RAID provides redundancy, which helps protect against data loss due to disk failures. However, redundancy alone does not equate to high availability. 

High availability involves ensuring that systems are operational and accessible most of the time, which requires additional measures beyond RAID, such as:

NFS (Network File System): Allows multiple clients to access shared storage over a network, contributing to high availability by providing continuous access to data even if some servers are down.

Parallel File Systems: Enhance performance and reliability by distributing data and workloads across multiple servers and storage devices. Examples include Lustre and GPFS (IBM Spectrum Scale), which are commonly used in HPC environments to ensure high availability and performance.

In summary, while RAID provides essential redundancy and fault tolerance, achieving high availability requires a combination of networked storage solutions, load balancing, and failover mechanisms.


RAID6 Configuration (on NFS Server)


1. Note: the VM configured with RAID will be your test NFS Server
for convenience's sake: make your VMs from the same base install template (identical UID & GID for local admin user(s) )

on the VM Centos 7 Clone, I created a group called adminuser
(sudo groupadd -g 1729 adminuser)

and then added the admin user by
sudo useradd -u 1729 -g 1729 -m adminuser

and check it using 
id adminuser
uid=1729(adminuser) gid=1729(adminuser) groups=1729(adminuser)

cloned the Centos 7 Clone to have another VM and called it 'Centos 7 Clone 2' (I changed this to name Centos 7 Clone (NFS))

2. Create RAID Array on NFS Server:

2.1 Attach 8 4GB virtual SATA drive via VM's Storage Settings in VirtualBox

Steps to Create a RAID6 Array on an NFS Server

Attach Virtual SATA Drives in VirtualBox


Open VirtualBox and select your VM.

Open VM Settings:

Click on Settings.
Go to the Storage tab.

Add AHCI SATA Controller:

In the Storage Tree, click on the Controller: IDE or Add Controller icon (hard disk with a plus sign).
Select Add SATA Controller and choose AHCI for SATA.


Add Virtual Drives:

Click on the Add Hard Disk icon next to the SATA controller.
Select Create a new disk.
Choose VDI (VirtualBox Disk Image) and click Next.
Choose Dynamically allocated and click Next.
Set the size to 4GB and specify the location, then click Create.
Repeat this process to create and attach 8 virtual SATA drives.


2.2 Inside the VM, configure the newly added disks to a RAID6 array
What's the minimum # of disks needed for RAID6? Could you create RAID6 with fewer disks?


Configure the Newly Added Disks in the VM


Start the VM and log in.


View the List of Physical Volumes (PVs):

lsblk


This will show the list of block devices, including the new 4GB drives (e.g., /dev/sdb, /dev/sdc, etc.).


Create Partitions on Each New Drive:


For each drive (/dev/sde, /dev/sdf, ..., /dev/sdl), use fdisk to create a primary partition:

sudo fdisk /dev/sde

Press n to create a new partition
Press p to select primary
Press 1 to select partition number 1
Press Enter to accept the default for the first sector
Press Enter to accept the default for the last sector
Press w to write changes and exit

Repeat this process for /dev/sdf, /dev/sdg, ..., /dev/sdl

verify the partitions: lsblk


Create the RAID6 Array:

sudo mdadm --create --verbose /dev/md8 --level=6 --raid-devices=8 /dev/sd[e-l]1

View the RAID Array Details:

sudo mdadm -D /dev/md8

Create Filesystem on the RAID Array

sudo mkfs.ext4 /dev/md8


2.3 Incorporate the RAID6 block device into a new or existing LVM add as PV, then use PV to expand VG, and finally extend a LV
mount the RAID6-backed LV to `/new_home` (or any arbitrary name of your choice) –> this will be your NFS File Share

Initialize the RAID device as a PV:

sudo pvcreate /dev/md0

Extend the existing VG:

sudo vgextend centos /dev/md8

Verify the VG extension:

sudo vgdisplay (will show that the size of centos is now expanded with the size of md8)

create a new LV named lv_md8

sudo lvcreate -n lv_md8 -l 100%FREE centos

sudo mkfs.ext4 /dev/centos/lv_md8

Mount the new LV:
sudo mkdir -p /mnt/lv_md8
sudo mount /dev/centos/lv_md8 /mnt/lv_md8

Add to fstab to mount at boot:
echo '/dev/centos/lv_md8 /mnt/lv_md8 ext4 defaults,nofail 0 0' | sudo tee -a /etc/fstab

2.4 Thought Exercise: is your RAID implementation Software or Hardware based? What's the difference?


Software RAID:


Implemented in the operating system: The CPU handles the RAID calculations and management.

Cost-effective: Uses existing system resources.

Flexibility: Can be configured on any standard hardware without the need for special RAID cards.

Performance: May be lower compared to hardware RAID, especially under high I/O loads, as it uses the system's CPU.

Hardware RAID:


Dedicated RAID controller: Uses a dedicated hardware controller card for RAID calculations.

Performance: Generally better performance, especially under

high I/O loads, as the RAID controller offloads the RAID processing from the CPU.


Features: Often includes advanced features such as battery-backed cache, which improves performance and reliability.

Cost: More expensive due to the additional hardware required.

Management: Can offer easier management and monitoring through dedicated firmware or software utilities.

Which Type of RAID Implementation is Used?
In the context of the setup described, the RAID implementation is software-based because we are using mdadm to configure the RAID array. 
mdadm is a software tool that manages RAID arrays at the software level within the operating system, leveraging the CPU for RAID calculations.


NFS Server Configuration


1. NFS Overview
NFS (Network File System) allows a system to share directories and files with others over a network. By using NFS, users and programs can access 
files on remote systems almost as if they were local files.


2. NFS Server-Client Exercise Prerequisites:
At least two CentOS VMs: both connected to the one host-only network with static I.P.'s & hostnames


I cloned the Centos7 Clone to create Centos7 Clone 2

on Centos7 Clone (NFS) server:

ip addr show - will show the network interfaces and their ip addresses. enp0s8 is not configured, hence it is going to be the network inteface that is assigned static IP address.

sudo nano /etc/sysconfig/network-scripts/ifcfg-enp0s8

TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=none
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=no
NAME=enp0s8
DEVICE=enp0s8
ONBOOT=yes
IPADDR=192.168.56.101
NETMASK=255.255.255.0
GATEWAY=192.168.56.1
DNS1=8.8.8.8
DNS2=8.8.4.4

Repeat this process for VM2- the client server vm - Centos 7 Clone 2

sudo nano /etc/sysconfig/network-scripts/ifcfg-enp0s8

TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=none
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=no
NAME=enp0s8
DEVICE=enp0s8
ONBOOT=yes
IPADDR=192.168.56.102
NETMASK=255.255.255.0
GATEWAY=192.168.56.1
DNS1=8.8.8.8
DNS2=8.8.4.4


I renamed Centos7 Clone 2 to Centos 7(NFS Client1) and server to Centos 7 (NFS Server)

set hostname:

sudo hostnamectl set-hostname nfs_server
sudo nano /etc/hosts

192.168.56.101   nfs_server
192.168.56.102   nfs_client1

do the same for client1.

Try connecting to nfs_server from client1 and vice-versa (ping nfs_server)
If it fails, open both vms together and disable firewalls. 'sudo systemctl stop firewalld' ; make sure they connect afterward.


3. Install required packages for NFS server
start NFS Server & check its status
Note: Better stop/disable firewalld for the following exercises

Install Required Packages for NFS Server:

sudo yum install nfs-utils


Start NFS Server and Check Status:

sudo systemctl start nfs-server
sudo systemctl enable nfs-server
sudo systemctl status nfs-server

Disable Firewall (for testing purposes):

sudo systemctl stop firewalld
sudo systemctl disable firewalld

4. Configure your NFS file shares via `/etc/exports`

Edit /etc/exports file to configure the file shares. For example:

sudo nano /etc/exports

/mnt/lv_md8 192.168.56.102(rw,sync,no_root_squash, no_subtree_check)

/mnt/lv_md8: The directory you want to share.
192.168.56.102: The IP address of the nfs_client1 VM.
rw: Read and write access.
sync: Writes are committed to stable storage before replying.
no_root_squash: Allows root on the client to have root privileges on the server.
no_subtree_check: Disables subtree checking (improves performance).

anonuid and anongid: Set the UID and GID for the anonymous user.


Export the NFS Shares:

sudo exportfs -a
sudo exportfs -r
sudo exportfs -v

sudo touch /mnt/lv_md8/testfile_server.txt
sudo chmod 660 /new_home/testfile_server.txt


NFS Client Configuration

1. On your second VM, install packages needed by NFS client
sudo yum install nfs-utils

2. Make mount point on local file system e.g. `/shared_home`
sudo mkdir -p /mnt/shared_home

3. Mount the NFS share via the `mount` command

sudo mount 192.168.56.101:/mnt/lv_md8 /mnt/shared_home

cd /mnt/shared_home
ls

you'll see the testfile_server.txt


4. Configure client to mount NFS file share via `/etc/fstab`
Why is this needed?
Familiarize with the command mount i.e. `intr,nfsvers,nosuid,port,rsize,wsize,_netdev`
reboot client VM and test if your mount is still available


Auto-Mounting at Boot
To ensure that the NFS share is automatically mounted at boot, you can add an entry to the /etc/fstab file on the client machine:

Edit /etc/fstab:


sudo nano /etc/fstab
Add the NFS Mount Entry:
Add the following line to ensure the NFS share mounts at boot:

plaintext

192.168.56.101:/new_home /mnt/new_home nfs defaults 0 0


Understand the Importance of /etc/fstab:


/etc/fstab is used to configure automatic mounting of filesystems at boot time.
The _netdev option ensures the network is up before attempting to mount the NFS share.


Familiarize with mount Command Options:


intr: Allow signals to interrupt NFS calls.

nfsvers: Specify the NFS protocol version (e.g., nfsvers=4).

nosuid: Disallow set-user-identifier or set-group-identifier bits.

port: Specify the NFS server port.

rsize: Set the read size (e.g., rsize=8192).

wsize: Set the write size (e.g., wsize=8192).

_netdev: Indicate that the device requires a network connection.


sudo reboot
df -h 

5. Test your file share:
read & modify existing files, see if changes implemented on server
create new files, see if changes are reflected on server


sudo nano testfile_server.txt
testfile from client

verify the change in server: cd /mnt/lv_md8 sudo nano testfile_server.txt

sudo nano /mnt/shared_home/testfile_client.txt   verify the change at server:/mnt/lv_md8 


6. Unmount and Remount the File Share:


Unmount:

sudo umount /shared_home

verify 


Remount:

sudo mount -a

verify 

7. Extra Mile: configure auto-mount on your nfs client (via the autofs tool)
test on client: is NFS file share persistently mounted or mounted upon traversing into designated mount point?

sudo yum install autofs -y

sudo nano /etc/auto.master

add the following line: 

/- /etc/auto.nfs
	

sudo nano /etc/auto.nfs

/mnt/shared_home -fstype=nfs,rw,sync 192.168.56.101:/mnt/lv_md8

sudo systemctl start autofs
sudo systemctl enable autofs
sudo systemctl status autofs

cd /mnt/shared_home

df -h | grep /mnt/new_home

By following these steps, the NFS file share will be auto-mounted upon accessing the designated mount point (/mnt/shared_home). This means the share is mounted only when it is needed, which can improve performance and reduce unnecessary network traffic.


SAMBA Server and Client Configuration

1. Prerequisites: an active NFS server, & a windows VM
for Windows VM: open "BioHPC VirtualBox Image Manager" –> select "Windows10" –> select "Add" –> "Quit" –> launch regular VirtualBox Manager, the Windows10 VM should be added to your local system
connect Windows10 VM to the same host-only network as that of your NFS server


go to files on the windows pc, and in the address bar enter \\192.168.56.101 which is the ip address of the NFS server. It may not connect because samba service is not installed in NFS server. So, install like follows:


sudo yum install samba samba-client samba-common -y

sudo systemctl enable smb
sudo systemctl start smb

sudo systemctl enable nmb
sudo systemctl start nmb


sudo systemctl status smb
sudo systemctl status nmb


sudo nano /etc/samba/smb.conf

[lv_md8]
path = /mnt/lv_md8
valid users = @nfsuser
guest ok = no
writable = yes
browsable = yes


add samba user:

sudo smbpasswd -a your_username [i entered cmurali]
type password
new smb passwd: i entered same as vm password
retype smb passwd
'added user cmurali'


To have the Windows VM: download the Windows1-_DVD.iso from /project/biohpcadmin/shared/isos/.
Install the Windows10_DVD.iso virtual machine (username=chinthak, password: time) the same way the Centos7 was installed (signing up for online account is not needed, opt for offline setup).

on NSF server:

sudo useradd sambauser1
sudo smbpasswd -a sambauser1
password=sambauser1

sudo useradd sambauser2
sudo smbpasswd -a sambauser2
password=sambauser2


sudo vi /etc/samba/smb.conf

[shared_home]
path = /mnt/lv_md8
valid users = sambauser1 sambauser2
read only = no
browsable = yes
guest ok = no

[global]
workgroup = WORKGROUP
security = user
map to guest = bad user


sudo systemctl start smb
sudo systemctl start nmb
sudo systemctl enable smb
sudo systemctl enable nmb
sudo systemctl status smb
sudo systemctl status nmb

Test if Samba Share is Available to Localhost:

smbclient -L localhost -U sambauser1


Samba Client Configuration on Windows 10 VM

Map the Network Drive Using Samba Credentials:

Open File Explorer.
Right-click on "This PC" and select "Map network drive...".
Choose a drive letter (e.g., Z:).
In the "Folder" field, enter the path to the Samba share:

\\192.168.56.101\lv_md8


Check "Connect using different credentials" and click "Finish".
When prompted, enter the Samba user credentials (sambauser1 and the password set earlier).


[the following permission is given to sambauser1

sudo chown -R sambauser1:sambauser1 /mnt/lv_md8
sudo chmod -R 770 /mnt/lv_md8]


Troubleshooting:
set SELINUX=diabled:

vi \etc\selinux\config


Testing the Samba Share


Verify Access:

Open the mapped drive in File Explorer.
Create, read, and modify files to verify that changes are reflected on the Samba server.


Check File Permissions and Ownership on the Server:

On the NFS server, verify the files created by the Samba user:

ls -l /new_home

it is verified

(i can check as the sambauser1 using sudo su - sambauser1 and make changes to the files and see the changes on the windows VM)


Big Concepts in Network Storage

NAS vs SAN
Network Attached Storage (NAS):


Purpose: Provides file-level storage over a network.

Connectivity: Connected to a local area network (LAN).

Protocol: Uses standard network protocols like NFS, SMB/CIFS.

Use Case: Suitable for file sharing, collaborative projects, and general file storage needs.

Examples: Home media servers, small business file sharing.

Storage Area Network (SAN):


Purpose: Provides block-level storage, typically for large-scale enterprise environments.

Connectivity: Connected via high-speed networks, usually Fibre Channel or iSCSI.

Protocol: Uses block storage protocols like Fibre Channel Protocol (FCP) or iSCSI.

Use Case: Suitable for databases, virtualization, and large-scale, high-performance applications.

Examples: Enterprise data centers, high-availability environments.


Parallel File Systems
Definition:
Parallel file systems allow multiple clients to access and write to files simultaneously, distributing data across multiple storage devices to improve performance and scalability.
Key Characteristics:


Scalability: Can handle large volumes of data and a high number of simultaneous access requests.

Performance: Designed for high throughput and low latency.

Data Distribution: Files are split into chunks and distributed across multiple storage devices.

Comparison with NFS:


NFS:


Architecture: Typically a single server managing multiple clients.

Performance: Can become a bottleneck under high load due to the single server handling all requests.

Scalability: Limited by the performance and capacity of the single server.


Parallel File Systems:


Architecture: Multiple servers (metadata servers and data servers) managing storage and access.

Performance: Higher performance due to parallel access and distributed data handling.

Scalability: More scalable as load is distributed across multiple servers.


Enterprise-Scale Parallel Storage Implementations

Lustre


Overview: High-performance parallel file system widely used in supercomputing and large-scale HPC (High-Performance Computing) environments.

Structure: Comprises multiple metadata servers (MDS) and object storage servers (OSS) managing object storage targets (OST).

Features: Scalability, high throughput, and support for petabyte-scale storage.


GPFS (IBM Spectrum Scale)


Overview: General Parallel File System developed by IBM, used in both commercial and research environments.

Structure: Distributed architecture with multiple nodes serving as both metadata and data servers.

Features: High availability, disaster recovery, and tiered storage management.


BeeGFS


Overview: The Fraunhofer Parallel File System, designed for ease of use and high performance in scientific computing environments.

Structure: Separates metadata and data servers, with flexible configuration options.

Features: Scalability, performance optimization, and user-friendly management tools.


Summary
NAS vs SAN:

NAS provides file-level storage over a standard network, suitable for general file sharing.
SAN offers block-level storage over high-speed networks, suitable for enterprise applications requiring high performance and scalability.

Parallel File Systems:

Provide simultaneous access to files by multiple clients, distributing data across multiple storage devices for improved performance and scalability.
Examples include Lustre, GPFS, and BeeGFS, each with unique features and optimal use cases for high-performance and large-scale environments.