Server-Side vs Client-Side Storage: Why Architecture Matters

When designing cloud storage systems, architecture matters as much as raw performance. MayaScale's server-side mirroring approach represents true disaggregated storage—keeping storage complexity where it belongs, on dedicated storage nodes, not stealing resources from your applications.

Let's examine why this architectural choice has profound implications for performance, cost, and operational simplicity, especially for resource-intensive workloads like Oracle databases.

The Client-Side Problem: Hidden Costs of Mirroring on Application Servers

Some cloud storage solutions implement client-side mirroring (using device-mapper RAID1) to provide high availability. While this approach works, it comes with significant hidden costs that compound over time.

1. CPU Overhead: Stealing Cycles from Your Workload

What happens with client-side mirroring:

Mirror writes: Every write must be duplicated to multiple devices by the client CPU
I/O scheduling: Device-mapper must schedule and coordinate I/O across multiple underlying devices
Queue management: Each NVMe device requires dedicated CPU cores for queue processing
Kernel overhead: Device-mapper metadata management consumes CPU cycles

Impact: On a 48-core instance running Oracle, you might lose 2-4 cores to storage overhead—that's 4-8% of your compute capacity unavailable to the database.

2. Memory Overhead: Device-Mapper Metadata and Buffers

Client-side mirroring consumes memory for:

Device-mapper metadata: Kernel memory for tracking mirror state
I/O buffers: Separate buffers for each underlying device in the mirror
Block layer queues: Memory for managing outstanding I/O requests
Bitmap tracking: For mirror rebuild and consistency checking

Impact: On a server with 96GB RAM, 1-2GB consumed by mirroring overhead means less memory for Oracle's SGA/PGA, potentially requiring a larger instance.

3. NVMe Queue Management: The Hidden Core Tax

Modern NVMe devices use multiple I/O queues for parallelism. With client-side mirroring across 4 NVMe devices:

Each NVMe device has 16-32 I/O queues
Each queue typically requires CPU affinity for optimal performance
4 devices × 16 queues = 64 queue-to-core mappings to manage
Inefficient queue assignment = performance loss

Impact: Best practices suggest dedicating 2-4 cores for optimal NVMe queue management when using multiple devices with client-side mirroring.

4. Oracle Licensing Cost: The $45K+/Year Storage Tax

Critical Cost Implication

Oracle Database Enterprise Edition licenses are priced per core. When client-side RAID forces you to add cores for storage overhead, you're paying Oracle licensing fees for CPU cycles dedicated to storage management.

Real-world example:

Component	Client-Side RAID	Server-Side (MayaScale)
Cores needed for Oracle workload	48 cores	48 cores
Extra cores for storage overhead	+4 cores	0 cores
Total cores required	52 cores	48 cores
Instance type needed	r7i.16xlarge (64 cores)	r7i.12xlarge (48 cores)
Oracle EE license cost (annual)	$59,280 (52 cores × $95/mo × 12)	$54,720 (48 cores × $95/mo × 12)
Annual savings	$4,560/year on Oracle licensing alone

Note: Oracle EE processor license costs approximately $950/core/month or $47,500/core perpetual. We've used $95/core/month for this example (10% of perpetual cost annually).

5. Mirror Rebuild Impact: Application Performance Degradation

When a storage node fails and mirror rebuild begins, client-side mirroring impacts the application server directly:

High CPU usage: Mirror rebuild consumes 10-20% CPU during reconstruction
Memory pressure: Increased I/O buffering for rebuild operations
Network saturation: Rebuild traffic competes with application traffic on same NICs
I/O priority: Application I/O competes with rebuild I/O

Impact: Oracle query response times may increase 20-50% during multi-hour mirror rebuild operations.

6. Operational Complexity: Mirror Configuration at Scale

With client-side mirroring, every application instance requires:

Device-mapper configuration and initialization
Mirror setup and monitoring
Mirror health monitoring and alerting
Manual intervention for device failures
Rebuild coordination and monitoring

Impact: 10 Oracle RAC nodes = 10 mirror configurations to manage. Complexity grows linearly with cluster size.

7. Split-Brain Risk: Fencing and Coordination Challenges

When running cold-standby with client-side mirroring, preventing split-brain requires:

Fencing mechanisms: STONITH (Shoot The Other Node In The Head) via cloud APIs
Quorum management: Cluster coordination to decide which node is authoritative
Manual procedures: Operator verification before failover
Testing complexity: Failover scenarios difficult to validate

Impact: Without proper fencing, simultaneous Oracle startup on both nodes = catastrophic data corruption.

8. Instance Sizing: Forced Oversizing

Client-side mirroring overhead forces you to choose larger instances than your workload actually needs:

Oracle needs 40 cores → must get 48 cores (accounting for mirroring overhead)
Need 80GB RAM → must get 96GB (accounting for device-mapper buffers)
Cannot right-size precisely to workload
Pay for compute you don't use

Impact: AWS r7i.12xlarge costs $3.78/hour. Forced upgrade to r7i.16xlarge costs $5.04/hour. Difference: $1.26/hour = $11,088/year wasted.

Server-Side Architecture Advantages: True Separation of Concerns

MayaScale implements server-side mirroring—the storage nodes handle all HA and replication logic. Application servers see simple NVMe-oF block devices with zero client-side overhead.

MayaScale Architecture Principles

Disaggregated storage: Storage logic runs on dedicated storage nodes
Standard protocols: Clients use standard NVMe-oF initiator (no special software)
Zero client overhead: No RAID, no device-mapper, no extra CPU/memory consumption
Active-Active HA: Both storage nodes serve I/O simultaneously

Complete List of Client-Side Implications Eliminated

Client-Side Mirroring Problem	MayaScale Server-Side Solution
CPU overhead from mirror writes	100% CPU available to application
Memory consumed by device-mapper metadata	100% memory available to application
Extra cores needed for NVMe queues	Zero extra cores—right-size for workload
Oracle licensing cost for storage cores	License only workload cores—save $4K-45K/year
Mirror rebuild impacts application performance	Rebuild invisible to application
Complex mirror configuration per instance	Simple NVMe-oF connect command
Split-brain fencing required	Storage layer handles with Pacemaker/STONITH
Forced instance oversizing	Precise right-sizing possible
Client software dependencies (device-mapper)	Standard kernel NVMe-oF driver
Mirror monitoring per instance	Centralized storage monitoring
Network overhead on application NICs	Dedicated storage network on storage nodes
Troubleshooting complexity (storage + app)	Clear separation—storage vs app issues
Security surface (more client software)	Minimal attack surface (standard protocol)
Cold-standby tradeoffs	True Active-Active—no tradeoffs
Scaling complexity grows with cluster	Linear scaling—same simple config

Real-World Impact on Applications

Oracle Database: The Perfect Storm of Costs

Oracle workloads exemplify why server-side architecture matters. Oracle is:

CPU-intensive: Every core stolen by storage = slower queries
Memory-hungry: SGA/PGA want all available RAM
Licensed per-core: Storage overhead = direct licensing cost
Performance-sensitive: RAID rebuild = query slowdown = user complaints

Total Cost of Ownership: 3-Year Oracle Deployment

Scenario: Oracle RAC with 2 nodes, 48-core workload requirement

Cost Component	Client-Side RAID	MayaScale Server-Side	3-Year Savings
Instance sizing (per node)	r7i.16xlarge	r7i.12xlarge	—
Compute cost (2 nodes × 3 years)	$265,680	$199,260	$66,420
Oracle EE licenses (2 nodes × 3 years)	$178,560	$164,160	$14,400
Performance degradation costs	~$20,000	$0	$20,000
Total 3-year TCO	$464,240	$363,420	$100,820

MayaScale saves $100,820 over 3 years (22% TCO reduction)

Other Workloads Affected

While Oracle shows the most dramatic impact due to per-core licensing, other workloads also benefit:

SQL Server: Per-core licensing similar to Oracle
High-frequency trading: Cannot tolerate RAID rebuild performance degradation
Analytics workloads: Want all CPU for query processing, not storage
AI/ML training: GPU-accelerated workloads don't want CPU stolen by storage
Large-scale web apps: Simplified ops = faster deploys

MayaScale's Implementation: How It Works

MayaScale implements server-side Active-Active mirroring using a dual-NIC architecture:

Storage Node Architecture

eth0 (Frontend): NVMe-oF client connections (port 4420-4423)
eth1 (Backend): RAID1 replication traffic between storage nodes (port 4422)
Local NVMe drives: 1-4 drives per node depending on tier
Mayastor engine: Handles NVMe-oF serving, mirroring, and HA

Client Experience

From the application server perspective:

# Connect to MayaScale volume (one-time setup)
nvme connect -t tcp -a 10.0.1.100 -s 4420 -n nqn.2019-05.com.zettalane:mayascale-data-node-1

# Volume appears as standard NVMe device
ls /dev/nvme1n1

# Use it like any block device
mkfs.xfs /dev/nvme1n1
mount /dev/nvme1n1 /data

# Oracle ASM sees it as standard block device
# No special configuration needed

That's it. No RAID setup, no device-mapper, no special software. Just a standard NVMe-oF connection.

Performance Comparison: Overhead Matters

CPU Utilization During Peak I/O

Metric	Client-Side RAID	MayaScale Server-Side
Application server CPU usage	35-40% (app) + 5-8% (mirroring)	35-40% (app only)
Storage server CPU usage	N/A	15-20% (mirroring)
Application CPU available	55-60%	60-65%
During mirror rebuild	45-50% (10% stolen)	60-65% (unchanged)

Memory Allocation

On a 96GB instance running Oracle:

Client-side mirroring: ~94GB available (2GB for device-mapper overhead)
MayaScale: ~95.5GB available (0.5GB for NVMe-oF initiator)

Result: 1.5GB more RAM for Oracle SGA = ~1.6% more buffer cache = fewer disk reads = better performance.

Deployment Simplicity: Operations at Scale

Adding a New Application Server

With client-side mirroring:

Install device-mapper software
Configure device mapper for multiple NVMe devices
Create mirror across devices
Wait for mirror initialization (can take hours for TB+ arrays)
Configure monitoring for mirror health
Setup alerting for degraded mirrors
Test failover and recovery procedures
Install application

With MayaScale:

Run: nvme connect -t tcp -a VIP -s 4420 -n NQN
Install application

Result: MayaScale: 2 steps. Client-side mirroring: 8 steps. Time savings: ~2-4 hours per server.

Troubleshooting Storage Issues

Client-side mirroring problem scenario:

Is it the application? Or storage?
Check mirror status on affected server
Check kernel logs for device errors
Check device-mapper state
Check network to storage nodes
Intermingled logs make diagnosis harder

MayaScale problem scenario:

Application issue? Check application logs/metrics
Storage issue? Check storage nodes (separate systems)
Clear separation = faster diagnosis

When Architecture Matters Most

Server-side architecture provides the most value when:

Per-core licensing applies: Oracle, SQL Server, commercial software
CPU-intensive workloads: Analytics, ML training, scientific computing
Memory-hungry applications: In-memory databases, caching layers
Performance-sensitive apps: Cannot tolerate RAID rebuild slowdowns
Large deployments: Operational simplicity compounds at scale
Compliance requirements: Clear separation aids auditing
Multi-tenant environments: Each tenant gets clean, isolated storage

Conclusion: Architecture is a Strategic Choice

The choice between client-side and server-side storage architecture isn't just technical—it's strategic:

Key Takeaways

Client-side RAID steals resources from applications (CPU, memory, cores)
Per-core licensing turns storage overhead into direct cost ($4K-45K/year for Oracle)
Operational complexity grows linearly with cluster size
Server-side architecture provides true separation of concerns
MayaScale's approach delivers 100% of instance resources to applications
TCO reduction of 20-25% for licensed workloads over 3 years

For workloads where every core, every GB of RAM, and every dollar matters—especially Oracle databases—server-side architecture isn't just better. It's the only sensible choice.

MayaScale implements this architecture principle across AWS, GCP, and Azure, delivering validated sub-millisecond performance with zero client-side overhead.

View AWS Performance Tiers Download MayaScale

Server-Side vs Client-Side Storage: Why Architecture Matters for Cloud Performance