MayaScale Achieves 2.3M IOPS on Google Cloud N2 Instances

MayaScale's Ultra performance tier delivers a breakthrough on Google Cloud Platform: 2.3 million read IOPS with 173 microsecond latency, transforming N2 instances with local SSDs into ultra-low latency shared storage that rivals direct-attached NVMe performance.

Breaking the 2M IOPS Barrier on Google Cloud

Google Cloud's N2 instances with local SSDs have long been the go-to choice for performance-critical workloads. The n2-highcpu-64 instance provides 16 local NVMe drives with exceptional raw performance. However, until now, this performance came with a critical limitation: local storage isn't shared, and you lose all data if the instance terminates.

MayaScale solves this fundamental trade-off by pooling local NVMe storage across multiple instances using NVMe-over-Fabrics (NVMe-oF), delivering shared storage with latency that approaches—and in some cases matches—direct-attached NVMe performance. Our latest validation on GCP demonstrates performance that pushes the boundaries of what's possible with cloud storage.

Ultra Performance Achievement

In October 2025 testing on n2-highcpu-64 instances (64 vCPU, 16x local NVMe SSDs, 100 Gbps network), MayaScale's Ultra tier achieved validated performance that sets a new standard for cloud storage: 2.3 million read IOPS with sub-200 microsecond latency.

Validated Performance Metrics

192μs

Read Latency (QD1)

Real application performance

246μs

Write Latency (QD4)

Optimal real-world performance

Latency Performance Curve

The latency performance curve demonstrates how MayaScale maintains sub-millisecond latency across different queue depths. Note that the best latency points (173μs read, 203μs write) both occur at QD8—showing optimal performance for moderately parallel workloads:

IOPS Scaling Performance

Queue depth 1 (QD1) performance is what matters for real applications—it's what your database, application server, or analytics engine actually experiences. But for workloads that can leverage parallelism, MayaScale's Ultra tier delivers extraordinary throughput. The performance curve below shows IOPS scaling with queue depth, with the sub-1ms latency zone highlighted:

2.3M

Read IOPS

Peak at QD64 with 1.78ms latency

866K

Write IOPS

Peak at QD24 with 884μs latency (sub-1ms)

Sequential Bandwidth Performance

In addition to exceptional random I/O performance, MayaScale delivers outstanding sequential throughput for large-block workloads such as data warehousing, backups, and streaming analytics:

11.2 GB/s

Sequential Read Bandwidth

128KB blocks, QD32/NJ16

3.6 GB/s

Sequential Write Bandwidth

128KB blocks (with HA replication)

Peak Bandwidth Comparison - Sequential vs 4K Random

Click to enlarge - Sequential vs 4K Random bandwidth comparison

Note: Write bandwidth is constrained by synchronous replication—every write must be committed to both storage nodes before acknowledgment. This ensures zero data loss during failover while still delivering 3.6 GB/s sustained throughput, which is more than sufficient for most sequential write workloads.

Why These Numbers Matter

The combination of ultra-low QD1 latency (192μs read) and massive parallel throughput (2.3M IOPS) makes MayaScale suitable for workloads previously confined to on-premises infrastructure or expensive specialized cloud instances:

Single-threaded applications benefit from sub-200μs response times
Highly parallel workloads can leverage millions of IOPS
Database clusters get both low latency for individual queries and high throughput for concurrent operations
AI/ML training experiences fast data loading at massive scale

Technical Architecture

Test Configuration

Our validation used Google Cloud's highest-performance instance configuration:

Instance Type: n2-highcpu-64 (64 vCPU, 64 GB RAM)
Local Storage: 16x local NVMe SSDs (375 GB each, 6.0 TB total)
Network: 100 Gbps tier-1 networking
Configuration: Active-Active with synchronous replication (dual-NIC architecture)
Protocol: NVMe-over-TCP (NVMe-oF)
Policy: zonal-ultra-performance
Location: us-central1-f

Dual-NIC Architecture for Maximum Bandwidth

The Ultra tier uses a dedicated backend network (10.200.0.0/24) for server-side replication traffic, separating replication from client I/O. This dual-NIC design ensures that synchronous replication doesn't compete with client traffic, enabling both high availability and maximum performance.

This architecture is critical for write performance: with synchronous replication, every write operation must be committed to both storage nodes. By dedicating a separate 100 Gbps network path for replication, MayaScale achieves 866K write IOPS while maintaining sub-1ms latency.

How It Compares

vs. Google Cloud Persistent Disk

Metric	MayaScale Ultra	Persistent Disk Extreme	Difference
Read IOPS	2.3M	120K (max)	19x more IOPS
Read Latency	192μs	~1-2ms	5-10x lower latency
Usable Capacity	6 TB	User-defined	Fixed for Ultra tier
Configuration	2x n2-highcpu-64 32x local SSDs	Managed service	Self-managed instances
High Availability	Active-Active (synchronous replication)	Regional persistent disk available	Sub-second failover
Cost (monthly estimate)	~$5,500*	~$1,400**	Higher for extreme performance

*MayaScale Ultra: 2x n2-highcpu-64 with 16x 375GB local SSDs each, us-central1. Includes HA infrastructure but requires self-management.
**Persistent Disk Extreme: 6TB @ 120K IOPS (max tier), regional for HA. Managed service with lower operational overhead.
Note: MayaScale is designed for workloads where Persistent Disk's 120K IOPS limit is insufficient. If you need <120K IOPS, Persistent Disk is more cost-effective.

vs. Local SSD (Direct Attached)

MayaScale pools local SSDs to provide shared storage with near-local performance. While direct-attached local SSDs are slightly faster (no network overhead), they lack:

Data persistence - Local SSDs are ephemeral; data lost on instance termination
Sharing - Only one instance can access local SSDs
High availability - No automatic failover

MayaScale delivers 80-90% of local SSD performance while adding enterprise-grade availability and shared access.

Perfect Use Cases for Ultra Tier

High-Performance Databases

PostgreSQL, MySQL, Oracle on GCE with 10x better performance than Cloud SQL. Sub-200μs read latency for OLTP workloads. Support for hundreds of thousands of concurrent transactions.

Real-Time Analytics

ClickHouse, Druid, Apache Pinot with sub-millisecond query latency. Process billions of events with ultra-fast aggregations. Support for real-time dashboards and operational analytics.

AI/ML Training on Vertex AI

Fast dataset loading for GPU/TPU instances. Shared storage for distributed training jobs. Eliminate data loading bottlenecks with 2.3M IOPS throughput.

NoSQL at Scale

Cassandra, ScyllaDB, MongoDB with massive IOPS. Low-latency reads for real-time applications. Support for millions of operations per second across clusters.

Getting Started

MayaScale Ultra tier can be deployed on Google Cloud Marketplace or via Terraform:

Terraform deployment - Infrastructure-as-Code with full customization
Policy-based selection - Specify "zonal-ultra-performance" policy for automatic instance selection
Active-Active HA - Automatic failover with no downtime
NVMe-oF client drivers - For Linux (RHEL, Ubuntu, Debian)
CSI driver for GKE - Dynamic volume provisioning for Kubernetes
Free trial available - Evaluate performance on your workload

Sample Terraform Configuration

module "mayascale_ultra" {
  source = "github.com/zettalane/terraform-gcp-mayascale"

  cluster_name        = "ultra-cluster"
  performance_policy  = "zonal-ultra-performance"
  zone                = "us-central1-f"
  project_id          = "your-project-id"

  # Automatically selects n2-highcpu-64 with 16 local SSDs
  # Deploys Active-Active with dual-NIC architecture
  # Target: 2.3M read IOPS, 866K write IOPS
}

Other Performance Tiers

Not every workload needs 2.3M IOPS. MayaScale offers five performance tiers on Google Cloud, from Basic (100K IOPS) to Ultra (2.3M IOPS). All tiers deliver sub-millisecond latency with Active-Active HA:

Basic: 100K IOPS on n2-highcpu-4 - Development and testing
Standard: 380K IOPS on n2-highcpu-8 - General purpose applications
Medium: 700K IOPS on n2-highcpu-16 - Production databases
High: 900K IOPS on n2-highcpu-32 - High-performance databases
Ultra: 2.3M IOPS on n2-highcpu-64 - Maximum performance workloads

See our MayaScale on GCP page for detailed tier comparisons and performance graphs.

Why MayaScale on Google Cloud?

Validated Performance

All performance numbers tested and validated with SNIA-compliant FIO benchmarks. October 2025.

Enterprise Availability

Active-Active architecture with sub-second failover. Both nodes serve I/O simultaneously for maximum uptime.

Cost Effective

90% lower cost vs traditional SAN. Pay only for GCE instances and local SSDs you use—no expensive storage arrays.

Conclusion

Google Cloud's N2 instances with local SSDs provide exceptional raw performance, but they've historically come with a trade-off: high performance or high availability, but not both. MayaScale eliminates this compromise.

With validated performance of 2.3 million read IOPS and 192 microsecond latency, MayaScale's Ultra tier on Google Cloud delivers shared storage that rivals direct-attached NVMe while providing enterprise-grade high availability. Applications that previously required on-premises infrastructure or expensive specialized hardware can now run efficiently on Google Cloud with performance that wasn't possible with traditional cloud storage.

The combination of Google Cloud's N2 hardware and MayaScale's software creates a compelling solution for databases, analytics, AI/ML training, and any workload that demands both extreme performance and high availability.

Ready to Experience 2M+ IOPS on Google Cloud?

Try MayaScale Ultra tier today with our free trial. Deploy in minutes with Terraform and see the performance difference for yourself.

Download Free Trial View All GCP Tiers