• Tiny Big Spark
  • Posts
  • Pure Storage ActiveCluster Turbocharges Our OpenStack Cloud

Pure Storage ActiveCluster Turbocharges Our OpenStack Cloud

Unleashing Sub-Millisecond Latency and Zero-Downtime Reliability for Finance Workloads

Tiny Big Spark Newsletter: Revolutionizing Our OpenStack with Pure Storage ActiveCluster

Welcome to this edition of Tiny Big Spark, where we ignite transformative ideas that drive significant impacts in our cloud infrastructure. Last week, we showcased the power of OpenStack in our article, Walmart’s Million-Core Cloud Revolution: How OpenStack Defies Myths and Redefines Hybrid Scaling. As a follow-up, we’re launching a series to highlight the core modules and components that form the backbone of our Platform and Software Engineering. These components—backup, observability (formerly monitoring), security, inventory, and networking—reflect countless hours of experience, hard work, and lessons learned, ensuring that our software, infrastructure, and platform deliver unparalleled performance and reliability.

In this series, we’ll dive into each module to share how they empower our 40,000-core OpenStack environment. Today, we focus on a critical aspect of our design, driven by insights from our observability module: the need for superior IOPS performance to enhance our cloud instances. While our Ceph storage, optimized with a low-cost, open-source design using bcache on NVMe as a caching device with HDD as a backing device, has delivered impressive performance, the demands of our customers—particularly in finance—require even greater speed and reliability. This realization led us to transition from Ceph to Pure Storage ActiveCluster, a decision that redefines performance and resilience for our Charmed OpenStack cloud. Let’s explore why we made this change, the technology behind it, and the benefits it brings to our customers and Infrastructure Teams.

Listening to the Voice of Our Customers

Our commitment to our customers is the driving force behind this transition. Our OpenStack cloud, built on Canonical’s Charmed OpenStack, supports a diverse range of workloads, from AI-driven enterprises to high-performance computing (HPC). While Ceph has been a reliable workhorse, leveraging bcache on NVMe caching with HDD backing to deliver quick performance at a low cost, our observability module revealed a growing demand for better IOPS to support latency-sensitive applications, especially in the finance sector.

In finance, where milliseconds can mean millions, what is fast might not be enough. Our customers, while often unable to specify exact IOPS requirements, consistently emphasized the need for faster, more reliable storage to power their mission-critical applications. To address this, we conducted a comprehensive Proof of Concept (PoC), benchmarking Ceph against leading storage solutions, including Dell EMC, Huawei OceanStor, and Pure Storage. The results were undeniable: Pure Storage’s all-flash arrays, combined with ActiveCluster, offered unmatched performance and reliability, making it the ideal choice for our cloud.

By listening to our customers and validating their needs through rigorous testing, we’ve chosen a solution that not only meets their performance demands but also elevates reliability and operational efficiency, ensuring our cloud remains a competitive advantage for our users.

Why Pure Storage ActiveCluster?

Pure Storage ActiveCluster, built on the Purity operating environment and integrated with the Pure1 AIOps management platform, is a fully symmetric, active/active synchronous replication solution that delivers zero downtime and maximum performance. Here’s why ActiveCluster emerged as the top choice in our evaluation and why it’s the perfect fit for our Charmed OpenStack deployment.

1. Unmatched Performance with All-Flash Arrays

Pure Storage’s all-flash arrays are the cornerstone of its performance advantage, delivering sub-millisecond latencies and industry-leading IOPS. In our PoC, Pure Storage outperformed Ceph, Dell EMC, and Huawei OceanStor across both sequential and random I/O workloads. For applications requiring high IOPS—such as financial trading platforms, databases, and machine learning pipelines—Pure Storage’s all-flash arrays provide the speed and responsiveness our customers demand.

While our Ceph deployment, optimized with bcache on NVMe and HDD, achieved respectable IOPS (50,000–100,000 in typical configurations), Pure Storage’s FlashArray//m delivered up to 300,000 IOPS with sub-millisecond latency. This performance leap translates into faster application response times, critical for finance customers where every microsecond counts. Unlike Ceph’s hybrid architecture, Pure Storage’s all-flash design eliminates the bottlenecks of spinning disks, ensuring consistent performance under heavy workloads.

2. ActiveCluster: Reliability Redefined

Performance is only half the equation—reliability is equally critical, especially in finance, where downtime or data loss can have catastrophic consequences. Pure Storage ActiveCluster addresses this with active/active synchronous replication, ensuring data is mirrored across two sites in real time, achieving zero Recovery Point Objective (RPO) and near-zero Recovery Time Objective (RTO).

ActiveCluster’s symmetric design allows both sites to handle read and write operations simultaneously, eliminating the complexity of traditional active/passive failover systems. In our PoC, we simulated site outages and found that ActiveCluster maintained 99.9999% availability, with failover times measured in seconds. Compared to Ceph’s eventual consistency model, which can introduce latency during replication, ActiveCluster’s synchronous replication guarantees data consistency, a non-negotiable requirement for our finance customers.

3. Simplified Management with Pure1 AIOps

Managing a 40,000-core OpenStack cloud is a complex task, and storage management can be a significant challenge. While Ceph’s bcache-optimized design reduced costs, it required careful tuning and expertise to maintain performance and reliability. Pure Storage’s Pure1 AIOps platform simplifies storage operations with a cloud-based, AI-driven management interface.

Pure1 leverages machine learning to monitor our storage fleet, predict performance and capacity needs, and proactively resolve issues before they impact operations. For example, Pure1’s Workload Planner helped us simulate the impact of new workloads, ensuring we could scale efficiently to meet customer demands. The platform also supports non-disruptive upgrades, allowing our Infrastructure Teams to keep our storage environment current without impacting running workloads.

In contrast, Ceph’s management tools, while robust, often demand manual intervention and deep technical expertise. Pure1’s intuitive interface and automation capabilities reduce the operational burden on our Infrastructure Teams, enabling us to focus on delivering value to our customers.

4. Seamless Integration with Charmed OpenStack

Our Canonical OpenStack deployment relies on the Cinder block storage driver to integrate with storage backends. Pure Storage’s Cinder driver, available through the Charmhub (charmhub.io/cinder-purestorage), provides native support for OpenStack, ensuring seamless integration with our environment. The driver supports advanced features like snapshots, replication, and thin provisioning, enhancing the functionality of our cloud.

During our proof of concept (PoC), we found that Pure Storage’s Cinder driver was more robust and easier to configure than those of other enterprise storage appliances. The Pure Storage driver also supports ActiveCluster’s replication capabilities, allowing us to extend high availability across our OpenStack clusters. This integration ensures that our customers can leverage Pure Storage’s technology without compromising the flexibility of our Charmed OpenStack platform.

Comparing Ceph to Pure Storage ActiveCluster

To understand why we made the switch, let’s compare Ceph and Pure Storage ActiveCluster across key dimensions:

Feature

Ceph (with bcache on NVMe + HDD)

Pure Storage ActiveCluster

Performance

Moderate IOPS (50,000–100,000); latency varies with workload

High IOPS (up to 300,000); sub-millisecond latency with all-flash

Reliability

Eventual consistency; replication can introduce latency

Synchronous replication; zero RPO, near-zero RTO

Management

Complex, requires expertise for tuning and maintenance

Simplified with Pure1 AIOps; AI-driven insights and automation

Scalability

Highly scalable, but performance degrades with scale

Linear scalability with consistent performance

OpenStack Integration

RBD driver - seamless integration

Cinder driver; seamless integration, supports advanced features

Cost Efficiency

Low-cost open-source design with bcache on NVMe

Subscription-based model with predictable costs and high efficiency

While Ceph’s bcache-optimized design offers a cost-effective, quick solution, its performance and reliability fall short of the demands of our finance customers. Pure Storage ActiveCluster, with its superior IOPS, synchronous replication, and simplified management, is the ideal choice for our high-performance OpenStack cloud.

The PoC That Sealed the Deal

Our decision was grounded in rigorous testing. During our PoC, we benchmarked Pure Storage against Dell EMC, Huawei OceanStor, and our bcache-optimized Ceph in a simulated OpenStack environment. The testbed included latency-sensitive databases and throughput-intensive AI workloads. Key findings:

  • Performance: Pure Storage delivered 3x the IOPS of Ceph and 2x that of Dell EMC and Huawei OceanStor in random read/write tests. For sequential workloads, Pure Storage achieved throughput rates of up to 10 GB/s, compared to Ceph’s 3–5 GB/s.

  • Reliability: ActiveCluster’s synchronous replication ensured zero data loss during simulated site failures, while Ceph’s eventual consistency introduced brief inconsistencies. Dell EMC and Huawei OceanStor offered comparable reliability but lacked ActiveCluster’s simplicity.

  • Ease of Use: Pure1 reduced setup and management time by 50% compared to Ceph. Dell EMC’s tools were robust but less intuitive, while Huawei OceanStor’s interface was less polished.

  • Integration: Pure Storage’s Cinder driver integrated seamlessly with Charmed OpenStack, requiring minimal configuration compared to Ceph’s RBD driver.

The PoC confirmed that Pure Storage ActiveCluster was the best solution for our OpenStack cloud, delivering the performance, reliability, and simplicity our customers and Infrastructure Teams required.

Benefits for Our Customers and Infrastructure Teams

The transition to Pure Storage ActiveCluster delivers significant benefits:

For Customers

  • Blazing-Fast Performance: Sub-millisecond latency and high IOPS accelerate applications, critical for finance workloads.

  • Rock-Solid Reliability: Zero RPO and near-zero RTO ensure mission-critical workloads remain online.

  • Future-Proof Scalability: Linear scalability and Pure1’s predictive analytics support growing demands.

For Our Infrastructure Teams

  • Simplified Operations: Pure1’s AIOps automates tasks and provides actionable insights.

  • Proactive Support: Pure1 resolves over 70% of issues proactively, minimizing downtime.

  • Seamless Upgrades: Non-disruptive upgrades keep our storage environment current.

Refind - Brain food is delivered daily. Every day we analyze thousands of articles and send you only the best, tailored to your interests. Loved by 510,562 curious minds. Subscribe.

Pure Storage’s Engineering Excellence: Enhancing Observability

A testament to Pure Storage’s commitment to engineering excellence is their recent firmware update, which introduced the Pure FlashArray OpenMetrics Exporter as part of their observability module. This update, executed flawlessly with zero downtime thanks to ActiveCluster’s robust design, enhances our ability to monitor and manage our storage infrastructure. The exporter extracts data from the Purity API and converts it into OpenMetrics format, making it compatible with observability platforms like Prometheus, which aligns perfectly with our observability module.

The exporter’s stateless design ensures easy configuration and scalability, allowing us to monitor an entire fleet of Pure Storage FlashArrays from a single instance or dedicate it to a single array. Following the multi-target-exporter pattern outlined in Prometheus documentation, the exporter is built using the Prometheus Go client library and Resty, a reliable HTTP/REST client for Go. It’s deployable via Docker and scalable on Kubernetes, offering flexibility for our cloud environment. To use it, we created a dedicated read-only user and API key on our arrays, ensuring secure and efficient monitoring.

This innovation underscores Pure Storage’s holistic approach to engineering, integrating observability directly into their hardware’s operating design. By providing real-time insights into performance and health, the exporter empowers our Infrastructure Teams to proactively optimize our storage infrastructure, further enhancing the reliability and performance our customers rely on.

Looking Ahead: A New Era for Our OpenStack Cloud

The shift to Pure Storage ActiveCluster marks a new era for our Canonical OpenStack cloud. By addressing our customers’ need for high IOPS and leveraging Pure Storage’s cutting-edge technology, we’re delivering a faster, more reliable, and easier-to-manage platform. ActiveCluster’s all-flash performance, synchronous replication, and AI-driven management pave the way for innovation, particularly in finance, where speed and reliability are paramount.

As we continue this series, we’ll explore our backup, security, inventory, and networking modules, sharing how they contribute to our cloud’s success.

Conclusion: A Tiny Big Spark for Our Cloud

In the spirit of Tiny Big Spark, the transition to Pure Storage ActiveCluster is a small change with a massive impact. By listening to our customers, leveraging insights from our observability module, and choosing a solution that delivers unmatched performance and reliability, we’ve ignited a transformation that benefits our entire ecosystem. Stay tuned for the next edition of Tiny Big Spark, where we’ll dive into our backup module, featuring Trilio's OpenStack data protection project. We'll share insights from our implementation and the lessons learned that are shaping the future of our OpenStack cloud.

That’s it! Keep innovating and stay inspired! If you think your colleagues and friends would find this content valuable, we’d love it if you shared our newsletter with them!

PROMO CONTENT

Can email newsletters make money?

With the world becoming increasingly digital, this question will be on the minds of millions of people looking for new income streams in 2025.

The answer is—Absolutely!

That’s it for this episode!

Thank you for taking the time to read today’s email! Your support allows me to send out this newsletter for free every day. 

 What do you think for today’s episode? Please provide your feedback in the poll below.

How would you rate today's newsletter?

Login or Subscribe to participate in polls.

Share the newsletter with your friends and colleagues if you find it valuable.

Disclaimer: The "Tiny Big Spark" newsletter is for informational and educational purposes only, not a substitute for professional advice, including financial, legal, medical, or technical. We strive for accuracy but make no guarantees about the completeness or reliability of the information provided. Any reliance on this information is at your own risk. The views expressed are those of the authors and do not reflect any organization's official position. This newsletter may link to external sites we don't control; we do not endorse their content. We are not liable for any losses or damages from using this information.

Reply

or to participate.