Need help with Zapier?
Connect with an Expert

Google Enhances Storage Performance with Homebrew Automation

Relying on Hard Disk Drives

Google has disclosed its continued reliance on hard disk drives (HDDs) for the majority of its storage requirements while achieving significant performance improvements through an in-house automated data tiering system. In a recent announcement, the tech giant elaborated on its “Colossus” universal storage platform that supports services like YouTube, Gmail, and Google Cloud Storage.

Colossus Storage Performance

The Colossus platform features large filesystems, some exceeding 10 exabytes of storage, with the capability to achieve read throughputs of over 50 TB/s and write throughputs of 25 TB/s. The announcement highlighted that the busiest cluster frequently handles more than 600 million input/output operations per second (IOPS), combining both reads and writes.

Google previously noted in 2021 that it employs a combination of flash and disk storage, where frequently accessed data is stored on SSDs to enhance efficiency and reduce latency. Despite this, there remains a challenge in balancing fast SSDs with cost-effective HDDs.

Automated Caching with L4

The automated caching system, known as “L4,” plays a critical role in determining which data is best suited for SSD storage. It creates an index that helps identify whether data is available in cache or on HDD, thus optimizing data access speeds.

  • L4 uses machine learning algorithms to analyze I/O patterns and decide the appropriate caching policy for different workloads.
  • New data can be categorized, and its storage location is determined based on usage patterns.

Although L4 has improved IOPS and throughput for frequently accessed data, certain types of data, such as quickly written and deleted files, are less suited for HDD storage, prompting considerations for more direct SSD usage.

Looking Ahead

As Google faces challenges in finding the optimal combination of HDD and SSD storage, the tech giant will reveal more insights into its storage systems during the Google Cloud Next conference in April. Storage tech lead Larry Greenfield and storage software engineer Seth Pollen recommend attendees participate in sessions discussing new features and optimizing storage infrastructure.