Skip to content

Cloud Storage

General Info

  • Binary large-object storage
  • Use case: whenever binary large-object storage is needed
    • online content
    • backup and archiving
    • storage of intermediate results in processing workflows
  • Bucket Attributes
    • Globally unique name
    • Storage class
    • Location (region or multi-region)
    • IAM policies or Access Control Lists
    • Object versioning setting
    • Object lifecycle management rules

Features

Feature Description
Encryption
  • At Rest: by default, before it is written to disk
  • In Transit: by default, from Google to endpoint
Mutability
  • Immutable
  • You do not edit them in place, but instead create a new version
Versioning
  • If enabled Cloud Storage keeps a history of modifications (overwrites or deletes)
  • You can list the archived versions of an object, restore an object to an older state, or permanently delete a version
  • If disabled, new always overwrites old

Import/Export

Solution Description
Offline Media Import/Export
  • 3rd-party solution that allows you to load data into Google Cloud Storage by sending physical media (HDDs, tapes, etc.) to a 3rd-party service provider who uploads data on your behalf
Cloud Storage Transfer Service
  • Allows to import large amounts of online data into Google Cloud Storage
  • Set up a transfer from a data source (S3, HTTP/HTTPS location, another Cloud Storage Bucket) to data sink (Google Cloud Storage bucket)
Transfer Appliance
  • Storage server that you lease from Google Cloud
  • Connect it to your network, load it with data, and then ship it to an upload facility where the data is uploaded to Cloud Storage

Storage Classes

Class Description
Multi-regional
  • High-performance
  • Geo-redundant: you pick a broad geographical location (US/EU,Asia), and data is stored in at least 2 geographic locations separated by at least 160km
  • Appropriate for storing frequently accessed data (website content, interactive workloads, etc.)
Regional
  • High-performance
  • Lets you store data in a specific GCP region (e.g., us-central1, europe-west1, asia-east1)
  • Cheaper than multi-regional storage, but it offers less redundancy
  • Appropriate to store data close to their Compute Engine virtual machines or their GKE clusters -> gives better performance for data-intensive computations
Nearline
  • Backup and archival storage
  • Low-cost, highly durable storage service for storing infrequently accessed data
  • Where you plan to read or modify your data on average once a month or less
  • Also incurs an access fee per gigabyte of data read
Coldline
  • Backup and archival storage
  • Very-low-cost, highly durable storage service for data archiving, online backup, and disaster recovery
  • Best choice for data that you plan to access at most once a year, due to its slightly lower availability, 90-day minimum storage duration, costs for data access, and higher per-operation costs
  • Incurs a higher fee per gigabyte of data read

Access Control

Cloud IAM
  • Roles are inherited from project to bucket to object
Access Control Lists ("ACLs")
  • Define who has access to buckets and objects, as well as what level of access they have
  • Each ACL consists of two parts
    • Scope: defines who can perform the specified actions
    • Permission: defines what actions can be performed