MithiDocs

Archiving Object Storage to Vaultastic

Overview

Many organizations store large volumes of unstructured data in cloud object storage platforms, including:

  • Amazon S3

  • Azure Blob Storage

  • Google Cloud Storage

These repositories commonly contain:

  • Application-generated data

  • Logs and telemetry archives

  • Media assets

  • Exported database snapshots

  • Compliance records

  • Data lake objects

While object storage platforms provide high durability and scalability, they do not inherently provide:

  • Centralized compliance indexing

  • Unified cross-source search

  • Supervisory workflows

  • Unified governance across communication and file data

Using Data Upload, data from these object storage platforms can be ingested into Vaultastic to:

  • Create a structured archive
  • Apply retention policies
  • Enable compliance workflows

Vaultastic Storage Targets

Object storage data is typically archived into the following Vaultastic storage tiers.

StoreUse Case
Open StoreMedium-term archival where data may require periodic access or search
Deep StoreLong-term archival optimized for low-frequency access and lower storage cost

The appropriate storage tier depends on:

  • Expected retrieval frequency

  • Regulatory retention requirements

  • Cost sensitivity

  • Legal hold considerations

Object Storage Ingestion Overview

The following table summarizes supported object storage ingestion options.

Data SourceDestination StoreMethodDescription
Amazon S3 BucketsOpen / Deep StoreData Upload
Copies objects from S3 buckets into Vaultastic
Azure Blob ContainersOpen / Deep StoreData Upload
Retrieves blobs and transfers them to Vaultastic
Google Cloud Storage BucketsOpen / Deep StoreData Upload
Extracts objects and uploads them to Vaultastic


During ingestion:

  • Object metadata is preserved
  • Retention policies can be applied

Object Storage Ingestion Process

Data Upload performs object storage archival using the following workflow:

  1. Connects securely to the source object storage platform.

  2. Authenticates using service credentials.

  3. Enumerates selected buckets or containers.

  4. Copies objects and associated metadata.

  5. Uploads the data into Vaultastic Open Store or Deep Store.

  6. Applies indexing and retention policies.

This process enables structured archival while preserving the original object structure.

Supported Ingestion Scenarios

Vaultastic supports multiple ingestion patterns:

Full Bucket Migration

Entire buckets or containers can be copied into Vaultastic.

This approach is typically used for:

  • System decommissioning

  • Data consolidation projects

  • Migration from one storage platform to another

Prefix or Folder-Based Archival

Specific paths within a bucket can be archived.

This enables structured archival by:

  • Department

  • Application

  • Workload

  • Project

Date-Based Archival

Objects older than a defined threshold can be archived.

Use cases:

  • Lifecycle management
  • Archiving inactive data
  • Cost optimization

Scheduled Incremental Archival

Recurring ingestion jobs archive newly created or modified objects.

Use cases:

  • Continuous compliance archival
  • Active environments generating ongoing data

Why Archive Cloud Object Storage to Vaultastic

Compliance and Governance

Object storage platforms primarily focus on durability and scalability. Object storage platforms lack compliance-centric capabilities.

Vaultastic provides:

  • Centralized retention enforcement

  • Indexed search across archived data

  • Audit logging

  • Integration with compliance and investigation workflows

Cost Optimization

Cloud object storage costs can grow rapidly as data volumes increase.

Archiving data into Vaultastic enables organizations to:

  • Reduce active storage footprint

  • Move aging objects into Deep Store

  • Optimize storage lifecycle management

Unified Data Governance

Many organizations operate multi-cloud environments.

Archiving into Vaultastic enables:

  • Consolidation across cloud platforms

  • Centralized search across archived datasets

  • Standardized retention policies

Risk Reduction

Cloud storage environments may be exposed to risks such as:

  • Misconfigured public access

  • Accidental deletion

  • Credential compromise

Vaultastic provides:

  • Independent preservation layer
  • Separation from production storage

Security and Access Configuration

LegacyFlo requires secure access credentials to connect to source object storage platforms.

Typical credential models include:

PlatformCredential Model
AWS S3IAM user or role with read-only bucket access
Azure Blob StorageService principal or SAS token
Google Cloud StorageService account with object viewer permissions

Guidelines:

  • Use least-privilege access
  • Restrict access to required buckets/containers only
  • Rotate credentials periodically

Vaultastic implementation teams assist with:

  • Permission scoping
  • Endpoint configuration
  • Transfer validation

Initial Configuration

Follow these steps to archive cloud object storage into Vaultastic.

1. Identify Archival Scope

  • Select buckets/containers
  • Define archival type (full, prefix, date-based)
  • Determine Open vs Deep Store
  • Decide on one-time vs recurring ingestion

2. Provision Source Credentials

  • Create read-only credentials for the source object storage.
  • Validate network connectivity
  • Ensure access scope is restricted

3. Configure Data Upload Ingestion

Define the ingestion request with the following details:

  • Object storage endpoint

  • Bucket or container name

  • Prefix or path (if applicable)

  • Destination Vaultastic store

  • Optional filters (date, path, object type)

4. Execute Initial Migration

Run the first ingestion job and verify that:

  • All objects are transferred successfully

  • Metadata is preserved

  • Indexing is functioning correctly

5. Configure Recurring Archival (Optional)

If ongoing archival is required:

  • Schedule incremental ingestion jobs

  • Monitor transfer logs and execution status

  • Periodically validate completeness


Benefits of Object Storage Archival in Vaultastic

Archiving cloud object storage into Vaultastic provides:

  • Centralized governance across cloud environments

  • Regulatory-compliant retention controls

  • Reduced exposure of primary storage systems

  • Centralized indexing and discovery

  • Durable preservation independent of production workloads

By ingesting object storage platforms into Vaultastic Open or Deep Store, organizations establish a structured and compliant archival framework across their cloud infrastructure.

Prerequisites

Before configuring ingestion:

  • Network connectivity from LegacyFlo to object storage endpoints
  • Required ports and firewall rules are open
  • Credentials with read-only access are created
  • Buckets/containers and prefixes are identified
  • The estimated data volume and ingestion window are defined

Limitations and Considerations

  • Ingestion is copy-based (source data is not deleted automatically)
  • Large buckets may require phased ingestion
  • API rate limits of cloud providers may impact throughput
  • Object versioning behavior depends on source configuration
  • Encryption (server-side/client-side) must be supported and validated

Monitoring and Validation

Track ingestion using:

  • Job execution logs
  • Object count comparison (source vs Vaultastic)
  • Error and retry reports
  • Indexing status

Periodic validation:

  • Random object verification
  • Metadata validation
  • Search test queries

Data Integrity and Verification

To ensure integrity:

  • Validate object counts post ingestion
  • Verify checksum/hash where applicable
  • Ensure metadata consistency
  • Confirm retention policies are applied correctly