Overview
Applications in Google Workspace—such as Gmail, Google Chat, and Google Drive—generate large volumes of organizational data.
This data often needs to be preserved for:
Business continuity and disaster recovery
Regulatory compliance
Supervision and audit requirements
Legal hold and investigation workflows
Vaultastic enables organizations to ingest, store, and manage Google Workspace data using configurable ingestion pipelines. Data is archived into storage tiers optimized for access frequency, performance, and long-term retention.
Vaultastic Storage Tiers
Vaultastic organizes archived data into multiple storage tiers.
| Store | Purpose |
|---|---|
| Active Store | High-performance storage for frequently accessed data and supervision workflows |
| Open Store | Medium-term archival with searchable retention |
| Deep Store | Long-term archival optimized for low-cost storage |
Key Notes:
- Data placement is configurable during ingestion.
- Data can be moved between tiers based on access needs.
- Activation workflows allow moving data from Open/Deep → Active for investigation.
Data Ingestion Overview
The following table summarizes how Google Workspace data can be archived into Vaultastic.
| Data Source | Destination Store | Method | Description |
|---|---|---|---|
| Live Email Transactions | Active Store | Gmail Routing Rules | Automatically archives inbound and outbound email |
| Mailbox Email (Existing Data) | Active / Open / Deep | Data Upload Application | Copies historical mailbox email to Vaultastic |
| PST / EML Files | Active / Open / Deep | Manual Upload | Upload existing email archives |
| Google Chat | Active / Open / Deep | Data Upload Application | Converts chat messages into email format before ingestion |
| Google Drive | Open / Deep | Data Upload Application | Uploads files for long-term archival |
Clarification:
- Chat data is normalized into email format to enable consistent indexing and search.
Email Archival
Vaultastic supports:
- Real-time email capture (journal-based)
- Historical mailbox ingestion
Live Mail Flow
To automatically capture all email transactions:
Configure Gmail routing rules in Google Workspace Admin Console.
Route copies of inbound and outbound email to the Vaultastic Active Store.
Apply the rule to
Selected users,
Groups, or
The entire domain.
Important:
- This is a continuous journaling mechanism
- It does not impact user mail delivery
Validation:
- Send test emails internally and externally
- Confirm ingestion in Vaultastic Active Store
Existing Mailbox Data
Historical Email already stored in user mailboxes can be ingested using the Data Upload Application.
Supported ingestion targets:
- Active Store → for supervision and search
- Open Store → for medium-term retention
- Deep Store → for long-term archival
Enhancements:
- Filter by:
- Date range
- Users
- Mailbox size thresholds
PST or EML Upload
If data is already exported:
Supported formats:
- PST
- EML
Upload targets:
- Open Store
- Deep Store
Operational Notes:
- Bulk uploads should be staged to avoid performance impact
- Data can later be promoted to Active Store for investigation
Google Chat Archival
Google Chat data is archived for:
- Compliance supervision
- Investigations & audit workflows
- Long-term retention
Using the Data Upload Application, administrators can archive:
- Direct messages
- Spaces
Filters:
- Date range
- Selected users
- Entire domain
Processing Behavior:
- Chat messages are converted into an email-compatible format
- Ensures:
- Uniform indexing
- Search consistency
Google Drive Archival
Google Drive files can be archived using the Data Upload Application.
Supported destinations:
- Open Store
- Deep Store
Configuration Options:
- User-based selection
- Group-based selection
- Date-based filtering
- Scheduled recurring archival
Clarification Added:
- File metadata (owner, timestamps, permissions) is preserved
- Version history handling depends on API limitations (see limitations section)
Initial Configuration
Follow the steps below to configure Google Workspace archival in Vaultastic.
1. Define User Scope
- Create Google Workspace Groups
- Add users to be archived
Why this matters:
- Enables centralized management
- Simplifies onboarding/offboarding
2. Configure Email Routing
- Configure Gmail routing rules
- Route inbound, local and outbound email copies to Vaultastic
Validation Checklist:
Verify that mail flow is functioning correctly to ensure continuous capture of email transactions.
- The rule applied to correct the scope
- No delivery disruption
- Emails visible in Vaultastic
3. Configure API Access
- Generate API credentials in Google Workspace
- Register credentials in Vaultastic using the Setup Connections application
Required API Access:
- Gmail
- Google Chat
- Google Drive
Best Practice:
- Use least privilege access
- Prefer service accounts with domain-wide delegation
4. Configure Automated Archival
Using Data Upload application, configure automated archival schedules for:
Google Chat
Google Drive
Recommendations:
- Align with operational and compliance requirements
- Define frequency (daily/weekly)
- Avoid peak business hours
5. Upload Historical Data
To eliminate gaps:
- Ingest historical mailbox data
- Archive historical chat data
- Upload legacy Drive data
Outcome:
- Ensures a complete baseline before automation begins
This approach ensures:
Continuous capture of new data
Complete historical data coverage
Efficient storage management across Vaultastic tiers
Audit-ready compliance and retention capabilities.
Security and Access Control
- All data ingestion occurs via authenticated APIs or journaling pipelines
- Access to archived data is controlled via role-based access control (RBAC)
- Audit logs should be enabled for:
- Data access
- Search activity
- Export operations
Recommended Controls:
- Enable MFA for admin accounts
- Restrict API credentials
- Periodically review access permissions
Monitoring and Validation
After configuration, validate:
- GMail routing rules are active and delivering emails
- API ingestion jobs are running successfully
- Data is searchable in Vaultastic
- No ingestion gaps exist
Suggested Checks:
- Sample user mailbox validation
- Google chat sampling
- File count comparison (source vs archive)
Important Considerations
- Google Chat data is stored in email-compatible format (not native structure)
- API throttling may impact large-scale ingestion jobs
- Historical ingestion duration depends on tenant size and API limits
- Gmail routing rules capture only email (not Chat/GDrive)