Vault Migrations

This article is intended for customers considering a Vault migration project. It provides suggestions and best practices for planning, developing, and executing a Vault migration.

Migration Types

A Vault migration refers to the loading of large volumes of data or documents into Vault. There are two common migration use cases:

Legacy Migration

When replacing a legacy system with Vault, it is common to migrate data from the legacy system into Vault. This type of migration usually occurs during the Vault implementation project for a new Vault, but can also happen when implementing a new application in an existing Vault.

Incremental Migration

Any business event which requires adding new data to Vault may require incremental migration. For example, a phased rollout, system consolidation, or a company acquisition. Incremental migrations occur independent of an application implementation and can affect a live production Vault application.

Approach

At a high level, data migration into Vault generally requires:

  1. Extracting data from legacy system
  2. Transforming data into target format
  3. Cleansing data quality issues
  4. Loading data into Vaultg
  5. Verifying created data

The complexity, effort, and timing of these steps varies based on the data volume, data complexity, system availability requirements, and tools that are used. Choosing the correct approach is the key to a successful migration project.

It is critical that you select an approach that is appropriate for the size and complexity of your migration project. Migrations can be performed by customers or with assistance from Veeva Migration Services and Certified Migration Partners who use validated tools and have experience performing Vault migrations. It is recommended that you speak with your Veeva account team to understand these options prior to starting your project.

Planning

The migration plan should consider dependencies between the Vault configuration and data preparation, as well as provide adequate time for testing and validation. It is important that you conduct performance testing to properly estimate the time it will take to complete migration activities.

Before carrying out a migration, it is necessary to inform Veeva at least three business days in advance or at least one week for large migrations by completing the Vault Migration Planning Form. This notifies Veeva technical operations and support teams of your migration project and allows them to prepare. Fill out this form for any migration that includes over 10,000 documents, 1,000 large (greater than 1 GB) documents (like videos), 500,000 object records, or any migration where the customer or services has concerns. Complete the form for each environment you will be migrating into.

Migration Best Practices

This section identifies the common practices that should be considered when migrating documents, objects, or configuration into Vault.

Extracting Data

Source data for a migration can come from legacy applications, file shares, spreadsheets, or even an existing Vault. The details of extracting data from its source format will depend on the system itself. Customers who are migrating from a complex source application often choose to work with a Certified Migration Partner who has experience extracting data from that application.

Batch and Delta Runs

A key consideration for data extraction is minimizing downtime during the cutover from the legacy application to Vault. Often the cutover is done over a weekend. To support this, it is recommended to migrate the majority of data or documents in batches beforehand while the legacy system is still running and then only do a delta migration, extracting and loading only the data that has changed since the batch run, on the cutover weekend once you have turned the legacy system off. If the target Vault is already live, you can use user access control to hide the batch documents until the cutover.

Data Transformation and Cleansing

Data extracted from the legacy system needs to be transformed before being migrated into Vault. Vault APIs and Vault Loader accept data in comma-separated values (CSV) format. During this process it’s necessary to map data between the legacy system and Vault. Review these Data Transformation Considerations before transforming your data.

Transforming Data References

When populating document or object fields which reference object records or picklists, first ensure the reference data exists in Vault. This reference data can be linked using either object lookup fields or picklist fields. This eliminates the need for system-generated IDs for related records.

Transforming Document Metadata

Mapping Document Metadata

To understand what document metadata values need to be populated during a migration, review the structure of the Vault Data Model. This can be achieved by running the Vault Configuration Report or via the Document Metadata API.

Versioned Documents

Vault automatically assigns major and minor document version numbers. The major version starts at one and then increments each time a new steady state is reached. At that time the minor version resets to zero and then increments with each minor change. Some legacy systems allow users to manually assign their own version numbers. Other legacy systems start version numbers at zero instead of one. As a result, the version number from the legacy system may not match those for documents created in Vault.

State Mapping

Lifecycle names and target states must be considered when mapping states. Source documents in “In” states (In Review, In Approval, etc.), other than In Progress, should not be migrated into Vault. Vault will not apply workflows to migrated documents.

Legacy Signature Pages

Legacy Signature Pages must be in PDF format to be migrated into Vault.

Legacy Document Audit Trails

If audit trail data is required for a migration, this can be done in a variety of ways. It’s recommended to convert the audit data into PDF format, link it to each document, and migrate it as an archived document.

Loading Data and Documents into Vault

Developing Migration Tools or Scripts

We recommend customers either use Vault Loader or a Certified Migration Partners to load data into Vault. These tools have been tested and certified as best practice.

However, if you determine that you will develop your own migration tool using the Vault API you should consider the following:

Use Loader API or Command Line

The Vault Loader API endpoints or the Loader command line allow you to automate migration tasks. The Loader service handles processing, batching, error reporting, and is developed and tested by Veeva. Utilizing the Vault Loader API endpoints or the Loader command line can greatly reduce the migration time.

Use Bulk APIs

Migration should be performed using Bulk APIs for data loading and data verification. Bulk APIs allow you to create a large number of records or documents with a single API call. These APIs are designed for higher data throughput and will minimize the number of calls required. Refer to the table below to see which data types have Bulk APIs.

Set Client ID

In any migrations that use the Vault REST API, it’s recommended to set the Client ID. If any errors occur during the migration, Veeva will be better able to assist in troubleshooting.

Handle API Rate Limits

When migrating data via the Vault REST API, it’s important to consider API rate limits. If API rate limits are exceeded, integrations will be blocked from using the API. To mitigate exceeding limits, bulk versions of APIs should be used whenever possible. Migration programs should be written in such a way so that the limits are checked for each API call. If the burst or daily limit are within a 10% threshold of breaching, this is handled by either waiting until limits are available or stopping the migration process.

Migration Service Account

Consider creating a user specifically for performing migration activities so it’s clear the data creation and any related activities were done as part of the migration. Any record of a document that is created will clearly show that it was done as part of a migration.

Migrating into a Live Vault

Consider the impact on existing users when migrating data into a live Vault.

Scheduling

Migrations can often be a computing-intensive process. For large or complicated migrations, you should schedule migration activities during periods of low user activity such as evenings or weekends.

Configuration Mode

When enabled, Configuration Mode prevents non-Admin users from logging into Vault. Use Configuration Mode if you need to prevent end-users from accessing Vault during a migration.

User Access Control

You can configure user access control to hide migrated data from existing users until the cutover is complete.

Loading Documents

Migrating documents into Vault can be done using the Create Multiple Documents endpoint. An alternative is to use the Vault Loader Command Line Interface (CLI) or API by following the tutorial for Creating & Downloading Documents in Bulk.

Preload Documents to Staging

When loading documents into Vault, first upload the files to the Vault file staging server. This includes the primary source files and document artifacts such as versions, renditions and attachments. This should be done far in advance, as the upload can take time. Vault Loader or one of the bulk APIs carry out the file processing to create documents in Vault.

The same files from sandbox test runs can be reused for subsequent production migrations if the files haven’t changed. To do this, re-link the file staging area from one Vault to another by contacting Support. Further details on re-linking can be found in Scalable FTP.

Migration Mode Header

Document Migration Mode is a Vault setting which loosens some of the Vault constraints that are typically enforced to make the migration of data into Vault run more smoothly. Use the Create Multiple Documents or Load Data Objects endpoints to enable this setting using the API.

To use this setting, the migration user must have the Vault Owner Actions : Document Migration permission in their security profile’s permission set.

Disable Custom functionality

You should disable custom functionality (such as entry actions, custom Java SDK, or jobs), if required. Ensure reference values, such as Lists of Values (LOVS), exist and are active if referenced in migration data.

Document Ingestion Delay

It can take time for documents to appear in Vault searches, renditions, or thumbnails once they have been migrated in. For large migrations, document indexing can take several hours. Account for ingestion delay when verifying the existence of migrated documents in Vault.

Suppress Rendition Generation

It is common to suppress document rendition generation or provide your own renditions for Vault migrations. If you choose not to suppress renditions, it will take a significant amount of time for Vault to process large quantities of renditions. See the Rendition Status page to monitor the progress of rendition jobs.

Binder Structures (i.e. Binders)

Bulk APIs don’t exist for migrating binders and folders into Vault, therefore, allocate sufficient time for this to take place. Consider whether the existing structures are still needed after migrating the data into Vault.

Vault Notifications

After migrating documents, jobs run that provide notifications to users via email, such as periodic review or expiration. Users for each environment should be forewarned that this may occur. Some users may receive a large number of emails.

Loading Objects

Migrate objects into Vault using the Create Object Records endpoint. An alternative is to use the Vault Loader Command Line Interface (CLI) or API by following the tutorial for Loading Object Records.

Record Migration Mode

Record Migration Mode allows the migration of object records in non-initial states within lifecycles. Use the Create Object Records or Load Data Objects endpoints to enable this setting using the API.

To use this setting, the migration user must have the Vault Owner Actions : Record Migration permission in their security profile’s permission set.

Disable Record Triggers and Actions

Record triggers execute custom Vault Java SDK code whenever a data operation on an object occurs. If custom functionality isn’t needed during the migration, disable the record triggers to prevent them from running. These can be re-enabled once the migration is complete.

Testing Migrations

Create Development/Sandbox Vault

Administer a sandbox Vault from the production (or validation) Vault and perform any custom configurations. This is typically done in conjunction with an implementation. At this stage, you can determine the structure of the environment into which the data will be migrated.

Reference data, such as picklists and Vault objects are included with the sandbox, but you will need to load other reference data that your migration depends on. Use Test Data packages to create packages of reference data.

Dry Run Migration

Perform a dry run migration to test the migration logic, data, and approximate timings. It’s not necessary to dry run full data volumes. Logic and timings can be validated using smaller datasets. If the migration fails, correct the issues in the development environment before migrating to the test environment.

Data Verification

Once data has been migrated into Vault, verify the data is as expected. This involves a number of different checks, such as:

Data Transformation Considerations

Several complications can occur when populating Vault metadata. Consider the following best practices to transform data before a migration.

CSV Format

CSV files used to create or update documents using Vault Loader must use UTF-8 encoding and conform to RFC4180.

Date Formatting

Dates migrated into Vault must use the format YYYY-MM-DD.

Date/Time Formatting

Date/time conversion must use the Coordinated Universal Time (UTC) format YYYY-MM-DDTHH:MM:SS.sssZ, for example 2019-07-04T17:00:00.000Z. Hence it must end with the 000Z UTC expression, although the zeros can be any number. Ensure that date/time fields map to the correct day. This may be different depending on the time zone.

Case

If Vault metadata is case-sensitive, convert it to match the expected format.

Special Characters

Metadata must not contain special characters, for example, tabs and smart quotes. These special characters can be lost when migrating data into Vault.

Character Encodings

Saving Excel™ files in CSV format for use with Vault Loader can corrupt the file in an undetectable manner. If the file becomes corrupt, your load will fail. Failure logs contain a record of each row that has failed and are accessible by email or Vault notification. Correct the CSV files to continue loading.

Language

If the data being migrated is multilingual, ensure your Vault is configured to support different languages.

Multi-value Field Comma Separator

When mapping multi-value fields, values with commas can be entered through quoting and escaping. For example, “veeva,,vault“ is equivalent to “veeva,vault“.

Windows™/MacOS™ Formatting

Data formatting can differ per environment. For instance, a line separator behaves differently when being from Windows™ or a MacOS™.

Boolean Fields

Format Yes/No fields as true or false when migrating using the API. This doesn’t apply to Vault Loader, as it handles boolean values regardless of case.

Trailing Spaces

Remove any trailing spaces from metadata. These are commonly found after commas.

Leading Zeros

Migrate numbers in String fields as String values to preserve leading zeros and prevent their conversion to integers.

Unique Identifiers

On documents or object records where Name is not unique or is system-managed, set the External ID (external_id__v or external_id__c) to relate it to the original ID used in the legacy system. Additionally, this field helps distinguish between records in success and failure logs.

Maximum Field Length

Values in Long Text fields must not exceed the maximum length configured in Vault. Vault Loader does not truncate these values.

References to Users and Persons

Documents and objects can reference users (user__sys) and persons (person__sys) records. These records must be active in order to be referenced. If referencing people who have left the company or had a name change, reference a person record as it does not have to be linked to a Vault user account. User name and person name are not unique, therefore, external IDs must be referenced for these objects.

Object Lookups

Many object records have relationships with other records. For example, the Study Site object has a field called study_country__v of data type Parent Object which links it to the Study Country object. If you create a new Study Site record using Vault Loader or the API and happen to know the ID for the desired Study Country record,you can populate it. However, these IDs will change based on the Vault environment. Use a lookup table to obtain the Vault record IDs from the name__v or external_id__v fields. An alternative is to use an object lookup field in the form study_country__vr.name__v = 'United States'.

Safety Migrations

Overview

This section provides best practices on migrating Safety data into Vault.

Use Case

The primary use case for a Safety Migration is a Legacy Migration. This involves migrating Safety Cases from a legacy system to Vault. This commonly includes migrating the most recent versions of Cases, but may include migrating previous versions as well.

Safety Migration Configuration

Safety Migration Configuration is a feature that allows a designated user to migrate safety data into Vault via ETL Vault Loader or API, while ensuring performance and data integrity. This also allows Cases to be migrated into a live Vault without altering the migrated data by applying Case processing automation (such as calculations or record auto-creations).

Safety Migration User

Safety Migration User will be used consistently in this article, and it will always refer to the user selected as the Migration user in the Safety Migration Configuration.

Effects

Safety Migration Configuration bypasses most triggers and actions available only for the designated Safety Migration User. Only a small subset of key triggers continue to execute, required for creating object records. This improves the performance of loading in Safety Cases. See the list of Bypassed Auto-Calculations for more information.

If the user attempts to execute a trigger that is not allowed during migration, the following error message appears:

You do not have permission to modify {0} records. Contact your administrator if changes are required.

Enablement

To enable the Safety Migration Configuration feature, create and change the lifecycle state to Active for the Safety Migration Configuration (safety_migration_configuration__v) record and assign it to a Safety Migration User.

Because the Safety Migration Configuration is not shown in Business Admin by default, you can create a record using one of the following methods:

Safety Migration Configuration records contain the following fields.

Name Description Type
name__v The system automatically generates a name for the record. System managed
user__sys (Required) Select the User record that corresponds to the migration user. (unique) Object reference to user__sys object
enabled__v To activate this configuration and allow this user to bypass triggers for migration, set to Yes. Yes/No

Considerations

Bypassing Custom SDK Code

Vault Safety product triggers are bypassed by default when they are triggered by the Safety Migration User.

Additional code needs to be written for custom triggers to have the same behavior. Failure to do so will result in major performance issues during migrations.

The following code snippet illustrates how to bypass a trigger in migration mode:

@RecordTriggerInfo(object = "case_version__v", events = {RecordEvent.BEFORE_INSERT, RecordEvent.BEFORE_UPDATE})
public class SampleTrigger implements RecordTrigger {
    @Override
    public void execute(RecordTriggerContext recordTriggerContext) {
        final QueryService queryService = ServiceLocator.locate(QueryService.class);

        // Get the current user ID from the context
        final RequestContext context = RequestContext.get();
        final String currentUserId = context.getCurrentUserId();

        // Query safety migration configuration for enabled users
        final String queryMigrationUsers = "SELECT user__v FROM safety_migration_configuration__v WHERE enabled__v='true'";
        final QueryResponse queryResponse = queryService.query(queryMigrationUsers);
        final boolean isMigrationUser = queryResponse.streamResults()
                .anyMatch(queryResult -> Objects.equals(currentUserId, queryResult.getValue("user__v", ValueType.STRING)));

        if (isMigrationUser) {
            return;
        }

        // Perform remaining trigger logic
    }
}

Bypassed Auto-Calculations

The following Auto-Calculations do not calculate when using a Safety Migration Configuration.

Clinical Study Migrations

Overview

This section provides best practices for migrating clinical study data into Vault.

Use Case

The primary use case for a Clinical Study migration is an Incremental Migration. This commonly involves having one set of studies and then going live with a second set of studies. In this case, you must migrate additional data to accommodate the second set of studies.

Study Migration Mode

Study Migration Mode helps load studies faster and reduces downtime during migrations. We recommend using Study Migration Mode for all CTMS migrations, particularly when handling large volumes of object data. Study Migration Mode is intended to be additive with Record Migration Mode. Learn more about Record Migration Mode in Vault Help.

Effects

When a Study enters Study Migration Mode, Vault makes study-related object data for that study hidden and uneditable for non-Admin users. This locks down target studies that are being migrated while allowing users to update documents and input data for the remaining studies. See the list of objects with the Migration field for more information.

Study Migration Mode also bypasses productized triggers for the target studies, such as calculating metrics and generating related records.

Certain jobs exclude studies that are In Migration from processing.

Standard Vault to Vault Connections exclude studies that are In Migration. Vault to Vault Connection jobs continue to process updated records that were bypassed while the study was being migrated.

Enablement

If your study uses an object lifecycle, Admins in your Vault must configure user actions that mark a study as In Migration. Learn more about status and archiving studies in Vault Help.

You can enable Study Migration Mode for Study records using the following methods:

Considerations

Consider the following when conducting a Clinical Study migration:

Bypassing Custom SDK Code

When Study Migration Mode is enabled, Vault also bypasses the Clinical App SDK by default.

You must write additional code for a custom SDK to have the same behavior as the Clinical App SDK. Because Study Migration Mode is controlled by the study_migration__v field on a record, you should update the custom SDK to read this field and check if a study is in Study Migration Mode.

Objects with the Migration Field

The following objects have a study_migration__v field available for use in a Clinical Study migration: