Relationship Queries (Joins)

The Query API is used to construct simple but powerful queries to retrieve data from Veeva Vault. This article covers advanced query use cases using Join relationships. For general information about the Query API, refer to the Query Syntax & Structure or Vault Query API Overview articles. You may also find additional information and help with Vault queries in the Vault Developer Community.

Note: Relationship queries are supported as of API v10.0

Article Contents

Introduction to Vault Relationship Queries

Object to Object Relationships

Vault Document Relationships Document to Object / Object to Document**

Finding Vault Relationships

Introduction to Vault Relationship Queries

VQL is the object query language for querying data in Vault. You can think of it as an object oriented cousin of SQL, where the data model and the way to use joins differs from SQL. Vault's data model is based on objects that are related to each other by relationships. Navigating these relationships is a fundamental building block in constructing queries that allow you to use data from different objects, thus using implicit joins. In other words, the relationship syntax in VQL allows you to join data using path traversal.

Programmatic access to a result set of a VQL query naturally reflects the object oriented nature of the underlying data model. VQL uses the navigation and relationships and is anchored to a particular object from where the result set is obtained by using the FROM clause. Consequently, the returned result set is a list of objects that reflect the object that constitutes the domain of the VQL query. The result set also contains the list of fields that are included in the SELECT clause, including those that are traversed via the relationships. The SELECT clause can include nested SELECT-FROM statements using relationships and the result set can include nested objects within a result set of objects. This is illustrated in the examples below.

In this tutorial, we'll look at some of the capabilities and patterns in VQL that allow developers to conceptualize familiar concepts like joins in SQL and puts them in perspective with respect to VQL's syntax and capabilities.

Requirements & Limitations

When performing relationship queries, you must include at least one field in each SELECT statement.
You can use a maximum of 10 relationships in a single query. The way in which Vault counts the number of relationships in a query is based on the way VQL constructs the joins in order to process the query. The join is only needed for the following two conditions:

  • Create a join relationship between the main object and a secondary object in a nested query.
  • Create an inner join to exclude records from the main object which doesn't have a match in the nested query.

Top

Object to Object Relationships

In the examples below, we'll use a simplified subset of baseball team rosters.

Object A: Teams team__v

Team ID id Team Name name__v Team City city__v
101 Giants San Francisco
102 Royals Kansas City
103 Yankees New York
104 Cubs Chicago

Object B: Players player__v

Player ID id Player Name name__v Player Position position__v Team team__v
44 Bumgarner Pitcher 101
55 Posey Catcher 101
66 Hosmer First Base 102
77 Perez Catcher 102
88 Beltran Right Field 103
99 Ryan Short Stop 103

We'll represent the Teams object as team__v and the Players object as player__v. Each team has a one-to-many (1:M) relationship with its players (one team has many players and one player can be assigned to only one team). These are Parent-Child relationships, where team__v is the parent of player__v. The objects are connected by an inbound and outbound relationship, always looked at from the perspective of the child object. In other words, the child object has a relationship coming "inbound" from the parent object and another relationship going "outbound" to the parent object. The team__v relationship field on the player__v object establishes its relationship with the team__v object.

Relationship naming conventions always use the plural form of the child object for inbound relationships and the singular form of the parent object for outbound relationships, Relationship names end in __vr here and in Vault for standard __v objects. For example, The inbound relationship name is players__vr for the team__v and player__v objects and the outbound relationship name is team__vr. The relationship names allow us to traverse from parent to child or child to parent. The examples below illustrate their usage in relationship queries (Joins). These are completely analogous to Vault Object queries.

Team Player Object Relationship

NOTE: The team and player objects share some of the same field names: id and name__v. This is merely a naming convention. The field values are different on each object. The id or name retrieved (team or player) depends on the object being queried.

Top

Left Outer Join - Parent to Child (1:M)

Problem

Retrieve the id, name, and city from all teams and the id, name, and position of the players assigned to each team.

Query

SELECT id, name__v, city__v, (SELECT id, name__v, position__v FROM players__vr) FROM team__v

Result

Team ID id Team Name name__v Team City city__v Player ID id Player Name name__v Player Position position__v
101 Giants San Francisco 44 Bumgarner Pitcher
101 Giants San Francisco 55 Posey Catcher
102 Royals Kansas City 66 Hosmer First Base
102 Royals Kansas City 77 Perez Catcher
103 Yankees New York 88 Beltran Right Field
103 Yankees New York 99 Ryan Short Stop
104 Cubs Chicago null null null
104 Cubs Chicago null null null

Discussion

The object of the query is team__v, from which we're directly retrieving the team id, name, and city. Since the team__v object is the parent of the player__v object, we used the inbound relationship name players__vr in a nested SELECT-FROM statement to retrieve the player id, name, and position from the player__v object.

Nested SELECT-FROM statements within the SELECT clause are very useful for obtaining related records by traversing from the "1" side of a 1:M relationship (from parent to child). We know from SQL that the result of a left outer join for objects A and B always contains all records of the "left" object (A), even if the join-condition does not find any matching record in the "right" object (B). Notice in this query that the "Cubs" team returned null results from the player object.

Inner Join - Parent to Child (1:M)

Problem

Retrieve the id, name, and city from all teams. Restrict the results to teams with assigned players.

Query

SELECT id, name__v, city__v FROM team__v WHERE id IN (SELECT name__v FROM players__vr)

Result

Team ID id Team Name name__v Team City city__v
101 Giants San Francisco
101 Giants San Francisco
102 Royals Kansas City
102 Royals Kansas City
103 Yankees New York
103 Yankees New York

Discussion

This is nearly identical to the previous query but we've removed the nested SELECT-FROM statement from the SELECT clause and placed another in the WHERE clause. We're also using the team id (which exists in both objects) and the logical operator IN. The object of the query is team__v. The nested SELECT-FROM statement is using the inbound relationship name players__vr to look at the player names in the player__v object and return results only when the team id exists in a player's row. We're not asking to return the player names (they are not included in the first SELECT clause) but only to filter on them.

Combining a nested SELECT-FROM statement with the IN operator in the WHERE clause allows us to test whether the team id exists and return results only when it does. We know from SQL that the result of an inner join for objects A and B requires that each record in the two joined objects have matching records. This query compared each row of object A with each row of object B and found all pairs of rows which satisfied the join-predicate (non-null values).

Top

Inner Join - Child to Parent (M:1)

Problem

Retrieve the id and name of all players in the Catcher position and the id, name, and city of the team to which each player is assigned.

Query

SELECT id, name__v, team__vr.id, team__vr.name__v, team__vr.city__v FROM player__v WHERE position__v = 'Catcher'

Result

Player ID id Player Name name__v Team ID team__vr.id Team Name team__vr.name__v Team City team__vr.city__v
55 Posey 101 Giants San Francisco
77 Perez 102 Royals Kansas City

Discussion

Unlike the previous two queries in which the object of the query was team__v (parent), we're now querying the object player__v (child). Child to parent queries are many-to-one (M:1) relationships and a nested SELECT-FROM statement can't be used. To retrieve the fields from the parent object, we must combine its field names with the outbound relationship name team__vr using dot-notation.

Look at the basic part of this query: SELECT the player id and name from the player object WHERE the position is "Catcher". Using dot-notation, add team__vr.id, team__vr.name__v, and team__vr.city__v to the SELECT clause. The outbound relationship name team__vr allows us to traverse "outbound" from the child player__v object to the parent team__v object to retrieve the team id, name, and city. The response contains the selected fields from each child object and the related parent object fields.

Top

Lookup

The last type of query we'll discuss is a "lookup".

Object A: Teams team__v

Team ID id Team Name name__v Mascot mascot__v
101 Giants 5
102 Royals 6
103 Yankees 7
104 Cubs 8

Object C: Mascots mascot__v

Mascot ID id Mascot Name name__v Animal animal__v
5 Lou Seal
6 Sluggerrr Lion
7 Dandy Bird
8 Clark Bear

For this example, we'll modify the Teams object team__v and create a new Mascots object mascot__v. Each team has a 1:1 relationship with its mascot (each team has only one mascot and each mascot has only one team). There is no true parent to child or child to parent relationship between these objects. There is what's referred to as a "reference relationship", which is why a "lookup" query must be used. To traverse the relationship, we'll use the reference relationship name mascots__vr.

Team Mascot Object Relationship

NOTE: The team and mascot objects share some of the same field names: id and name__v. This is merely a naming convention. The field values are different on each object. The id or name retrieved (team or mascot) depends on the object being queried.

Problem

Retrieve the team id and name which has a "Bear" as a mascot.

Query

SELECT id, name__v FROM team__v WHERE mascots__vr.animal__v = 'Bear'

Result

Team ID id Team Name name__v
104 Cubs

Discussion

The object of the query is team__v, which includes the mascot__v object field for the mascot but not its name or animal. This type of query is called a "lookup" because we're using the animal__v field record from the mascot__v object and using dot-notation to combine it with the mascots__vr relationship in the WHERE clause, thereby looking up the team id and name from the team__v object.

Top

Vault Document Relationships: Document to Object / Object to Document

In the examples below, we'll use a simplified subset of Vault documents and products.

Object A: Documents documents

Document ID id Document Name name__v Document Type type__v
1 CholeCap Study Study
2 Nyaxa Brochure Promotional Piece
3 VeevaProm Information Reference Document

Object B: Products product__v

Product ID id Product Name name__v Generic Name generic__v
01010 CholeCap cholepriol phosphate
02020 Nyaxa nitroprinaline oxalate
03030 VeevaProm veniladrine
04040 VeevaProm XR veniladrine extended

In Vault, documents documents and products product__v have many-to-many (M:M) relationships (a document may be associated with many products and a product may be associated with many documents). The same applies to document relationships with other Vault objects such as country__v, study__v, site__v, etc. These document-object relationships are referred to as reference relationships. Their reference relationship names (to use in relationship queries) are exposed in the document metadata relationship fields. We'll discuss finding relationships in the sections below.

To traverse the document-product relationship, we'll use the reference relationship name document_product__vr. The two queries shown below illustrate the bidirectional nature of document-product relationships. The direction of the query and structure of results depends on which object is being queried, i.e., the object in the FROM clause.

Document Product Relationship

NOTE: The document and product objects share some of the same field names: id and name__v. This is merely a naming convention. The field values are different on each. The id or name retrieved (document or product) depends on the which is queried.

Top

Left Outer Join - Document to Product (M:M)

Problem

Query the Documents object documents to retrieve document fields. Use the document_product__vr relationship to retrieve product fields associated with each document.

Query

SELECT id, name__v, type__v, (SELECT id, name__v, generic__v FROM document_product__vr) FROM documents

Result

Document ID id Document Name name__v Document Type type__v Product ID id Product Name name__v Generic Name generic__v
1 CholeCap Study Study 01010 CholeCap cholepriol phosphate
2 Nyaxa Brochure Promotional Piece 02020 Nyaxa nitroprinaline oxalate
3 VeevaProm Information Reference Document 03030 VeevaProm veniladrine
3 VeevaProm Information Reference Document 04040 VeevaProm XR veniladrine extended

Top

Left Outer Join - Product to Document (M:M)

Problem

Query the Products object product__v to retrieve the product fields. Use the document_product__vr relationship to retrieve document fields associated with each product.

Query

SELECT id, name__v, generic__v, (SELECT id, name__v, type__v FROM document_product__vr) FROM product__v

Result

Product ID id Product Name name__v Generic Name generic__v Document ID id Document Name name__v Document Type type__v
01010 CholeCap cholepriol phosphate 1 CholeCap Study Study
02020 Nyaxa nitroprinaline oxalate 2 Nyaxa Brochure Promotional Piece
03030 VeevaProm veniladrine 3 VeevaProm Information Reference Document
04040 VeevaProm XR veniladrine extended 3 VeevaProm Information Reference Document

Results in List Format (Document to Product / Product to Document)

Document Product Result

Discussion

As shown in the two examples above, using the document_product__vr relationship in a nested SELECT-FROM statement allows you to retrieve fields from both objects in a single query. The bidirectional nature of this M:M relationship allows you to place either object in the FROM clause, thereby obtaining nested results in two different ways.

Top

Finding Vault Relationships

Relationships (and the relationship names to use in queries) are exposed in the Document and Object Metadata APIs for fields of the type ObjectReference where the referenced object objectType is a Vault Object.

Here are the endpoints:

  • Document Metadata API: /api/{version}/metadata/objects/documents/properties
  • Object Metadata API: /api/{version}/metadata/vobjects/{object name}

To find relationships on a document or object, search the metadata response for fields with the following attributes (an example is provided in the JSON response below):

type : ObjectReference

  • This states that the field is referencing an object.
  • The field is not referencing a "string", "number", "picklist", "Boolean", etc.

objectType : vault_object__v

  • The object being referenced by the field must be a Vault Object.
  • This includes object__v for standard objects or object__c for custom objects, e.g., product__v, study__v, custom_object__c, etc.
  • Vault Objects do not include users, groups, workflows, events, or document relationships.

relationshipType : reference, reference_inbound, reference_outbound, parent, or child.

  • The type of relationship defines how the query must be structured, e.g., Left Outer Join, Inner Join, Lookup, etc.

relationshipName : relationship_name__vr for standard objects or relationship_name__cr for custom objects.

  • This is the relationship name to use in relationship queries, e.g., study_number__vr, sites__vr, document_product__vr, etc.

Top

Finding Document Relationships: Document to Object / Object to Document

All document-object relationships are defined on the documents object and always take the form document_{field_name}__vr for standard objects or document_{field_name}__cr for custom objects.

For example:

  • The product__v relationship field has the relationship name: document_product__vr
  • The country__v relationship field has the relationship name: document_country__vr
  • The custom_object__c relationship has the relationship name: document_custom_object__cr

You can retrieve the metadata of all document properties by sending a GET request to the /api/{version}/metadata/objects/documents/properties endpoint.

For example:

$ curl -X GET -H "Authorization: {SESSION_ID}" \ 
https://{CUSTOMER}.veevavault.com/api/{version}/metadata/objects/documents/properties

JSON Response (abridged)

The response contains the list of all document fields (id, name__v, type__v, etc.) configured across entire document type hierarchy. The response shown below lists the details of the product__v relationship field on the documents object. Note that product__v is both a Vault Object and a relationship field on other objects.

      {
            "name": "product__v",
            "scope": "DocumentVersion",
            "type": "ObjectReference",
            "required": false,
            "repeating": true,
            "systemAttribute": true,
            "editable": true,
            "setOnCreateOnly": false,
            "disabled": false,
            "objectType": "product__v",
            "label": "Product",
            "section": "productInformation",
            "sectionPosition": 1,
            "hidden": false,
            "queryable": true,
            "shared": false,
            "definedInType": "type",
            "definedIn": "base_document__v",
            "relationshipType": "reference",
            "relationshipName": "document_product__vr"
      }

Discussion

The product__v field is just one of many document fields. By searching the complete list of metadata returned for the Document Metadata API, you can find all relationships configured on the documents object which link documents to other objects in Vault. You can also create custom objects with relationships to documents or other objects. Once you have the relationship names, structure queries by using the relationship names in nested SELECT-FROM statements in either the SELECT clause or WHERE clause. You can also use multiple nested statements in a single query.