1. Home
  2. Programming
  3. Programmers Guide – Introduction

Programmers Guide – Introduction

This document provides an overview of the EIDR Software Development Kit’s (SDK’s) public APIs for the EIDR Registry and offers examples of how to use the packaged SDKs in Java and .NET (C#) programs.

The SDKs provide the recommended programmatic interfaces for EIDR. Some details are provided about the EIDR HTTP API, which is the Registry interface underpinning the Java and .NET SDKs. For details on the REST API, see the EIDR 2.6 REST API Reference.

When the term API (Application Programming Interface) is used without being qualified with “HTTP”, “REST”, “SDK”, “Java” or “.NET”, it refers collectively to the various expressions of the API or operation being discussed.

This document assumes the reader is familiar with the EIDR Registry Technical Overview. For details on the EIDR content data model, see the EIDR Data Fields Reference. The SDK Source installation option includes additional information (as Javadoc or .NET help) that can be accessed within an IDE or as standalone documentation.

In addition, the SDKs come with a set of Command-Line Tools that perform every operation of the Registry and provide full source code examples for every API call that EIDR developers can follow. The Command-Line Tools Overview provides detailed instructions and examples using the command-line tools that can be very instructive when learning how to develop programs using the EIDR SDKs.

Overview and Structure of the APIs

SDK Overview

The main purpose of the Java and .NET SDKs is to make it easy to build applications that use the Registry HTTP API.

The SDKs contain classes for the following purposes:

  • Representing EIDR objects and requests
  • Establishing and managing a connection with the EIDR Registry
  • Sending requests to the Registry
  • Representing and interpreting Registry responses
  • Converting between object and XML representations
  • Utility classes that simplify working with the classes above
  • Utility classes for working with data formats not native to the Registry, such as ISAN; the EIDR bulk ingest format; and an Excel-compatible, tab-delimited output format

The principal operations that send requests to the Registry are done with the classes in org.eidr.sdk.api:

  • AddRelationship (Establish a lightweight relationship between two content records.)
  • Alias (Deprecate a duplicate record by redirecting it to a surviving record.)
  • Delete (Alias a record to the EIDR Tombstone.)
  • GraphTraversal (Retrieve parent/child records from an identified starting point.)
  • Match (Retrieve information about existing records that match a set of registration data.)
  • ModificationBase (Retrieve the current state of a content record to be modified.)
  • Modify (Submit a content record to be modified.)
  • Promote (Convert the publication Status of a record from “in development” to “valid”.)
  • Query (Retrieve all content records that match a query expression.)
  • Registration (Submit a new content record for inclusion in the Registry.)
  • RemoveRelationship (Delete a lightweight relationship between two content records.)
  • ReplaceRelationship (Modify an existing lightweight relationship.)
  • Resolution (Retrieve content records based on a list of one or more IDs.)
  • StatusLookup (Check the current status of a system or user-defined token.)
  • VirtualFieldsRetrieval (Retrieve system-generated encapsulations of a content record’s free text fields.)

org.eidr.sdk.party:

  • PartyAdmin (Perform Party administrative tasks, including Party creation.)
  • PartyQuery (Retrieve all Party records that match a query expression.)
  • PartyResolution (Retrieve Party records based on a list of one or more IDs.)

org.eidr.sdk.service:

  • ServiceAdmin (Perform Video Service administrative tasks, including Video Service creation.)
  • ServiceGraph (Retrieve parent/child records from an identified starting point.)
  • ServiceQuery (Retrieve all Video Service records that match a query expression.)
  • ServiceResolution (Retrieve Video Service records based on a list of one or more IDs.)

These classes are all derived from the Post class or its UserOverride subclass, and share several common features:[1]

  • The constructor for each class takes a Boolean argument that controls debugging output. Calling constructor(true) causes the SDK to print out timestamps, HTTP header information sent to the Registry, the URL being called, the XML payload (if applicable), the XML returned from the Registry, and extra information about error conditions.
  • All of the methods that send data to the Registry take an EIDRConnection object as an argument.

Write Operations

The Registry APIs have several calls that can modify the content records of the Registry. These are Create, Modify, AddRelationship, RemoveRelationship, Delete, Alias, and Promote. For the rest of this document they are referred to as batchable operations.

Each of these APIs takes a single request, which in turn can be composed of multiple operations. All the operations in a request must be the same type, such as all Registration or all Modify.

The EIDR API is based on two separate concepts that sometimes be confused with each other: single/batch requests and the immediate/asynchronous (or, non-immediate) response flag.

The first thing to be aware of is that all requests to write to the EIDR Registry are really batch requests. All requests contain one or more operations, and what is often called a single or non-batch request is just a request containing one operation.  A “single” request can be immediate or non-immediate.  A request of more than one item (a batch) must be non-immediate.

“Immediate” and “non-immediate” refer to the point at which final results of the operations in the request are known.[2]  “Immediate” means that the result of the request is available in the response to the initial request.  “Non-immediate” means that the result may not be known at the time of the response to the initial request, so the client application has to check for the final result using a request token.

The initial request returns one or more tokens, which are used to which are used to track the status of batchable operations using the StatusLookup API.  Tokens are returned even for immediate requests; although this may seem like overkill, having a single format can simplify the handling of registry responses.

In the SDK, all of the batchable operations come in multiple flavors, covering single/batch and immediate/non-immediate. For more information, see “Appendix: Summary of Batchable Calls”.

Single/Batch

All batchable operations are submitted through the HTTP API in a single request. The SDK has separate interfaces for batches of one (single operation requests) and batches with multiple operations (“proper” batches).

All the operations in a batch must be the same (for example, all Create, all Modify, or all Delete). The Registry will return an Invalid Request Error for a batch that violates this constraint.

By default, a status token is represented as a simple 19-digit numeric string.[3] One status token is returned for each operation in the batch, and one for the batch itself. If the batch contains only one operation, the batch token and the single operation token are the same. An application uses the token to query for the current status and final disposition of any non-immediate operations using StatusLookup.

Each item in a batch is processed separately. There is no guarantee that the operations will be processed in the same order as they were submitted in the batch. Thus, batches with operations that depend on each other (for example, creating a series and its seasons in a single batch) will have unpredictable results. However, the status tokens will be returned in order, so you can match the token to the specific operation it represents. Alternatively, you can provide user-defined tokens, introduced in the “Tokens” section.

Submission of the batch is authenticated and additional authorization checks are performed on each individual element of the batch. If an error occurs while processing an item in the batch, that item will fail; querying its status will return appropriate information. Processing of any other operations in the batch is not affected.

Immediate/Non-Immediate (Asynchronous)

In order to guarantee the uniqueness of each record, EIDR subjects all record modification requests to de-duplication review if any of the object’s metadata used in the de-duplication calculation have been changed. In most cases, a result is returned automatically. If there is ambiguity that cannot be resolved by the software, one of two things will happen:

  • If the request is marked as immediate-response, the Registry immediately returns an error to the application, indicating the potential problem(s). In some cases, immediate-response requests return more detailed status information than non-immediate (asynchronous) requests.
  • If the request is not marked as immediate-response (i.e., it is a non-immediate request), it is sent for manual de-duplication. Registry operators will review the request. Their determination is then made available via an API status request using the appropriate system-generated or user-defined token. This process is not real-time; manual review can take up to one business day.

An immediate response applies only to single requests; all multiple-request batches are non-immediate (asynchronous). If an application requests an immediate response for a batch of more than one item, the Registry returns an Invalid Request error.

When to Use Immediate and Non-Immediate

Calls that require confirmation from the de-duplication system, such as registration and modification requests, are generally performed asynchronously, so the calls may not be able to provide an immediate result. All such calls return a token, which can be used to discover and track the status of the associated request. (Automated de-duplication results are generally an “approval,” of the given record as not a duplicate, or a reference to a single high-probability duplicate record.)

For non-immediate results, if the de-duplication system can resolve the response without human review, the result will usually be available within a matter of seconds. In some cases, a summary of this information may be available in the initial Registry response, but it is always available via StatusLookup with the token. Because the de-duplication process may require manual review by the EIDR operations team, results may not be available for up to one business day.

In immediate mode, a request that would have gone into manual review either succeeds or fails immediately, providing a set of one or more existing EIDR IDs for the candidate duplicates. Requests that would not have gone into manual review succeed or fail as they would normally. The results are also available using the token returned with the initial registry response.

Most registration and modification workflows should use non-immediate requests. Immediate requests should only be used for particular situations, such as workflows that:

  • Know with very high probability that the outcome will not require manual de-duplication review. For example, when registering new episodes with vetted metadata in an empty series, or when registering records for which almost any metadata differences are considered distinguishing, as with Manifestations or Edits.
  • Manually or automatically review the candidate duplicates and determine that ne of the match candidates is a true match for the submitted record; that the submitted metadata should be corrected in a way that better distinguishes it; or that the request should be re-submitted in non-immediate mode for manual processing by EIDR.

The EIDR Web UI supports this last model. By default the initial request is performed immediately so the user can see the duplicate candidates and determine which of the three options outlined above to pursue. Generally, processes that use immediate mode need to treat the return of candidate duplicates as an error condition that requires manual intervention.

Tokens

Any system that does not provide immediate results must provide requestors with a way of tracking the progress of a request, so the Registry returns a status token for its batchable operations: Registration, Modify, AddRelationship, RemoveRelationship, Promote, Delete, and Alias. Multi-item requests generate a token for the batch itself and one for each individual operation in the batch. These are called Request and Operation tokens, respectively:

  • Request tokens refer to the status of the entire request body. These are returned in the /Response/RequestStatus/Token element.[4] These are also called batch tokens.
  • Operation tokens refer to individual requests, and are returned in the /Response/RequestStatusResults/OperationStatus/Token XML element.

By default, tokens are in the form of a system-generated unique 19-digit numeric string, but user-defined tokens are also supported for both Requests and for the individual operations within a request.

Besides retrieving information related to the token from a Response, you can also retrieve information relating to a token with a StatusLookup request. Operation tokens can be used to retrieve information about the status of an individual request (for example, a single Create or Delete). Batch (Request) tokens can be used to retrieve information about the status of the batch and the individual items within it, as they become available. This information includes both the operation tokens and the current status of each item in the batch.

The EIDR Registry always generates a token for each Request and Operation.

Batches with a single item generate only a single token. This is treated as an operation token whenever information relating to it is returned from the Registry (as, for example, when it is initially generated or requested via StatusLookup).

While the individual requests within a batch are processed in an indeterminate order, the operation tokens associated with a batch are always returned in the order submitted. This allows the requestor to match the token to the associated transaction.

See the “Token Status Lookup” section and “Appendix: Token Use Examples” for more details.

Per-Operation Data

Forced de-duplication and user-defined tokens both require the use of per-operation data. In both cases, these are provided via an array, which must be exactly the same length as the number of Operations in the Request.

User-Defined Tokens

In addition to the tokens generated by the Registry, users can provide their own tokens for requests and operations, which can be used in exactly the same way as the system-generated token the Registry returns. A user token is an identifying string, supplied by the requester, which is passed along with each request or record. This will usually consist of a representation of the submitting system’s internal ID or primary key for the records being submitted.

If given, User tokens will be echoed back in status request responses, and you can correlate API-generated tokens to your internal via the user token. If you do not supply user tokens, you will have to infer the correlation between submitted requests and the EIDR response by the order in which the system-generated operation tokens are returned within a batch.

NOTE: It is strongly advised that systems that use the API establish a mechanism for ensuring that each supplied User token is unique. Otherwise, status requests that refer to these User tokens will return results that are ambiguous.

These SDK classes support userTokens via the listed methods: Registration, Modify, AddRelationship, RemoveRelationship, Promote, Delete, and Alias.

  • public void setRequestUserToken(String userToken)
  • public String getRequestUserToken();
  • public void setOperationUserTokens(String [] userTokens);
  • public String[] getOperationUserTokens();

If a request has more than one operation, the length of the user tokens array must be the same as the number of operations in the request, or the SDK will generate an Error.

See the “Token Status Lookup” section and “Appendix: Token Use Examples” for more details and the source code for RegisterTool provided with the SDK for an example of how to use these calls.

Forced De-duplication

An application can request manual processing of de-duplication for content records with a single match above the applicable high threshold, which otherwise would be returned as duplicates by the automated review system. This applies to the following operations only in non-immediate (asynchronous) mode: Create, AddRelationship, RemoveRelationship, Modify, Delete, Alias, and Promote.

The Registration, Modify, Alias, AddRelationship, and RemoveRelationship classes all implement the following method:

public void setForcedFlags(DedupTypes [] forcedFlags);

forcedFlags allows the following values:

  • null
  • normal
  • manual
  • accept
  • review

Setting these for immediate mode requests causes an EIDR Registry error. As with user-specified tokens, if a request has more than one operation, the length of the flags array must be the same as the number of operations in the request, or the SDK will generate an SDK Error.

The forcedFlags array on an SDK object starts out as null, meaning that no forced de-duplication occurs. Call setForcedFlags(null) or setForcedFlags(normal) after using the feature to turn it off again. Null implies a value of “normal” for the dedupMode attribute.

If you are certain that a submitted record is unique, despite its apparent similarity to the identified duplicate, and you would like to record it in the EIDR Registry, then you must either provide additional or alternate metadata sufficient to disambiguate the record and re-submit the Registry request or request a manual review by setting the force de-dupe flag to “manual”.

Applications should use this feature only in very special circumstances, and users should give EIDR operations staff advance warning of its use.

NOTE: Since the force de-duplication flags are set per operation, not per batch, each individual request within a batch could have a different flag. However, we recommend that they all be the same and that you use separate batches for each different setting.

See the source code for RegisterTool, ModifyTool, and PromoteTool provided with the SDK for examples of how to use this feature.

Reset Request to Default State

This call sets the Request token, the array of operation tokens, and the array of forced flags to null:

public void clearOperationData();

Read Operations

All Registry read operations are synchronous and blocking – the Registry returns the result as quickly as it can and the calling SDK classes wait for the response before returning to the caller. The Registry read classes are:

  • GraphTraversal
  • Match
  • ModificationBase
  • PartyQuery
  • PartyResolution
  • Query
  • Resolution
  • ServiceGraph
  • ServiceQuery
  • ServiceResolution
  • StatusLookup
  • VirtualFieldsRetrieval

Return Values

The HTTP API returns a variety of XML elements, including <Response> and <AdminResponse>. Some calls return a different XML element for success and failure responses.

In the SDK, all Registry read and write requests return a Response object, defined in org.eidr.sdk.model, or a subclass of Response. The Response object provides a consistent interface across the whole range of Registry return values.

Each SDK call can return one of four Status Types, which are found in org.eidr.sdk.model.StatusTypes (for the Java SDK) or org.eidr.schema.StatusTypeType (for the .NET SDK):[5]

  • Success (The request was processed by the EIDR Registry.)
  • EIDRError (The Registry found an error in the request, such as malformed XML or an authentication error.)
  • SDKError (The SDK was unable to convert the call into a Registry Request, for example, because of missing data or a batch that is too large for the registry to process in a single request.)
  • Error(There was some other problem, for example, a network failure. These errors are usually generated by the Java or .NET runtime libraries.)

For more information on EIDR error codes, see “Error Types” in the EIDR Registry Technical Overview and “Codes and Descriptions” in the EIDR HTTP API Reference.

A response to any Request generally behaves as follows:

On Success:

  • getStatus() returns StatusTypes.Success
  • getMessage() returns a human-readable response
  • getResponseObj() returns the object version of Response
  • getResponsePayload() returns the registry-provided XML associated with that SDK response object

On EIDRError:

  • getStatus() returns StatusTypes.EIDRError
  • getMessage() returns useful info extracted from the Registry response
  • getResponseObj() returns the object version of the error XML
  • getResponsePayload() returns the registry-provided error XML associated with that SDK response object

 On SDKError:

  • getStatus() returns StatusTypes.SDKError
  • getMessage() returns a textual description of what went wrong, e.g., “argument lengths must be equal” when two arrays are not the same length
  • getResponseObj() returns null
  • getResponsePayload() returns “” (the empty string, not null)

On Error:

  • getStatus() returns StatusTypes.Error
  • getMessage() returns status information, e.g., information from catching an internal exception
  • getResponseObj() returns null
  • getResponsePayload() returns “” (the empty string, not null)

Some classes extend Response to include other information. For example, resolution.resolveSimple() returns a SimpleInfoResponse, which also has a getSimpleInfo() method to call if the request was successful. All of these extended Responses are in org.eidr.sdk.model

  • AllSelfDefinedInfoResponse
  • FullObjectInfoResponse
  • KernelMetadataResponse
  • PartyQueryResponse
  • PartyResolutionResponseType
  • ProvenanceInfoResponse
  • ServiceQueryResponse
  • ServiceResolutionResponseType
  • SimpleInfoResponse

See the source code for the tools included with the SDKs for examples of how to handle the various kinds of responses.

A successful response to a request that contains multiple operations will return at least a token for the request. StatusLookup on that token eventually will return a response that has a list of OperationStatusType objects, one for each operation in the original request, each of which will have its own unique operation token. See the source code for RegisterTool included with the SDK for an example of iterating through this list.

Tips and Tricks

  • All calls to the EIDR Registry must be via HTTPS, not HTTP
  • For EIDRError, you can find the actual Registry error with Java using responseObj.getStatus().getCode(), responseObj.getStatus().getType(), and responseObj.getStatus().getDetails()
  • For .NET, use responseObj.getStatus().Code, responseObj.getStatus().Type, and responseObj.getStatus().Details
  • The standard errors from the Registry APIs are listed in the EIDR Registry Technical Overview and EIDR HTTP API Reference.
  • A getStatus() result of Success means only that the EIDR Registry accepted and processed the Request. The Request itself may have caused errors when the Registry tried to execute the request.

HTTP Return Codes

Generally the Registry will generate an http return code of 200 (“Success”). This means that the request made it to the HTTP API for processing and that the HTTP API could return a result.

A 400-class error indicates that the Registry was contacted but that the request URL was incorrect. This should not happen if you are using the SDK, but may occur with applications that construct HTTP API request URLs themselves.

A 505 error indicates that the requested HTTP API function was found, but that the request could not be executed, usually because of an error in one of the parameters. This can happen because some of the information sent to the SDK API is used to construct the HTTP URL, and so must conform to the HTTP RFC. If something is provided that results in a malformed request, the Registry will return a 505 error. The most common cause of this is passing a malformed EIDR ID to the SDK. For example:

  • Having leading or trailing spaces in the EIDR ID
  • Mistakenly sending in a non-EIDR ID where an EIDR ID is expected

Encoding and Escaping Conventions

Special XML characters in a Registry response are escaped as appropriate, regardless of the technique (escaping or CDATA) used in the original request. The special characters and their escaped representation are:

CharacterXML Escape
&quot;
&apos;
&lt;
&gt;
&&amp;

The parentheses characters are reserved in EIDR Query expressions. These characters need to be escaped with a backslash (e.g., \& in place of &) if used in a query.

XML-based and Object-based SDK Calls

In the SDKs, the primary Registry operations are available in the org.eidr.sdk.api package or namespace. Each class contains methods that permit you to submit requests using either information objects contained within the SDK, or an XML-based request using XML strings.

For example, the AddRelationship class offers both addSingleRelationshipFromObj() and addSingleRelationshipFromXML() methods.

EIDR Schema Files

The version of the XML schemas that the EIDR SDKs rely upon are bundled with the SDK distribution packages. In addition, those unique to EIDR are available on the Web at http://eidr.org/schema/, with archival copies of older schemas available in appropriately named subdirectories. For example, while the 2.6 EIDR schemas are in http://eidr.org/schema/, a copy is also held at http://eidr.org/schema/2.6.0, while a copy of the (now deprecated) 1.2.1 schema is available in http://eidr.org/schema/archive/1.2.1.

Most breaking changes in the EIDR schemas only occur in “dot” releases, e.g., 2.0, 2.1, etc. EIDR “dot-dot” releases (2.1.1, 2.1.2, etc.) contain non-breaking changes, affecting Registry and de-duplication internal operations, the introduction of new features that cannot be called from older APIs, changes to elements not encapsulated in the schema (such as the text of certain error messages), and additions to existing controlled vocabularies (the latter with a forwards compatibility mode so older schema versions can reference the vocabulary entries of newer schema versions).

NOTE: The EIDR Command-LineTools, included with the SDKs, can be run from locations other than their original install directory. This means that the tools cannot reliably locate a local copy of the schemas and therefore rely upon the Web copy.

EIDR Configuration File

The EIDR SDK and Tools obtain their Registry credentials from a configuration file found on the local workstation. The important fields in an EIDR configuration file are:

  • <user> This is your user ID, provided by EIDR operations. It looks something like 10.5238/your-name. The prefix will always be 10.5238
  • <party> This is the party name for you user ID. All users are associated with a party. It is of the form 10.5237/9DD9-E249. The prefix is always 10.5237.
  • <url> This is the URL for the registry you are using. The valid choices are
    • https://registry1.eidr.org:443/EIDR for the production registry.
    • https://resolve.eidr.org:443/EIDR for the production read-only mirror.
    • https://sandbox1.eidr.org:443/EIDR for the test registry.

Sometimes a special-purpose registry may be used for pre-release testing, in which case alpha and beta users will be given the appropriate URL.

  • <PageSize> Query results are returned in sets or “pages”, and this gives the maximum number of results to return per page. Larger values of PageSize are generally more efficient since they amortize the cost of the HTTP request over more results. Some of the tools internally use a small PageSize if all they are interested in is the number of results rather than the results themselves.

If you are connecting to the EIDR Registry via a proxy server, then additional configuration parameters may be necessary:

  • <proxyhost> This is the hostname of the proxy. This should be either an IP address or a domain name that is resolvable by the name service running on your machine.
  • <proxyport> This is the port number that the proxy is attached to on the <proxyhost>.
  • <proxyauthuser> The user sent for proxy authentication. Do not use if there is no authentication required.
  • <proxyauthpass> The password sent for proxy authentication. Do not use if there is no authentication required.

NOTE: The Java SDK proxy uses the underlying proxy support in the JVM, and as such, any limitations in the base JVM proxy support will limit the SDK proxy support. 

The Tools and SDK come with a sample configuration file: Doc/Examples/eidr-config-sample.xml

Add your EIDR credentials (partyname, username, and password, obtained from EIDR Operations) to the sample configuration file and place the file in your home directory as eidr-config.xml

  • Unix: $HOME/eidr-config.xml
  • Windows 7, 8, 10: \Users\[windows-username]\eidr-config.xml
  • Mac: /Users/[mac-username]/eidr-config.xml

Tools that use API calls that do not require credentials (generally limited to Resolve) do not require credentials. For those tools credentials can be left out, but if present they must be valid.


[1] A developer using the EIDR SDK should never call the Post class directly, but may call methods on the UserOverride subclass, which sits between Post and the classes that perform Registry write operations. Post itself has been significantly extended, including the ability to set user-defined tokens and de-dupe flags, both discussed in later sections.

[2] The Request itself always returns a result synchronously.

[3] The exception being “User-Defined Tokens,” described in their own section below.

[4] Using XPath notation to map the XML schema structure.

[5] This document primarily describes the Java SDK implementation. In the .NET SDK implementation, there are some differences in the names of model constructs and how they are accessed. See “Schema Classes” for more information.

Updated on April 11, 2021

Was this article helpful?

Related Articles