Catalog Matching and Registration

You can match and register back catalogs with EIDR in several ways, but all of them have some common steps consisting of:

  • Preparing data in a standard format, either XML or a spreadsheet
  • An iterative process of matching the proposed records against the current EIDR database
  • For records that were not found by the matching process (gap records)
    • Serialization: some records (e.g. episodes) may require that other records be registered first (such as registering a series before an episode.)
    • Registration of new content records
  • Making corrections to existing records based on the matching results.
  • Providing the matched and newly registered EIDR IDs to the EIDR user for inclusion in the EIDR user’s metadata systems.

Most bulk registrations will have to go through the matching process. The matching process reduces the chance of de-duplication errors caused by the variability in quality of source material across providers.

To prepare large registrations:

  1. Determine hierarchy compatibility. This includes determining the types of records you will be registering and how these fit with the types of records in your system. For example, if you are registering abstract works, determine which records represent the work in its general form. If you are registering digital assets (Manifestations), determine those versions (EIDR Edits) from which your asset originated.
  2. Map your data fields to EIDR’s and validate metadata compatibility.
  3. Evaluate metadata for covering required EIDR fields and recommended EIDR practices (for example, Directors/Actors and Alternate IDs). Develop a plan to fill in any missing data.
  4. Evaluate the quality and consistency of use of the resulting metadata (for example, release year, first billed vs. any four actors).
  5. Match common production/distribution companies to those in EIDR’s Party database.
  6. Run the batch through match before registration. That way,  you’ll know if any records require further preparation and how many of them will require manual review. (Our standard SLA for registry response does not apply to catalog projects, so you’ll must coordinate with EIDR Operations in advance to determine the SLA and correct submission batch size for your particular project.)

The following describes the steps needed for the matching and registration of records with the EIDR service.

Data Model Alignment Process

The first step in the process is to make sure that any records being registered align with the EIDR data model.

  1. Utilizing the defined and required EIDR data fields need to register a record, the EIDR user iteratively reviews their data model to identify and map to EIDR fields.
  2. If needed, the EIDR user can request a review of the data model mapping of EIDR, to help validate that the Provider’s data model is properly aligned with EIDR.
  3. EIDR user supplies sample set of data (for example, 10 complete records) for the fields identified in an Excel template provided by EIDR.
  4. EIDR reviews this initial sample to confirm mapping and identify any gaps.
  5. If there are missing data fields or other anomalous issues, EIDR user to iterate with EIDR team on how to provide missing fields.

Output: Mapping of EIDR user’s data model and practices to EIDR’s.

Record Matching Process

The next step in the process is to match the EIDR user’s data against the EIDR data.

  1. If the user wants to use a third-party matching service:
    1. EIDR will provide a copy of the current registry (in XML or flat file) for external matching.
    2. EIDR user reviews the results to identify records do not exist in the EIDR database.
    3. EIDR user submits these gap records to EIDR for registration.
    4. EIDR will review the provided data to confirm there are no matches and assess the anticipated manual review volumes before registration.
  2. If the user wants to use EIDR’s fuzzy matching service:
    1. EIDR user submits all of their records to EIDR for preliminary match.
    2. EIDR will review the provided data to confirm the match, no match, and manual review rates before registration.
  3. EIDR user then follows the Bulk Registration steps to register records, See steps below.

Output: Gap titles identified and verified for Bulk Registration process.

Bulk Registration Process

After Data Model Alignment and Record Matching have been completed, the registration process can start.

  1. EIDR-managed bulk registration:
    1. The EIDR user provides a full set of data fields for the records identified in the output of the Matching Process in a BMR (Bulk Match & Register) template.
    2. EIDR processes them in batches (determined based on the number of anticipated manual reviews), returning the final results once all records have been processed.
  2. User-managed bulk registration:
    1. The user works with EIDR Operations to determine a suitable submission schedule for their gap record set (batch size and frequency), based on the number of anticipated manual reviews.
    2. The user processes their records in batches via their own API/SDK integration until all records have been processed.

Output: Gap data added to EIDR database.

NOTE: The EIDR user is then responsible for adding the newly created EIDR content IDs to its data systems.

Bulk Modification Process

  1. After Record Matching, if the EIDR user has additional fields such as alternate IDs or more accurate metadata, the EIDR user supplies the EIDR IDs and additional metadata fields.
  2. EIDR spot checks metadata for accuracy and alternate IDs for proper granularity.
  3. EIDR updates the records.

See Also

Updated on April 9, 2021

Was this article helpful?

Related Articles