One of the first tasks in an EIDR integration is data model alignment: establishing how an organization’s internal data representation relates to the EIDR representation so that organization data can be included in the EIDR registry and EIDR data can be included in the organization’s internal systems and processes.
If the only EIDR data element of interest is the EIDR ID itself, then this process is simple:
- Allow space to carry a 34-character text field, including upper-case letters, numbers, and the punctuation symbols period (.), forward slash (/), and hyphen (-).
EIDR IDs can then be obtained from external sources, stored locally, and passed on to system users or downstream supply chain partners. Your EIDR integration is complete.
Most EIDR integrations will be more involved. Begin by identifying a proof-of-concept use case and focus on only those EIDR elements that are directly involved. Next, start at the root of the EIDR registration tree and master each level before moving on to the next. In some cases, it may be necessary to start at a higher level than the organization commonly deals with. For example, if the local system records only Edit records (as might be the case for an avails system), it will still be necessary to map to the EIDR Abstraction structure before moving on to Edits. Similarly, if the local system starts its episodic hierarchy with Seasons, it will be necessary to first master EIDR Series Collections before moving on to Seasons Collections and then to Episodes.
At the start of a data model alignment project, several key questions should be answered:
- Is this a one-off catalog project or the basis of an ongoing system integration?
- Can all necessary data for EIDR registration be derived from a single data source?
- If data must be combined from several sources, are they internal, or must external data be acquired to fill in any gaps? Will this require manual matching & data entry or are there pre-existing common identifiers that bridge the different data sources?
- Will the records only be matched to existing EIDR IDs or will gap records also be registered for new EIDR IDs?
The necessary data preparations are generally more rigorous when EIDR registration is the end goal, rather than matching alone.
Data that describe and identify audiovisual content are stored in a near-infinite variety of formats. As a result, the same organization may need to perform data model alignment separately for each of its systems or acquired catalogs. Even a single catalog project may have more than one data extraction set, since it is much more efficient to process records in similar batches according to the amount and type of preparation and review required. For example, feature films are processed separately from episodic works. It may also be helpful to process newer records separate from older ones due to differences in data quality over time. Significant effort can be saved by splitting a project into related sets that are processed iteratively, with each pass tailored to the data set and progressively more refined until all the records have been matched and registered.
A small-scale proof-of-concept using data representative of the records and source(s) that will be involved in the full-scale project can help refine the data selection, identify gaps in coverage, and define any transformations necessary to produce EIDR-conforming data. The EIDR Excel templates and BMR (Bulk Match and Register) tool have been developed to facilitate this process prior to the development of an automated integration between the client’s source system and the EIDR Registry.