Implementation Classes
The DataInterface
Class
Class in which all required OAI data retrieval actions must be implemented. The instantiated instance of this class is then passed to the OAI repository.
Attributes:
Name | Type | Description |
---|---|---|
limit |
int
|
Max number of results to return per request for ListSets, ListIdentifiers, ListRecords |
get_identify() -> Identify
Create and return an instantiated Identify object.
Returns:
Type | Description |
---|---|
Identify
|
The Identify object with all properties set appropriately |
get_metadata_formats(identifier: str | None = None) -> list[MetadataFormat]
Return a list of metadata prefixes for the identifier. If no identifier identifier is passed, then list must contain all possible prefixes for the repository.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
identifier
|
str | None
|
An identifer string |
None
|
Returns:
Type | Description |
---|---|
list[MetadataFormat]
|
A list of instantiated MetadataFormat objects with all properties set appropriately to the identifer. If identifier is None, then list of all possible MetadataFormat objects for the entire repository. |
get_record_abouts(identifier: str) -> list[lxml.etree._Element]
Return a list of XML elements which will populate the <about>
tags in GetRecord responses.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
identifier
|
str
|
A valid identifier string |
required |
Returns:
Type | Description |
---|---|
list[_Element]
|
A list of lxml.etree.Elements to populate |
Important
oai_repo will wrap the response with a <about>
tag; do not add it yourself.
Note
If you implement get_records_abouts
, you may not need this
method implemented. By default, get_records_abouts
is the
only method which calls get_record_abouts
.
get_record_header(identifier: str) -> RecordHeader
Return a RecordHeader instance for the identifier.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
identifier
|
str
|
A valid identifier string |
required |
Returns:
Type | Description |
---|---|
RecordHeader
|
The RecordHeader object with all properties set appropriately. |
Note
If you implement get_records_header
, you may not need this
method implemented. By default, get_records_header
is the
only method which calls get_record_header
.
get_record_metadata(identifier: str, metadataprefix: str) -> lxml.etree._Element | None
Return a lxml.etree.Element representing the root element of the metadata found for the given prefix.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
identifier
|
str
|
A valid identifer string |
required |
metadataprefix
|
str
|
A metadata prefix |
required |
Returns:
Type | Description |
---|---|
_Element | None
|
The lxml.etree.Element for the requested record metadata, or None if record has no metadata for provided prefix. |
Important
oai_repo will wrap the response with a <metadata>
tag; do not add it yourself.
Note
If you implement get_records_metadata
, you may not need this
method implemented. By default, get_records_metadata
is the
only method which calls get_record_metadata
.
get_records_abouts(identifiers: list[str]) -> list[list[lxml.etree._Element]]
Return a list of XML elements which will populate the <about>
tags in GetRecord responses.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
identifier
|
list
|
A list of valid identifier strings |
required |
Returns:
Type | Description |
---|---|
list[list[_Element]]
|
A list of lists, each being the lxml.etree.Elements to populate |
list[list[_Element]]
|
the record in the first list. |
Important
oai_repo will wrap each response with a <about>
tag; do not add them yourself.
Note
Implementing this function in your DataInterface is optional. You may want to implement a custom version if pulling record metadata is individually slow and could be accomplished faster in bulk.
get_records_header(identifiers: list[str]) -> list[RecordHeader]
Return a list of RecordHeader instances for the identifiers.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
identifier
|
list
|
A list of valid identifier strings |
required |
Returns:
Type | Description |
---|---|
list[RecordHeader]
|
A list of the RecordHeader objects with all properties set appropriately. |
Note
Implementing this function in your DataInterface is optional. You may want to implement a custom version if pulling record headers is individually slow and could be accomplished faster in bulk.
get_records_metadata(identifiers: list[str], metadataprefix: str) -> list[lxml.etree._Element | None]
Return a list of lxml.etree.Element representing the root elements for the metadata found for the requested prefix and identifers.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
identifiers
|
list
|
A list of valid identifer strings |
required |
metadataprefix
|
str
|
A metadata prefix |
required |
Returns:
Type | Description |
---|---|
list[_Element | None]
|
list containing the lxml.etree.Element for each requested record metadata, or None for records which have no metadata for provided prefix. |
Note
Implementing this function in your DataInterface is optional. You may want to implement a custom version if pulling record metadata is individually slow and could be accomplished faster in bulk.
get_set(setspec: str) -> Set
Return an instatiated OAI Set object for the provided setSpec string.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
setspec
|
str
|
a setSpec string |
required |
Returns:
Type | Description |
---|---|
Set
|
The Set object with all properties set appropriately, or None if the setspec is not valid or does not exist. |
is_valid_identifier(identifier: str) -> bool
Determine if an identifier string is valid format and exists.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
identifier
|
str
|
A string to check for being an identifier |
required |
Returns:
Type | Description |
---|---|
bool
|
True if given string is an identifier that exists. |
list_identifiers(metadataprefix: str, filter_from: datetime = None, filter_until: datetime = None, filter_set: str = None, cursor: int = 0) -> tuple
Return valid identifier strings, filtered appropriately to passed parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
metadataprefix
|
str
|
The metadata prefix to match. |
required |
filter_from
|
datetime
|
Include only identifiers on or after given datetime. |
None
|
filter_until
|
datetime
|
Include only identifiers on or before given datetime. |
None
|
filter_set
|
str
|
Include only identifers within the matching setSpec string. |
None
|
cursor
|
int
|
position in results to start retrieving from |
0
|
Returns:
Type | Description |
---|---|
tuple
|
A tuple of length 3:
|
list_set_specs(identifier: str = None, cursor: int = 0) -> tuple
Return a list of setSpec string for the given identifier string if provided, or the list of all valid setSpec strings for the repository if no identifier is None.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
identifier
|
str
|
a valid identifier string |
None
|
cursor
|
int
|
position in results to start from |
0
|
Returns:
Type | Description |
---|---|
tuple
|
A tuple of length 3:
|
Classes Returned by DataInterface
Methods
Identify
dataclass
The info needed for the Identify verb. In your DataInterface.get_identify_instance()
method create an instance of this class, set appropriate data, and return it.
Attributes:
Name | Type | Description |
---|---|---|
repository_name |
str
|
The name of the OAI repository |
base_url |
str
|
the base url for this repository |
admin_email |
list
|
a list of email addresses, cannot be empty |
earliest_datestamp |
str | datetime
|
a string in the granularity format or a datetime object |
deleted_record |
str
|
OAI deleted record value, one of |
granularity |
str
|
OAI granularity, either |
compression |
list
|
compression to be available (typically left empty) |
description |
list
|
can be bytes data or a pre-loaded lxml Element |
Examples:
ident = oai_repo.Identify()
ident.repository_name = "My Repo"
ident.base_url = f"https://example.edu/oai"
ident.deleted_record = "no"
ident.granularity = "YYYY-MM-DDThh:mm:ssZ"
ident.compression = []
... # remaining attributes
MetadataFormat
dataclass
Class to define fields necessary for an OAI metadata format. Your definition of the
DataInterface.get_metadata_formats()
method should return a list of these.
Attributes:
Name | Type | Description |
---|---|---|
metadata_prefix |
str
|
A metadataPrefix string |
schema |
str
|
The schema for the metadata |
metadata_namespace |
str
|
The namespace for the metadata |
Examples:
mdf = oai_repo.MetadataFormat(
"oai_dc",
"http://www.openarchives.org/OAI/2.0/oai_dc.xsd",
"http://www.openarchives.org/OAI/2.0/oai_dc/"
)
RecordHeader
dataclass
Class to define a record header for an identifier. Your definition of the
DataInterface.get_record_header()
method should return one of these.
Attributes:
Name | Type | Description |
---|---|---|
identifier |
str
|
The OAI identifier |
datestamp |
str | datetime
|
The datestamp for when this record was created or last modified |
setspecs |
list[str]
|
A list of setspec strings this recdord is part of |
status |
str
|
The optional OAI status |
Set
dataclass
Class to define fields for an OAI set. Your definition of the
DataInterface.get_metadata_formats()
method should return a list of these.
Attributes:
Name | Type | Description |
---|---|---|
spec |
str
|
The setspec string |
name |
str
|
The name associated with the setspec |
description |
list
|
A list of lxml.etree.Elements to populate
|