Utils
The utils module provides utility functions for handling XML data in the context of OAI-PMH services.
This module includes functions essential for parsing and transforming XML data obtained from OAI-PMH responses. These utilities facilitate the extraction of namespaces and conversion of XML elements into more accessible data structures.
Functions:
Name | Description |
---|---|
log_response |
Log the details of an HTTP response. |
remove_none_values |
Remove keys from the dictionary where the value is |
filter_dict_except_resumption_token |
Filter keys from the dictionary, if resumption token is not |
get_namespace |
Extracts the namespace from an XML element. |
xml_to_dict |
Converts an XML tree or element into a dictionary representation. |
filter_dict_except_resumption_token(d)
Filter out keys with None values from a dictionary, with special handling for 'resumptionToken'.
If 'resumptionToken' is present and not None, and there are other non-None keys, log a warning and retain only 'resumptionToken' and 'verb' keys. Otherwise, return a dictionary excluding any keys with None values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
d |
dict[str, Any | None]
|
The dictionary to filter. |
required |
Returns:
Type | Description |
---|---|
dict[str, Any]
|
dict[str, Any]: A filtered dictionary based on the defined criteria. |
Source code in src/oaipmh_scythe/utils.py
get_namespace(element)
Return the namespace URI of an XML element.
Extracts and returns the namespace URI from the tag of the given XML element.
The namespace URI is enclosed in curly braces at the start of the tag.
If the element does not have a namespace, None
is returned.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
element |
_Element
|
The XML element from which to extract the namespace. |
required |
Returns:
Type | Description |
---|---|
str | None
|
The namespace URI as a string if the element has a namespace, otherwise |
Source code in src/oaipmh_scythe/utils.py
log_response(response)
Log the details of an HTTP response.
This function logs the HTTP method, URL, and status code of the response for debugging purposes. It uses the 'debug' logging level to provide detailed diagnostic information.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
response |
Response
|
The response object received from an HTTP request. |
required |
Returns:
Type | Description |
---|---|
None
|
None |
Source code in src/oaipmh_scythe/utils.py
remove_none_values(d)
Remove keys from the dictionary where the value is None
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
d |
dict[str, Any | None]
|
The input dictionary. |
required |
Returns:
Type | Description |
---|---|
dict[str, Any]
|
A new dictionary with the same keys as the input dictionary but none values have been removed. |
Source code in src/oaipmh_scythe/utils.py
xml_to_dict(tree, paths=None, nsmap=None, strip_ns=False)
Convert an XML tree to a dictionary, with options for custom XPath and namespace handling.
This function takes an XML element tree and converts it into a dictionary. The keys of the dictionary are the tags of the XML elements, and the values are lists of the text contents of these elements. It offers options to apply specific XPath expressions, handle namespaces, and optionally strip namespaces from the tags in the resulting dictionary.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree |
_Element
|
The root element of the XML tree to be converted. |
required |
paths |
list[str] | None
|
An optional list of XPath expressions to apply on the XML tree. If None or not provided, the function will consider all elements in the tree. |
None
|
nsmap |
dict[str, str] | None
|
An optional dictionary for namespace mapping, used to provide shorter, more readable paths in XPath expressions. If None or not provided, no namespace mapping is applied. |
None
|
strip_ns |
bool
|
A boolean flag indicating whether to remove namespaces from the element tags in the resulting dictionary. Defaults to False. |
False
|
Returns:
Type | Description |
---|---|
dict[str, list[str | None]]
|
A dictionary where each key is an element tag (with or without namespace, based on |
dict[str, list[str | None]]
|
|
dict[str, list[str | None]]
|
each element with that tag. |