Sandhill Data Processors
Sandhill routes are composed of a list of data processors. These are single actions that Sandhill may take while processing a request.
Things data processors can do:
- Gathering data by querying an API
- Loading configuration from a file
- Transforming or manipulating data
- Performing some evaluation or computation
If the data processors provided with Sandhill are not sufficient, you can develop your own data processor as well.
Data Processors Included With Sandhill
- evaluate - Evaluate a set of conditions and return a truthy result.
- file - Find and load files from the instance.
- iiif - Calls related to IIIF APIs.
- request - Do generic API calls and redirects.
- solr - Calls to a Solr endpoint.
- stream - Stream data to client from a previously open connection.
- string - Simple string manipualtion.
- template - Render files or strings through Jinja templating.
- xml - Load XML or perform XPath queries.
Common Data Processor Arguments
These arguments are valid to pass to all data processors. Data processors should be written to handle these arguments appropriately.
name
- Required
Defines the label under which the data processor will run. Results from the processor will be stored under this key in the data passed to subsequent processors.
processor
- Required
Specifies the processor and method to call within the processor, period delimited.
on_fail
- Optional
Unless specified, the data processor is allowed to fail silently and proceed onto the next processor.
When specified, the value must be the integer of a valid
4xx or 5xx HTTP Status Code or 0
.
If the data processor fails and on_fail
set, Sandhill will abort the page request and return an error
page with the selected code. If set to 0
, the processor may choose to return an appropriate code to
the type of failure.
when
- Optional
A string which is first rendered through Jinja and then evaluated for truth. If the value is not truthy, then the given data processor will be skipped.
sandhill.processors.evaluate
Processor for evaluation functions
conditions(data)
Evaluates the condtions specified in the processor section of the configs.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
Processor arguments and all other data loaded from previous data processors.
|
required |
Returns:
Type | Description |
---|---|
bool | None
|
Returns True if given conditions match appropriate to the parameters, False if they do not, or None on failure |
Raises:
HTTPException: If abort_on_match
is true and the evaluation is truthy.
Source code in sandhill/processors/evaluate.py
sandhill.processors.file
Processing functions for files
create_json_response(data)
Wrapper for load_json
that will return a JSON response object.
This can be used to stream JSON instead of loading it to use it as data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
Processor arguments and all other data loaded from previous data processors.
|
required |
Returns:
Type | Description |
---|---|
Response
|
The response object with the JSON data loaded into it. |
Source code in sandhill/processors/file.py
load_json(data)
Search for files at the paths within 'path' and 'paths' keys of data
. Will load JSON from the first file it finds and then return the result.
If both 'path' and 'paths' are set, paths from both will be searched starting with 'path' first.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
Processor arguments and all other data loaded from previous data processors.
|
required |
Returns:
Type | Description |
---|---|
dict | None
|
The loaded JSON data or None if no file was found. |
Note:
Paths must be relative to the instance/
directory.
Source code in sandhill/processors/file.py
load_matched_json(data)
Loads all the config files and returns the file that has the most matched conditions.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
Processor arguments and all other data loaded from previous data processors.
|
required |
Returns:
Type | Description |
---|---|
dict | None
|
The loaded JSON data from the file that most matched its conditions, or None if no files matched. |
Source code in sandhill/processors/file.py
sandhill.processors.iiif
Processor for IIIF
load_image(data, url=None, api_get_function=api_get)
Load and return a IIIF image.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
route data where |
required |
url |
str
|
Override the IIIF server URL from the default IIIF_BASE in the configs |
None
|
api_get_function |
function
|
function to use when making the GET request |
api_get
|
Returns:
Type | Description |
---|---|
Response | None
|
Requested image from IIIF, or None on failure. |
Raises:
Type | Description |
---|---|
HTTPException
|
On failure if |
Source code in sandhill/processors/iiif.py
sandhill.processors.request
Processor for requests
api_json(data)
Make a call to an API and return the response content as JSON.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
Processor arguments and all other data loaded from previous data processors.
|
required |
Returns:
Type | Description |
---|---|
dict
|
The JSON response from the API call. |
Raises:
Type | Description |
---|---|
HTTPException
|
On failure if |
Source code in sandhill/processors/request.py
redirect(data)
Trigger a redirect response to specified url.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
Processor arguments and all other data loaded from previous data processors. |
required |
* |
`location` _str_
|
URL to redirect client to. |
required |
* |
`code` _int, optional_
|
HTTP status code to redirect with. Default: 302 |
required |
Returns:
Type | Description |
---|---|
Response
|
The flask response object with the included redirect. |
Raises:
Type | Description |
---|---|
HTTPException
|
If the |
Source code in sandhill/processors/request.py
sandhill.processors.solr
Wrappers for making API calls to a Solr node.
search(data, url=None, api_get_function=api_get)
Perform a configured Solr search and return the result.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
Processor arguments and all other data loaded from previous data processors.
|
required |
url |
str
|
Overrides the default SOLR_URL normally retrieved from the Sandhill config file. |
None
|
api_get_function |
function
|
Function used to call Solr with. Used in unit tests. |
api_get
|
Returns:
Type | Description |
---|---|
dict | Response
|
A dict of the loaded JSON response, or a |
Source code in sandhill/processors/solr.py
select(data, url=None, api_get_function=api_get)
Perform a Solr select call and return the loaded JSON response.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
Processor arguments and all other data loaded from previous data processors.
|
required |
url |
str
|
Overrides the default SOLR_URL normally retrieved from the Sandhill config file. |
None
|
api_get_function |
function
|
Function used to call Solr with. Used in unit tests. |
api_get
|
Returns:
Type | Description |
---|---|
dict | None
|
The loaded JSON data or None if nothing matched. |
Raises:
Type | Description |
---|---|
HTTPException
|
If |
Source code in sandhill/processors/solr.py
select_record(data, url=None, api_get_function=api_get)
Perform a Solr select call and return the first result from the response.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
Processor arguments and all other data loaded from previous data processors.
|
required |
url |
str
|
Overrides the default SOLR_URL normally retrieved from the Sandhill config file. |
None
|
api_get_function |
function
|
Function used to call Solr with. Used in unit tests. |
api_get
|
Returns:
Type | Description |
---|---|
Any
|
The first item matched by |
Raises:
Type | Description |
---|---|
HTTPException
|
If |
Source code in sandhill/processors/solr.py
sandhill.processors.stream
Processor for streaming data
response(data)
Stream a Requests library response that was previously loaded.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
Processor arguments and all other data loaded from previous data processors.
|
required |
Returns:
Type | Description |
---|---|
Response | None
|
A stream of the response |
Raises:
Type | Description |
---|---|
HTTPException
|
If |
Source code in sandhill/processors/stream.py
string(data)
Stream a data variable as string data to the output
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
Processor arguments and all other data loaded from previous data processors.
|
required |
Returns:
Type | Description |
---|---|
Response | None
|
A stream of the response |
Source code in sandhill/processors/stream.py
sandhill.processors.string
Processor for string functions
replace(data)
For the given name
in data, replace all occurances of an old string with new string and return the result.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
Processor arguments and all other data loaded from previous data processors.
|
required |
Returns:
Type | Description |
---|---|
str | Response | None
|
The same type as |
Source code in sandhill/processors/string.py
sandhill.processors.template
Processor for rendering templates
render(data)
Render the response as a template or directly as a Flask Response.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
Processor arguments and all other data loaded from previous data processors.
|
required |
Returns:
Type | Description |
---|---|
Response
|
The rendered template in a Flask response. |
Raises:
Type | Description |
---|---|
HTTPException
|
If |
Source code in sandhill/processors/template.py
render_string(data)
Given a Jinja2 template string, it will render that template to a string and set it in
the name
variable.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
Processor arguments and all other data loaded from previous data processors.
|
required |
Returns:
Type | Description |
---|---|
str | None
|
The rendered template string, or None if no |
Source code in sandhill/processors/template.py
sandhill.processors.xml
XML Data Processors
load(data: dict) -> etree._Element
Load an XML document.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
Processor arguments and all other data loaded from previous data processors.
|
required |
Returns:
Type | Description |
---|---|
_Element | None
|
The loaded XML object tree, or None if |
Source code in sandhill/processors/xml.py
xpath(data: dict) -> list
Retrieve the matching xpath content from an XML source.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
Processor arguments and all other data loaded from previous data processors.
|
required |
Returns:
Type | Description |
---|---|
list
|
Matching results from XPath query, or None if any required keys are not in data. |
Source code in sandhill/processors/xml.py
xpath_by_id(data: dict) -> dict
For the matching xpath content, organize into dict with key being the id param of the matched tags. Elements without an id attribute will not be returned.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
Processor arguments and all other data loaded from previous data processors.
|
required |
Returns:
Type | Description |
---|---|
dict
|
Dict mapping with keys of id, and values of content within matching elements, or None if missing any required keys in data. |
Source code in sandhill/processors/xml.py
Developing a Data Processor
Sandhill makes developing your own data processors quite easy, perhaps best explained with a simple example.
Simple Processor
Within your instance/
ensure there is processors/
sub-directory. If not create it.
Next create a new Python file in instance/processors/
; we'll call our example file
myproc.py
(the name of the file is up to you). Next up, we create a function in that
file which must accept a single parameter data
.
# instance/processors/myproc.py
"""The myproc data processors"""
def shout(data):
"""The shout data processor; will upper case all text and add an exlcaimation point."""
...
The data
here is a dict containing all loaded data from a route up until this point.
If previous data processors loaded anything, it will be present in data
. Sandhill
always includes the standard view_args
key which contains any route variables. Also, all
keys arguments set for this data processor call will also be in data
.
For our shout()
processor, let's say we want to expect a key words
, which will
contain the data we want to transform with our processor.
def shout(data):
"""The shout data processor; will upper case all text and add an exlcaimation point."""
return data["words"].upper() + "!"
That's mostly it! Now we could include our custom data processor in a route with this entry
in our route's JSON data
list:
And after the data processor runs, Sandhill will have the following in your route's data
dict:
Improving your Processor
But what if someone fails to pass in the words
key? Right now that would result in a KeyError
.
In Sandhill, best practice for data processors is to return None
on most failures; that is unless
the on_fail
key is set in data
. In this case, we ought to abort with the value of on_fail
.
To assist with this, Sandhill provide the dp_abort()
function (short for "data processor abort") which
will do most of the heavy lifting for you. Let's rework our method to handle failures.
from sandhill.utils.error_handling import dp_abort
def shout(data):
"""The shout data processor; will upper case all text and add an exlcaimation point."""
if "words" not in data:
# Here we choose HTTP status 500 for default, but `on_fail` value will take precedence.
dp_abort(500)
# If no `on_fail` is set, None indicates failure, so always return None after a db_abort().
return None
return data["words"].upper() + "!"
With that, you have a nicely functioning data processor! For more advanced examples, feel free to peek at the source code of the built-in Sandhill data processors above.