Task, Process and Workflow¶
General¶
Caliber provides functionality for running Python functions in a structured manner. So, the three main classes of Caliber; Task, Process and Workflow, are designed to enable you to create a hierarchy around your Python functions.
Task¶
The Task class is a wrapper for your Python function. When you instantiate a task, you point to the function that the task should wrap. You can also give the task a meaningful name. Furthermore, you provide a list of arguments and a dict of keyword arguments that are necessary for the function call. So the task not only points to the function that is to be called, but also contains the arguments and keyword arguments that are to be used.
Put your functions in modules
Do you have a set of functions that you want to use in your tasks? Why not collect them in a module and import them in your workflow definition? Check out the example below to see this in action.
Caliber is software agnostic
Caliber implements classes for creating and running workflows and comes with related convenient functionality. In many cases, the tasks in a workflow call other external software. The interfaces towards other software is not part of Caliber and must be implemented elsewhere. This is why we say that Caliber is software agnostic.
Process¶
The Process class is a wrapper for your tasks. It takes a list of tasks as input and you can give it a meaningful name. The order of the tasks in the input list decides the order in which the tasks are run.
Under the hood of the process
A Process instance uses the networkx package to construct a directed acyclic graph with all the tasks as its nodes. When the .run() method of the workflow is called, the first task of the process is found, and its .run() method is called. When the task has completed successfully, the next tasks are found and run, until the end of the process is encountered.
Workflow¶
The Workflow class is a wrapper for your process. When you instantiate a workflow, you point to the process the workflow should contain. Furthmore, you can give the workflow a meaningful name.
Attach your files
If you have a set of input files that are necessary for the workflow to run properly, the names of these files can be listed as the keyword argument attach_files. Doing this is not strictly necessary for the workflow to run, but is necessary if you use a remote workflow queue, or if you plan to use transports to store your data, input files and output files.
Example¶
This example demonstrates how to set up a workflow by using the main Caliber classes and a collection of functions in a separate module. The example does not represent a series of engineering tasks, but the basic concepts are shown.
The files are put in a folder with a simple structure.
/pasta
├── main.py
├── pasta_functions.py
pasta_functions.py contains the functions that will be used as basis for the tasks. Note that these functions do not perform any interesting work other than printing or waiting. However, their nature is similar to that of common engineering tasks, e.g. preparing input files, running simulations, and post-processing.
Three functions are defined: make_dough represents making a pasta dough, rest represents letting the pasta dough rest, and roll_and_shape represents rolling and shaping the dough into a desired pasta type.
1import time
2
3
4def make_dough(number_of_persons):
5 """Make pasta dough"""
6 scaling = number_of_persons/3
7 eggs = scaling*185.0
8 flour = scaling*300.0
9
10 print(f'Makes pasta dough from {eggs:.1f} g eggs and {flour:.1f} flour.')
11 time.sleep(0.5)
12
13
14def rest():
15 """Let the pasta dough rest"""
16 print('Wraps the dough in plastic and puts to rest in fridge for 30 min.')
17 time.sleep(2)
18
19
20def roll_and_shape(pasta_type='tagliatelle'):
21 """Roll and shape pasta"""
22 for thickness in range(1, 7):
23 print(f'Rolls on thickness setting {thickness}')
24 time.sleep(0.5)
25 print(f'Shapes into {pasta_type}.')
main.py is used to define the tasks, add the tasks to a process and to create the workflow. Notice how each task refers to a function from pasta_functions.py and that the arguments and keyword arguments are stored as attributes of the tasks. On the last line, the created workflow is run by calling its run() method.
1import caliber
2
3import pasta_functions
4
5# Define tasks
6make_dough = caliber.Task(
7 function=pasta_functions.make_dough,
8 name='Make dough',
9 args=[
10 3,
11 ],
12)
13
14rest = caliber.Task(
15 function=pasta_functions.rest,
16 name='Rest',
17)
18
19roll_and_shape = caliber.Task(
20 function=pasta_functions.roll_and_shape,
21 name='Roll and shape',
22 kwargs={
23 'pasta_type': 'tagliatelle',
24 },
25)
26
27# Collect tasks in process
28pasta = caliber.Process(
29 tasks=[
30 make_dough,
31 rest,
32 roll_and_shape,
33 ],
34 name='Pasta',
35)
36
37# Create workflow
38make_pasta = caliber.Workflow(
39 process=pasta,
40 name='Make pasta',
41)
42
43make_pasta.run()
What if you were to run the same workflow several times only with slight variations of input, say cook tagliatelle for three persons and ravioli for six? Meet the BranchProcess.