Tutorial#
This tutorial will guide you throughout writing a simple vortex script. This script will result in the execution of a Python program whose behavior resembles this of a numerical weather prediction forecast program.
To run succesfully, the program must be able to read two files in the current directory:
A file named
ICSHMFCSINIT, containing input data required by the program to run. You can think of it as a file containing forecast initial condition data.A file named
fort.4, describing configuration keys and values required by the program. In this example, this file only specifies the value of a configuration keyTERMwhich determines how many result files will be written to disk. You can think of this configuration as the analoguous of a forecast term.
Therefore, prior to running our fake forecast program, our working directory must look like this:
fake-forecast.py
fort.4
ICMSHFCSINIT
Supposing the content of the configuration file fort.4 is
# fort.4
TERM=3
after running the fake forecast program, the working directory will contain three extra files to look like this:
fake-forecast.py
fort.4
ICMSHFCSINIT
ICMSHFCST+01:00.grib
ICMSHFCST+02:00.grib
ICMSHFCST+03:00.grib
Initial set up#
To work through this tutorial, you will need to download the
input data archive and extract
its content on your computer. The exctraction location doesn’t matter,
as long as it is somewhere where you have read and write access.
The archive contains the following files:
fake-forecast.py: A Python script that reads in a data file and a configuration file, both expected to be present in the current directory, and writes a set of filesICMSHFCST+01:00.grib,ICMSHFCST+02:00.grib,ICMSHFCST+03:00.grib. The number of output files is specified by the content of the configuration file.forecast_configuration_files/main_arpege.nam: A configuration file consisting of oneKEY=VALUEpair per line.data_tree: The root directory for the vortex data tree, from which input files are fetched from and ouput files written to.
Note
The examples in this tutorial assume that the tutorial data archive
was extracted in directory /home/user. Be sure to replace this
path by the path to the directory where your extracted the tutorial
data archive.
This tutorial is a guide to writing a Python script, using the
vortex library, to fetch required input data from the data tree
directory, run the fetch-forecast.py program, and write output
files to the data tree.
Start by creating a empty directory. The location and name do not
matter and we’ll just call it vortex-tutorial. Using your
favorite text editor, create a new Python file run-forecast.py and
write the following lines to it:
# run-forecast.py
import vortex as vtx
vtx.config.set_config(
section="data-tree",
key="rootdir",
# Be sure to replace "/home/user/" by the path where you
# extracted the tutorial data archive.
value="/home/user/vortex-tutorial-data/vortex_data_tree",
)
print(
"The data tree root is",
vtx.config.from_config("data-tree", "rootdir"),
)
The call to vortex.config.set_config() specifies the location
of the vortex data tree, a directory hierarchy that holds data files
that vortex can fetch data from and write data to. By default, the
data tree is located in the user’s home directory, but for the purpose
of this tutorial you will configure the data tree root node to be the
directory vortex_data_tree located within the tutorial data files.
Tip
You can run the script with python -i to execute the script in an
interactive Python session.
See also
See Configuring vortex for more information about configuring vortex, including setting an alternative location for the data tree.
Fetching input data#
Use the vortex.input() function to define an input resource
for the initial condition input file:
handlers = vtx.input(
kind="analysis",
date="2024082600",
model="arpege",
cutoff="production",
filling="atm",
geometry="global1798",
nativefmt="grib",
vapp="tutorial",
vconf="fake-forecast",
experiment="vortex-tutorial",
block="4dupd2",
local="ICMSHFCSTINIT",
)
The vortex.input() function returns a list of objects of type
Handler. In our case, this
list contains only a single item mapping to the initial condition
file.
initial_condition = handlers[0]
An instance of Handler is
able to compute the file path to the underlying physical file:
initial_condition.locate()
This path is computed from the values of the arguments passed to the
vortex.input() function. This path can be computed because
the initial condition file is a ressource that was stored by another
vortex script. Its location is therefore well defined within a
standardised data tree layout, see The vortex data tree.
Note
The vortex.input() function does not actually fetch the
corresponding file into the current working directory, it only
defines (a) Handler
object(s) that provide(s) access to the get method.
To fetch the file into the current working directory, use the
get method on the
resource handler:
initial_condition.get()
The second step is to fetch the configuration file in the current
working directory, as a file named fort.4 since this is what the
fake forecast expects.
Similarly to the initial condition file, use the
vortex.input() function again:
config_file = vtx.input(
kind="namelist",
model="arpege",
remote="/home/user/vortex-tutorial-data/forecast_configuration_files/main_arpege.nam",
local="fort.4",
)[0]
Attention
Be sure to replace /home/user by the path to the directory
where you extracted the tutorial data.
The call to vortex.input() is much simpler. This time, the
path to the configuration file is specified explicitly using the
remote argument, instead of being computed by vortex from the
arguments of vortex.input().
See also
Finally, use the get
method on the config_file handler to fetch the file into the
current working directory.
config_file.get()
You can verify that a new file named fort.4 was created in the
current working directory. This file is in fact a (hard) link pointing
to the location specified as a value to the remote argument to
vortex.input().
Running the fake forecast program#
With the input data files copied into the current working directory, you are now ready to run the program. You will first fetch the program itself – in this case a Python script – into the current working directory, then instanciate an algorithmic component object which will be responsible to actually run the script.
Fetching the fake forecast program#
The VORTEX library considers programs, whether they are scripts written in interpreted languages or compiled binaries, as executables. Fetching an executable is similar to fetching an input data file:
exe = vtx.executable(
kind="script",
language="python",
# Replace "/home/user" by the path to the directory you
# extracted the tutorial data to.
remote="/home/user/vortex-tutorial-data/fake-forecast.py",
local="fake-forecast.py",
)[0]
Similarly to vortex.input(), vortex.executable returns a
list of instances of the Handler class, which you can call get on:
# Fetch the Python script into the current working directory
exe.get()
Running the script through an algo component#
The vortex library provides a collection of classes that define how to run specific programs. These classes are referred to as algorithmic components.
Algorithmic components classes are instanciated using the
vortex.task() function:
task = vtx.task(
interpreter="python",
engine="exec",
)
With interpreter="python" and engine="exec", the vtx.algo
returns an instance of vortex.algo.components.Expresso.
This class encapsulates behavior required the run a Python script,
potentially setting up environment variables like PYTHONPATH or
switching to a different Python interpreter.
Finally, the script can be run using the run method on the task
object, which takes an executable object as a argument.
task.run(exe)
At this point, the script ran and produced 3 files ICMSHFCST+01:00.grib,
ICMSHFCST+02:00.grib and ICMSHFCST+03:00.grib in the current working
directory. The next step is to store them into the vortex data tree,
so that they can be retrieved later by other vortex scripts.
Storing outputs into the data tree#
In this section we use the vortex.output() function to store
the files generated by the fake forecast program into the vortex
data tree. This way, subsequent vortex
scripts will be able to retrieve them using the
vortex.input() function.
Storing files in the data tree is achieved by calling the
vortex.output(). Its interface is identical to
vortex.input()’s:
historic_files = vtx.output(
kind="modelstate",
date="2024082600",
model="arpege",
cutoff="production",
geometry="global1798",
nativefmt="grib",
vapp="tutorial",
vconf="fake-forecast",
experiment="vortex-tutorial",
term=[1, 2, 3],
block="forecast",
local="ICMSHFCST+[term].grib",
)
The vortex.output() function returns a list Handlers
instances whose put
method works in the opposite direction of get: instead of reading files from the
data tree, it writes to it files present in the current working
directory that are named as the value passed to the local argument
to vortex.output().
Note the addition of the argument term, also referenced within the
string passed to local:
historic_files = vtx.output(
# ...
term=[1, 2, 3],
local="ICMSHFCST+[term].grib",
)
Values of arguments to functions such as vortex.input(),
vortex.output() or vortex.executable() can reference
the values of other arguments. Sequence are expanded into as many
elements as they contain. In this case, vtx.output returns a list
of 3 Handler objects instead
of a single object.
Finally, calling put on
the handlers will write the files into the data tree:
for handler in historic_files:
handler.put()
You can now list the content of the forecast block to check that the
3 files where indeed written there:
DATATREE_ROOT=<tutorial/data>/vortex_data_tree
ls -l $DATATREE_ROOT/tutorial/fake-forecast/vortex-tutorial/20240826T0000P/forecast
Setting default values#
Definitions of vortex inputs and outputs often feature the same
arguments and values. Vortex provides the vortex.defaults()
function, which can be used to prevent repeating arguments to
functions vortex.input(), vortex.output() or
vortex.executable().
Using vortex.defaults(), the script becomes:
import vortex as vtx
vtx.config.set_config(
section="data-tree", key="rootdir",
value="/home/user/vortex-tutorial-data/vortex_data_tree",
)
vtx.defaults(
date="2024082600",
model="arpege",
cutoff="production",
geometry="global1798",
nativefmt="grib",
vapp="tutorial",
vconf="fake-forecast",
experiment="vortex-tutorial",
)
initial_condition = vtx.input(
kind="analysis",
filling="atm",
local="ICMSHFCSTINIT",
block="4dupd2",
)[0]
initial_condition.get()
config_file = vtx.input(
kind="namelist",
remote="/home/user/vortex-tutorial-data/forecast_configuration_files/main_arpege.nam",
local="fort.4",
)[0]
config_file.get()
exe = vtx.executable(
kind="script",
language="python",
remote="/home/user/vortex-tutorial-data/fake-forecast.py",
local="fake-forecast.py",
)[0]
exe.get()
vtx.task(interpreter="python", engine="exec").run(exe)
for output_handler in vtx.output(
kind="modelstate",
nativefmt="grib",
local="ICMSHFCST+[term].grib",
block="forecast",
term=[1, 2, 3],
):
output_handler.put()
Attention
Be sure to replace "/home/user" by the path to the directory
you extracted the tutorial data to.
A post-processing task#
We conclude this tutorial by implementing a second vortex script, which will illustrate the way outputs of one vortex script can be used as inputs of another.
This new vortex script will:
fetch all three forecast output files
concatenate them
write the resulting file back into the data tree
Open a new file aggregate-task.py and start with calling
vortex.input():
import vortex as vtx
vtx.config.set_config(
section="data-tree",
key="rootdir",
# Be sure to replace "/home/user/" by the path where you
# extracted the tutorial data archive.
value="/home/user/vortex-tutorial-data/vortex_data_tree",
)
vtx.defaults(
date="2024082600",
model="arpege",
cutoff="production",
vapp="tutorial",
vconf="fake-forecast",
experiment="vortex-tutorial",
geometry="global1798",
)
historic_files = vtx.output(
kind="modelstate",
nativefmt="grib",
term=[1, 2, 3],
local="ICMSHFCST+[term].grib",
block="forecast",
)
for handler in historic_files:
handler.get()
Note
Because the location of the data tree root is different from the
default $HOME/.vortex.d, it is necessary to call
vortex.config.set_config() again at the beginning of the
script.
For convenience, we could instead use the default location or specify the location of the data tree in the configuration.
Observe that the arguments specified are identical to those provided
to the vortex.output() function in section
Setting default values.
With the three files present in the working directory, let’s
concatenate them into a new file result.txt:
with open("result.txt", "w") as target:
for handler in historic_files:
with open(handler.container.localpath(), "r") as source:
target.writelines(source.readlines())
For the sake of this tutorial, let’s pretend that we want to interpret
the output file result.txt as a horizontal diagnostics
resource, and write it into the data
tree as a such:
# Reminder: vortex.output returns a list, even if
# there is only one element in it.
for output in vtx.output(
kind="ddh",
scope="global",
nativefmt="lfi",
block="postprocessing",
term=3,
local="result.txt",
):
# Write content of file "result.txt" in current working directory
# as file ddh.arpege-global.tl1798-c22+0003:00.lfi
# in the data tree
output.put()
DATATREE_ROOT=/home/user/vortex-tutorial-data/vortex_data_tree
ls -l $DATATREE_ROOT/tutorial/fake-forecast/vortex-tutorial/20240826T0000P/postprocessing
Note
The use of the DDH resource in the previous example is arbitrary,
for the sole purpose of illustrating the storage of the result of a
postprocessing operation into the data tree. The content of file
ddh.arpege-global.tl1798-c22+0003:00.lfi is not really this
of a true horizontal diagnostics file.