HWRF  trunk@4391
Run ex-Scripts Manually

This page describes how to run the scripts/ex*.py scripts manually. This is an invaluable way of debugging the HWRF scripts when adding new functionality or testing a port of the system to a new supercomputer. This is a quick way of finding simple bugs, such as syntax errors, without having to wait hours for a job to go through the batch queue. It is best to use the automated methods to run the HWRF as far in the workflow as possible and then only run the job you are debugging manually. Otherwise, you'll lose a lot more time in manual labor than you would save in batch queue wait time.

Doing this requires the bash, ksh, or sh shells because you must load the storm1.holdvars.txt file. That file sets most environment variables required by the HWRF Python scripts. There are a few scripts that need more variables set, such as the exhwrf_init.py, so make sure you read the instructions specific to those jobs if you want to run them directly.

Manually Running the exhwrf_launch

This is the most complicated job to run manually. The syntax is similar to the run_hwrf script but with a few added arguments:

cd /path/to/HWRF/scripts
export HOMEhwrf=/path/to/HWRF
export USHhwrf=/path/to/HWRF/ush
./exhwrf_launch.py 2015092912 11L HISTORY /path/to/HWRF/parm \
     config.EXPT=HWRF ... more configuration options ...

If the launcher job completes successfully, you should see the message "exhwrf_launch completed." The launcher will print out the path to the COM directory. Make sure you remember that path since you will need it in the later commands.

Serial and OpenMP Jobs

After the launcher jobs runs, the directory structure is set up to run most of the other HWRF jobs. In particular, the storm1.holdvars.txt is created. That file contains a sequence of bash/sh/ksh commands to set environment variables the Python scripts need to start HWRF jobs. A few jobs need more work, described in later sections. Also, if a job requires MPI, you will need to run it in an interactive batch job, or just Control-C the job before it tries to start an MPI program (if you're just checking job setup and syntax errors).

Here is an example of one of the simpler ones: the exhwrf_input job Note that we have added one directory, which you must find manually: the storm1.holdvars.txt file. It is in the COM directory of the cycle in question, and its path was printed out by the exhwrf_launch job.

cd /path/to/HWRF/scripts
( . /path/to/CDSCRUB/com/2015092912/11L/storm1.holdvars.txt ; \
    export PYTHONPATH=$USHhwrf ; $EXhwrf/exhwrf_input.py )

If the input job completes successfully, it will print "HWRF input job completed"

Forecast and Other MPI Jobs

A more complicated job is the exhwrf_forecast job. In this job, and other MPI jobs, you must set the $TOTAL_TASKS environment variable on most platforms so the scripts know how many MPI ranks are available. There are two situations you will want to run this job manually:

  1. To see if the inputs to the forecast are set up correctly. In this case, you do not need to run the forecast MPI program.
  2. To test the actual forecast.

In case #1 (no MPI job), you would put in a fake MPI environment, by lying and saying there is only one MPI task, and then kill the process (Control-C) before it tries to start the MPI program. Here are the bash/sh/ksh commands:

cd /path/to/HWRF/scripts
( . /path/to/CDSCRUB/com/2015092912/11L/storm1.holdvars.txt ; \
    export PYTHONPATH=$USHhwrf TOTAL_TASKS=1 ; \
    $EXhwrf/exhwrf_forecast.py )

Eventually, when you see the job try to start the MPI program, quickly hit Control-C. On WCOSS and Jet, the MPI program will abort immediately anyway, but on Theia, it will try to actually run. Depending on what else is going on in the front-end node, that may be a Bad Thing.

In case #2 (wanting to actually run the job) you will need to submit an interactive batch job with the correct task geometry. First, investigate the number of tasks the forecast job actually uses by looking in the [runwrf] section of storm1.conf, or at the job submitted by Rocoto, ecFlow, or the wrapper scripts. Then, start an interactive batch job with the number of processors you require. Suppose the number is 1234. The command, within the interactive batch job, would be this:

cd /path/to/HWRF/scripts
( . /path/to/CDSCRUB/com/2015092912/11L/storm1.holdvars.txt ; \
    export PYTHONPATH=$USHhwrf TOTAL_TASKS=1234 ; \
    $EXhwrf/exhwrf_forecast.py )

The INIT and BDY Jobs

Some of the jobs require extra environment variables. The INIT and BDY jobs are among these. The BDY job is actually just the INIT job with different variables set.

Read rocoto/tasks/init.ent and rocoto/tasks/bdy.ent to see which variables are set by the Rocoto init and bdy jobs.

Archiving and Input Jobs

The HPSS archiving jobs and HWRF input jobs have additional resource requirements. They need to be run on a node with HPSS or network access and a moderate amount of available memory (about 1–3 GB). A front-end node will suffice for this. If you decide to run in an interactive job instead, make sure you submit it to a queue with HPSS or network access and a full OS installation. On NOAA machines, those queues are called "rdtn," "transfer," or "service."