case – describe a modelling case

This case module describes a modelling case that might involve running a set of programs in sequence, transferring output from one program to input to the others, etc.

Some key features:
  • By default a unique temporary working directory is created when a case is initiated. This means that you can have several cases running in parallel without having them intefering with each other. The methods add_file() and get_file() makes it easy to copy input and result files to/from the working directory.
  • All program execution and file manipulation is logged via the logging module in the standard library. By default a logfile for the current case is written to the working directory. All logging is also direct accessable to an outer framework using this module.
  • Provides a clean way to extract data fromput files from one program and update input files for another program.
  • Provides an easy way to test program execution and check if the output is as expected.

The main component in this module is the Case class, which can be accessed in the following ways:

  1. Create an instance and call its methods (might be useful in interactive sessions and simple scripts), e.g.:

    c = Case('mycase')
    c.start()
    c.add_file('my_inputfile.txt')
    status = c.execute('myprogram')
    assert status.returncode == 0
    c.close()
    

    You have to call the close() method to clean up the working directory. Use try.. finally to ensure that close() is called even if an unhandled exception is raised.

    This method cannot be used to run cases in a background thread using the BgCase class.

  2. Provide a target argument to the constructor (useful in test scripts and simple programs), e.g.:

    def target(c):
        c.add_file('my_inputfile.txt')
        ...
    
    c = case('mycase, target=target)
    c.start()

    This calls the close() method automatically when target() returns.

  3. subclassing and overwrite the run() method (useful in larger frameworks or when you need more control), e.g.:

    class MyCase(Case):
        def run(self):
            self.add_file('my_inputfile.txt')
            ...
    
    c = MyCase()
    c.start()
    

    This calls the close() method automatically when the run() method returns.

This module defines the following objects:

Case object

class pario.case.Case(casename, casedir=None, workdir=None, tmpdir=None, cleanup=None, target=None)[source]

A class for running a modelling case.

Parameters :

casename : string

Name of this case.

casedir : None | string

Directory with all inputfiles for the case. If None, it defaults to the current working directory.

workdir : None | string

The working directory to run the commands in. If None, a unique temporary directory under tmpdir will be used.

tmpdir : None | string

Directory for temporary working directories. If None, the environment variable CASE_TMPDIR will be used. If this isn’t defined, a subdirectory “tmp” in the current working directory will be used (created if it does not exists).

cleanup : None, bool | “on_success”

Whether to remove workdir when run() is finished. If the special value of “on_success” is given, workdir is only removed if no unhandled exception is raised during the execution of run(). This might be very useful for debugging. The default is False if workdir is provided and “on_success” otherwise.

target : None | callable

A callable object that will be invoked when the start() method is called. It will be called as target(case, *args, **kwargs), where case is a reference to the Case object and args and kwargs are the arguments passed to start().

Note

CaseSet objects might call target() with a keyword argument parameters set to a dictionary holding the names and new values of not already handled parameters that should be changed from their default values in this case.

Notes

A new logger named casename is created for each subclass instance, with a default handler that logs to the file casename.log in workdir.

See https://docs.python.org/2/howto/logging.html#configuring-logging for global logging configurations to customize the logging after your needs.

add_file(src, dest=None, symlink=False, literal=False)[source]

Adds file to working directory.

Copies the input file src to working directory.

Parameters :

src : string

File to copy. If a relative path is given, it is relative to the case directory.

dest : None | string

If dest is None, src is copied to the working directory. If a relative path is given, it is relative to the working directory.

symlink : bool

If symlink is true and the OS supports symbolic links, a symbolic link is created instead.

literal : bool

If literal is True, (the string representation of) src will be written to dest instead of copying a file.

call(func, args=(), kwargs={}, conditions=None, maxtime=None, full_output=False)

Calls function func in a separate process and wait for it to finish. The current working directory of func is set to self.workdir.

Parameters :

func : callable

Command to run.

args : tuple

Positional arguments passed to func.

kwargs : dict

Additional keyword arguments passed to func.

conditions : None | sequence

A sequence of conditions for considering the output of command up-to-date. Each condition is a tuple with arguments to condition(), i.e. (op, file1) or (op, file1, file2). If conditions is not None and all conditions evaluates to true, then command is considered to have succeeded without being executed.

maxtime : float

Maximum execution time in seconds. If maxtime is not None or zero, the child process will be terminated when its execution time exceeds maxtime.

full_output : bool

If true, (retval, returncode, exectime) is returned, otherwise only retval is returned.

Returns :

retval : object

The object returned by func. This is None if the child process was terminated.

returncode : int (optional)

The exit code of child process. This will be None, if func was never called due to all conditions evaluates the output to be up-to-date. A negative value N indicates that it was killed with signal N.

exectime : float (optional)

The execution time in seconds.

casedir

Name of case directory (read only).

casename

Name of this case (read only).

casepar

Dict with parameter-value pairs set by the caller. This is e.g. used by caseset.

cleanup

Whether to remove workdir when closing the case.

close(cleanup=None)[source]

Explicitely tears down the case.

This does the following:
  1. closes the logger
  2. optionally removes the working directory
cleanup may take the following values:
  • True : the working directory is removed
  • False : the working directory is not removed
  • “on_success” : the working directory is removed if no unhandled exception was raised. This only makes sence if Case is subclassed or a target was provided in the constructor.
  • None : use the value provided in the constructor.

Notes

The logger will be deleted the first time this function is called. However it is possible to call close() with cleanup=False to close the case but keep the output, and then at a later stage call it again with cleanup=True to clean up the working directory.

closed

Whether close() has been called (read only).

critical(msg)[source]

Logs a message with level CRITICAL.

debug(msg)[source]

Logs a message with level DEBUG.

error(msg)[source]

Logs a message with level ERROR.

execute(command, expected_returncode=0, conditions=None, env=None, shell=False, stdin=None, maxtime=None, bufsize=0)[source]

Executes command and wait for it to finish.

The command is executed in the working directory and its standard output and error are stored in the files stdout<n>.txt and stderr<n>.txt in the working directory, where <n> is the number of times execute() has been called for this case instance.

Parameters :

command : string | sequence

Command to run.

expected_returncode : None | int

If expected_returncode is not None, an ExecutionError is raised if the returncode of command does not equal expected_returncode.

conditions : None | sequence

A sequence of conditions for considering the output of command up-to-date. Each condition is a tuple with arguments to condition(), i.e. (op, file1) or (op, file1, file2). If conditions is not None and all conditions evaluates to true, then command is considered to have succeeded without being executed.

env : None | dict

If not None, env defines the environment variables for command.

shell : bool

Whether to execute command through the shell.

stdin : None | string | file-like

Standard input passed to command.

maxtime : float

Maximum execution time in seconds. If maxtime is not None or zero, the command will be killed (with SIGKILL) when its execution time exceeds maxtime.

bufsize : int

Same meaning as the corresponding argument to open(): 0 means unbuffered, 1 means line buffered, any other positive value means use a buffer of (approximately) that size and a negative value means to use the system default, which usually means fully buffered.

Returns :

status : ExeStatus instance

Returns an ExeStatus object holding information about the execution.

get_file(src, dest=None)[source]

Retrives a file from working directory.

Copies src to dest, where src and dest are relative to the working and case directories, respectively.

If dest is None, src is copied to the case directory.

If dest is an existing directory or ends with “/”, src is copied to this directory (possibly creating it).

getpar(fname, par, cls=None, cwd=None, **kw)[source]

Returns the value of a given parameter from a parameter file.

Parameters :

fname : string | file-like

The file to read the parameter from.

name : string

Name of the parameter to return.

fmt : None | string

File format. If none it might be deduced from fname.

info(msg)[source]

Logs a message with level INFO.

static md5sum(filename)[source]

Returns the md5 digest of filename.

nexe

Number of times execute() has been called (read only).

readdata(filespec, cwd=None)[source]

Reads data from files and return it as a list of 1d numpy arrays.

filespec is a comma-separated list of one or more file names, followed by a colon-separated list of colum names and option-value pairs.

If the filename are relative, they will be relative to the directory cwd.

The grammar for filespec can formally be written as:

filespec   ::= filename (":" columnspec | ":" option)* ["," filespec]
columnspec ::= column_id [ "[" indexspec "]" ] 
option     ::= name "=" value
indexspec  ::= integer | slice | column_id "=" val ("," val)*
column_id  ::= integer | string

Comma, colon, equal and backslash characters can be backslash-escaped.

Columns can be specified either by name or by number and optionally followed by a pair of bracket ([]) supporting standard indexing.

If your data file has columns ‘year’ and ‘population’ it is also possible to write population[year=2012,2014]. This will return the population at 2012 and 2014 using linearly interpolation. The column ‘year’ must be increasing. The options left and right (described below) are used if a specified year is outside the range of the data.

The name=value options are passed to recio.read() as keyword arguments to the reader. However, a few options are interpreated by readdata() and not passed further.

Options interpreated by readdata()
first_column : int
Index of the first column when column are specified as inter. The default is zero.
left : None | float
Value to return if val is smaller than the data.
right : None | float
Value to return if val is larger than the data.

Notes

This function uses eval() for the bracket indexing, which might be a security risk if filespec comes from an untrusted user.

Examples

Read ‘col1’ and ‘col2’ from ‘filename’ and return them:

filename:col1:col2

Read ‘col1’ and ‘col2’ from both ‘file1’ and ‘file2’ (returning four columns) and pass opt=val as keyword argument to recio.read() when reading the files:

file1,file2:col1:col2:opt=val

Read 3 columns, two from file1 and one from ‘file2’, passing opt1=val1 and opt2=val2 to recio.read():

file1:col1:col2:opt1=val1:opt2=val2,file2:col3

Read element 1 to 10 from column ‘col’ in ‘file’:

file:col[1:10]

Read the columns ‘year’ and ‘population’ from ‘population.txt’ and return interpolated population at 2012 and 2014:

population.txt:population[year=2012,2014]
readpar(fname=None, s=None, mode='error', cwd=None, cls=None)[source]

Returns a Parameters object from file or data.

This is a wrapper around pario.recio.read.

If fname is a file name and not an absolute path, it will be relative cwd. cwd defaults to the working directory.

cls is given, it must be a subclass of Parameters, usually defining another file format.

See also

parameters.Parameters
for full documentation
readrec(filename, fmt=None, lang=None, names=None, dtype=None, formats=None, titles=None, cwd=None, **kwargs)[source]

Reads filename and return a NumPy record array.

This is a wrapper around pario.recio.read. If filename is not an absolute path, it will be relative cwd, where cwd defaults to the working directory.

See also

recio.read
for full documentation
run(*args, **kwargs)[source]

Method defining the case actions.

You can override this method in your subclass. The default implementation calls the target provided in the constructor with arguments (self, *args, **kwargs).

Note

CaseSet objects might call run() with a keyword argument parameters set to a dictionary holding the names and new values of not already handled parameters that should be changed from their default values in this case.

running

Whether run() is running (read only).

seconds()[source]

Returns the number of seconds this case has been running (so far).

setpar(filename, par, value, cls=None, linesep=None, cwd=None, **kw)[source]

Sets the value of a parameter in a parameter file.

Parameters :

filename : string

The file to modify.

name : string

Name of the parameter to set.

value : correct type for this parameter

New value.

fmt : None | string

File format. If none it might be deduced from fname.

options : keywords

Keywords passed to the writer for the given format.

start(*args, **kwargs)[source]

Starts the case by invoking run() passing it args and kwargs.

started

Whether start() has been called (read only).

test(op, args, prefix=None)[source]

Evaluates a condition.

This function is typically used for checking whether the output of a program is up-to-date.

Parameters :

op : callable | string

The operator. If it is a callable it is called with args as arguments and should return true to indicate up-to-date, otherwise false.

op can also be one of the operator names below, optionally prepended with a “not ”, in which case the test is negated:

“anewer”, file1, file2

True if file1 and file2 exists and file1 is newer (access time) than file2.

“cnewer”, file1, file2

True if file1 and file2 exists and file1 is newer (metadata change) than file2.

“equal”, file1, file2

True if file1 is equal to file2.

“eval”, expr

True if Python expresseion expr evals to True.

“exists”, file1

True if file1 exists.

“isabs”, file1

True if file1 is an absolute path.

“isdir”, file1

True if file1 exists and is a directory.

“isfile”, file1

True if file1 exists and is a regular file.

“islink”, file1

True if file1 exists and is a symbolic link.

“md5sum”, file1

True if the MD5 message digest of file1 equals the string literal file2.

“newer”, file1, file2

True if file1 and file2 exists and file1 is newer (modification time) than file2.

“samefile”, file1, file2

True if file1 and file2 exists and are the same file.

args : sequence

Arguments passed to the operator.

prefix : string

A directory prefix to prepend file arguments, e.g. the working directory.

Returns :

status: bool

True if the condition evaluates to true, otherwise False.

Notes

The docstring for this method is automatically created from condition(). For this method the prefix argument defaults to workdir.

warning(msg)[source]

Logs a message with level WARNING.

workdir

Name of working directory (read only).

writerec(filename, rec, fmt=None, lang=None, cwd=None, **kwargs)[source]

Write NumPy record array rec to filename.

Parameters :

filename : string

Name of file to write.

rec : record array

Record array to write.

fmt : None | ‘csv’ | ‘xls’ | ‘txt’

File format. Valid values are:
  • None - fmt is taken from the file extension.
  • ‘csv’- use matplotlib.mlab.rec2csv(). Requires matplotlib.
  • ‘xls’- use rec2excel(). Requires xlrd.
  • otherwise use rec2string(), assuming an Alpack table.

lang : None | string

Read localized csv input. In addition to the platform-specifig country-codes, lang may also be one of: “C”, “american”, “danish”, “english”, “french”, “german”, “norwegian” or “swedish”.

kwargs :

Keyword arguments are passed to the writer.

BgCase object

class pario.case.BgCase(casename, casedir=None, workdir=None, tmpdir=None, cleanup=None, target=None)[source]

Class for running a case in a background thread.

This is a subclass of Case defining a few extra methods for dealing with threads.

BgCase objects must be subclassed or instantiated with a target. If you want to transfer data from a running BgCase instance to a calling framework, you can use the Queue module documented in https://docs.python.org/2/library/queue.html.

get_thread()[source]

Returns a reference to the thread object. Cannot be called before start().

isAlive()[source]

Returns whether the thread is alive. Cannot be called before start().

isDaemon()[source]

Returns whether this is a daemon thread. The entire Python program exists when no more alive non-daemon threads are left.

join(timeout=None, balancing=True)[source]

Wait until the thread terminates.

This blocks the calling thread until the thread whose join() method is called terminates – either normally or through an unhandled exception or until the optional timeout occurs.

When the timeout argument is present and not None, it should be a floating point number specifying a timeout for the operation in seconds (or fractions thereof). As join() always returns None, you must call isAlive() after join() to decide whether a timeout happened – if the thread is still alive, the join() call timed out.

When the timeout argument is not present or None, the operation will block until the thread terminates.

A thread can be join()ed many times.

join() raises a RuntimeError if an attempt is made to join the current thread as that would cause a deadlock. It is also an error to join() a thread before it has been started and attempts to do so raises the same exception.

setDaemon(daemonic)[source]

Sets whether this thread should be daemonic. Must be called before start().

start(*args, **kwargs)[source]

Starts the case in a background thread and invoke run() passing it args and kwargs.

ExeStatus object

class pario.case.ExeStatus(returncode, nexe, stdoutfile, stderrfile, exetime)[source]

The status after program execution returned by Case.execute().

It has the following attributes:

returncode
Program return code, or None if the program was not executed since its output is up-to-date.
nexe
Which execution number this output corresponds to.
stdoutfile
Name of the file stdout is written to or None (if up-to-date).
stderrfile
Name of the file stderr is written to or None (if up-to-date).
exetime
Execution time in seconds.

Table Of Contents

Previous topic

chemistry – alloy chemistry

Next topic

caseset – runs a case for a set of different input parameters

This Page