Read/write tabular data to/from NumPy record arrays. Alpack tables, csv and excel formats are supported.
A table has three components:
- heading - Initial lines starting with a hash (#), not including
the header (only supported with Alpack tables).
header - The last initial line starting with a hash (#).
- data - The lines following the header. A blank line or a line
starting with a hash (#) ends the The data section.
Example of a Alpack table:
# This is a heading...
#
# time(s) T(C) Css PZ(Pa)
0.0 330.0 0.0053138 0.0
1.0 330.0 0.0053138 0.0
5.0 330.0 0.0053138 0.0
10.0 330.0 0.0053138 0.0
Splits a string considering escape characters and quotes.
Parameters : | s : string
sep : string
escape : None | string
keep : bool
quote : None | string
endquote : None | string
maxsplit | None | int
|
---|
Examples
>>> escaped_split(r'a\:\=b=c', '=')
['a\\:\\=b', 'c']
>>> escaped_split(r'a\:\=b=c', '=', keep=False)
['a\\:=b', 'c']
>>> escaped_split(r'a\:\=b=cd', '=')
['a\\:\\=b', 'cd']
>>> escaped_split(r'a\:\=b=cd', '=', keep=False)
['a\\:=b', 'cd']
>>> escaped_split('col1[col2=u]:opt=v', '=', quote='[', endquote=']')
['col1[col2=u]:opt', 'v']
Reads a record array from a Excel document.
Parameters : | filename: string
sheetname: None | string
startcell: string
endcell: None | string
has_header: bool
names: None | sequence of strings | comma-separated string
dtype: dtype instance
formats: None | sequence of strings
titles: None | sequence of strings | comma-separated string
|
---|
Returns (row, col) tuple given a cell id.
Examples
>>> excel_cell2tuple('B1')
(0, 1)
Returns column number n as string.
Examples
>>> excel_column2string(0)
'A'
>>> excel_column2string(26)
'AA'
Returns column label s as number.
Examples
>>> excel_string2column('A')
0
>>> excel_string2column('AA')
26
Returns cell id given (row, col) tuple.
Examples
>>> excel_tuple2cell((0, 1))
'B1'
Returns all available filters. Input formats are returned if reading is true, otherwise output formats are returned.
Returns column col from record array rec, where col is a field name, title or number (counting from first_field). If unit is given, the output will be converted to this unit.
Returns the given column from filename. If index is given, it must be a valid NumPy index, or a string representation of it. kwargs are passed on to read(). **kwargs is passed on to read().
Returns guessed format based on fmt or filename. If no format can be determined fallback is returned.
Looks up name in the local, global or builtin scopes in turns, returning it’s value in the scope where it is first found.
Parsing heading and extract description, tags, fieldheads.
Parameters : | lines : sequence | string | bytes
lineno : int
comment : string
maxheads : None | int
converters : None | sequence | string
|
---|---|
Returns : | tags : dict
fieldheads : list
lineno : int
|
Notes
A heading is an initial set of lines before the acual data starting with the comment character/string. It may contain:
description - running text describing the data. This is returned as tags['description']. Non-special lines that are not tags or fieldheads are identified are description lines.
tags - a more formal description of the data with tag names and associated values. Tag lines are indented and of the form:
key :: valuefieldheads - are the last lines in the heading containing the same number of tokens as the number of fields/columns in the following data. Normally it is just the header, but it might span several lines. A good practice is to leave a blank line (except for the initial comment) before the fieldheads to avoid that the line above unintended is treated as a fieldhead.
Reads filename and return a NumPy record array.
Parameters : | filename : string
fmt : None | ‘csv’ | ‘xls’ | ‘txt’
names : None | sequence of strings | comma-separated string
lang : None | string
dtype : dtype instance
formats : None | sequence of strings
titles : None | sequence of strings | comma-separated string
kwargs :
|
---|
Reads data from files and return it as a list of 1d numpy arrays.
filespec is a comma-separated list of one or more file names, followed by a colon-separated list of colum names and option-value pairs.
If the filename are relative, they will be relative to the directory cwd.
The grammar for filespec can formally be written as:
filespec ::= filename (":" columnspec | ":" option)* ["," filespec]
columnspec ::= column_id [ "[" indexspec "]" ]
option ::= name "=" value
indexspec ::= integer | slice | column_id "=" val ("," val)*
column_id ::= integer | string
Comma, colon, equal and backslash characters can be backslash-escaped.
Columns can be specified either by name or by number and optionally followed by a pair of bracket ([]) supporting standard indexing.
If your data file has columns ‘year’ and ‘population’ it is also possible to write population[year=2012,2014]. This will return the population at 2012 and 2014 using linearly interpolation. The column ‘year’ must be increasing. The options left and right (described below) are used if a specified year is outside the range of the data.
The name=value options are passed to recio.read() as keyword arguments to the reader. However, a few options are interpreated by readdata() and not passed further.
Notes
This function uses eval() for the bracket indexing, which might be a security risk if filespec comes from an untrusted user.
Examples
Read ‘col1’ and ‘col2’ from ‘filename’ and return them:
filename:col1:col2
Read ‘col1’ and ‘col2’ from both ‘file1’ and ‘file2’ (returning four columns) and pass opt=val as keyword argument to recio.read() when reading the files:
file1,file2:col1:col2:opt=val
Read 3 columns, two from file1 and one from ‘file2’, passing opt1=val1 and opt2=val2 to recio.read():
file1:col1:col2:opt1=val1:opt2=val2,file2:col3
Read element 1 to 10 from column ‘col’ in ‘file’:
file:col[1:10]
Read the columns ‘year’ and ‘population’ from ‘population.txt’ and return interpolated population at 2012 and 2014:
population.txt:population[year=2012,2014]
A subclass of np.recarray containing most of the I/O features provided with this module in addition to:
- field units
- field heads, i.e.
- tags
Parameters : | array: array_like
dtype : data-type
names: None | sequence of strings | string
units: bool | sequence of strings | string
titles: None | sequence of strings | string
tags : None | mapping | string
fieldheads : None | sequence
formats : list of data-types
aligned : bool
byteorder : {‘<’, ‘>’, ‘=’}
copy : bool
|
---|
Examples
>>> rec([(u'a', 1, 1.1), ('b', 2, 2.2)], units=False)
rec([(u'a', 1, 1.1), (u'b', 2, 2.2)],
dtype=[('f0', '<U4'), ('f1', '<i8'), ('f2', '<f8')])
Returns a new rec object with arrays arrs appended as new fields.
Note that arrs must be a sequence, even if only a single field is appended.
The arguments names, formats, and units applies to the new fields. They can be provided as squences or as comma- separated strings.
fieldheads is a sequence of field heads for each array in arrs. Note that this implies that fieldheads is transposed, compared to fieldheads property.
Returns a view as a numpy record array.
Returns a view of self with the named fields dropped.
Opens a graphical window for editing values.
This is a convenient wrapper around arrayedit.arrayedit() with wait set to true. Refer to this function for a description of arguments and return value.
Direct access to list of fieldheads.
Reads filename from file and returns a rec instance.
This is a convenient wrapper around read(). Refer to this function for a description of the arguments.
Parses lines and returns a rec instance.
This is a convenient wrapper around string2rec(). Refer to this function for a description of the arguments.
Insert a new field head to the list of field heads (before index).
fieldhead must be a sequence of the same length as the number of fields. If index is “end”, the new field head is appended.
Field labels.
Field names.
Direct access to tags dictionary.
Writes self to filename.
This is a convenient wrapper around write(). Refer to this function for a description of the arguments.
Returns self as string.
This is a convenient wrapper around rec2string(). Refer to this function for a description of arguments and return value.
Field units.
Returns a view of self with only the named fields.
Formats a numpy records array as a html table
Column labels can be grouped together by prefixing a set of adjacent labels with the same prefix followed by colon.
Parameters : | rec: record array
indent: int
columns: None | sequence
formatters: None | str | callable | dict
tableclass: None | str
rowclasses: None | sequence of strings
columnclasses: None | sequence of strings
linesep: str
|
---|
Examples
>>> import numpy as np
>>> rec = np.rec.fromarrays([(1, 2, 3), ('a', 'b', 'c'), (1.2, 2.3, 4.3)],
... names=('i', 's', 'f'))
>>> columns = [('s', 'string'), ('i', 'number:int'), ('f', 'number:float')]
>>> print(rec2html(rec, columns=columns))
<table>
<tr>
<th rowspan="2">string</th>
<th colspan="2">number</th>
</tr>
<tr>
<th>int</th>
<th>float</th>
</tr>
<tr>
<td>a</td>
<td>1</td>
<td>1.2</td>
</tr>
<tr>
<td>b</td>
<td>2</td>
<td>2.3</td>
</tr>
<tr>
<td>c</td>
<td>3</td>
<td>4.3</td>
</tr>
</table>
Convert record array rec to a string.
Parameters : | heading : None | sequence of strings
header : None | sequence of strings | comma-separated string | nested seq
sep : string
linesep : None | string
comment : string
alignments : None | string | sequence of strings
precisions : None | int | sequence of None/int
types : None | string | sequence of strings
|
---|
Notes
See https://docs.python.org/2/library/string.html#format-specification-mini-language for more information about the type characters.
Returns a new record array with the arrays in arrs appended as new fields. The arguments names, formats, and titles applies to the new fields. They can be provided as squences or as comma-separated strings.
If rec has titles, default titles will the initial part of name up to the first left paranthesis if name contains a left paranthesis, otherwise name with “()” appended to it.
Returns a list with names converted to proper field names of rec. names may be a sequence or a comma-separated string with field names, titles or field numbers. first_field is the number of the first field.
Returns a list titles. Unset titles will be: LABEL if name is “LABEL(UNIT)” [i.e. name contains ‘(‘] LABEL() if name is “LABEL” [i.e. name doesn’t contains ‘(‘]
Returns a view of self with the named fields dropped.
Returns a list of field formats.
Returns the index of the given column, where col is a field name, title or number (counting from first_field).
Returns the label of the given column, where col is a field name, title or number (counting from first_field).
Returns the name of the given column, where col is a field name, title or number (counting from first_field).
Returns the title of the given column, where col is a field name, title or number (counting from first_field).
Returns the label of the given column, where col is a field name, title or number (counting from first_field). An enpty string is returned if the column has no unit.
Returns true if record array rec has titles, otherwise false.
Returns a view of rec with arr inserted before position index. The arguments name, format, title and offset applies to the new fields.
If rec has titles, default titles will the initial part of name up to the first left paranthesis if name contains a left paranthesis, otherwise name with “()” appended to it.
Returns a view of rec with only the named fields.
Returns a list of field names.
Returns a list of field offsets.
Sets the name of the given column to value, where col is a field name, title or number (counting from first_field).
NOTE that this function does not change the titles. Hence:
rec_setlabel(rec, 0, 'length(m)')
will not make “length” an alias for the first column.’
Sets the name of the given column to value, where col is a field name, title or number (counting from first_field).
Returns a list of column units (parts of the field name enclosed in parenthisis). An empty string is returned for fields without unit.
Sets column col in record array rec to value, where col is a field name, title or number (counting from first_field). unit is the unit of value. If given, value will first be converted to the unit of the given column.
Sets the given column in filename to value. If index is given, it must be a valid NumPy index, or a string representation of it. **kwargs is passed on to read().
Splits label into a name and unit-part, where unit is optional. If unit is included, it follows name and is enclosed in parentheses.
Examples
>>> split_label('length')
('length', '')
>>> split_label('length(m)')
('length', 'm')
Converts the string value to bool and returns the result. Valid strings are:
'TRUE', 'True', 'true', 'YES', 'Yes', 'yes', '.TRUE.', '.true.'
or:
'FALSE', 'False', 'false', 'NO', 'No', 'no', '.FALSE.', '.false.'
Parse a table given in lines and return a record array.
Parameters : | lines : string | sequence of strings
lineno : int
names : None | sequence of strings | comma-separated string
dtype : dtype instance
formats : None | sequence of strings
titles : None | sequence of strings | comma-separated string
converters : None | sequence | string
maxfields : None | int
maxheads : None | int
comment : string
section_separator : None | string
section : int | slice
full_output : bool
|
---|---|
Returns : | rec : record array
tags : dict (optional)
fieldheads : list (optional)
lineno : int (optional)
|
Converts array to a numpy record array.
Parameters : | array: array_like
dtype : data-type
names: None | sequence of strings | string
units: bool | sequence of strings | string
titles: None | sequence of strings | string
formats : list of data-types
aligned : bool
byteorder : {‘<’, ‘>’, ‘=’}
copy : bool
|
---|
Examples
>>> a = [(0., 0), (1., 1), (2., 4)]
>>> rec1 = torecord(a, names='time(s),number')
>>> rec1
rec.array([(0.0, 0), (1.0, 1), (2.0, 4)],
dtype=[('time(s)', '<f8'), ('number', '<i8')])
>>> rec1['time']
Traceback (most recent call last):
...
ValueError: field named time not found...
>>> rec2 = torecord(a, names='time(s),number', units=True)
>>> rec2
rec.array([(0.0, 0), (1.0, 1), (2.0, 4)],
dtype=[(('time', 'time(s)'), '<f8'), (('number()', 'number'), '<i8')])
>>> rec2['time']
array([ 0., 1., 2.])
Tries to convert value to another type using the converters in the sequence converters. converters may also be a comma-separated string of converter names (looked up in the local, global or builtin scopes).
The new value after the first successful convertsion is returned.
If value is a sequence, a list is returned with all elements converted using the first conversion that works for all of them.
If types is None, it defaults to (int, float, ast.literal_eval, string2bool, str).
Examples
>>> typeconvert('3.4')
3.4
>>> typeconvert('3.4a')
'3.4a'
>>> typeconvert('yes')
True
>>> typeconvert('yes', converters='int,float,string2bool')
True
>>> typeconvert(['3', '1', '2.2'])
[3.0, 1.0, 2.2]
>>> typeconvert([3, 1, 2.2]) # Note that 2.2 is truncated!
[3, 1, 2]
>>> typeconvert(['a', 1, 2.2])
['a', '1', '2.2']
Write NumPy record array rec to filename.
Parameters : | filename : string
rec : record array
fmt : None | ‘csv’ | ‘xls’ | ‘txt’
lang : None | string
kwargs :
|
---|
Looks up name in the local, global or builtin scopes in turns, returning it’s value in the scope where it is first found.
Converts the string value to bool and returns the result. Valid strings are:
'TRUE', 'True', 'true', 'YES', 'Yes', 'yes', '.TRUE.', '.true.'
or:
'FALSE', 'False', 'false', 'NO', 'No', 'no', '.FALSE.', '.false.'
Tries to convert value to another type using the converters in the sequence converters. converters may also be a comma-separated string of converter names (looked up in the local, global or builtin scopes).
The new value after the first successful convertsion is returned.
If value is a sequence, a list is returned with all elements converted using the first conversion that works for all of them.
If types is None, it defaults to (int, float, ast.literal_eval, string2bool, str).
Examples
>>> typeconvert('3.4')
3.4
>>> typeconvert('3.4a')
'3.4a'
>>> typeconvert('yes')
True
>>> typeconvert('yes', converters='int,float,string2bool')
True
>>> typeconvert(['3', '1', '2.2'])
[3.0, 1.0, 2.2]
>>> typeconvert([3, 1, 2.2]) # Note that 2.2 is truncated!
[3, 1, 2]
>>> typeconvert(['a', 1, 2.2])
['a', '1', '2.2']