datamatrix.series

What are series?

A SeriesColumn is a column with a depth; that is, each cell contains multiple values. Data of this kind is very common. For example, imagine a psychology experiment in which participants see positive or negative pictures, while their brain activity is recorded using electroencephalography (EEG). Here, picture type (positive or negative) is a single value that could be stored in a normal table. But EEG activity is a continuous signal, and could be stored as SeriesColumn.

function baseline(series, baseline, bl_start=-100, bl_end=None, reduce_fnc=None, method=u'subtractive')

Applies a baseline to a signal

Example:

import numpy as np
from matplotlib import pyplot as plt
from datamatrix import DataMatrix, SeriesColumn, series

LENGTH = 5 # Number of rows
DEPTH = 10 # Depth (or length) of SeriesColumns

sinewave = np.sin(np.linspace(0, 2*np.pi, DEPTH))

dm = DataMatrix(length=LENGTH)
# First create five identical rows with a sinewave
dm.y = SeriesColumn(depth=DEPTH)
dm.y.setallrows(sinewave)
# Add a random offset to the Y values
dm.y += np.random.random(LENGTH)
# And also a bit of random jitter
dm.y += .2*np.random.random( (LENGTH, DEPTH) )
# Baseline-correct the traces, This will remove the vertical
# offset
dm.y2 = series.baseline(dm.y, dm.y, bl_start=0, bl_end=10)

plt.clf()
plt.subplot(121)
plt.title('Original')
plt.plot(dm.y.plottable)
plt.subplot(122)
plt.title('Baseline corrected')
plt.plot(dm.y2.plottable)
plt.savefig('content/pages/img/series/baseline.png') Figure 1.

Arguments:

• series -- The signal to apply a baseline to.
• Type: SeriesColumn
• baseline -- The signal to use as a baseline to.
• Type: SeriesColumn

Keywords:

• bl_start -- The start of the window from baseline to use.
• Type: int
• Default: -100
• bl_end -- The end of the window from baseline to use, or None to go to the end.
• Type: int, None
• Default: None
• reduce_fnc -- The function to reduce the baseline epoch to a single value. If None, np.nanmedian() is used.
• Type: FunctionType, None
• Default: None
• method -- Specifies whether divisive or subtractive baseline correction should be used. (Changed in v0.7.0: subtractive is now the default)
• Type: str
• Default: 'subtractive'

Returns:

A baseline-correct version of the signal.

• Type: SeriesColumn

function blinkreconstruct(series, vt=5, maxdur=500, margin=10, smooth_winlen=21, std_thr=3)

Reconstructs pupil size during blinks. This algorithm has been designed and tested largely with the EyeLink 1000 eye tracker.

Source:

Arguments:

• series -- A signal to reconstruct.
• Type: SeriesColumn

Keywords:

• vt -- A pupil velocity threshold. Lower tresholds more easily trigger blinks.
• Type: int, float
• Default: 5
• maxdur -- The maximum duration (in samples) for a blink. Longer blinks are not reconstructed.
• Type: int
• Default: 500
• margin -- The margin to take around missing data.
• Type: int
• Default: 10
• smooth_winlen -- No description
• Default: 21
• std_thr -- No description
• Default: 3

Returns:

A reconstructed singal.

• Type: SeriesColumn

function concatenate(*series)

Concatenates multiple series such that a new series is created with a depth that is equal to the sum of the depths of all input series.

Example:

from datamatrix import series as srs

dm = DataMatrix(length=1)
dm.s1 = SeriesColumn(depth=3)
dm.s1[:] = 1,2,3
dm.s2 = SeriesColumn(depth=3)
dm.s2[:] = 3,2,1
dm.s = srs.concatenate(dm.s1, dm.s2)
print(dm.s)

Output:

col[[1. 2. 3. 3. 2. 1.]]

Argument list:

• *series: A list of series.

Returns:

A new series.

• Type: SeriesColumn

function downsample(series, by, fnc=)

Downsamples a series by a factor, so that it becomes 'by' times shorter. The depth of the downsampled series is the highest multiple of the depth of the original series divided by 'by'. For example, downsampling a series with a depth of 10 by 3 results in a depth of 3.

Example:

import numpy as np
from matplotlib import pyplot as plt
from datamatrix import DataMatrix, SeriesColumn, series

LENGTH = 1 # Number of rows
DEPTH = 100 # Depth (or length) of SeriesColumns

sinewave = np.sin(np.linspace(0, 2*np.pi, DEPTH))

dm = DataMatrix(length=LENGTH)
dm.y = SeriesColumn(depth=DEPTH)
dm.y.setallrows(sinewave)
dm.y2 = series.downsample(dm.y, by=10)

plt.clf()
plt.subplot(121)
plt.title('Original')
plt.plot(dm.y.plottable, 'o-')
plt.subplot(122)
plt.title('Downsampled')
plt.plot(dm.y2.plottable, 'o-')
plt.savefig('content/pages/img/series/downsample.png') Figure 2.

Arguments:

• series -- No description
• by -- The downsampling factor.
• Type: int

Keywords:

• fnc -- The function to average the samples that are combined into 1 value. Typically an average or a median.
• Type: callable
• Default:

Returns:

A downsampled series.

• Type: SeriesColumn

function endlock(series)

Locks a series to the end, so that any nan-values that were at the end are moved to the start.

Example:

import numpy as np
from matplotlib import pyplot as plt
from datamatrix import DataMatrix, SeriesColumn, series

LENGTH = 5 # Number of rows
DEPTH = 10 # Depth (or length) of SeriesColumns

sinewave = np.sin(np.linspace(0, 2*np.pi, DEPTH))

dm = DataMatrix(length=LENGTH)
# First create five identical rows with a sinewave
dm.y = SeriesColumn(depth=DEPTH)
dm.y.setallrows(sinewave)
# Add a random offset to the Y values
dm.y += np.random.random(LENGTH)
# Set some observations at the end to nan
for i, row in enumerate(dm):
row.y[-i:] = np.nan
# Lock the degraded traces to the end, so that all nans
# now come at the start of the trace
dm.y2 = series.endlock(dm.y)

plt.clf()
plt.subplot(121)
plt.title('Original (nans at end)')
plt.plot(dm.y.plottable)
plt.subplot(122)
plt.title('Endlocked (nans at start)')
plt.plot(dm.y2.plottable)
plt.savefig('content/pages/img/series/endlock.png') Figure 3.

Arguments:

• series -- The signal to end-lock.
• Type: SeriesColumn

Returns:

An end-locked signal.

• Type: SeriesColumn

function interpolate(series)

Linearly interpolates missing (nan) data.

Example:

import numpy as np
from matplotlib import pyplot as plt
from datamatrix import DataMatrix, SeriesColumn, series

LENGTH = 1 # Number of rows
DEPTH = 100 # Depth (or length) of SeriesColumns
MISSING = 50 # Nr of missing samples

# Create a sine wave with missing data
sinewave = np.sin(np.linspace(0, 2*np.pi, DEPTH))
sinewave[np.random.choice(np.arange(DEPTH), MISSING)] = np.nan
# And turns this into a DataMatrix
dm = DataMatrix(length=LENGTH)
dm.y = SeriesColumn(depth=DEPTH)
dm.y = sinewave
# Now interpolate the missing data!
dm.i = srs.interpolate(dm.y)

# And plot the original data as circles and the interpolated data as dotted
# lines
plt.clf()
plt.plot(dm.i.plottable, ':')
plt.plot(dm.y.plottable, 'o')
plt.savefig('content/pages/img/series/interpolate.png') Figure 4.

Arguments:

• series -- A signal to interpolate.
• Type: SeriesColumn

Returns:

The interpolated signal.

• Type: SeriesColumn

function lock(series, lock)

Shifts each row from a series by a certain number of steps along its depth. This is useful to lock, or align, a series based on a sequence of values.

Example:

import numpy as np
from matplotlib import pyplot as plt
from datamatrix import DataMatrix, SeriesColumn, series as srs

LENGTH = 5 # Number of rows
DEPTH = 10 # Depth (or length) of SeriesColumns

dm = DataMatrix(length=LENGTH)
# First create five traces with a partial cosinewave. Each row is
# offset slightly on the x and y axes
dm.y = SeriesColumn(depth=DEPTH)
dm.x_offset = -1
dm.y_offset = -1
for row in dm:
row.x_offset = np.random.randint(0, DEPTH)
row.y_offset = np.random.random()
row.y = np.roll(np.cos(np.linspace(0, np.pi, DEPTH)),
row.x_offset)+row.y_offset
# Now use the x offset to lock the traces to the 0 point of the cosine,
# i.e. to their peaks.
dm.y2, zero_point = srs.lock(dm.y, lock=dm.x_offset)

plt.clf()
plt.subplot(121)
plt.title('Original')
plt.plot(dm.y.plottable)
plt.subplot(122)
plt.title('Locked to peak')
plt.plot(dm.y2.plottable)
plt.axvline(zero_point, color='black', linestyle=':')
plt.savefig('content/pages/img/series/lock.png') Figure 5.

Arguments:

• series -- The signal to lock.
• Type: SeriesColumn
• lock -- A sequence of lock values with the same length as the Series. This can be a column, a list, a numpy array, etc.

Returns:

A (series, zero_point) tuple, in which series is a SeriesColumn and zero_point is the zero point to which the signal has been locked.

function normalize_time(dataseries, timeseries)

New in v0.7.0

Creates a new series in which a series of timestamps (timeseries) is used as the indices for a series of data point (dataseries). This is useful, for example, if you have a series of measurements and a separate series of timestamps, and you want to combine the two.

The resulting series will generally contain a lot of nan values, which you can interpolate with interpolate().

Example:

from matplotlib import pyplot as plt
from datamatrix import DataMatrix, SeriesColumn, series as srs, NAN

# Create a DataMatrix with one series column that contains samples
# and one series column that contains timestamps.
dm = DataMatrix(length=2)
dm.samples = SeriesColumn(depth=3)
dm.time = SeriesColumn(depth=3)
dm.samples = 3, 1, 2
dm.time    = 1, 2, 3
dm.samples = 1, 3, 2
dm.time    = 0, 5, 10
# Create a normalized column with samples spread out according to
# the timestamps, and also create an interpolate version of this
# column for smooth plotting.
dm.normalized = srs.normalize_time(
dataseries=dm.samples,
timeseries=dm.time
)
dm.interpolated = srs.interpolate(dm.normalized)
# And plot!
plt.clf()
plt.plot(dm.normalized.plottable, 'o')
plt.plot(dm.interpolated.plottable, ':')
plt.xlabel('Time')
plt.ylabel('Data')
plt.savefig('content/pages/img/series/normalize_time.png') Figure 6.

Arguments:

• dataseries -- A column with datapoints.
• Type: SeriesColumn
• timeseries -- A column with timestamps. This should be an increasing list of the same depth as dataseries. NAN values are allowed, but only at the end.
• Type: SeriesColumn

Returns:

A new series in which the data points are spread according to the timestamps.

• Type: SeriesColumn

function reduce_(series, operation=)

Transforms series to single values by applying an operation (typically a mean) to each series.

Example:

import numpy as np
from datamatrix import DataMatrix, SeriesColumn, series

LENGTH = 5 # Number of rows
DEPTH = 10 # Depth (or length) of SeriesColumns

dm = DataMatrix(length=LENGTH)
dm.y = SeriesColumn(depth=DEPTH)
dm.y = np.random.random( (LENGTH, DEPTH) )
dm.mean_y = series.reduce_(dm.y)

print(dm)

Output:

+---+---------------------+---------------------------------------------------+
| # |        mean_y       |                         y                         |
+---+---------------------+---------------------------------------------------+
| 0 |  0.5479575317879923 | [0.45282829 0.81620012 ... 0.887073   0.55792064] |
| 1 |  0.5607641257952233 |  [0.18444932 0.71336793 ... 0.6004839 0.4020966]  |
| 2 |  0.6175386033844104 | [0.94877745 0.69829179 ... 0.73817667 0.99058437] |
| 3 | 0.42798384030403175 | [0.18921016 0.39927687 ... 0.2975318  0.27864995] |
| 4 | 0.49575749438838884 |  [0.7201395 0.1445049 ... 0.11523028 0.65230748]  |
+---+---------------------+---------------------------------------------------+

Arguments:

• series -- The signal to reduce.
• Type: SeriesColumn

Keywords:

• operation -- The operation function to use for the reduction. This function should accept series as first argument, and axis=1 as keyword argument.
• Default:

Returns:

A reduction of the signal.

• Type: FloatColumn

function smooth(series, winlen=11, wintype=u'hanning')

Smooths a signal using a window with requested size.

This method is based on the convolution of a scaled window with the signal. The signal is prepared by introducing reflected copies of the signal (with the window size) in both ends so that transient parts are minimized in the begining and end part of the output signal.

Example:

import numpy as np
from matplotlib import pyplot as plt
from datamatrix import DataMatrix, SeriesColumn, series

LENGTH = 5 # Number of rows
DEPTH = 100 # Depth (or length) of SeriesColumns

sinewave = np.sin(np.linspace(0, 2*np.pi, DEPTH))

dm = DataMatrix(length=LENGTH)
# First create five identical rows with a sinewave
dm.y = SeriesColumn(depth=DEPTH)
dm.y.setallrows(sinewave)
# And add a bit of random jitter
dm.y += np.random.random( (LENGTH, DEPTH) )
# Smooth the traces to reduce the jitter
dm.y2 = series.smooth(dm.y)

plt.clf()
plt.subplot(121)
plt.title('Original')
plt.plot(dm.y.plottable)
plt.subplot(122)
plt.title('Smoothed')
plt.plot(dm.y2.plottable)
plt.savefig('content/pages/img/series/smooth.png') Figure 7.

Arguments:

• series -- A signal to smooth.
• Type: SeriesColumn

Keywords:

• winlen -- The width of the smoothing window. This should be an odd integer.
• Type: int
• Default: 11
• wintype -- The type of window from 'flat', 'hanning', 'hamming', 'bartlett', 'blackman'. A flat window produces a moving average smoothing.
• Type: str
• Default: 'hanning'

Returns:

A smoothed signal.

• Type: SeriesColumn

function threshold(series, fnc, min_length=1)

Finds samples that satisfy some threshold criterion for a given period.

Example:

import numpy as np
from matplotlib import pyplot as plt
from datamatrix import DataMatrix, SeriesColumn, series

LENGTH = 1 # Number of rows
DEPTH = 100 # Depth (or length) of SeriesColumns

sinewave = np.sin(np.linspace(0, 2*np.pi, DEPTH))

dm = DataMatrix(length=LENGTH)
# First create five identical rows with a sinewave
dm.y = SeriesColumn(depth=DEPTH)
dm.y.setallrows(sinewave)
# And also a bit of random jitter
dm.y += np.random.random( (LENGTH, DEPTH) )
# Threshold the signal by > 0 for at least 10 samples
dm.t = series.threshold(dm.y, fnc=lambda y: y > 0, min_length=10)

plt.clf()
# Mark the thresholded signal
plt.fill_between(np.arange(DEPTH), dm.t, color='black', alpha=.25)
plt.plot(dm.y.plottable)
plt.savefig('content/pages/img/series/threshold.png')

print(dm)

Output:

+---+-------------------+-----------------------------------------------------+
| # |         t         |                          y                          |
+---+-------------------+-----------------------------------------------------+
| 0 | [1. 1. ... 0. 0.] | [0.80003818 0.83952468 ... -0.0235213   0.32785217] |
+---+-------------------+-----------------------------------------------------+ Figure 8.

Arguments:

• series -- A signal to threshold.
• Type: SeriesColumn
• fnc -- A function that takes a single value and returns True if this value exceeds a threshold, and False otherwise.
• Type: FunctionType

Keywords:

• min_length -- The minimum number of samples for which fnc must return True.
• Type: int
• Default: 1

Returns:

A series where 0 indicates below threshold, and 1 indicates above threshold.

• Type: SeriesColumn

function window(series, start=0, end=None)

Extracts a window from a signal.

Example:

import numpy as np
from matplotlib import pyplot as plt
from datamatrix import DataMatrix, SeriesColumn, series

LENGTH = 5 # Number of rows
DEPTH = 10 # Depth (or length) of SeriesColumnsplt.show()

sinewave = np.sin(np.linspace(0, 2*np.pi, DEPTH))

dm = DataMatrix(length=LENGTH)
# First create five identical rows with a sinewave
dm.y = SeriesColumn(depth=DEPTH)
dm.y.setallrows(sinewave)
# Add a random offset to the Y values
dm.y += np.random.random(LENGTH)
# Look only the middle half of the signal
dm.y2 = series.window(dm.y, start=DEPTH//4, end=-DEPTH//4)

plt.clf()
plt.subplot(121)
plt.title('Original')
plt.plot(dm.y.plottable)
plt.subplot(122)
plt.title('Window (middle half)')
plt.plot(dm.y2.plottable)
plt.savefig('content/pages/img/series/window.png') Figure 9.

Arguments:

• series -- The signal to get a window from.
• Type: SeriesColumn

Keywords:

• start -- The window start.
• Type: int
• Default: 0
• end -- The window end, or None to go to the signal end.
• Type: int, None
• Default: None

Returns:

A window of the signal.

• Type: SeriesColumn