Manipulating light exposure data with pyActigraphy

Light manipulation

Photo by Ryunosuke Kikuno on Unsplash

Disclaimer

The development of the pyActigraphy module for analysing light exposure data was led and financially supported by members of the Daylight Academy Project The role of daylight for humans (led by Mirjam Münch, Manuel Spitschan). The module is part of the Human Light Exposure Database. For more information about the project, please see https://daylight.academy/projects/state-of-light-in-humans/.

Introduction

Just like any recording, light exposure data recordings often require some preprocessing before analysis. These steps may include cleaning, resampling, filtering, etc…

The light exposure data analysis module of pyActigraphy allows users to perform many of steps easily.

This tutorial presents how to:

  • truncate or mask spurious light exposure periods;

  • resample, binarize or filter light exposure data.

Imports and input data

As usual, let’s import the necessary packages:

[1]:
import pyActigraphy
[2]:
import pandas as pd
[3]:
import plotly.graph_objects as go
[4]:
import os

Similarly to the introduction tutorial, we will use as input data a sample file recorded by a ActTrust device (Condor Instrument), located in the test directory of the pyActigraphy package itself.

[5]:
fpath = os.path.join(
    os.path.dirname(pyActigraphy.__file__),
    'tests','data/',
    'test_sample_atr.txt'
)
[6]:
raw = pyActigraphy.io.read_raw_atr(fpath)

Truncation and data masking

The period of recording often exceeds the period during which the light recording device was actually worn by the participant; the recording might have started before giving the device to the participant or the device was removed after a certain amount of days by the participant while still recording. Or, you might simply want to analysis a fixed number of days for all your participants. Additionaly, the device might also have been removed temporarily by the participant.

In any case, it is mandatory to truncate and/or mask these periods in the recording prior to analysis.

Let’s first inspect our recording:

[7]:
raw.light.data.head(1)
[7]:
LIGHT AMB LIGHT RED LIGHT GREEN LIGHT BLUE LIGHT IR LIGHT UVA LIGHT UVB LIGHT
DATE/TIME
1918-01-01 09:00:00 2.585957 2.195291 1.719 1.874424 1.590953 1.157154 0.0 0.0
[8]:
raw.light.data.tail(1)
[8]:
LIGHT AMB LIGHT RED LIGHT GREEN LIGHT BLUE LIGHT IR LIGHT UVA LIGHT UVB LIGHT
DATE/TIME
1918-01-05 08:59:00 0.012837 0.004321 0.0 0.004321 0.004321 0.0 0.0 0.0

The recording starts at 9:00 on the fictional date of the 1st of January 1918 and ends exactly 4 days later.

Truncation

To truncate the recording, there are two ways:

  1. Specify a start and stop datetime on the light data:

[9]:
raw.light.start_time = '1918-01-02 09:00:00'
[10]:
raw.light.stop_time = '1918-01-04 09:00:00'
[11]:
raw.light.data.head(1)
[11]:
LIGHT AMB LIGHT RED LIGHT GREEN LIGHT BLUE LIGHT IR LIGHT UVA LIGHT UVB LIGHT
DATE/TIME
1918-01-02 09:00:00 1.286456 0.925828 0.605305 0.70927 0.677607 0.896526 0.0 0.0
[12]:
raw.light.data.tail(1)
[12]:
LIGHT AMB LIGHT RED LIGHT GREEN LIGHT BLUE LIGHT IR LIGHT UVA LIGHT UVB LIGHT
DATE/TIME
1918-01-04 09:00:00 1.281488 0.921166 0.604226 0.61066 0.332438 0.245513 0.0 0.0
  1. Specify a start and stop datetime when reading the input recording:

[13]:
raw_trunc = pyActigraphy.io.read_raw_atr(
    fpath,
    start_time = '1918-01-02 09:00:00',
    period='2D' # restrict input data to 2 days
)
[14]:
raw_trunc.light.data.head(1)
[14]:
LIGHT AMB LIGHT RED LIGHT GREEN LIGHT BLUE LIGHT IR LIGHT UVA LIGHT UVB LIGHT
DATE/TIME
1918-01-02 09:00:00 1.286456 0.925828 0.605305 0.70927 0.677607 0.896526 0.0 0.0
[15]:
raw_trunc.light.data.tail(1)
[15]:
LIGHT AMB LIGHT RED LIGHT GREEN LIGHT BLUE LIGHT IR LIGHT UVA LIGHT UVB LIGHT
DATE/TIME
1918-01-04 09:00:00 1.281488 0.921166 0.604226 0.61066 0.332438 0.245513 0.0 0.0

Both methods are equivalent.

It is possible to reset these start and stop times at any moment:

[16]:
raw.light.reset_times()

The start and stop times have been set to None:

[17]:
raw.light.start_time
[18]:
raw.light.stop_time

Masking

The ‘machinery’ to mask some periods of light data acquisition in the recording is similar to the one used for activity; it consists in dynamically masking the light data upon access. The underlying original data are kept intact and the mask can easily be turned off and on.

For more info, please see this tutorial.

Let’s first visualize the white light channel (‘LIGHT’) as well as the activity:

[19]:
layout = go.Layout(
    xaxis=dict(title="Date time"),
    yaxis=dict(title="Activity counts/period"),
    yaxis2=dict(title='Light intensity',overlaying='y',side='right'),
    showlegend=True
)
[20]:
fig1 = go.Figure([
    go.Scatter(
        x=raw_trunc.data.index.astype(str),
        y=raw_trunc.data,
        name='Activity'),
    go.Scatter(
        x=raw_trunc.light.get_channel('LIGHT').index.astype(str),
        y=raw_trunc.light.get_channel('LIGHT'),
        yaxis='y2', opacity=0.5,
        name='Light')
], layout=layout)
[21]:
fig1.show()

First, create a simple mask (i.e a series of 1, meant for being further edited by the users):

[22]:
raw.light.create_light_mask()

To simply mask a period of light data acquistion on all channels:

[23]:
raw.light.add_light_mask_period(
    start='1918-01-03 06:00:00',
    stop='1918-01-03 10:00:00'
)

However, it is possible apply such mask period on a specific channel:

[24]:
raw.light.get_channel_list()
[24]:
Index(['LIGHT', 'AMB LIGHT', 'RED LIGHT', 'GREEN LIGHT', 'BLUE LIGHT',
       'IR LIGHT', 'UVA LIGHT', 'UVB LIGHT'],
      dtype='object')
[25]:
raw.light.add_light_mask_period(
    start='1918-01-03 05:00:00',
    stop='1918-01-03 11:00:00',
    channel='RED LIGHT'
)

However, as long as the apply_mask boolean is not set to True, the data are not actually masked:

[26]:
raw.light.apply_mask
[26]:
False
[27]:
raw.light.get_channel(
    'LIGHT'
)['1918-01-03 05:58:00':'1918-01-03 10:02:00']
[27]:
DATE/TIME
1918-01-03 05:58:00    0.460898
1918-01-03 05:59:00    0.495544
1918-01-03 06:00:00    0.328380
1918-01-03 06:01:00    0.354108
1918-01-03 06:02:00    0.466868
                         ...
1918-01-03 09:58:00    0.079181
1918-01-03 09:59:00    0.079181
1918-01-03 10:00:00    0.082785
1918-01-03 10:01:00    0.079181
1918-01-03 10:02:00    0.071882
Freq: T, Name: LIGHT, Length: 245, dtype: float64
[28]:
raw.light.get_channel(
    'RED LIGHT'
)['1918-01-03 04:58:00':'1918-01-03 11:02:00']
[28]:
DATE/TIME
1918-01-03 04:58:00    0.075547
1918-01-03 04:59:00    0.110590
1918-01-03 05:00:00    0.152288
1918-01-03 05:01:00    0.133539
1918-01-03 05:02:00    0.143015
                         ...
1918-01-03 10:58:00    0.845718
1918-01-03 10:59:00    0.475671
1918-01-03 11:00:00    0.644439
1918-01-03 11:01:00    0.414973
1918-01-03 11:02:00    0.380211
Freq: T, Name: RED LIGHT, Length: 365, dtype: float64
[29]:
raw.light.apply_mask = True
[30]:
raw.light.get_channel(
    'LIGHT'
)['1918-01-03 05:58:00':'1918-01-03 10:02:00']
[30]:
DATE/TIME
1918-01-03 05:58:00    0.460898
1918-01-03 05:59:00    0.495544
1918-01-03 06:00:00         NaN
1918-01-03 06:01:00         NaN
1918-01-03 06:02:00         NaN
                         ...
1918-01-03 09:58:00         NaN
1918-01-03 09:59:00         NaN
1918-01-03 10:00:00         NaN
1918-01-03 10:01:00    0.079181
1918-01-03 10:02:00    0.071882
Freq: T, Name: LIGHT, Length: 245, dtype: float64
[31]:
raw.light.get_channel(
    'RED LIGHT'
)['1918-01-03 04:58:00':'1918-01-03 11:02:00']
[31]:
DATE/TIME
1918-01-03 04:58:00    0.075547
1918-01-03 04:59:00    0.110590
1918-01-03 05:00:00         NaN
1918-01-03 05:01:00         NaN
1918-01-03 05:02:00         NaN
                         ...
1918-01-03 10:58:00         NaN
1918-01-03 10:59:00         NaN
1918-01-03 11:00:00         NaN
1918-01-03 11:01:00    0.414973
1918-01-03 11:02:00    0.380211
Freq: T, Name: RED LIGHT, Length: 365, dtype: float64

The masked data have been replaced with NaN, on the fly.

To visualize this, let’s superimpose the light exposure data and the mask:

[32]:
layout = go.Layout(
    xaxis=dict(title="Date time"),
    yaxis=dict(title="Activity counts/period"),
    yaxis2=dict(title='Mask',overlaying='y',side='right'),
    showlegend=True
)
[33]:
fig2 = go.Figure([
    go.Scatter(
        x=raw.light.get_channel('LIGHT').index.astype(str),
        y=raw.light.get_channel('LIGHT'),
        name='Light'),
    go.Scatter(
        x=raw.light.mask.index.astype(str),
        y=raw.light.mask.loc[:,'LIGHT'],
        yaxis='y2', opacity=0.5,
        name='Mask')
], layout=layout)
[34]:
fig2.show()

Data resampling & binarization

In order to analyse the light exposure data, it is sometimes more convenient to work with data resampled at a lower frequency than thr acquisition frequency. It could be also useful to binarize the light exposure data; light data are replaced with ‘1’ if they are above a certain threshold and with ‘0’ otherwise.

Both functionalities are readily available within the light expsoure data analysis module of pyActigraphy.

Resampling

To resample the light data to a 5-min period:

[35]:
help(raw.light.resampled_data)
Help on method resampled_data in module pyActigraphy.recording.base:

resampled_data(rsfreq, agg='sum') method of pyActigraphy.light.light.LightRecording instance
    # TODO: @lru_cache(maxsize=6) ???

It is possible to specify the aggregation function used during the resampling. By default, the data are summed over each resampled periods. However, it is possible to use a ‘mean’:

[36]:
raw.light.resampled_data(rsfreq='5min', agg='mean').head(5)
[36]:
LIGHT AMB LIGHT RED LIGHT GREEN LIGHT BLUE LIGHT IR LIGHT UVA LIGHT UVB LIGHT
DATE/TIME
1918-01-01 09:00:00 2.518573 2.128190 1.647580 1.811510 1.522233 1.105414 0.0 0.0
1918-01-01 09:05:00 2.469647 2.079500 1.597877 1.764314 1.476032 1.062723 0.0 0.0
1918-01-01 09:10:00 2.552769 2.162241 1.684724 1.843288 1.559470 1.132460 0.0 0.0
1918-01-01 09:15:00 2.547663 2.157165 1.678734 1.838214 1.555122 1.134545 0.0 0.0
1918-01-01 09:20:00 2.229805 1.841657 1.377844 1.545315 1.232937 0.925625 0.0 0.0

The resulting time series has now an index with an epoch length of 5 min.

Binarization

[37]:
help(raw.light.binarized_data)
Help on method binarized_data in module pyActigraphy.recording.base:

binarized_data(threshold, rsfreq=None, agg='sum') method of pyActigraphy.light.light.LightRecording instance
    # TODO: @lru_cache(maxsize=6) ???

[38]:
raw.light.data.head(5)
[38]:
LIGHT AMB LIGHT RED LIGHT GREEN LIGHT BLUE LIGHT IR LIGHT UVA LIGHT UVB LIGHT
DATE/TIME
1918-01-01 09:00:00 2.585957 2.195291 1.719000 1.874424 1.590953 1.157154 0.0 0.0
1918-01-01 09:01:00 2.512151 2.121790 1.643650 1.803730 1.516932 1.095518 0.0 0.0
1918-01-01 09:02:00 2.506884 2.116541 1.630530 1.802500 1.514415 1.103462 0.0 0.0
1918-01-01 09:03:00 2.495544 2.105272 1.625312 1.790074 1.495544 1.084934 0.0 0.0
1918-01-01 09:04:00 2.492327 2.102056 1.619406 1.786822 1.493319 1.086004 0.0 0.0

To binarize data with a specific threshold:

[39]:
raw.light.binarized_data(threshold=2.5).head(5)
[39]:
LIGHT AMB LIGHT RED LIGHT GREEN LIGHT BLUE LIGHT IR LIGHT UVA LIGHT UVB LIGHT
DATE/TIME
1918-01-01 09:00:00 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1918-01-01 09:01:00 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1918-01-01 09:02:00 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1918-01-01 09:03:00 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1918-01-01 09:04:00 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

As expected, light data timepoints with a value below 2.5 are set to ‘0’ and to ‘1’ otherwise.

Resampling and binarization at the same time

With the binarization function, it is also possible to first resample the light exposure data before binarization

To resample to a 5-min period and then binarize the data:

[40]:
raw.light.binarized_data(
    threshold=2.5,
    rsfreq='5min',
    agg='mean'
).head(5)
[40]:
LIGHT AMB LIGHT RED LIGHT GREEN LIGHT BLUE LIGHT IR LIGHT UVA LIGHT UVB LIGHT
DATE/TIME
1918-01-01 09:00:00 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1918-01-01 09:05:00 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1918-01-01 09:10:00 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1918-01-01 09:15:00 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1918-01-01 09:20:00 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

Data filtering

Instead of resampling the data, it is possible to retain the original sampling frequency while getting rid of the high frequency fluctuations by simply filtering the data.

The light exposure data analysis module of pyActigraphy provides an easy way to instaciate and apply a Butterworth filter on the light data. This is essentially a wrapper around the Scipy’s scipy.signal.butter function.

[41]:
raw.light.apply_mask = False
[42]:
help(raw.light.filter_butterworth)
Help on method filter_butterworth in module pyActigraphy.light.light_metrics:

filter_butterworth(fc_low, fc_high, N, channels=None) method of pyActigraphy.light.light.LightRecording instance
    Butterworth filtering

    Forward-backward digital filtering using a Nth order Butterworth filter

    Parameters
    ----------
    fc_low: float
        Critical frequency (lower).
    fc_high: float
        Critical fequency (higher).
    N: int
        Order of the filter
    channels: list of str, optional.
        Channel list. If set to None, use all available channels.
        Default is None.

    Returns
    -------
    filt: pd.DataFrame
        Filtered signal, per channel.

    Notes
    -----

    This function is essentially a wrapper to the scipy.signal.butter
    function. For more information, see [1]_.

    References
    ----------

    .. [1] https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.butter.html

To filter the light exposure data with a 4th order low-pass Butterworth filter with a frequency cut a 100th of the original sampling frequency:

[43]:
wlight_filtered = raw.light.filter_butterworth(
    fc_low=None,
    fc_high=(1/30)/100,
    N=4,
    channels=['LIGHT']
)
[44]:
fig3 = go.Figure(
    data=[
        go.Scatter(
            x=raw.light.get_channel('LIGHT').index.astype(str),
            y=raw.light.get_channel('LIGHT'),name='Raw light data'),
        go.Scatter(
            x=wlight_filtered.index.astype(str),
            y=wlight_filtered.loc[:,'LIGHT'],
            name='Filtered data'
        ),
    ]
)
[45]:
fig3.show()

Et voilà! For now…