SLIDING window analysis                                                                   Sanjay Manohar

How does it work?

Description

  • These tools allow you to relate paired measurements to each other. It works a bit like a correlation.
  • Typically, on each trial, you might measure two things, for example, choice and reaction time. But these tools can be used for any paired measurements, X and Y.
  • A standard analysis would be a scatter plot of Y against X. Then perhaps linear or nonlinear regression could be performed to find a relationship.
  • These tools allow you to relate X to Y without assuming a linear model.
  • The analysis divides X into quantile bins, across all trials.
  • The bins overlap, so there is a sliding window, with a fixed number of trials in it.
  • For each bin, the mean value of Y (or some other statistic) is calculated.

Subjects

  • This procedure can be done for each subject or dataset independently.
  • The mean values of Y, with the corresponding bin centres of X, are given for each subject, and can be plotted.
  • Our way of plotting these is to take the across-subjects mean Y for each bin, and plot this against the mean bin centre X, along with standard errors.

Statistics

  • To compare different conditions, where each subject has data from both conditions, we can do a paired permutation test. The resulting per-subject Y values, as a function of X, can thus be compared.
  • To do this, we use a t statistic, and test the null hypothesis that “no bins are different, between the two conditions”, by randomly permuting the conditions and calculating the maximum t-statistic over bins.
  • This allows us to test whether the dependence of Y on X, normalised by quantile of X, differs between conditions. This makes minimal assumptions about the shape of the relationship.

 

 

Example

Let’s say you have 10 subjects each with a data file called “datafile_1” etc., and within each subject’s data there is a variable called “condition” which is the condition of each trial, “RT” with the reaction time for each trial, and “correct” indicating if the response was accurate for each trial. Let’s say each variable is a vector, with one element for each trial. Then we can write:

 

% first build array of X values and Y values

% corresponding to each trial

for subj = 1:10      % for each subject

          % load subject data

          dat = load( [ ‘datafile_’ num2str(subj) ] );

          for cond = 1:2% for each condition

                 sel = dat.condition == cond; % select trials

     % read RTs for this condition

                  X(:,i,j)  = dat.RT( sel );  

                  % accuracy for each trial in this condition

                  Y(:,i,j)  = dat.correct( sel );

          end

end

% calculate conditional accuracy plot

[acc, rt] = conditionalPlot( X,Y );

 

 

Documentation from .m file:

 

% [Xb,Yb, p, t, h] = conditionalPlot(X,Y, nbins,  ...)

%      X{ SUBJECT, CONDITION } ( TRIAL )

% or   X( TRIAL, SUBJECT )            -- for one condition

% or   X( TRIAL, SUBJECT, CONDITION ) -- for multiple conditions

%

% Takes sliding quantile bins of X, and calculates the mean Y for

% each bin. Nbins indicates the width of the window (5 means 20 percentiles)

% This is done for each subject and condition separately.

% The bin values of Xb and Yb are plotted as mean and sem across subjects.

% Different conditions are plotted as separate lines.

%

% returns: Xb ( SUBJECT, CONDITION, QUANTILE ) = median of X for each bin

%          Yb ( SUBJECT, CONDITION, QUANTILE ) = mean   of Y for each bin

%          p ( QUANTILE ) = probability that conditions are significantly

%             different at a quantile, using permutation test.

%             Within-subject comparison between conditions is performed by

%             permutation, and multiple conditions are treated as a

%             single continuous linear predictor (1,2,3 etc).

%          T ( QUANTILE )  = t statistic for linear effect of condition

%          H = handle of lines as returned by errorbarplot

%

% Parameters:

%  'meanfunc' = function handle to use instead of "nanmean".

%  'xtransform', 'ytransform' - set transformations for the plot axes. Can

%     be a @function, or one of 'identity' (default), 'probit', 'recip',

%     'log', 'asin', 'gauss' (inverse gaussian from 0 to 1), or

%     'gauss2' (inverse gaussian from 0.5 to 1). Note that the output values

%     Xb and Yb will not be transformed.

%  'StandardBinning' - if true, then use non-overlapping bins.

%

% dependencies: smoothn, nanmean, nancat, removeIdenticals, permutationOLS

% sanjay g manohar 2013

conditionalPlot.m:

function [Xb,Yb, p,t, h]=conditionalPlot(X,Y, Nbins, varargin)
% [Xb,Yb, p, t, h] = conditionalPlot(X,Y, nbins,  ...)
%      X{ SUBJECT, CONDITION } ( TRIAL )
% or   X( TRIAL, SUBJECT )            -- for one condition
% or   X( TRIAL, SUBJECT, CONDITION ) -- for multiple conditions
%
% Takes sliding quantile bins of X, and calculates the mean Y for
% each bin. Nbins indicates the width of the window (5 means 20 percentiles)
% This is done for each subject and condition separately. 
% The bin values of Xb and Yb are plotted as mean and sem across subjects.
% Different conditions are plotted as separate lines.
%
% returns: Xb ( SUBJECT, CONDITION, QUANTILE ) = median of X for each bin
%          Yb ( SUBJECT, CONDITION, QUANTILE ) = mean   of Y for each bin
%          p ( QUANTILE ) = probability that conditions are significantly
%             different at a quantile, using permutation test.
%             Within-subject comparison between conditions is performed by
%             permutation, and multiple conditions are treated as a
%             single continuous linear predictor (1,2,3 etc).
%          T ( QUANTILE )  = t statistic for linear effect of condition
%          H = handle of lines as returned by errorbarplot
%
% Parameters:
%  'meanfunc' = function handle to use instead of "nanmean". 
%  'xtransform', 'ytransform' - set transformations for the plot axes. Can
%     be a @function, or one of 'identity' (default), 'probit', 'recip',
%     'log', 'asin', 'gauss' (inverse gaussian from 0 to 1), or
%     'gauss2' (inverse gaussian from 0.5 to 1). Note that the output values
%     Xb and Yb will not be transformed.
%  'StandardBinning' - if true, then use non-overlapping bins.

% dependencies: smoothn, nanmean, nancat, removeIdenticals, permutationOLS
% sanjay g manohar 2013


%%%%%%%%%%%%%%%%%%%%%
% Setup Parameters
if(~exist('Nbins','var') || isempty(Nbins)) Nbins=5; end;

if ~iscell(X)
  if ndims(X)==2 % treat as RT( TRIAL, SUBJECT )
    if size(X,1)==1, X=X'; end
    if size(Y,1)==1, Y=Y'; end
    X  = mat2cell(X, size(X,1),   ones(1,size(X,2))   )';  % break into cells
    Y  = mat2cell(Y, size(Y,1),   ones(1,size(Y,2)) )';
  elseif ndims(X)==3 % treat as RT ( TRIAL, SUBJECT, CONDITION ): convert to cell { SUBJECT, CONDITION } (TRIAL)
    X  = sq(mat2cell(X, size(X,1),   ones(1,size(X,2)),   ones(size(X,3),1) )); 
    Y  = sq(mat2cell(Y, size(Y,1),   ones(1,size(Y,2)),   ones(size(Y,3),1) ));
  end
end

% use nonoverlapping bins? 
% advantage is the values are independent, and thus stats are simpler.
% if not, use overlapping bins, which must be compared with permutation test
STANDARD_BINNING = 0; 
Nq       = 100;     % number of quantiles (used when STANDARD_BINNING=0)
binwidth = 1/Nbins; 
SMOOTH   = 15;      % used for area plots
if exist('nanmean','file')  MEANFUNC = @nanmean;  % do we have 'nanmean' ?
else                        MEANFUNC = @mean;     % if not, use mean.
end
doErrorBars        = true; % on the standard binning plot, show error bars for both X and Y?
FILL_AREA          = true; % if false, show dotted lines for standard error, rather than shaded area
REMOVE_IDENTICAL_X = true; % if there aren't enough unique X-values, then interpolate?
% transform axes?
T.identity = @(x)x;
T.probit   = @(x) log(eps+x./(1-x+eps));
T.recip    = @(x) -1./x;     % reciprocal (eg for 'reciprobit'  
T.log      = @(x) log(x+eps);
T.asin     = @(x) asin(sqrt(x)); 
T.gauss    = @(x) erfinv( max(eps-1, min(1-eps, (x*2-1))) ); % gaussian error, range [0-1]
T.gauss2   = @(x) erfinv( max(eps-1, min(1-eps, (x-0.5)*4-1)) ); % gaussian in range [0.5-1]
% these two lines apply transforms to the data before plotting. 
% the function output values are unchanged by the transform.
YTRANSFORM = T.identity;  
XTRANSFORM = T.identity;
i=find(strcmpi(varargin, 'xtransform')); if i, if isstr(varargin{i+1}) XTRANSFORM=T.(varargin{i+1}); 
  else XTRANSFORM=varargin{i+1};  end ; varargin([i i+1])=[];end

i=find(strcmpi(varargin, 'xtransform')); if i, if isstr(varargin{i+1}) YTRANSFORM=T.(varargin{i+1}); 
  else YTRANSFORM=varargin{i+1};  end ; varargin([i i+1])=[];end

i=find(strcmpi(varargin, 'meanfunc')); if i, 
  MEANFUNC=varargin{i+1}; varargin([i,i+1])=[];
end

i=find(strcmpi(varargin, 'StandardBinning')); if i, 
  STANDARD_BINNING = varargin{i+1}; varargin([i i+1])=[]; 
end

i=find(strcmpi(varargin, 'Conditions')); if i, 
  C = varargin{i+1}; varargin([i i+1])=[]; 
  if ~isempty(C)
    if size(X,2)>1, error('cannot specify conditions since RT already contains conditions'); end
    if isnumeric(C) & ndims(C)==2, C=mat2cell(C, size(C,1), ones(1,size(C,2)) )'; end
    uc=unique(nancat(2,C{:})); uc(isnan(uc))=[]; 
    for i=1:size(X,1) % subject
      for j=1:length(uc) % condition
        rt{i,j}  =X{i}(  C{i}==uc(j) );
        corr{i,j}=Y{i}(C{i}==uc(j) );
      end
    end
    X=rt; Y=corr;
  end
end

if STANDARD_BINNING, N=Nbins,   % N is the number of quantiles to sample, i.e.
else N=floor(Nq*(1-binwidth));  % number of points on a CAF curve
end
Xb=nan(size(X,1), size(X,2), N); % create blank output variable (needed for parallel processing)


%%%%%%%%%%%%%%%%%%%%%
% Calculate binning

%%%% change the outer 'for' to 'parfor' for faster execution.
for(i=1:size(X,1))       % for each subject
  x_i=nan(size(X,2),N);  % keep track of bin centres
  for(j=1:size(X,2))     % for each condition
    if STANDARD_BINNING  % DIVIDE INTO BINS
      top=-inf;          % lowest bin goes down to -inf
      for q=1:Nbins      % bin the RTs into Nbins
        bottom=top;      % start new bin at top of old bin
        if(q<Nbins)   top=quantile(X{i,j},q/Nbins); % get top edge of bin
        else          top=inf; % highest bin goes up to +inf
        end
        f=X{i,j}>bottom & X{i,j}<top;       % filter to select this bin
        Yb(i,j,q)=MEANFUNC( Y{i,j}(f) );    % mean of Y in this X-bin
        x_i(j,q)=quantile(X{i,j},(q-0.5)/Nbins); % centre (median) RT of the bin
        % alternative: use mean of top and bottom of bin?
      end
    else % SLIDING WINDOW BINS
      quantiles = quantile(X{i,j}, linspace(0,1,Nq)); % create quantiles
      % go through bins, but there aren't 100 bins because of the width!
      for q=1:floor(Nq*(1-binwidth));   % e.g. 1:80 for 5 bins
        % get start, mid and end of the bin
        range=quantiles( q + [0,floor(Nq*binwidth/2), floor(Nq*binwidth)] );
        f = X{i,j}>range(1) & X{i,j}<range(3); % filter data within range
        % Yb ( SUBJECT, CONDITION, QUANTILE )
        Yb(i,j,q)=MEANFUNC( Y{i,j}(f) );      % mean of Corr
        % Xb values will be the centre (median) RT of the bin. 
        % Change this to range([1 3])/2 to get the midway-point between edges
        x_i(j,q)=range(2); 
      end
      % 8/7/14 added this to prevent nonunique bin centres
      % note that this will fail if X does not have enough unique values to
      % make bins.
      if REMOVE_IDENTICAL_X
        x_i(j,:)=removeIdenticals(flat(sq(x_i(j,:))));
      end
    end
  end
  Xb(i,:,:) = x_i;     % store bins
end


%%%%%%%%%%%%%%%%%%%%%%
% Now do the plotting

oldhold=ishold();             % keep track of whether "hold on"
t_x = XTRANSFORM( Xb );       % transform the data for plotting
t_y = YTRANSFORM( Yb );
h=[];                         % keep graphics / line handles
cols = get(gca,'ColorOrder'); % use axis colours for different lines

if STANDARD_BINNING % for standard binning, we have just a couple of bins
  for j=1:size(X,2) % so for each bin, 
    if doErrorBars  % draw error-bars for both X and Y
      errorbarxy(  YTRANSFORM(  sq(mean(t_x(:,j,:),1))  ), ...
        sq(mean(t_y(:,j,:),1)), ...
        sq(std(t_x(:,j,:),[],1))/sqrt(size(Xb,1)), ...
        sq(std(t_y(:,j,:),[],1))/sqrt(size(Yb,1)) ...
        ,[],[], cols(j,:)  , .2*[1 1 1]);
    else            % or just plot the means
      h2=plot(sq(mean(t_x(:,j,:),1)), ...
        sq(mean(t_y(:,j,:),1)) ...
        , cols(j,:) );
      h=[h h2];
    end
    hold on
  end
else % SLIDING BIN = we have a whole curve to plot
  % Yb ( SUBJECT, CONDITION, TIME )
  for j=1:size(Xb,2)       % for each condition
    x = sq(nanmean(t_x(:,j,:),1)); line = sq(nanmean(t_y(:,j,:),1));
    erro = sq(nanstd(t_y(:,j,:)))/sqrt(size(Yb,1)); % std err at each time
    if numel(erro)==1, erro=erro*ones(size(line)); end
    if SMOOTH, line=smoothn(line, SMOOTH); erro=smoothn(erro, SMOOTH); end
    args=varargin;
    % if there is more than one line, add in the color-order
    colj = cols(mod(j-1,size(cols,1))+1,:);
    if size(Xb,2)>1, args=[args 'Color',colj]; end
    h2=plot(x, line, args{:}, 'Marker','.'); h=[h h2];
    hold on
    if ~FILL_AREA % DOTTED LINES?
      h2=plot(x, line+error, args{:}, 'LineStyle',':'); h=[h h2];
      h2=plot(x, line-error, args{:}, 'LineStyle',':'); h=[h h2];
    else % FILLED AREA?
      alpha = 0.5;
      h2 = fill([x; flipud(x)]', [line+erro; flipud(line-erro)]', colj, 'FaceAlpha', alpha,'linestyle','none');
      h=[h h2];
    end
  end
  %%%%%%% STATISTICS on sliding bin data?
  if nargout>2 % DO STATS?  treats conditions as a continuous variable
    NS = size(Yb,1); NC = size(Yb,2); % run mixed effects permutation test
    if nargout<5, ao={[],[],[],[]};  % did you request a figure handle? if so
    else          ao={[],[],[],[],[]}; end % then request a significance bar from permutationOLS.
    DES=[flat(ones(NS,NC))  zscore(flat(repmat(1:NC,[NS,1]))) ];
    if var(DES(:,2))>0
      [ao{:}]=permutationOLS(reshape(Yb,NS*NC,[]),...
        DES, [0 1], ...
        repmat([1:NS]', [1,NC])  ); 
      p = ao{2}; t = ao{3};
    else p=[]; t=[];
    end
  end
end

if ~oldhold % restore "graphics hold" state
  hold off;
end