nws_tools.get_corr

nws_tools.get_corr(txtpath, corrtype='pearson', sublist=[], **kwargs)[source]

Compute pair-wise statistical dependence of time-series

Parameters:

txtpath : str

Path to directory holding ROI-averaged time-series dumped in txt files. The following file-naming convention is required sNxy_bla_bla.txt, where N is the group id (1,2,3,...), xy denotes the subject number (01,02,...,99 or 001,002,...,999) and everything else is separated by underscores. The files will be read in lexicographic order, i.e., s101_1.txt, s101_2.txt,... or s101_Amygdala.txt, s101_Beemygdala,... See Notes for more details.

corrtype : str

Specifier indicating which type of statistical dependence to use to compute pairwise dependence. Currently supported options are

pearson: the classical zero-lag Pearson correlation coefficient (see NumPy’s corrcoef for details)

mi: (normalized) mutual information (see the docstring of mutual_info in this module for details)

sublist : list or NumPy 1darray

List of subject codes to process, e.g., sublist = [‘s101’,’s102’]. By default all subjects found in txtpath will be processed.

**kwargs : keyword arguments

Additional keyword arguments to be passed on to the function computing the pairwise dependence (currently either NumPy’s corrcoef or mutual_info in this module).

Returns:

res : dict

Dictionary with fields:

corrs : NumPy 3darray

N-by-N matrices of pair-wise regional statistical dependencies of numsubs subjects. Format is corrs.shape = (N,N,numsubs) such that corrs[:,:,i] = N x N statistical dependence matrix of i-th subject

bigmat : NumPy 3darray

Tensor holding unprocessed time series of all subjects. Format is bigmat.shape = (tlen,N,numsubs) where tlen is the maximum time-series-length across all subjects (if time-series of different lengths were used in the computation, any unfilled entries in bigmat will be NumPy nan‘s, see Notes for details) and N is the number of regions (=nodes in the networks).

sublist : list of strings

List of processed subjects specified by txtpath, e.g., sublist = [‘s101’,’s103’,’s110’,’s111’,’s112’,...]

See also

corrcoef
Pearson product-moment correlation coefficents computed in NumPy
mutual_info
Compute (normalized) mutual information coefficients

Notes

Per-subject time-series do not necessarily have to be of the same length across a subject cohort. However, all ROI-time-courses within the same subject must have the same number of entries. For instance, all ROI-time-courses in s101 can have 140 entries, and time-series of s102 might have 130 entries. The remaining 10 values “missing” for s102 are filled with NaN‘s in bigmat. However, if s101_2.txt contains 140 data-points while only 130 entries are found in s101_3.txt, the code will raise a ValueError.