Introduction to “RKColocal”

Shulei Wang1 and Ming Yuan1

1Department of Statistics, Columbia University; Department of Statistics, University of Wisconsin-Madison; Morgridge Institute for Research

11-1-2017

Abstract

RKColocal provides baisc functions for dual-channel image input/output and colocalization analysis. RKColocal includes tools to display joint distribution of pixel intensities for colocalization analysis, evaluate the average degree of colocalization in a given region, qunatify the degree of colocalization at each pixel and identify colocalized region.

Package

RKColocal 0.0.1.0

1 Getting Started

The R package RKColocal can be downloaded from Bioconductor website, installed and loaded by entering:

source("http://bioconductor.org/biocLite.R")
biocLite("RKColocal")

Then, the library can be loaded by

library("RKColocal")

Some parts of the R package need to call external Java class. Therefore, it’s recommended to install R package rJava manually before installing RKColocal since sometimes extra configuration is needed, (some common errors can be found in help1 and help2).

There are three main functional modules in RKColocal:

Dual-Channel Images I/O provides basic functions to read, display and write dual channel images;
there are functions to plot different types of the joint distribution of pixel intensities in Colocalization Analysis Plot;
the functions in Region based Colocalization Analysis assess the average degree of colocalization in a given region of interest via evaluating some statistical quantity.
Spatial Adaptive Colocalization Analysis aims to measure the colocalization level at each pixel and identify colocalized region in dual channel images.

2 Dual-Channel Images I/O

RKColocal supports the basic functions to read, display and write images via adopting the same input/ouput system in EBImage. The dual channel images can be read by function readImage in EBImage and the two corresponding channels can be extracted from it. One typical example of reading dual-channel images is to load two channels as X and Y from sample images distributed with the package.

f = system.file("images", "sample.jpeg", package = "RKColocal")
img = readImage(f)
X = imageData(img)[,,1]
Y = imageData(img)[,,2]

readImage currently supports three forms of images: jpeg, png and tiff and see the details in EBImage. The two channels X and Y can also be seen as two matrixs and their dimensions can be accessed using the dim just like for regular matrix.

dim(X)

[1] 1024 1024

For convenience of visualization, the two channels are usually represented by green and red color. These two channels colored by green and red can be visualized by the function DualImageVisual.

DualImageVisual(X, Y, isSplit = TRUE)

The most straight-forward way to assess the colocalization is to overlay two channels into one image. If we see the yellow color in the composite image, then the colocalization occurs between two channels. The composite image can also be visualized by the function DualImageVisual with the argument isSplit=FALSE.

DualImageVisual(X, Y, isSplit = FALSE)

The colocalization analysis is sometimes restricted into some region of interest (ROI). In RKColocal, the region of interest is represented by a boolean matrix with the same dimensions of two channels, where TRUE means pixel is included in ROI. We load a ROI for the image above from a sample ROI distributed with the package and visualize it via function display in EBImage.

data(ROI)
display(ROI)

Function DualImageVisual supports the dual-channel image is displayed only in ROI with argument mask=ROI.

DualImageVisual(X, Y, isSplit = FALSE, mask = ROI)

Images can be saved to files using the DualImageSave function. The same with DualImageVisual, two channels are saved in two seperate images with isSplit=TRUE and only pixels within ROI are saved when mask=ROI.

DualImageSave(X, Y, name = "sample.png", isSplit = TRUE, mask = ROI)

The dual-channel images can be saved as jpeg, png or tiff and see more details in introduction of function writeImage in EBImage.

3 Colocalization Analysis Plot

Besides overlaying two channels, joint distribution of pixel intensitiy between two channels is another way to inspect colocalization visually. In RKColocal, function colocalplot supports two types of colocalization analysis plot: scatter plot (see, e.g. Bolte and Cordeliéres 2006; Comeau, Costantino, and Wiseman 2006; Dunn, Kamocka, and McDonald 2011) and Li’s plot (see, e.g. Li et al. 2004).

Scatter Plot: Scatter plot is a useful graphical tool to display joint distribution of pixel intensities in dual-channel images, where the pixel intensity of green channel X is plotted against the intensity of red channel Y. The scatter plot for dual-channel images can be displayed via colocalplot with argument method = 'scatter'.

colocalplot(X, Y, method = 'scatter')

The scatter plot can be restricted into pre-defined ROI with argument mask = ROI.

Li’s Plot: Different from scatter plot, Li’s plot provides two seperate plots instead of one. The common \(x\) axis is \((X_i-\bar{X})(Y_i-\bar{Y})\) in both plots and the \(y\) axes in two plots are \(X_i\) and \(Y_i\), respectively. The Li’s plot in the region of interest can be shown via colocalplot with arguments method = 'li' and mask = ROI.

colocalplot(X, Y, mask = ROI, method = 'li')

4 Region based Colocalization Analysis

The most common used colocalization analysis method is to evaluate a statistical quantity for colocalization/correlation in a given region (region of interest or the whole image). In RKColocal, this process is implemented by function colocalroi. In colocalroi, six different colocalization measures are implemented: Pearson correlation coefficient, Manders’ colocalization coefficients, Kendall tau correlation coefficient, Spearman’s rank correlation coefficient, intensity correlation quotient and maximum truncated Kendall tau correlation coefficient. These methods can be chosen via argument method.

4.1 Classical Colocalization Measure

We introduce the first five classical colocalization measures in this section.

Pearson correlation coefficient

Pearson correlation coefficient is a very popular statistics for correlation, measuring the degree of linear relationship between two variables. It is first introduced to colocalization analysis by Manders (see Manders et al. 1992). Mathematically, it can be expressed as \[ r={\sum_{i=1}^n(X_i-\bar{X})(Y_i-\bar{Y}) \over\sqrt{\sum_{i=1}^n(X_i-\bar{X})^2\sum_{i=1}^n(Y_i-\bar{Y})^2}} \] where \(X_i\) and \(Y_i\) are the pixel intensity of green and red channels at pixel \(i\). Here, \(\bar{X}\) and \(\bar{Y}\) are the mean of pixel intensities. \(r\) can be calculated via function colocalroi with argument method = 'pearson'.

colocalroi(X, Y, method = 'pearson')

[1] 0.8443913

The colocalization analysis can also restrict into the region of interest by setting argument mask = ROI.

colocalroi(X, Y, mask = ROI, method = 'pearson')

[1] -0.006204956

Manders colocalization coefficients

Another popular colocalization measure is Manders colocalization coefficients, introduced by Manders, Verbeek, and Aten (1993). Instead of one single measure, there are two components in Manders colocalization coefficients, i.e. \(M_1\) and \(M_2\). They can be written as \[ M_1={\sum_{i=1}^n X_i\mathbf{I}(Y_i>\alpha_Y)\over\sum_{i=1}^n X_i}\quad {\rm and}\quad M_2={\sum_{i=1}^n Y_i\mathbf{I}(X_i>\alpha_X)\over\sum_{i=1}^n Y_i} \] where \(\alpha_X\) and \(\alpha_Y\) are the thresholds for two channels and \(\mathbf{I}(\cdot)\) is an indicator function. These coefficients \(M_1\) and \(M_2\) measure fractions of signal in one channel that overlap with the other. Different from Pearson’s correlation coefficient, Manders’ colocalization coefficients measure the degree of colocalization manifested in a distinct way. It measures co-occurence and is most useful if simple spatial overlap between the two probes is expected. \(M_1\) and \(M_2\) can be calculated by setting method = 'mandersM1' or method = 'mandersM2'.

colocalroi(X, Y, method = 'mandersM1')

[1] 0.9888702

colocalroi(X, Y, method = 'mandersM2')

[1] 0.5149765

Thresholds \(\alpha_X\) and \(\alpha_Y\) can be input mannually via arguments Thresholds or can be set as Otsu’s thresholds when Thresholds=NULL.

Spearman’s rank correlation coefficient

Spearman’s rank correlation coefficient can be seen as a robust version of Pearson correlation coefficient, introduced into colocalization analysis by French et al. (2008) and Adler, Pagakis, and Parmryd (2008). Compared with Pearson correlation coefficient, Spearman’s rank correlation coefficient is able to capture a wider range of assoiations. It is Pearson correlation coefficient between rank \(R_i\) and \(T_i\) after transforming raw data \(X_i\) and \(Y_i\) to corresponding rank \(R_i\) and \(T_i\), i.e. \[ S={\sum_{i=1}^n(R_i-\bar{R})(T_i-\bar{T}) \over\sqrt{\sum_{i=1}^n(R_i-\bar{R})^2\sum_{i=1}^n(T_i-\bar{T})^2}}. \] \(S\) can be calculated by function colocalroi with argument method = 'spearman'.

colocalroi(X, Y, method = 'spearman')

[1] 0.9288932

Kendall tau correlation coefficient

Similar with Spearman’s rank correlation coefficient, Kendall tau correlation coefficient is another popular rank correlation coefficient and is able to capture a wider range of associations beyond linear relationship. It is expressed as \[ \tau={2\over n(n-1)}\sum_{1\le i<j\le n}{\rm sign}(X_i-X_j){\rm sign}(Y_i-Y_j), \] where \({\rm sign}(\cdot)\) is a function returning sign of input. \(\tau\) can be calculated via function colocalroi with argument method = 'kendall'.

colocalroi(X, Y, mask = ROI, method = 'kendall')

[1] 0.02708241

Intensity correlation quotient

With the same idea in Li’s plot, Li et al. (2004) proposes a new statistics called intensity correlation quotient. It is basically ratio of pixels where \((X_i-\bar{X})(Y_i-\bar{Y})>0\). The idea is similar with rank correlation coefficient, since only sign information is extracted. It can be expressed as \[ Q={\sum_{i=1}^n \mathbf{I}((X_i-\bar{X})(Y_i-\bar{Y})>0) \over n}-{1\over 2} \] \(Q\) can be calculated via function colocalroi with argument method = 'icq'.

colocalroi(X, Y, method = 'icq')

[1] 0.3831882

4.2 Robust Colocalization Measure

All colocalization measures above overlook an important fact that a dark background with positive offset may occupy a substantial area of the image. To overcome this, Wang, Arena, Eliceiri, et al. (2017) proposes to evaluate Kendall tau only on the subset of pixels where both channels are sufficiently bright, leading to truncated Kendall tau correlation coefficient \[ \tau(t_X,t_Y)={1\over n_{t_X,t_Y}(n_{t_X,t_Y}-1)}\sum_{i,j\in\mathcal{K}(t_X,t_Y):i\ne j}{\rm sign}(X_i-X_j){\rm sign}(Y_i-Y_j), \] where \[ \mathcal{K}(t_X,t_Y)=\{i:X_i>t_X,Y_i>t_Y\}\quad{\rm and}\quad n_{t_X,t_Y}=|\mathcal{K}(t_X,t_Y)|, \] for two pre-specified thresholds \(t_X\) and \(t_Y\).

In practice, we do not know at which level \(t_X\) and \(t_Y\) colocalization may occur and determining the background is a complex process and very susceptible to misspecification. To over come this problem, Wang, Arena, Eliceiri, et al. (2017) proposes to instead the maximum of normalized \(\tau(t_X,t_Y)\) for all possible \(t_X\)s and \(t_Y\)s. Therefore, maximum truncated Kendall tau correlation coefficient (MTKT) can be expressed as \[ \tau^\ast=\max_{t_X\ge \beta_X,t_Y\ge \beta_Y} \left\{\tau(t_X,t_Y)\cdot\sqrt{9n_{t_X,t_Y}(n_{t_X,t_Y}-1)\over 2(2n_{t_X,t_Y}+5) }\right\}, \] where \(\beta_X\) and \(\beta_Y\) are two lower bounds for possible thresholds.

Obviously, it is time consuming to evaluate \(\tau(t_X,t_Y)\) for all poosible thresholds \(t_X\) and \(t_Y\) in practice. Wang, Arena, Eliceiri, et al. (2017) proposes an approximationed version for \(\tau^\ast\) \[ \tau^\ast_{\rm app}=\max_{t_X=X_{(j)},t_Y=Y_{(k)}:j,k\in\mathcal{R}_n} \left\{\tau(t_X,t_Y)\cdot\sqrt{9n_{t_X,t_Y}(n_{t_X,t_Y}-1)\over 2(2n_{t_X,t_Y}+5) }\right\}, \] where \(\mathcal{R}_n\) is \[ \mathcal{R}_n=\left\{s:s=\left\lfloor n-\left(1+{1\over \log\log n}\right)^j \right\rfloor, j=1,\ldots, X_{(s)}\ge \beta_X\ {\rm or}\ Y_{(s)}\ge \beta_Y\right\}. \] \(\tau^\ast_{\rm app}\) can be calculated via function colocalroi with argument method = 'trunkendall'.

colocalroi(X, Y, method = 'trunkendall')

[1] 42.11716

Lower bound of thresholds \(\beta_X\) and \(\beta_Y\) can be input mannually via arguments Thresholds or can be set as Otsu’s thresholds when Thresholds=NULL.

colocalroi(X, Y, Thresholds=c(0.5,0.5), method = 'trunkendall')

4.3 Statistical Test on Colocalization

To interpret the values of different colocalization measures above, we formulate the region based colocalization analysis as a hypothesis testing problem

\[ H_0:{\rm no\ colocalization}\quad {\rm v.s.}\quad H_1: {\rm colocalization\ happens} \]

To assess statistical significance (\(p\)-value) under hypothesis above, RKColocal adopts permutation test to calculate \(p\)-value via function colocalroitest. With the same arguments in colocalroi, method can be set as one of six colocalization measures above and mask is designed for region of interest. The Thresholds can be manually input when method is mandersM1, mandersM2 or trunkendall. A typical example is to apply permutation test on sample dual-channel images.

Testresult <- colocalroitest(X, Y, mask = ROI, method = 'pearson')
Testresult

$Index_Value
[1] -0.006204956

$P_value
[1] 0.8316832

$Null_Distribution
  [1]  0.0110999188  0.0045105497 -0.0041123287 -0.0109777158 -0.0143283922
  [6] -0.0060002863  0.0046309593 -0.0049618553  0.0010398714  0.0047381864
 [11] -0.0012662583  0.0122381024  0.0040482094 -0.0045144642  0.0068670992
 [16]  0.0088936775  0.0011695224  0.0098925609 -0.0070722853  0.0082834752
 [21] -0.0057702034  0.0088038459 -0.0036112561  0.0022899027 -0.0006722284
 [26]  0.0049561740 -0.0029344616  0.0074898045 -0.0137660276  0.0012473945
 [31]  0.0027497968 -0.0005639141  0.0022737303 -0.0006360784 -0.0105100752
 [36] -0.0004157804 -0.0060907974 -0.0005847072  0.0036550434  0.0064146798
 [41] -0.0114143705  0.0073256343  0.0050071375 -0.0021678355 -0.0071814151
 [46]  0.0125520372  0.0008479771 -0.0007971228  0.0147811101 -0.0003085533
 [51] -0.0007153095 -0.0130549856 -0.0099974510  0.0066576735  0.0005760361
 [56] -0.0030168185  0.0132117079  0.0075994779 -0.0009327535  0.0005137928
 [61] -0.0015318118 -0.0114090703  0.0039313332 -0.0047397906  0.0088919107
 [66]  0.0019655034  0.0110363165 -0.0077985484 -0.0065854825 -0.0031794938
 [71]  0.0036875241  0.0006596162 -0.0039466635 -0.0064240304 -0.0046608312
 [76]  0.0077856643  0.0013120841 -0.0017063106 -0.0092376200 -0.0045968211
 [81] -0.0065474298 -0.0019540609 -0.0158673522 -0.0005319770  0.0091733648
 [86]  0.0049997988 -0.0020338357 -0.0059097753  0.0079945465 -0.0044418923
 [91]  0.0054090012 -0.0000159552  0.0011207334  0.0093990989  0.0017536314
 [96]  0.0015446133 -0.0102896414  0.0057340800  0.0077790051  0.0123551144

attr(,"class")
[1] "colocalroitest"

The output of colocalroitest is an object of class colocalroitest. There are three parts in the class of colocalroitest: Index_Value is the value of colocalization measure without permutation; Null_Distribution is the collection of values of colocalization measure in each permutation; P_value is the approximated \(p\)-value calculated from permutation test. The class colocalroitest can also be displayed as histogram via function plot. The Testresult is plot via plot.

plot(Testresult)

The default times of permutation in colocalroitest is 100 and it can also be reset via arguments times. For exaample, the permutation time can be set as 1000.

colocalroitest(X, Y, mask = ROI, times = 1000, method = 'pearson')

The potential hurdle of permutation test is the computation cost. To overcome this, colocalroitest can conduct permutation test in a parallel computing fashion via R package doParallel. Parallel computing can be active with arguments is.parallel=TRUE and the number of core can be set via numcore. In the following example, the number of core for parallel computing is 2.

colocalroitest(X, Y, mask = ROI, is.parallel = TRUE, numcore = 2, method = 'pearson')

Another practical challenge is the potential dependence among \(X_i\)s and \(Y_i\)s. To overcome this problem, colocalroitest has options to conduct block based permutation test, proposed by Costes et al. (2004). Block based permutation test permutes the block by blcok on pre-defined block instead of pixel by pixel. The block based permutation test is execute only when mask=NULL and block size bsize is set as some integer larger than 1. For instance, we can permute the \(32\times 32\) block in the sample dual-channel image.

colocalroitest(X, Y, bsize = 32, method = 'mandersM2')

5 Spatial Adaptive Colocalization Analysis (SACA)

Although the region based colocalization analysis above can tell the existence of colocalization in any given region, all of them overlook the spatial information of images and fail to identify the colocalized region. Recently, Wang, Arena, Becker, et al. (2017) proposes a new framework for spatial adaptive colocalization analysis. It is implenmented in the function colocalpixel. We apply it on the sample images.

saca <- colocalpixel(X, Y)
str(saca)

List of 5
 $ Zvalue  : num [1:1024, 1:1024] 0 0 0 0 0 0 0 0 0 0 ...
 $ Pvalue  : num [1:1024, 1:1024] 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
 $ SigPixel: num [1:1024, 1:1024] 0 0 0 0 0 0 0 0 0 0 ...
 $ Red     : num [1:1024, 1:1024] 0.0353 0.0353 0.0353 0.0353 0.0353 ...
 $ Green   : num [1:1024, 1:1024] 0.051 0.051 0.051 0.051 0.051 ...
 - attr(*, "class")= chr "colocalpixel"

The output of function colocalpixel is an object of class colocalpixel. There are five components in class colocalpixel: Zvalue is a matrix with the same dimension of input image, each entry of which is pixelwised \(z\)-value; Pvalue is a matrix with the same dimension of input image, each entry of which is pixelwised \(p\)-value; SigPixel is a boolean matrix representing significant/colocalized pixel; Red and Green are the two channels of dual-channel image. We can also plot the colocalized region of the class colocalpixel via function plot.

plot(saca)

In the plot above, the colocalized region is labelled by blue color. Besides the colocalized regions, RKColocal can also display the heat map of \(z\)-values via function HeatMapZ.

HeatMapZ(saca)

Through this heatmap of \(z\)-value, the pixelwise colocalization level is dispalyed clearly and most high degree of colocalization concentrates on the edge of cell.

The significant region is selected via multiple comparison and the default mutiple comparison correction is through bonferroni method. Other mutiple comparison correction method can also be used and see the details in function p.adjust.

colocalpixel(X, Y, method = 'BH')

References

Adler, J., S. N. Pagakis, and I. Parmryd. 2008. “Replicate-Based Noise Corrected Correlation for Accurate Measurements of Colocalization.” Journal of Microscopy 230: 121–33.

Bolte, S., and F. P. Cordeliéres. 2006. “A Guided Tour into Subcellular Colocalization Analysis in Light Microscopy.” Journal of Microscopy 224 (3): 213–32.

Comeau, J., S. Costantino, and P. Wiseman. 2006. “A Guide to Accurate Fluorescence Microscopy Colocalization Measurements.” Biophysical Journal 91 (12): 4611–22.

Costes, S., D. Daelemans, E. Cho, Z. Dobbin, G. Pavlakis, and S. Lockett. 2004. “Automatic and Quantitative Measurement of Protein-Protein Colocalization in Live Cells.” Biophysical Journal 86 (6): 3993–4003.

Dunn, K. W., M. M. Kamocka, and J. H. McDonald. 2011. “A Practical Guide to Evaluating Colocalization in Biological Microscopy.” American Journal of Physiology-Cell Physiology 300 (4): 723–42.

French, A. P., S. Mills, R. Swarup, M. J. Bennett, and T. P. Pridmore. 2008. “Colocalization of Fluorescent Markers in Confocal Microscope Images of Plant Cells.” Nature Protocols 3 (4): 619–28.

Li, Q., A. Lau, T.J. Morris, L. Guo, C.B. Fordyce, and E.F. Stanley. 2004. “A Syntaxin 1, Galpha(o), and N-Type Calcium Channel Complex at a Presynaptic Nerve Terminal: Analysis by Quantitative Immunocolocalization.” The Journal of Neuroscience 24: 4070–81.

Manders, E., J. Stap, G. Brakenhoff, R. Van Driel, and J. Aten. 1992. “Dynamics of Three-Dimensional Replication Patterns During the S-Phase, Analysed by Double Labelling of Dna and Confocal Microscopy.” Journal of Cell Science 103 (3): 857–62.

Manders, E., F. Verbeek, and J. Aten. 1993. “Measurement of Co-Localization of Objects in Dual-Colour Confocal Images.” Journal of Microscopy 169 (3): 375–82.

Wang, S., E. T. Arena, J. T. Becker, W. M. Bement, N. M. Sherer, K. W. Eliceiri, and M. Yuan. 2017. “Spatially Adaptive Colocalization Analysis in Dual-Color Fluorescence Microscopy.” ArXiv.

Wang, S., E. T. Arena, K. W. Eliceiri, and M. Yuan. 2017. “Automated and Robust Quantification of Colocalization in Dual-Color Fluorescence Microscopy: A Nonparametric Statistical Approach.” IEEE Transactions on Image Processing to appear.