Contents

1 Version Info

R version: R version 4.4.0 RC (2024-04-16 r86468)
Bioconductor version: 3.19
Package version: 1.31.0

2 Sample Workflow

The following code illustrates a typical R / Bioconductor session. It uses RMA from the affy package to pre-process Affymetrix arrays, and the limma package for assessing differential expression.

## Load packages
library(affy)   # Affymetrix pre-processing
library(limma)  # two-color pre-processing; differential
                  # expression
                
## import "phenotype" data, describing the experimental design
phenoData <- 
    read.AnnotatedDataFrame(system.file("extdata", "pdata.txt",
    package="arrays"))

## RMA normalization
celfiles <- system.file("extdata", package="arrays")
eset <- justRMA(phenoData=phenoData,
    celfile.path=celfiles)
## Warning: replacing previous import 'AnnotationDbi::tail' by 'utils::tail' when
## loading 'hgfocuscdf'
## Warning: replacing previous import 'AnnotationDbi::head' by 'utils::head' when
## loading 'hgfocuscdf'
## 
## differential expression
combn <- factor(paste(pData(phenoData)[,1],
    pData(phenoData)[,2], sep = "_"))
design <- model.matrix(~combn) # describe model to be fit

fit <- lmFit(eset, design)  # fit each probeset to model
efit <- eBayes(fit)        # empirical Bayes adjustment
topTable(efit, coef=2)      # table of differentially expressed probesets
##                 logFC   AveExpr         t      P.Value    adj.P.Val        B
## 204582_s_at  3.468416 10.150533  39.03471 1.969915e-14 1.732146e-10 19.86082
## 211548_s_at -2.325670  7.178610 -22.73165 1.541158e-11 6.775701e-08 15.88709
## 216598_s_at  1.936306  7.692822  21.73818 2.658881e-11 7.793180e-08 15.48223
## 211110_s_at  3.157766  7.909391  21.19204 3.625216e-11 7.969130e-08 15.24728
## 206001_at   -1.590732 12.402722 -18.64398 1.715422e-10 3.016740e-07 14.01955
## 202409_at    3.274118  6.704989  17.72512 3.156709e-10 4.626157e-07 13.51659
## 221019_s_at  2.251730  7.104012  16.34552 8.353283e-10 1.049292e-06 12.69145
## 204688_at    1.813001  7.125307  14.75281 2.834343e-09 3.115297e-06 11.61959
## 205489_at    1.240713  7.552260  13.62265 7.264649e-09 7.097562e-06 10.76948
## 209288_s_at -1.226421  7.603917 -13.32681 9.401074e-09 7.784531e-06 10.53327

A top table resulting from a more complete analysis, described in Chapter 7 of Bioconductor Case Studies, is shown below. The table enumerates Affymetrix probes, the log-fold difference between two experimental groups, the average expression across all samples, the t-statistic describing differential expression, the unadjusted and adjusted (controlling for false discovery rate, in this case) significance of the difference, and log-odds ratio. These results can be used in further analysis and annotation.

      ID logFC AveExpr    t  P.Value adj.P.Val     B
636_g_at  1.10    9.20 9.03 4.88e-14  1.23e-10 21.29
39730_at  1.15    9.00 8.59 3.88e-13  4.89e-10 19.34
 1635_at  1.20    7.90 7.34 1.23e-10  1.03e-07 13.91
 1674_at  1.43    5.00 7.05 4.55e-10  2.87e-07 12.67
40504_at  1.18    4.24 6.66 2.57e-09  1.30e-06 11.03
40202_at  1.78    8.62 6.39 8.62e-09  3.63e-06  9.89
37015_at  1.03    4.33 6.24 1.66e-08  6.00e-06  9.27
32434_at  1.68    4.47 5.97 5.38e-08  1.70e-05  8.16
37027_at  1.35    8.44 5.81 1.10e-07  3.08e-05  7.49
37403_at  1.12    5.09 5.48 4.27e-07  1.08e-04  6.21

[ Back to top ]

3 Installation and Use

Follow installation instructions to start using these packages. You can install affy and limma as follows:

if (!"BiocManager" %in% rownames(installed.packages()))
     install.packages("BiocManager")
BiocManager::install(c("affy", "limma"), dependencies=TRUE)

To install additional packages, such as the annotations associated with the Affymetrix Human Genome U95A 2.0, use

BiocManager::install("hgu95av2.db", dependencies=TRUE)

Package installation is required only once per R installation. View a /packagesfull list of available packages.

To use the affy and limma packages, evaluate the commands

library("affy")
library("limma")

These commands are required once in each R session.

[ Back to top ]

4 Exploring Package Content

Packages have extensive help pages, and include vignettes highlighting common use cases. The help pages and vignettes are available from within R. After loading a package, use syntax like

help(package="limma")
?topTable

to obtain an overview of help on the limma package, and the topTable function, and

browseVignettes(package="limma")

to view vignettes (providing a more comprehensive introduction to package functionality) in the limma package. Use

help.start()

to open a web page containing comprehensive help resources.

[ Back to top ]

5 Pre-Processing Resources

The following provide a brief overview of packages useful for pre-processing. More comprehensive workflows can be found in documentation (available from package descriptions) and in Bioconductor Books and monographs.

5.1 Affymetrix 3’-biased Array

affy, gcrma, affyPLM

  • Require cdf package, probe package and annotation package
  • All these packages are available from Bioconductor via BiocManager::install()

xps

  • Requires installation of ROOT
  • Uses data files from Affymetrix (.CDF, .PGF, .CLF, .CSV) directly

5.2 Affymetrix Exon ST Arrays

oligo

  • Requires a pdInfoPackage built using pdInfoBuilder
  • This package collates cdf, probe, annotation data together
  • These packages are available from Bioconductor via BiocManager::install()
  • Most cases will require a 64-bit computer running Linux and >= 8Gb RAM

exonmap

  • Requires installation of MySQL and Ensembl core database tables
  • Requires specially modified cdf and affy package
  • Requires a 64-bit computer running Linux and >= 8 Gb RAM

xps

  • Requires installation of ROOT
  • Uses data files from Affymetrix (.CDF, .PGF, .CLF, .CSV) directly
  • Will run on conventional desktop computers

5.3 Affymetrix Gene ST Arrays

oligo

  • Requires a pdInfoPackage built using pdInfoBuilder
  • This package collates cdf, probe, annotation data together
  • These packages are available from Bioconductor via BiocManager::install()

xps

  • Requires installation of ROOT
  • Uses data files from Affymetrix (.CDF, .PGF, .CLF, .CSV) directly

5.4 Affymetrix SNP Arrays

oligo

  • Requires a pdInfoPackage built using pdInfoBuilder
  • This package collates cdf, probe, annotation and HapMap data
  • These packages are available from Bioconductor via BiocManager::install()
  • Not yet capable of processing CNV regions in SNP5.0 and SNP6.0

5.5 Affymetrix Tiling Arrays

oligo

  • Requires a pdInfoPackage built using pdInfoBuilder
  • This package collates data from bpmap and cif files

5.6 Nimblegen Arrays

oligo

5.7 Illumina Expression Microarrays

lumi

  • Requires lumi-specific mapping and annotation packages (e.g., lumiHumanAll.db and lumiHumanIDMapping)

beadarray

  • Requires beadarray-specific mapping and annotation packages (e.g., illuminaHumanv1BeadID.db and illuminaHumanV1.db)

[ Back to top ]

sessionInfo()
## R version 4.4.0 RC (2024-04-16 r86468)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.4 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.20-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] hgfocuscdf_2.18.0   affy_1.81.0         Biobase_2.63.1     
## [4] BiocGenerics_0.49.1 limma_3.59.10       arrays_1.31.0      
## [7] BiocStyle_2.31.0   
## 
## loaded via a namespace (and not attached):
##  [1] bit_4.0.5               preprocessCore_1.65.0   jsonlite_1.8.8         
##  [4] crayon_1.5.2            compiler_4.4.0          BiocManager_1.30.22    
##  [7] blob_1.2.4              Biostrings_2.71.6       jquerylib_0.1.4        
## [10] png_0.1-8               IRanges_2.37.1          yaml_2.3.8             
## [13] fastmap_1.1.1           statmod_1.5.0           XVector_0.43.1         
## [16] R6_2.5.1                GenomeInfoDb_1.39.14    knitr_1.46             
## [19] bookdown_0.39           GenomeInfoDbData_1.2.12 AnnotationDbi_1.65.2   
## [22] DBI_1.2.2               bslib_0.7.0             affyio_1.73.0          
## [25] rlang_1.1.3             KEGGREST_1.43.1         cachem_1.0.8           
## [28] xfun_0.43               sass_0.4.9              bit64_4.0.5            
## [31] RSQLite_2.3.6           memoise_2.0.1           cli_3.6.2              
## [34] zlibbioc_1.49.3         digest_0.6.35           lifecycle_1.0.4        
## [37] S4Vectors_0.41.7        vctrs_0.6.5             evaluate_0.23          
## [40] stats4_4.4.0            httr_1.4.7              rmarkdown_2.26         
## [43] UCSC.utils_0.99.7       tools_4.4.0             htmltools_0.5.8.1

[ Back to top ]