---
title: "RImmPort: Quick Start Guide"
author: "Ravi Shankar"
date: "`r Sys.Date()`"
output: 
    rmarkdown::pdf_document:
        number_sections: true
vignette: >
  %\VignetteIndexEntry{RImmPort: Quick Start Guide}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

*****
# Introduction
ImmPort study data is available for download in two formats currently: MySQL and TSV (Tab) formats. The RImmPort workflow is as follows: 
1) MySQL formatted study data: 
       User downloads one or more studies in MySQL zip files. 
       Unzips the files. 
       Loads local database instance. 
       Connects to the database. 
       Sets the ImmPort data source to the connection handle. 
       Invokes RImmPort functions.
2) Tab: 
       User downloads one or more studies in Tab format. 
       Passes the folder where the zip files are located to an RImmPort function that builds SQLite database. 
       Connects to the database. 
       Sets the ImmPort data source to the connection handle. 
       Invokes RImmPort functions.

User downloads study data of interest from the ImmPort website ( http://www.immport.org ) **. Depending on the file format MySQL or Tab the data is loaded into a local MySQL and SQLite database respectively. The user installs the RImmPort package, loads the RImmPort library, connects to the ImmPort database, and calls RImmPort methods to load study data from the database into R. Please refer to RImmPort_Article.pdf for a detailed discussion on RImmPort.

** User need to regsiter to the ImmPort website for downloading the datasets.

# Initial Steps

- Download MySQL or Tab formatted data of studies of interest from the ImmPort website
- If working with MySQL-format, load the data in to a local MySQL database
- Install and load RImmPort package, and other required packages.

# Load the RImmPort library

```{r}
library(RImmPort)
library(DBI)
library(sqldf)
library(plyr)

```
# Setup ImmPort data source that all RImmPort functions will use

## Option 1: ImmPort MySQL database

### Download zip files of ImmPort study data in MySQL format. e.g.'SDY139' and 'SDY208'
### Load the data into a local MySQL database

### Connect to the ImmPort MySQL database.
```{r eval=FALSE}

# provide appropriate connection parameters
mysql_conn <- dbConnect(MySQL(), user="username", password="password", 
                   dbname="database",host="host")

```

### Set the data source as the ImmPort MySQL database.

```{r  eval=FALSE}
setImmPortDataSource(mysql_conn)
```

## Option 2: ImmPort SQLite database

### Download zip files of ImmPort data, in Tab format. e.g.'SDY139' and 'SDY208'
```{r  eval=FALSE}
# get the directory where ImmPort sample data is stored in the directory structure of RImmPort package
studies_dir <- system.file("extdata", "ImmPortStudies", package = "RImmPort")

# set tab_dir to the folder where the zip files are located
tab_dir <- file.path(studies_dir, "Tab")
list.files(tab_dir)
```

### Build a local SQLite ImmPort database instance. 
```{r  eval=FALSE}
# set db_dir to the folder where the database file 'ImmPort.sqlite' should be stored
db_dir <- file.path(studies_dir, "Db")
```
```{r  eval=FALSE}
# build a new ImmPort SQLite database with the data in the downloaded zip files
buildNewSqliteDb(tab_dir, db_dir) 
```
```{r  eval=FALSE}
list.files(db_dir)
```

### Connect to the ImmPort SQLite database 
```{r}
# get the directory of a sample SQLite database that has been bundled into the RImmPort package
db_dir <- system.file("extdata", "ImmPortStudies", "Db", package = "RImmPort")

# connect to the private instance of the ImmPort database
sqlite_conn <- dbConnect(SQLite(), dbname=file.path(db_dir, "ImmPort.sqlite"))
```

### Set the data source to the ImmPort SQLite DB
```{r}
setImmPortDataSource(sqlite_conn)
```

# NOTE: In rest of the script, all RImmPort functions will use the SQLite ImmPort database as the data source.

# Get all study ids
```{r}
getListOfStudies()
```

# Get all data of a specific study
The `getStudyFromDatabase` queries the ImmPort database for the entire dataset of a specific study, and instantiates the `Study` reference class with that data. 
```{r}
?Study

# load all the data of study: `SDY139`
study_id <- 'SDY139'
sdy139 <- getStudy(study_id)

# access Demographics data of SDY139
dm_df <- sdy139$special_purpose$dm_l$dm_df
head(dm_df)

# access Concomitant Medications data of SDY139
cm_df <- sdy139$interventions$cm_l$cm_df
head(cm_df)

# get Trial Title from Trial Summary
ts_df <- sdy139$trial_design$ts_l$ts_df
title <- ts_df$TSVAL[ts_df$TSPARMCD== "TITLE"]
title
```

# Get the list of Domain names. 
Note that some RImmPort functions take a domain name as input.

```{r}

# get the list of names of all supported Domains
getListOfDomains()

?"Demographics Domain"

```

# Get list of studies with specifc domain data
The Domain name should be exact to what is found in the list of Domain names.
```{r}
# get list of studies with Cellular Quantification data
domain_name <- "Cellular Quantification"
study_ids_l <- getStudiesWithSpecificDomainData(domain_name)
study_ids_l
```

# Get specifc domain data of one or more studies
The Domain name should be exact to what is found in the list of Domain names. 
```{r}
# get Cellular Quantification data of studies `SDY139` and `SDY208`

# get domain code of Cellular Quantification domain
domain_name <- "Cellular Quantification"
getDomainCode(domain_name)

study_ids <- c("SDY139", "SDY208")
domain_name <- "Cellular Quantification"
zb_l <- getDomainDataOfStudies(domain_name, study_ids)
if (length(zb_l) > 0) 
  names(zb_l)
head(zb_l$zb_df)
```

# Get the list of assay types from ImmPort studies
```{r}
getListOfAssayTypes()
```

# Get specific assay data of one or more Immport studies
The assay type should be exact to what is found in the list of supported assay types. 
```{r}

# get 'ELISPOT' data of study `SDY139`
assay_type <- "ELISPOT"
study_id = "SDY139"
elispot_l <- getAssayDataOfStudies(study_id, assay_type)
if (length(elispot_l) > 0)
  names(elispot_l)
head(elispot_l$zb_df)

```

# Serialize RImmPort-formatted study data as .rds files
```{r eval=FALSE}

# serialize all of the data of studies `SDY139` and `SDY208'
study_ids <- c('SDY139', 'SDY208')

# the folder where the .rds files will be stored
rds_dir <- file.path(studies_dir, "Rds")

serialzeStudyData(study_ids, rds_dir)
list.files(rds_dir)

```

# Load the serialzed data (.rds) files of a specific domain of a study from the directory where the files are located
```{r}
# get the directory where ImmPort sample data is stored in the directory structure of RImmPort package
studies_dir <- system.file("extdata", "ImmPortStudies", package = "RImmPort")

# the folder where the .rds files will be stored
rds_dir <- file.path(studies_dir, "Rds")

# list the studies that have been serialized
list.files(rds_dir)

# load the serialized data of study `SDY208` 
study_id <- 'SDY208'
dm_l <- loadSerializedStudyData(rds_dir, study_id, "Demographics")
head(dm_l[[1]])

```