---
title: "rhdf5client Elements"
author: "Sam Pollack"
date: "9/1/2018"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Objects and methods in rhdf5client}
  %\VignetteEngine{knitr::rmarkdown}
  \usepackage[utf8]{inputenc}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(rhdf5client)
```

# HSDSSource

An object of type HSDSSource is a HDFGroup server running on a machine. The 
constructor requires the endpoint and server type. At present, the only 
valie value is `hsds` (for the HDF Scalable Data Service).
If the type is not specified, the server will be assumed to be `hsds`

```{r}
src.hsds <- HSDSSource('http://hsdshdflab.hdfgroup.org')
```

The routine `listDomains` is provided for inspection of the server hierarchy.
This is the hierarchy that maps approximately to the directory structure of
the server file system. The purpose of this routine is to assist the user
in locating HDF5 files. 

The user needs to know the root domain of the server. The data set's
maintainer should publish this information along with the server endpoint.

```{r}
listDomains(src.hsds, '/home/jreadey')
listDomains(src.hsds, '/home/jreadey/HDFLabTutorial')
```

# HSDSFile

An object of class HSDSFile represents a HDF5 file. The object is constructed 
by providing a source and a file domain. 

```{r}
f0 <- HSDSFile(src.hsds, '/home/spollack/testzero.h5')
f1 <- HSDSFile(src.hsds, '/shared/bioconductor/tenx_full.h5')
```

The function `listDatasets` lists the datasets in a file.

```{r}
listDatasets(f0)
listDatasets(f1)
```

# HSDSDataset

Construct a HSDSDataset object from a HSDSFile and a dataset path.

```{r}
d0 <- HSDSDataset(f0, '/grpA/grpAB/dsetX')
d1 <- HSDSDataset(f1, '/newassay001')
```

## Data Fetch (1)

The low-level data retrieval method is `getData`. Its argument is a
vector of slices of type `character`. Valid slices are `:` (all indices), 
`1:10` (indices 1 through 10 inclusive), `:10` (same as `1:10`), `5:` 
(from 5 to the maximum value of the index) and `2:14:4` (from 2 to 14 
inclusive in increments of 4.)

Note that the slice should be passed in R semantics: 1 signifies
the first element, and the last element is included in the slice. (Internally,
rhdf5client converts to Python semantics, in which the first index is 0 
and the last element is excluded. But here, as everywhere in the package,
all Python details should be hidden from the user.)

```{r}
apply(getData(d1, c('1:4', '1:27998'), transfermode='JSON'), 1, sum)
apply(getData(d1, c('1:4', '1:27998'), transfermode='binary'), 1, sum)
```

## Data Fetch (2)

`getData` is generic. It can also be passed a list of vectors for the index 
argument, one vector in each dimension. At present, it only works if 
each of the vectors can be expressed as a single slice. Eventually, this 
functionality will be expanded to the general multi-dimensional case of 
multiple slices. In the general case, multiple array blocks will be 
fetched and bound back together into a single array.

```{r}
apply(getData(d1, list(1:4, 1:27998), transfermode='JSON'), 1, sum)
apply(getData(d1, list(1:4, 1:27998), transfermode='binary'), 1, sum)
```

## Data Fetch (3)

The `[` operator is provided for the two most typical cases (one-dimensional and two-dimensional numeric data.)

```{r}
apply(d1[1:4, 1:27998], 1, sum)
```