THAIS NISENBAUM

R


Data Analysis and Visualization

Example 1:Assay of serum free light chain for 7874 subjects

Analysis of the flchain dataset from theR survival package

Example 2:Ten Most Commonly Mutated Genes in Six Cancer Types

Output used in the Analysis of Genes Associated with Cancer: Variant Association and Drug Interactions referenced under Python Computer Gateway Interface (CGI) Programming

Data Mining

Ensembl BiomaRt

Ensembl is a genome browser for chordate sequenced genomes and non-chordate model organisms. The data from Ensembl is stored in MySQL relational databases and can be retrieved using BioMart, which contains several categories in which a query can be filtered. BioMart can be accessed through the web or through the Bioconductor package BiomaRt. Bioconductor is an open-source tool to mine, analyze, and visualize data

Ensembl - BioMart
Bioconductor - BiomaRt

Example 1:According to information from OMIM.org for Huntington's disease, biomaRt was used to retrieve two tables containing the following attributes for five MIM identifiers.

Table 1: Entrez Gene ID, HGNC symbol, Ensembl Gene ID

Table 2: HGNC symbol, Ensembl Gene ID, Ensembl Transcript ID

Table 1 is gene-specific, with each line representing one gene. Table 2 has more lines because it shows transcripts, and each gene can have more than one transcript.

Sequence Analysis and Manipulation

Seqnir

Seqnir is an R package used to analyze DNA and protein sequences from databases (FASTA format).

More information about Seqnir can be foundhere

Example:Analysis of Pseudomonas alkylphenoliatrain KL28 genome sequence using the Seqnir R package