Package 'MetaComp' reference manual

Title:	EDGE Taxonomy Assignments Visualization
Description:	Implements routines for metagenome sample taxonomy assignments collection, aggregation, and visualization. Accepts the EDGE-formatted output from GOTTCHA/GOTTCHA2, BWA, Kraken, MetaPhlAn, DIAMOND, and Pangia. Produces SVG and PDF heatmap-like plots comparing taxa abundances across projects.
Authors:	Pavel Senin [aut, cre]
Maintainer:	Pavel Senin <[email protected]>
License:	GPL-2
Version:	1.1.2
Built:	2025-03-04 05:18:45 UTC
Source:	https://github.com/seninp-bioinfo/metacomp

Efficiently loads an EDGE-produced taxonomic assignment from a file. An assumption has been made – since EDGE tables are generated in an automated fashion, they should be properly formatted – thus the code doesn't check for any inconsistencies except for the very file existence. Note however, the unassigned to taxa entries are removed. This implementation fully relies on the fread function from data.table package gaining performance over traditional R techniques.

Description

Efficiently loads an EDGE-produced taxonomic assignment from a file. An assumption has been made – since EDGE tables are generated in an automated fashion, they should be properly formatted – thus the code doesn't check for any inconsistencies except for the very file existence. Note however, the unassigned to taxa entries are removed. This implementation fully relies on the fread function from data.table package gaining performance over traditional R techniques.

Usage

load_edge_assignment(filepath, type)
load_edge_assignment(filepath, type)

Arguments

`filepath`	the path to EDGE-generated tab-delimited taxonomy assignment file.
`type`	the assignment type. Following types are recognized: 'bwa', 'diamond', 'gottcha', 'gottcha2', 'kraken', 'metaphlan', and 'pangia'.

Value

a data frame containing four columns: TAXA, LEVEL, COUNT, and ABUNDANCE, representing taxonomically anchored sequences from the sample.

Examples

pa_fpath <- system.file("extdata", "HMP_even//allReads-pangia.list.txt", package="MetaComp")
pangia_assignment = load_edge_assignment(pa_fpath, type = "pangia")

table(pangia_assignment$LEVEL)

pangia_assignment[pangia_assignment$LEVEL == "phylum",]

pa_fpath <- system.file("extdata", "HMP_even//allReads-pangia.list.txt", package="MetaComp")
pangia_assignment = load_edge_assignment(pa_fpath, type = "pangia")

table(pangia_assignment$LEVEL)

pangia_assignment[pangia_assignment$LEVEL == "phylum",]

Efficiently loads a BWA (or other EDGE-like taxonomic assignment) tables from a list of files. Outputs a named list of assignments.

Description

Efficiently loads a BWA (or other EDGE-like taxonomic assignment) tables from a list of files. Outputs a named list of assignments.

Usage

load_edge_assignments(filepath, type)
load_edge_assignments(filepath, type)

Arguments

`filepath`	the path to tab delimited, two-column file whose first column is a project_id (which will be used to name this assignment) and the second column is the assignment filename.
`type`	the type of assignments to be loaded. Following types are recognized: 'bwa', 'diamond', 'gottcha', 'gottcha2', 'kraken', 'metaphlan', and 'pangia'.

Value

a list of all read assignments.

Examples

hmp_even_fp <- system.file("extdata", "HMP_even", package="MetaComp")
hmp_stagger_fp <- system.file("extdata", "HMP_stagger", package="MetaComp")
data_files <- data.frame(V1 = c("HMP_even", "HMP_stagger"),
                         V2 = c(file.path(hmp_even_fp, "allReads-gottcha2-speDB-b.list.txt"),
                                file.path(hmp_stagger_fp, "allReads-gottcha2-speDB-b.list.txt")))
write.table(data_files, file.path(tempdir(), "assignments.txt"),
                                                 row.names = FALSE, col.names = FALSE)
gottcha2_assignments = load_edge_assignments(file.path(tempdir(), "assignments.txt"),
                                                                            type = "gottcha2")

names(gottcha2_assignments)
table(gottcha2_assignments[[1]]$LEVEL)

hmp_even_fp <- system.file("extdata", "HMP_even", package="MetaComp")
hmp_stagger_fp <- system.file("extdata", "HMP_stagger", package="MetaComp")
data_files <- data.frame(V1 = c("HMP_even", "HMP_stagger"),
                         V2 = c(file.path(hmp_even_fp, "allReads-gottcha2-speDB-b.list.txt"),
                                file.path(hmp_stagger_fp, "allReads-gottcha2-speDB-b.list.txt")))
write.table(data_files, file.path(tempdir(), "assignments.txt"),
                                                 row.names = FALSE, col.names = FALSE)
gottcha2_assignments = load_edge_assignments(file.path(tempdir(), "assignments.txt"),
                                                                            type = "gottcha2")

names(gottcha2_assignments)
table(gottcha2_assignments[[1]]$LEVEL)

Merges two or more EDGE-like taxonomical assignments. The input data frames are assumed to have the following columns: LEVEL, TAXA, and ABUNDANCE – these will be used in the merge procedure, all other columns will be ignored.

Description

Merges two or more EDGE-like taxonomical assignments. The input data frames are assumed to have the following columns: LEVEL, TAXA, and ABUNDANCE – these will be used in the merge procedure, all other columns will be ignored.

Usage

merge_edge_assignments(assignments)
merge_edge_assignments(assignments)

Arguments

assignments

A named list of assignments (the list element's name will be used as a resulting data frame column name).

Value

A merged table, which is a data frame whose rows are taxonomical ids and columns are the input assignments ids.

Examples

## Not run: 
hmp_even_fp <- system.file("extdata", "HMP_even", package="MetaComp")
hmp_stagger_fp <- system.file("extdata", "HMP_stagger", package="MetaComp")
data_files <- data.frame(V1 = c("HMP_even", "HMP_stagger"),
                         V2 = c(file.path(hmp_even_fp, "allReads-gottcha2-speDB-b.list.txt"),
                                file.path(hmp_stagger_fp, "allReads-gottcha2-speDB-b.list.txt")))
write.table(data_files, file.path(tempdir(), "assignments.txt"),
                                                 row.names = FALSE, col.names = FALSE)
gottcha2_assignments = merge_edge_assignments(
                         load_edge_assignments(
                           file.path(tempdir(), "assignments.txt"), type = "gottcha2"))

## End(Not run)

## Not run: 
hmp_even_fp <- system.file("extdata", "HMP_even", package="MetaComp")
hmp_stagger_fp <- system.file("extdata", "HMP_stagger", package="MetaComp")
data_files <- data.frame(V1 = c("HMP_even", "HMP_stagger"),
                         V2 = c(file.path(hmp_even_fp, "allReads-gottcha2-speDB-b.list.txt"),
                                file.path(hmp_stagger_fp, "allReads-gottcha2-speDB-b.list.txt")))
write.table(data_files, file.path(tempdir(), "assignments.txt"),
                                                 row.names = FALSE, col.names = FALSE)
gottcha2_assignments = merge_edge_assignments(
                         load_edge_assignments(
                           file.path(tempdir(), "assignments.txt"), type = "gottcha2"))

## End(Not run)

Merges two or more EDGE-like taxonomical assignments. The input data frames are assumed to have the following columns: LEVEL, TAXA, and COUNT – these will be used in the merge procedure, all other columns will be ignored.

Description

Merges two or more EDGE-like taxonomical assignments. The input data frames are assumed to have the following columns: LEVEL, TAXA, and COUNT – these will be used in the merge procedure, all other columns will be ignored.

Usage

merge_edge_counts(assignments)
merge_edge_counts(assignments)

Arguments

assignments

A named list of assignments (the list element's name will be used as a resulting data frame column name).

Value

A merged table, which is a data frame whose rows are taxonomical ids and columns are the input assignments ids.

Generates a single column ggplot for a taxonomic assignment table and also outputs a PDF.

Description

This implementation is built upon ggplot geom_tile.

Usage

plot_edge_assignment(assignment, level, plot_title, column_title, filename)
plot_edge_assignment(assignment, level, plot_title, column_title, filename)

Arguments

`assignment`	The EDGE-like assignment table.
`level`	The taxonomic level to plot (i.e., family, strain, etc...).
`plot_title`	The plot title, e.g., "Project XX, Run YY".
`column_title`	The column title.
`filename`	The PDF file name mask.

Value

the ggplot2 plot.

Examples

pa_fpath <- system.file("extdata", "HMP_even//allReads-pangia.list.txt", package="MetaComp")
pangia_assignment = load_edge_assignment(pa_fpath, type = "pangia")

plot_edge_assignment(pangia_assignment, "phylum", "Pangia", "HMP Even",
                                                     file.path(tempdir(), "assignment.pdf"))

pa_fpath <- system.file("extdata", "HMP_even//allReads-pangia.list.txt", package="MetaComp")
pangia_assignment = load_edge_assignment(pa_fpath, type = "pangia")

plot_edge_assignment(pangia_assignment, "phylum", "Pangia", "HMP Even",
                                                     file.path(tempdir(), "assignment.pdf"))

Generates a single column ggplot for a taxonomic assignment table.

Description

This implementation...

Usage

plot_merged_assignment(assignment, taxonomy_level,
  sorting_order = "abundance", row_limit = 60, min_row_abundance = 0,
  plot_title, filename)
plot_merged_assignment(assignment, taxonomy_level,
  sorting_order = "abundance", row_limit = 60, min_row_abundance = 0,
  plot_title, filename)

Arguments

`assignment`	The gottcha-like merged assignment table.
`taxonomy_level`	The level which need to be plotted.
`sorting_order`	the order in which rows shall be sorted, "abundance" is defult, "alphabetical" is an alternative.
`row_limit`	the max amount of rows to plot (default is 60).
`min_row_abundance`	the minimal sum of abundances in a row required to plot. Rows whose sum is less than this value are dropped even if row_limit is specified. Ignored for "alphabetical" order. (default 0.0).
`plot_title`	The plot title.
`filename`	The output file mask, PDF and SVG files will be produced with Cairo device.

Examples

## Not run: 
hmp_even_fp <- system.file("extdata", "HMP_even", package="MetaComp")
hmp_stagger_fp <- system.file("extdata", "HMP_stagger", package="MetaComp")
data_files <- data.frame(V1 = c("HMP_even", "HMP_stagger"),
                         V2 = c(file.path(hmp_even_fp, "allReads-gottcha2-speDB-b.list.txt"),
                                file.path(hmp_stagger_fp, "allReads-gottcha2-speDB-b.list.txt")))
write.table(data_files, file.path(tempdir(), "assignments.txt"),
                                                 row.names = FALSE, col.names = FALSE)
gottcha2_assignments = merge_edge_assignments(
                         load_edge_assignments(
                           file.path(tempdir(), "assignments.txt"), type = "gottcha2"))
plot_merged_assignment(gottcha2_assignments, "family", 'alphabetical', 100, 0,
                                       "HMP side-to-side", file.path(tempdir(), "assignment.pdf"))

## End(Not run)

## Not run: 
hmp_even_fp <- system.file("extdata", "HMP_even", package="MetaComp")
hmp_stagger_fp <- system.file("extdata", "HMP_stagger", package="MetaComp")
data_files <- data.frame(V1 = c("HMP_even", "HMP_stagger"),
                         V2 = c(file.path(hmp_even_fp, "allReads-gottcha2-speDB-b.list.txt"),
                                file.path(hmp_stagger_fp, "allReads-gottcha2-speDB-b.list.txt")))
write.table(data_files, file.path(tempdir(), "assignments.txt"),
                                                 row.names = FALSE, col.names = FALSE)
gottcha2_assignments = merge_edge_assignments(
                         load_edge_assignments(
                           file.path(tempdir(), "assignments.txt"), type = "gottcha2"))
plot_merged_assignment(gottcha2_assignments, "family", 'alphabetical', 100, 0,
                                       "HMP side-to-side", file.path(tempdir(), "assignment.pdf"))

## End(Not run)

Package 'MetaComp'

Help Index

Description

Usage

Arguments

Value

Examples

Efficiently loads a BWA (or other EDGE-like taxonomic assignment) tables from a list of files. Outputs a named list of assignments.

Description

Usage

Arguments

Value

Examples

Merges two or more EDGE-like taxonomical assignments. The input data frames are assumed to have the following columns: LEVEL, TAXA, and ABUNDANCE – these will be used in the merge procedure, all other columns will be ignored.

Description

Usage

Arguments

Value

Examples

Merges two or more EDGE-like taxonomical assignments. The input data frames are assumed to have the following columns: LEVEL, TAXA, and COUNT – these will be used in the merge procedure, all other columns will be ignored.

Description

Usage

Arguments

Value

Generates a single column ggplot for a taxonomic assignment table and also outputs a PDF.

Description

Usage

Arguments

Value

Examples

Generates a single column ggplot for a taxonomic assignment table.

Description

Usage

Arguments

Examples