This package is used for estimating viral transmission bottleneck sizes using different methods.
First step, install the package “ViralBottleneck” and download dataset in test_dataset folder
Second step, the transmission object need to be created before bottleneck size estimation. To create transmission object, the working directory need to meet two requirements: transmission pairs table and sample files used for estimation. This package would extract sample files according to the transmission pairs table the users input.
The example of the transmission pairs table is below (in
test_dataset
folder in package)
donor | recipient |
---|---|
donor_3000 | 50_0_All_r1 |
donor_3000 | 50_3_All_r1 |
donor_3000 | 50_6_All_r1 |
donor_3000 | 50_9_All_r1 |
donor_3000 | 50_12_All_r1 |
Note: Do not put the “-” in name of sample.
After making sure the sample files all exist according to the transmission pairs, start to create transmission object. example code:
Sim_trans = read.table("Example_TansmissionPairs.csv",header = TRUE,sep = ",")
Sim_ob = CreateTransmissionObject(Sim_trans)
2.1 Subset transmission object
The transmission object could be used as list.
After creating transmission object, the Summary_ob
function would provide the information of shared sites (the sites belong
to shared sites should be sequenced both in donor and recipient.) for
users. Example code:
The result:
Donors | Recipients | number.of.shared.sites |
---|---|---|
donor_3000 | 50_0_All_r1 | 13158 |
donor_3000 | 50_3_All_r1 | 13158 |
donor_3000 | 50_6_All_r1 | 13158 |
donor_3000 | 50_9_All_r1 | 13158 |
donor_3000 | 50_12_All_r1 | 13158 |
Finally, start to calculate transmission bottleneck size using transmission object.
4.1 Output of
Bottleneck_size_Calculation
function
Take calculation using Beta-binomial method approximate version as an example:
BB_App_output =
Bottleneck_size_Calculation(
transmission_ob = Sim_ob,
method ="Beta_binomial_Approximate",
error_calling = 0,
variant_calling = 0.03,
Nbmin = 1,
Nbmax = 200,
donor_depth_threshold = 0,
recipient_depth_threshold = 0
)
Output like:
donor | recipient | transmission_bottleneck_size | CI_low | CI_high |
---|---|---|---|---|
donor_3000 | 50_0_All_r1 | 70 | 64 | 70 |
donor_3000 | 50_3_All_r1 | 45 | 30 | 64 |
donor_3000 | 50_6_All_r1 | 28 | 20 | 39 |
donor_3000 | 50_9_All_r1 | 34 | 23 | 47 |
donor_3000 | 50_12_All_r1 | 47 | 31 | 67 |
4.2 Specify transmission pairs during estimation
This package provide a chance that if user need to specify some transmission pairs for estimation
subset_transmission_pairs = read.table("Samplesize50_r1_TansmissionPairs_specify.csv"
,header = TRUE,sep = ",")
4.3 Calculation
Bottleneck_size_Calculation
could create plot of
likelihood curve for each transmission pairs in working directory.
However, this argument just used for the methods using maximum
likelihoods estimation, including KL
method (Emmett et al.,
2015), Presence-Absence
method (Sacristán
et al., 2011), Binomial
method (Leonard et
al., 2017), Beta_binomial_Approximate
method (Leonard et
al., 2017) and Beta_binomial_Exact
method (Leonard et
al., 2017). Using show_table
and plot
options could help to save output and obtain the plots of likelihood
curve for each transmission pairs. (Note: if you want to access the
original publication for each methods, you could click the
Publication link after each methods)
The program would create individual folder for each transmission pair to store the plot. Example code for creating plot:
BB_App_output_plot =
Bottleneck_size_Calculation(
transmission_ob = Sim_ob,
method = "Beta_binomial_Approximate",
error_calling = 0,
variant_calling = 0.03,
Nbmin = 1,
Nbmax = 200,
donor_depth_threshold = 0,
recipient_depth_threshold = 0,
show_table = FALSE,
plot= TRUE
)
The plot of likelihood curve for one transmission pairs (donor_3000-50_3_All_r1) is below:
4.4 Log file
Bottleneck_size_Calculation
could create log file
containing number of variant used in calculation and number of variant
filtered before calculation in working directory.
Example code:
BB_App_output_log =
Bottleneck_size_Calculation(
transmission_ob = Sim_ob,
method = "Beta_binomial_Approximate",
error_calling = 0,
variant_calling = 0.03,
Nbmin = 1,
Nbmax = 200,
donor_depth_threshold = 0,
recipient_depth_threshold = 0,
log= TRUE
)
Output of log
argument:
donor | recipient | donor_used | donor_unused | recipient_used | recipient_unused |
---|---|---|---|---|---|
donor_3000 | 50_0_All_r1 | 193 | 12965 | 193 | 12965 |
donor_3000 | 50_3_All_r1 | 193 | 12965 | 193 | 12965 |
donor_3000 | 50_6_All_r1 | 193 | 12965 | 193 | 12965 |
donor_3000 | 50_9_All_r1 | 193 | 12965 | 193 | 12965 |
donor_3000 | 50_12_All_r1 | 193 | 12965 | 193 | 12965 |
4.5 Methods comparison
Given that one major purpose of the package is to compare calculation of bottleneck sizes across methods on the same data set, it would be nice to illustrate this. For example, compare all methods (except Wright-Fisher, see below) on a single pair, Sim_ob[1]:
all_methods <-
c("KL", "Presence-Absence", "Binomial", "Beta_binomial_Approximate", "Beta_binomial_Exact")
compare_methods <-
t(sapply(all_methods, function(m){
Bottleneck_size_Calculation(Sim_ob[1], method = m)
}))
compare_methods
An example using the realistic H1N1 dataset in the folder
test_dataset
:
library(ViralBottleneck)
# Set working directory and make sure you have
# transmission pairs file and related host files in this directory.
setwd("your working directory")
# Create transmission object.
transmission_pairs = read.csv("H1N1_transmission_pairs.csv", sep = " ")
ob_H1N1 = ViralBottleneck::CreateTransmissionObject(transmission_pairs)
# Applying all methods on one transmission pair.
all_methods <-
c("KL", "Presence-Absence", "Binomial", "Beta_binomial_Approximate", "Beta_binomial_Exact")
compare_methods <-
t(sapply(all_methods, function(m){
Bottleneck_size_Calculation(ob_H1N1[1],
error_calling = 0,
variant_calling = 0.03
Nbmin = 1,Nbmax = 400,
donor_depth_threshold = 0,
recipient_depth_threshold = 0 ,
method = m)
}))
# Save results as csv file.
write.csv(compare_methods,"compare_methods.csv")
result:
method | donor | recipient | transmission_bottleneck_size | CI_low | CI_high |
---|---|---|---|---|---|
KL | 681_1_H1N1_donor | 681_1_H1N1_recipient | 21 | 14 | 30 |
Presence-Absence | 681_1_H1N1_donor | 681_1_H1N1_recipient | 13 | 9 | 19 |
Binomial | 681_1_H1N1_donor | 681_1_H1N1_recipient | 66 | 66 | 67 |
Beta_binomial_Approximate | 681_1_H1N1_donor | 681_1_H1N1_recipient | 50 | 30 | 78 |
Beta_binomial_Exact | 681_1_H1N1_donor | 681_1_H1N1_recipient | 49 | 30 | 78 |