Logistic Regression Splicing Filter
This is an R based filter so will only be available if you have configured
R in the SeqMonk preferences and installed the pre-requisite packages
The logistic regression splicing filter is useful for detecting certain classes
of alternate splicing events in RNA-Seq data. There are some pre-requisites
for using this filter.
- You must be analysing RNA-Seq data which has been imported setting the
option to "Import Introns Rather than Exons" so that the data you see
are the introns from splicing events
- You must have generated probes such that the probes lie directly over
introns. You would normally to this using the Read Position probe generator, but
you could also theoretically use the Feature probe generator to specifically define
introns
- Your quantitation must be a raw count of how many times that intron was observed
in each sample. This would normally be achieved using the Exact Overlap Count
quantitation method with all of the normalisation options turned off.
- Your datasets must be assembled into replicate sets with at least 3 replicates
per set
What the filter does is to identify putative splicing events by finding pairs of
probes whose start or end positions are exactly the same. It then uses a logistic
regression test to test whether the ratio of the counts in these two introns changes
significantly between the two replicate sets you're testing. A change in the ratio
would imply a change in the splicing decisions being made by that transcript.
This filter will therefore find differential splicing events where the location
of the splice donor or splice acceptor changes, but will not find events such as
novel splice introductions, or changes from a splice to reading through an intron.
For these types of change you would need to use a DESeq2 comparison on the splice
junction count data
Options
- You need to two Replicate Sets to compare from the list on the left.
- You need to select a p-value cutoff for the filter
- You can choose to pre-filter your data by the minimum count an intron has across all the samples in a condition
- You can choose whether to apply multiple testing correction (you pretty much always want to leave this turned on)