API reference
This page provides an auto-generated summary of MAJIQ’s API. For more details and examples, refer to the relevant chapters in the main part of the documentation.
Random number generation
MAJIQ uses a pool of random number generators to handle multithreaded random
number generation, which is separate from numpy or dask random number
generation.
If more than a single thread is needed for a task involving random numbers, do
not forget to use rng_resize()
to size the pool of RNGs to
match.
|
Set seed for random number generator pools |
|
Resize rng pools to allow n simultaneous threads |
Build API
MAJIQ builds a SpliceGraph
object from GFF3 and coverage from BAMs.
The splicegraph is used later to define Events
: for quantification.
Classes
Spliced junction and retained intron coverage for the same experiment. |
|
Per-bin read coverage over junctions |
|
Per-bin read coverage over introns |
|
Representation of all possible splicing changes in each gene. |
|
Collection of contigs/chromosomes on which genes can be defined |
|
Collection of genes on |
|
Collection of exons per gene and their annotated/updated coordinates |
|
Collection of introns per gene and their coordinates, flags, and exons |
|
Collection of junctions per gene and their coordinates, flags, and exons |
|
Map from exons to the introns and junctions that start or end from them |
|
Thresholds on intron/junction coverage for inclusion in splicegraph |
|
Accumulator of |
|
Accumulator of |
|
Accumulator of |
|
Accumulator of |
|
Accumulate |
Create a splicegraph from GFF3
The first step in MAJIQ is to build a splicegraph from transcriptome annotations (GFF3).
|
Create |
Save/load splicegraphs to zarr
These splicegraphs are saved and loaded with the following commands.
|
Save |
|
Load |
Process BAMs for junction/intron coverage
Updating the annotated splicegraph requires information about coverage from
RNA-seq experiments, which is represented by
SJExperiment
objects.
|
Load |
|
Save |
|
Load |
|
|
|
Update SpliceGraph structure, passed flags
SpliceGraphs are generally updated in the following manner:
Update
GeneJunctions
using either coverage or existingGeneJunctions
objects.Update
Exons
to match updatedGeneJunctions
.Get potential introns between exons. Update them using old introns and/or coverage, then filter for those that passed.
Create updated
ExonConnections
, thenSpliceGraph
.
Update junctions using coverage
GeneJunctions
can be updated using coverage from SJExperiment
objects.
This is done by:
Creating a
PassedJunctionsGenerator
(GeneJunctions.builder()
).Experiments are passed in as “build groups” where evidence for a junction must be found in some minimum number of experiments of at least one build group.
Build groups are represented by
GroupJunctionsGenerator
(created byGeneJunctions.build_group()
).Add
SJJunctionsBins
from experiments in a build group usingGroupJunctionsGenerator.add_experiment()
, then add build groups usingPassedJunctionsGenerator.add_group()
.The updated
GeneJunctions
is then returned byPassedJunctionsGenerator.get_passed()
.
Create |
|
|
Create |
Add |
|
|
Update passed junctions with |
Return |
Update junctions from other junctions
Updated GeneJunctions
can also be created by loading junctions from
previous splicegraphs and combining them using GeneJunctions
.
Note that they must all share the same Genes
object, which can be done
by setting genes argument to GeneJunctions.from_zarr()
.
|
Add |
Return |
Update exons
Update Exons
to match updated
GeneJunctions
.
|
Return updated |
Update introns
Generally, updated introns are obtained by:
Determine potential introns between exons (
Exons.potential_introns()
).Update flags of potential introns using old introns (
GeneIntrons.update_flags_from()
) and/or coverage in build groups (GeneIntrons.build_group()
).Filter potential introns to only those that passed build filters (
GeneIntrons.filter_passed()
)
Return empty |
|
|
|
|
Update flags using overlapping donor |
Create |
|
|
Add |
In-place update of original |
|
|
Return |
Update SpliceGraph
The updated splicegraph is made by:
Creating updated
ExonConnections
(create_connecting()
).Creating updated
SpliceGraph
(with_updated_exon_connections()
).
|
Construct |
Create |
|
|
Create |
Update simplifier flags
Simplifier flags allow excluding introns and junctions that pass reliability
thresholds (raw readrates/nonzero bins) but have negligible coverage relative
to the events they are a part of (PSI).
These flags are updated in place by creating a SimplifierGroup
, which
accumulates SJExperiment
objects per group and updates intron/junction
flags for a group using SimplifierGroup.update_connections()
.
Set all connections to the simplified state |
|
Set all connections to the simplified state |
|
Set all connections to the unsimplified state |
|
Set all connections to the unsimplified state |
|
Create |
|
|
Add |
In-place update of connections passing thresholds in enough experiments |
Events API
An event is defined by a reference exon and connections (junctions and/or
intron) that all start or end at the reference exon.
The Events
class represents a collection of these events as arrays
over events (e_idx) and connections per event (ec_idx).
The mapping from events and event connections is specified by offsets yielding
the start/end indexes of ec_idx for each event.
The events use indexes to refer back to the splicegraph/exon connections that
were used to create them.
UniqueEventsMasks
and Events.unique_events_mask()
allow
identification of events that are unique or shared between two Events
objects.
This has use for analyses involving multiple splicegraphs derived from a common
splicegraph (e.g. a common set of controls).
Classes
Collections of introns/junctions all starting or ending at the same exon |
|
Masks betwen two |
Create/save events objects
|
construct |
construct |
|
|
Construct |
|
Construct |
|
Save |
|
Load |
Work with events objects
|
Get |
|
|
|
|
|
|
|
|
|
|
Information on unique events
Index over unique events |
|
Index into self.exons for reference exon of each unique event |
|
Indicator if source ('s') or target ('b') for each unique event |
|
First index into event connections (ec_idx) for each unique event |
|
One-past-end index into event connections (ec_idx) for each unique event |
|
|
Get slice into event connections for event with specified index |
Information on connections per event
Index over event connections |
|
Indicator if an intron or junction for each event connection |
|
Index into self.introns or self.junctions for each event connection |
|
|
Index into self.genes for selected event connections |
|
Start coordinate for each selected event connection |
|
End coordinate for each selected event connection |
|
Indicator if connection was denovo for each selected event connection |
Index into self.exons for reference exon for each event connection |
|
|
Index into self.exons for nonreference exon for each event connection |
PsiCoverage API
PsiCoverage
describes coverage over Events
in one or more
independent “prefixes”.
PsiCoverage
can be created over events for a single experiment using
PsiCoverage.from_sj_lsvs()
(prefix is determined by prefix of original
BAM file, which is where “prefix” name originates).
New PsiCoverage
files can be subsequently created by loading them
together or aggregating coverage over multiple prefixes.
Finally, PsiCoverage
provides attributes and functions which enable
lazy computation of PSI posterior quantities using xarray/Dask.
Classes
Summarized raw and bootstrap coverage over LSVs for one or more experiments. |
Create/save PsiCoverage
Create PsiCoverage from SJ coverage
|
Create |
Save PsiCoverage to zarr
|
Save |
|
Initialize zarr store for saving |
|
Save |
Load and update PsiCoverage
|
Load |
|
Create updated |
|
Create aggregated |
|
Return |
Events/prefixes with coverage
Total number of connections over all events |
|
|
Construct |
Number of independent experiments |
|
Names of independent units of analysis |
|
array(prefix, ec_idx) indicating if event passed |
|
Number of prefixes for which an event was passed |
|
Return boolean mask array for events passing min_experiments |
Raw coverage/posteriors
array(prefix, ec_idx) raw total reads over event |
|
array(prefix, ec_idx) coverage for individual connection (psi * total) |
|
array(prefix, ec_idx) alpha parameter of raw posterior |
|
array(prefix, ec_idx) beta parameter of raw posterior |
|
array(...) means of raw posterior distribution on PSI (alias) |
|
array(...) standard deviations of raw posterior distribution (alias) |
|
array(...) median over prefixes of raw_psi_mean |
|
empirical quantiles over prefixes of raw_psi_mean |
Bootstrap coverage/posteriors
Number of bootstrap replicates used for bootstraped coverage estimates |
|
array(prefix, ec_idx, bootstrap_replicate) bootstrapped raw_total |
|
array(prefix, ec_idx, bootstrap_replicate) bootstrapped raw_coverage |
|
array(prefix, ec_idx, bootstrap_replicate) alpha parameter of bootstrapped posterior |
|
array(prefix, ec_idx, bootstrap_replicate) beta parameter of bootstrapped posterior |
|
array(...) means of bootstrap posterior distribution on PSI (alias) |
|
array(...) median of means of bootstrapped posteriors |
|
array(...) standard deviations of bootstrap posterior distribution (alias) |
|
array(...) median over prefixes of bootstrap_psi_mean |
|
empirical quantiles over prefixes of bootstrap_psi_mean |
Beta approximation to bootstrap mixture coverage/posteriors
array(prefix, ec_idx) alpha parameter of approximated bootstrap posterior |
|
array(prefix, ec_idx) beta parameter of approximated bootstrap posterior |
|
|
Compute quantiles of approximate/smoothed bootstrapped posterior |
Compute discretized PMF of approximate/smoothed bootstrap posterior |
Quantifier API
DeltaPsi (replicate PsiCoverage)
|
Prior on DeltaPsi as weighted mixture of beta distributions (over [-1, 1]) |
|
Use reliable binary events from psi1,2 to return updated prior |
|
Compute DeltaPsi between two groups of PsiCoverage (replicate assumption) |
|
Reduce to |
|
|
|
Specialization of PMFSummaries for DeltaPsi on [-1, 1] |
expectation of position in bins |
|
standard deviation (sqrt of variance) |
|
Probability that abs(dPSI) > changing_threshold |
|
Probability that abs(dPSI) <= nonchanging_threshold |
Heterogen (independent PsiCoverage)
|
Compare Psi between two groups of PsiCoverage (independence assumption) |
|
|
|
Statistics on means, samples from raw posteriors |
|
Statistics on means, samples from approximate posteriors |
CLIN (in development)
Controls
|
Summary of PSI posterior means over large group of controls |
|
Save PSI coverage dataset as zarr |
alias for controls_q |
|
Get boolean mask of events that pass enough experiments |
|
For each controls_alpha, range between lower/upper quantiles (scale) |
Outliers
|
Outliers in PSI between cases and controls |