Plotting with pyft
[7]:
import pyft
import pandas as pd
import altair as alt
alt.data_transformers.enable("vegafusion")
[7]:
DataTransformerRegistry.enable('vegafusion')
Read in the results of a ft-footprint calculation and plot the results using pyft.
[8]:
dfm = pyft.utils.read_and_center_footprint_table(
"../../../tests/data/ctcf-footprints.bed.gz"
)
dfm.head(2)
[8]:
| chrom | motif_start | motif_end | strand | footprint_codes | fire_qual | fiber_name | has_spanning_msp | footprinted | start | end | centering_position | centering_strand | type | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | chr11 | 5204946 | 5204981 | + | 3 | 247 | m64076_211222_124721/148505307/ccs | True | True | 0 | 1 | 5204946 | + | footprinted |
| 1 | chr11 | 5204946 | 5204981 | + | 2 | -1 | m64076_211222_124721/51053256/ccs | False | True | 0 | 1 | 5204946 | + | not-footprinted |
Read in fiber data centered on the footprint locations.
[9]:
rgns = pd.read_csv("../../../tests/data/ctcf.bed.gz", sep="\t", header=None, nrows=2)
rgns.columns = ["chrom", "start", "end", "name", "score", "strand", "name2"]
fiberbam = pyft.Fiberbam("../../../tests/data/ctcf.bam")
centers = []
z = None
for idx, rgn in rgns.iterrows():
region = (rgn["chrom"], rgn["start"], rgn["end"])
z = pyft.utils.region_to_centered_df(
fiberbam, region, strand=rgn["strand"], max_flank=250
)
centers.append(z)
[2024-11-12T22:29:48Z INFO pyft::fiberdata] 181 records fetched in 0.01s
[2024-11-12T22:29:48Z INFO pyft::fiberdata] Fiberdata made for 181 records in 0.11s
[2024-11-12T22:29:48Z INFO pyft::fiberdata] Fiberdata centered for 181 records in 0.02s
[2024-11-12T22:29:48Z INFO pyft::fiberdata] 172 records fetched in 0.11s
[2024-11-12T22:29:48Z INFO pyft::fiberdata] Fiberdata made for 172 records in 0.10s
[2024-11-12T22:29:48Z INFO pyft::fiberdata] Fiberdata centered for 172 records in 0.07s
Combine the footprinting results with the fiber data centered around the footprints.
[10]:
both_dfs = pd.concat(centers + [dfm], axis=0).reset_index(drop=True)
both_dfs.head(2)
[10]:
| chrom | fiber_start | fiber_end | fiber_name | strand | type | start | end | qual | centering_position | centering_strand | motif_start | motif_end | footprint_codes | fire_qual | has_spanning_msp | footprinted | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | chr11 | 5184260.0 | 5205600.0 | m64076_211222_124721/148505307/ccs | + | msp | -225 | -160 | 0 | 5204946 | + | NaN | NaN | NaN | NaN | NaN | NaN |
| 1 | chr11 | 5184260.0 | 5205600.0 | m64076_211222_124721/148505307/ccs | + | msp | -57 | 135 | 247 | 5204946 | + | NaN | NaN | NaN | NaN | NaN | NaN |
Show the chart within the notebook.
[11]:
chart = pyft.plot.centered_chart(both_dfs, width=400, height=200)
chart
[11]: