MDPath Outputs

This page focuses on the outputs generated by the main function of MDPath, which can be invoked using the mdpath prefix in the command line. These outputs can either be directly used for visualization or further processed using mdpath-tools for more in-depth analysis.

Visualization Outputs

The main data for 3D visualization is stored in the following .json files. These files can be directly visualized using tools such as PyMOL or NGLViewer, or they can be further post-processed for additional analysis.

precomputed_clusters_paths.json        All paths are plotted individually, allowing for detailed inspection of each signaling pathway.
quick_precomputed_clusters_paths.json  Every path within a cluster is precomputed, allowing for less rendering and faster inspections.
first_frame.pdb                        The first frame of the trajectory essential for plotting.

The following images can be useful for debugging purposes or for visualizing smaller protein systems. However, when dealing with larger protein systems, this representation may become too complex to fully comprehend.

clustered_paths.png  A picture of the hierarchical clustering dendrogram, which illustrates how pathways are grouped based on similarity.
graph.png            A picture of the complete graph, which can be useful for fine-tuning the `graphdist` parameter in certain systems.

Analysis Outputs

MDPath outputs files at different points of the workflow that can be used for further analysis or visualization. These files are essential for understanding the results of the analysis and can be used to generate additional outputs.

cluster_pathways_dict.pkl  Contains the final data form the analysis. It is set up as a dictionary with the cluster number as the key and the corresponding pathways as the value in form of a list of lists.
top_pathways.pkl           Contains the top (500) pathways. Stored as a list of lists.
residue_coordinates.pkl    Contains the residue coordinates of the protein backbone :math:`\alpha`-C atoms used for visualization.
paths.txt                 Contains all pathways and their corresponding total mutual information values.
nmi_df.csv                 Contains the normalized mutual information (NMI) for each residue pair within the protein. Column 1 contains the residue pair as a tuple and column 2 contains the NMI value.

Bootstrapping Outputs

When using the bootstrapping flag, an additional set of outputs is generated. These outputs are valuable for assessing the variability and reliability of the analysis.

bootstrap   A folder containing the results of each bootstrapping sample.
output.txt  This file includes the confidence intervals for each path in the analysis.

Keep in mind that the standard error is also directly printed in the output log of the MDPath command when the bootstrapping flag is used.

MDPath Tools Outputs

MDPath Tools provides a collection of methods specifically designed to enhance the analysis of MDPath output. Below is a list of the various outputs that are exclusive to MDPath Tools. For detailed information on these methods, please refer to the MDPath Tools documentation.

residue_coordinates_dict   Contains residue coordinates used by mdpath_compare
cluster_pathways_dict.pkl  Holds key data from MDPath for mdpath_compare and mdpath_gpcr_image.

top_pathways.pkl           Contains data required for reclustering multiple runs with mdpath_multitraj