Skip to content

HOME

msmu logo

Python toolkit for modular and traceable LC-MS/MS proteomics analysis based on MuData

Overview

msmu is an open-source Python package for modular and traceable post-DB search preprocessing and statistical analysis of bottom-up proteomics data.

It supports modules for every step of end-to-end processing—from search output parsing through hierarchical summarization, normalization, batch correction, statistical analysis, and visualization—implemented with commonly used analytical and statistical methods.

Central to msmu is the highly versatile and standardized MuData (and AnnData) as a unifying, provenance-aware data container for organizing and storing annotations and representations of multi-dimensional MS data and processing history.

This unique marriage between flexible processing pipeline and MuData empowers FAIR principle-aligned downstream analysis for biomarker discovery and systems biology.

MuData logo

Key Features

  • Flexible data ingestion from Sage, DIA-NN, and other popular DB search tools
  • MuData/AnnData-compatible object structure for organizing multi-level MS data
  • Protein inference: infer protein groups from peptide evidence using parsimony rule
  • Normalization: median centering, quantile normalization, etc.
  • Batch correction for discrete and continuous variations
  • Built-in QC: identification count, peptide length, charge, missed cleavage, intensity distribution, etc.
  • Statistical analysis: differential expression analysis, dimensionality reduction
  • PTM data support and stoichiometry adjustment with matched global dataset (if available)
  • Visualization: PCA, UMAP, volcano plots, heatmaps, QC metrics

Supporting DB Search Tools

Citation

If you use msmu in your research, please cite the following publication (preprint):

msmu: a Python toolkit for modular and traceable LC-MS proteomics data analysis based on MuData

Hyung-Wook Choi, Byeongchan Lee, Un-Beom Kang, Sunghyun Huh

bioRxiv 2026.01.07.698308; doi: 10.64898/2026.01.07.698308

License

BSD 3-Clause License. See LICENSE for details.