Skip to content

dhruvdcoder/xlm-core

Repository files navigation

XLM Logo

A Unified Framework for Non-Autoregressive Language Models

PyPI version Documentation Python 3.11+ License


XLM is a modular, research-friendly framework for developing and comparing non-autoregressive language models. Built on PyTorch and PyTorch Lightning, with Hydra for configuration management, XLM makes it effortless to experiment with cutting-edge NAR architectures.

✨ Key Features

Feature Description
🧩 Modular Design Plug-and-play components—swap models, losses, predictors, and collators independently
Lightning-Powered Distributed training, mixed precision, and logging out of the box
🎛️ Hydra Configs Hierarchical configuration with runtime overrides—no code changes needed
📦 Multiple Architectures 7 NAR model families ready to use
🔬 Research-First Type-safe with jaxtyping, debug modes, and flexible metric injection
🤗 Hub Integration Push trained models directly to Hugging Face Hub

🏗️ Available Models

Model Full Name Description Reference
mlm Masked Language Model Classic BERT-style masked prediction
ilm Insertion Language Model Iterative insertion-based generation arXiv:2505.05755
arlm Autoregressive LM Standard left-to-right baseline
mdlm Masked Diffusion LM Discrete diffusion with masking arXiv:2406.07524
flexmdm Flexible Masked Diffusion Model Variable-length masked diffusion arXiv:2509.01025

🚀 Installation

pip install xlm-core

For model implementations, also install:

pip install xlm-models

📖 Quick Start

XLM uses a simple CLI with three main arguments:

xlm job_type=<JOB> job_name=<NAME> experiment=<CONFIG>
Argument Description
job_type One of prepare_data, train, eval, or generate
job_name A descriptive name for your run
experiment Path to your Hydra experiment config

🎯 Example: ILM on LM1B

A complete workflow demonstrating the Insertion Language Model on the LM1B dataset:

1️⃣ Prepare Data

xlm job_type=prepare_data job_name=lm1b_prepare experiment=lm1b_ilm

2️⃣ Train

# Quick debug run (overfit a single batch)
xlm job_type=train job_name=lm1b_ilm experiment=lm1b_ilm debug=overfit

# Full training
xlm job_type=train job_name=lm1b_ilm experiment=lm1b_ilm

3️⃣ Evaluate

xlm job_type=eval job_name=lm1b_ilm experiment=lm1b_ilm \
    +eval.ckpt_path=<CHECKPOINT_PATH>

4️⃣ Generate

xlm job_type=generate job_name=lm1b_ilm experiment=lm1b_ilm \
    +generation.ckpt_path=<CHECKPOINT_PATH>

Tip: Add debug=[overfit,print_predictions] to print generated samples to the console:

xlm job_type=generate job_name=lm1b_ilm experiment=lm1b_ilm \
    +generation.ckpt_path=<CHECKPOINT_PATH> \
    debug=[overfit,print_predictions]

5️⃣ Push to Hugging Face Hub

xlm job_type=push_to_hub job_name=lm1b_ilm_hub experiment=lm1b_ilm \
    +hub_checkpoint_path=<CHECKPOINT_PATH> \
    +hub.repo_id=<YOUR_REPO_ID>

🗂️ Project Structure

xlm-core/
├── src/xlm/           # Core framework
│   ├── harness.py     # PyTorch Lightning module
│   ├── datamodule.py  # Data loading & collation
│   ├── metrics.py     # Evaluation metrics
│   └── configs/       # Default Hydra configs
│
└── xlm-models/        # Model implementations
    ├── mlm/           # Masked LM
    ├── ilm/           # Infilling LM
    ├── arlm/          # Autoregressive LM
    └── ...            # Other architectures

🔧 Extending XLM

Adding a new model requires implementing four components:

Component Responsibility
Model Neural network architecture
Loss Training objective
Predictor Inference/generation logic
Collator Batch preparation

You can also add new entrypoint scripts to the cli.

See the Contributing Guide for a complete walkthrough.

📚 Documentation

🤝 Contributing

We welcome model contributions! Please check out our Contributing Guide for guidelines on adding new models and features.

📄 License

This project is licensed under the MIT License.

🙏 Acknowledgements

XLM is developed and maintained by IESL students at UMass Amherst.

Primary Developers:

  1. Dhruvesh Patel
  2. Durga Prasad Maram
  3. Sai Sreenivas Chintha
  4. Benjamin Rozonoyer

Model Contributors:

  1. Soumitra Das (EditFlow)
  2. Eric Chen (EditFlow)

📚 Cite

If you found this repository useful, please consider citing:

@article{patel2025xlm,
  title={XLM: A Python package for non-autoregressive language models},
  author={Patel, Dhruvesh and Maram, Durga Prasad and Chintha, Sai Sreenivas and Rozonoyer, Benjamin and McCallum, Andrew},
  journal={arXiv preprint arXiv:2512.17065},
  year={2025}
}

Built with ❤️ for the NLP research community

About

XLM is a modular, research-friendly framework for developing and comparing non-autoregressive language models. Built on PyTorch and PyTorch Lightning, with Hydra for configuration management, XLM makes it effortless to experiment with cutting-edge NAR architectures.

Topics

Resources

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages