Leopard-EM
Welcome to the Location & oriEntatiOn of PARticles found using two-Dimensional tEmplate Matching (Leopard-EM) online documentation! Leopard-EM is a Python implementation of Two-Dimensional Template Matching (2DTM) which itself is a data processing method in cryo-EM for locating and orienting particles using a reference structure. This package currently reflects the functionality described in Lucas, et al. (2021)1 with additional programs to maximize the usefulness of 2DTM as well as other user-friendly features for integrating into broader data science workflows.
Citing this work
If you use Leopard-EM in your research, please cite (coming soon!).
Installation
Requirements
The general system requirements for Leopard-EM are
- Python version 3.10 or above
- PyTorch 2.4.0 or above
- Linux operating system
The package config contains a complete set of requirements which are automatically downloaded and checked during the installation process. We also recommend using a virtual environment manager (such as conda) to avoid conflicts with other installed software on your system. Please open a open up a bug report on the GitHub page if you experience major issues during the installation process.
MacOS and Windows support
Leopard-EM should theoretically work on MacOS and Windows operating systems, but we cannot guarantee compatibility with platforms other than Linux nor do we distribute pre-built wheels for these platforms.
Tested GPU support
Leopard-EM has only been tested against NVIDIA GPUs but should run on most modern GPU hardware (supported by PyTorch). If you've experienced a compatibility issue, create an issue on GitHub.
Pre-packaged releases
Pre-packaged versions of Leopard-EM are released on the Python Package Index (PyPI). To install the latest pre-packaged release of Leopard-EM, run the following:
pip install leopard-em
From there, you are ready to start running 2DTM workflows or exploring some of our examples.
Installing from Source
If you want to install Leopard-EM from source, first clone the repository and install the package using pip:
git clone https://github.com/Lucaslab-Berkeley/Leopard-EM.git
cd Leopard-EM
pip install .
For Developers
Developers who are interested in contributing to Leopard-EM should fork the repository into their own GitHub account. Navigate to the Leopard-EM GitHub landing page and click on fork in the top right-hand corner. Then clone your fork and add the Lucaslab-Berkeley remote as an upstream:
git clone https://github.com/YOUR_USERNAME/Leopard-EM.git
cd Leopard-EM
git remote add upstream https://github.com/Lucaslab-Berkeley/Leopard-EM
Check that the remote has been properly added (git remote -v
) then run the following to install the package along with the optional development dependencies.
pip install -e '.[dev,test,docs]'
See the Contributing page for detailed guidelines on contributing to the package.
Basic Usage
Built-in programs
Leopard-EM is runnable through a set of pre-built Python scripts and easily modifiable YAML configurations.
There are currently five main programs (located under programs/
folder) each with their own configuration files.
Detailed documentation for each program can be found on the Program Documentation Overview, but the five main programs are as follows:
match_template
- Runs a whole orientation search for a given reference structure on a single micrograph.refine_template
- Takes particles identified from match template and refines their location, orientation, and defocus parameters.optimize_template
- Optimizes the pixel size of the micrograph and template structure model using a set of identified particles.constrained_search
- Uses the location and orientation of identified particles to constrain the search parameters of a second particle.optimize_b_factor.py
- Script to optimize the b-factor added to a model (using 2DTM) for a set of metrics.
Minimal match template example
The following Python script will run a basic match template program using only the built-in Pydantic models. See the programs documentation page for more details on running each program.
from leopard_em.pydantic_models.config import DefocusSearchConfig
from leopard_em.pydantic_models.config import OrientationSearchConfig
from leopard_em.pydantic_models.config import ComputationalConfig
from leopard_em.pydantic_models.config import PreprocessingFilters
from leopard_em.pydantic_models.data_structures import OpticsGroup
from leopard_em.pydantic_models.managers import MatchTemplateManager
from leopard_em.pydantic_models.results import MatchTemplateResult
def setup_match_template_manager():
"""Helper function to set up MatchTemplateManager Pydantic model."""
# Microscope imaging parameters
my_optics_group = OpticsGroup(
label="my_optics_group",
pixel_size=1.2, # In Angstroms
voltage=300, # In kV
defocus_u=5100.0, # In Angstroms
defocus_v=4900.0, # In Angstroms
astigmatism_angle=0.0, # In degrees
)
# Relative defocus planes to search across
df_search_config = DefocusSearchConfig(
defocus_min=-1000, # In Angstroms, relative to defocus_{u,v}
defocus_max=1000, # In Angstroms, relative to defocus_{u,v}
defocus_step=200.0, # In Angstroms
)
# Orientation sampling of SO(3) space, using default
orientation_search_config = OrientationSearchConfig()
# Where to save the output results
mt_result = MatchTemplateResult(
allow_file_overwrite=True,
mip_path="some/path/to/output_mip.mrc",
scaled_mip_path="some/path/to/output_scaled_mip.mrc",
correlation_average_path="some/path/to/output_correlation_average.mrc",
correlation_variance_path="some/path/to/output_correlation_variance.mrc",
orientation_psi_path="some/path/to/output_orientation_psi.mrc",
orientation_theta_path="some/path/to/output_orientation_theta.mrc",
orientation_phi_path="some/path/to/output_orientation_phi.mrc",
relative_defocus_path="some/path/to/output_relative_defocus.mrc",
)
# What Fourier pre-processing filters to apply to image/template
# This uses the default filter parameters which work in most cases
pre_filters = PreprocessingFilters()
# Which GPUs to run template matching on (here the first 2 GPUs)
comp_config = ComputationalConfig(gpu_ids=[0, 1])
# Bring all the other configs together in the manager
mt_manager = MatchTemplateManager(
micrograph_path="some_path/to/image.mrc",
template_volume_path="some_path/to/volume.mrc",
optics_group=my_optics_group,
defocus_search_config=df_search_config,
orientation_search_config=orientation_search_config,
match_template_result=mt_result,
computational_config=comp_config,
preprocessing_filters=pre_filters,
)
return mt_manager
def main():
"""Run the match_template program."""
mt_manager = setup_match_template_manager()
mt_manager.run_match_template(
orientation_batch_size=1, # Change this based on GPU memory
do_result_export=True, # Saves the statistics immediately upon completion
)
# Construct and export the dataframe of picked peaks
df = mt_manager.results_to_dataframe()
df.to_csv("my_match_template_results.csv", index=True)
if __name__ == "__main__":
main()
Documentation and Examples
See the left-hand menu for examples of using the Leopard-EM package and other package documentation.
Theory
🚧 Under Construction 🚧
Contributing
We encourage contributions to this package from the broader cryo-EM/ET and structural biology communities. See contributing for guidelines. Leopard-EM is configured with a set of development dependencies to help contributors maintain code quality and consistency. See the Installation -- For Developers section for instructions on how to install these dependencies.
License
The code in this repository is licensed under the BSD 3-Clause License. See the LICENSE file for full details.
References
-
Lucas BA, Himes BA, Xue L, Grant T, Mahamid J, Grigorieff N. Locating macromolecular assemblies in cells by 2D template matching with cisTEM. Elife. 2021 Jun 11;10:e68946. doi: 10.7554/eLife.68946. PMID: 34114559; PMCID: PMC8219381. ↩