About the Concept Peptide Profiler

Beta notice: This application is a beta release. Features and outputs may change, and results may be incomplete or contain errors.

Overview

This application is a peptide analysis tool designed to support early-stage sequence evaluation and synthesis planning. It estimates aggregation propensity and SPPS synthesis difficulty, calculates key physicochemical properties, and provides structure visualization tools.

The aggregation and synthesis analysis scans the sequence to highlight regions that may aggregate on resin and reduce coupling efficiency. The assessment considers local aggregation trends, stronger impact near the C-terminus, peptide length effects, and amino acid composition. Results are intended as comparative guidance rather than a guarantee of experimental outcome.

The Properties module reports commonly used descriptors such as monoisotopic mass, ε280, GRAVY, aliphatic index, predicted pI, estimated molecular dimensions, and amino acid composition. Two-dimensional structures and SMILES are generated using RDKit.

3D conformers are generated using PyPept and displayed interactively. To keep the web app responsive, only limited optimization is performed; therefore, the resulting conformations should be treated as approximate visual models rather than validated structures.

For high-confidence structural modeling, design support, or a more accurate assessment of synthesis feasibility (including timelines and quotations), please contact us . Our team can provide a CADD service using advanced software and experienced modelers, and an experienced peptide chemist can review sequences and propose optimized synthesis strategies, including for complex or aggregation-prone peptides.

This tool is intended for research and development support and provides heuristic, model-based estimates for planning purposes. Outputs—especially synthesis difficulty assessments and 3D conformers—should be treated as indicative and should be reviewed by an experienced scientist/peptide chemist before informing experimental work.

Calculations

Aggregation

The synthesis difficulty and aggregation profile estimate regions of a peptide sequence that may present challenges during solid-phase peptide synthesis (SPPS). Aggregation of resin-bound peptide chains can reduce reagent accessibility, decrease coupling efficiency, and increase the risk of truncations or deletions.

Each amino acid is assigned an aggregation tendency value derived from literature data, and the sequence is analyzed using a sliding-window approach to estimate local aggregation behavior along the chain.

  • Local aggregation tendency: Sequence segments are evaluated to identify regions with increased propensity to aggregate.
  • C-terminal influence: Aggregation near the C-terminus has greater impact because SPPS proceeds from C→N.
  • Consecutive aggregation-prone regions: Long uninterrupted stretches can promote packing and steric hindrance on resin.
  • Peptide length effects: Longer peptides generally present increased synthesis difficulty and reduced prediction reliability.
  • Amino acid composition: Bulky or β-branched residues may slow coupling, while charged residues can partially mitigate aggregation effects.

Hydrophobicity-related properties, including GRAVY, are calculated using hydropathy parameters described by Kyte and Doolittle.

Predictions are heuristic estimates intended to provide comparative guidance for sequence design and planning. Experimental outcomes depend strongly on synthesis protocol, resin, solvent, and other practical factors and should be reviewed by an experienced peptide chemist.

References

Krchnák, V., Flegelová, Z., Vágner, J. Aggregation of resin-bound peptides during solid-phase peptide synthesis: Prediction of difficult sequences. Int. J. Peptide Protein Res. 42, 450–454 (1993).
Kyte, J., Doolittle, R.F. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157(1), 105–132 (1982).

Charge vs. pH

The Charge vs pH profile estimates the theoretical net charge of a peptide as a function of pH. The calculation considers all ionisable groups in the sequence, including the N-terminus, C-terminus, and side chains of charged amino acids.

At each pH value, the fractional protonation state of each ionisable group is calculated using the Henderson–Hasselbalch relationship and literature pKa values. The total net charge is obtained by summing the contributions of all ionisable groups across the peptide. The predicted isoelectric point (pI) is estimated as the pH at which the net charge approaches zero.

Non-natural amino acids: Where supported, non-natural residues are handled using approximate ionisation parameters. In this implementation, ornithine uses a reported side-chain pKa of 10.76 (δ-amine), and pyrrolysine (O) is approximated as lysine-like for ionisation (using the lysine ε-amine pKa 10.53). These approximations may be less accurate than predictions for standard amino acids.

These values represent theoretical estimates. Experimental charge and pI may vary depending on buffer composition, ionic strength, temperature, neighboring residues, and structural effects.

References

Stryer, L., Berg, J.M., Tymoczko, J.L. Biochemistry, 5th ed. W.H. Freeman.
Bjellqvist, B. et al. The focusing positions of polypeptides in immobilized pH gradients can be predicted from their amino acid sequences. Electrophoresis 14, 1023–1031 (1993).
Srinivasan, G., James, C.M., Krzycki, J.A. Pyrrolysine: A 22nd Amino Acid in the Genetic Code. Science 296, 1459–1462 (2002).
Sigma-Aldrich Technical Article: L-Ornithine properties (pKa values 1.94, 8.65, 10.76). View

Properties

This module reports several calculated descriptors derived directly from the peptide sequence:

  • Exact mass (monoisotopic) is calculated as the sum of residue monoisotopic masses plus the mass of water to account for the peptide termini.
  • Extinction coefficient (ε280) is estimated from the number of tryptophan and tyrosine residues using standard empirical coefficients.
  • GRAVY (Grand Average of Hydropathy) is calculated as the average of residue hydropathy values using the Kyte–Doolittle scale.
  • Aliphatic index is calculated from the relative abundance of alanine, valine, isoleucine, and leucine according to the standard definition used in protein chemistry.
  • Predicted isoelectric point (pI) is obtained numerically from the theoretical net charge calculated across a pH range using literature pKa values and the Henderson–Hasselbalch relationship.
  • Estimated molecular dimensions are approximate values derived from empirical relationships between peptide length or mass and hydrodynamic behavior and should be interpreted as rough indicators only.
  • Amino acid composition is reported as residue counts across the sequence.

Non-natural amino acids: Where supported, non-natural or modified residues are handled using approximate parameters. In particular, GRAVY values for non-natural amino acids are estimated based on similarity to natural amino acids or representative literature values and may therefore be less accurate than predictions for standard residues.

References

Aggregation Potential [Pa]: Krchnák, V., Flegelová, Z., Vágner, J. Int. J. Peptide Protein Res. 42, 450–454 (1993).
Kyte, J., Doolittle, R.F. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157(1), 105–132 (1982).

Structure

This module provides sequence-to-structure utilities for rapid visualization. For accurate, project-grade structural modeling and design support, please contact us — we offer CADD services delivered by experienced computational chemists using cutting-edge software (e.g., Schrödinger Maestro).

  • 2D structure and SMILES. Two-dimensional structures and SMILES representations are generated using RDKit.
  • 3D structure (visual model). Three-dimensional conformers are generated from a BILN representation. Standard sequences can be entered in FASTA format and are converted to BILN. Sequences containing non-natural residues, side-chain modifications, or cyclic peptides should be provided directly as BILN (using the BILD annotation and button to generate or convert the required notation). For details on supported residue definitions, cyclic handling, and notation, please refer to the pyPept publication.

3D coordinates are generated using a fast embedding workflow designed for web responsiveness, relying primarily on RDKit’s ETKDGv3 (Experimental Torsion Knowledge Distance Geometry) method. Only limited optimization is performed; therefore, the resulting conformations should be treated as approximate visual models rather than validated structures.

Known issues. Very aromatic-rich sequences (approximately more than 20 aromatic residues) may fail during RDKit sanitization or kekulization. These validation checks are intentionally retained to ensure chemical consistency and reliability of generated structures, and we aim to address this limitation in future updates. Protonation states and covalent connectivity are assigned using chemical rules. However, because no full energy minimization or solvent refinement is performed in this web tool, local geometries— especially for flexible side chains, protonated groups (e.g., NH₃⁺/iminium), and hydrogen-bond orientations—may not correspond to an energetically preferred arrangement. For physically refined structures, export the model and run an external minimization/solvent refinement workflow. For cyclic peptides, conformer generation may sometimes yield cis amide geometries due to perceived ring constraints during embedding. Experimentally, trans amide geometries are often preferred in cyclic peptides, so such structures should be interpreted with care and used primarily for visualization.

References

Ochoa, R., Brown, J.B., Fox, T. pyPept: a python library to generate atomistic 2D and 3D representations of peptides. Journal of Cheminformatics (2023).
pyPept (Boehringer Ingelheim) — GitHub repository: https://github.com/Boehringer-Ingelheim/pyPept