|
Persistent Identifier
|
doi:10.17617/3.CBCNQG |
|
Publication Date
|
2026-06-01 |
|
Title
| Data repository: Complete NMR assignment for 275 of the most common dipeptides in intrinsically disordered proteins |
|
Author
| Rindfleisch, Tobias
Computational Biology Unit, University of Bergen
Department of Chemistry, University of Bergen
Max Planck Institute of Molecular Plant Physiology
Fjeldberg Taule, Emilie
Department of Chemistry, University of Bergen
Miettinen, Markus S.
Computational Biology Unit, University of Bergen
Department of Chemistry, University of Bergen
Underhaug, Jarl
Department of Chemistry, University of Bergen
|
|
Study Type
| experimental |
|
Description
| OVERVIEW: This dataset includes the experimental NMR data of the publication "Complete NMR assignment for 275 of the most common dipeptides in intrinsically disordered proteins". We provide all measured NMR spectra (NMReData, MestReNova) and the corresponding complete assignments (NMReData, NMR-STAR, MestReNova, YAML) for the 275 most common dipeptides in intrinsically disordered proteins. ABSTRACT: Accurate NMR chemical shift assignments are essential for atomic-resolution characterization of proteins. Especially for intrinsically disordered proteins (IDPs) and regions (IDRs), however, the assignment remains a labor-intensive task due to spectral overlap and conformational heterogeneity. Consequently, complete side-chain assignments are rare. Here, we present a comprehensive reference dataset, comprising the complete NMR chemical shift assignments for 275 of the most prevalent dipeptides in the IDPome, covering 93% of it. In addition, we report side-chain protonation-dependent chemical shifts for dipeptides containing aspartic or glutamic acid. The dataset contains all NMR-accessible backbone and side-chain nuclei, in total 11 571 rigorously validated data points, as well as the 1D (1H, 13C) and 2D (15N HSQC, 13C HSQC, TOCSY, NOESY, HMBC) spectra used for the assignment, making it a rich resource for the training, testing, and benchmarking of tools for data-driven protein assignment, peak picking, and synthetic spectrum generation. To facilitate such machine learning applications, all data are delivered in standardized, machine-readable formats. REMARK: For better overview, you can switch to "Tree view" in the "Files section": Files > Change View > Tree
DATA: The following NMR experiments are included:
- 1D 1H (zgesgppe; gradient-enhanced, with water suppression)
- 1D 1H (1D NOESY )
- 1D 1H (q-NMR; quantitative NMR)
- 1D 13C (C13CPD; composite pulse decoupling)
- 2D 1H-15N HSQC (heteronuclear single quantum coherence)
- 2D 1H-13C HSQC (heteronuclear single quantum coherence)
- 2D 1H-1H TOCSY (total correlation spectroscopy)
- 2D 1H-1H NOESY (nuclear overhauser effect spectroscopy)
- 2D 1H-13C HMBC (heteronuclear multiple-bond correlation)
|
|
Subject
| Biology; Medicine; Chemistry |
|
Keyword
| IDP
NMR
chemical shifts
protonation state
dipeptides |
|
Language
| English |
|
Depositor
| Rindfleisch, Tobias |
|
Deposit Date
| 2025-08-12 |
|
Software
| MestReNova, Software Version: 14.02.0-26256
TopSpin, Software Version: 4.1.4
NMRstar, Software Version: 3.3
Python, Software Version: 3.9.13
Python: Biopython, Software Version: 1.78
Python: Matplotlib, Software Version: 3.5.2
Python: NumPy, Software Version: 1.21.5
Python: Pandas, Software Version: 1.4.4
Python: pynmrstar, Software Version: 3.3.5
Python: tqdm, Software Version: 4.64.1
Python: PyYAML, Software Version: 6.0
Python: Zipfile, Software Version: 3.9.13 |
|
Related Publication
| Please cite the following paper if you use any of the data/scripts provided here:
Tobias Rindfleisch, Emilie Fjeldberg Taule, Markus S. Miettinen, and Jarl Underhaug. "Complete NMR assignment for 275 of the most common dipeptides in intrinsically disordered proteins". Scientific Data (2026) |