文摘
伪-Amino acids are fundamental to biochemistry as the monomeric building blocks with which cells construct proteins according to genetic instructions. However, the 20 amino acids of the standard genetic code represent a tiny fraction of the number of 伪-amino acid chemical structures that could plausibly play such a role, both from the perspective of natural processes by which life emerged and evolved, and from the perspective of human-engineered genetically coded proteins. Until now, efforts to describe the structures comprising this broader set, or even estimate their number, have been hampered by the complex combinatorial properties of organic molecules. Here, we use computer software based on graph theory and constructive combinatorics in order to conduct an efficient and exhaustive search of the chemical structures implied by two careful and precise definitions of the 伪-amino acids relevant to coded biological proteins. Our results include two virtual libraries of 伪-amino acid structures corresponding to these different approaches, comprising 121鈥?44 and 3鈥?46 structures, respectively, and suggest a simple approach to exploring much larger, as yet uncomputed, libraries of interest.