计算化学公社

标题: 批量获得多肽smiles序列求助 [打印本页]

作者
Author:
qiuqiuren    时间: 2026-1-12 15:52
标题: 批量获得多肽smiles序列求助
求助各位老师,我现在有大量多肽的pdb文件和氨基酸序列,我想得到多肽的smiles序列,是直接从pdb转化成smiles,还是利用序列生成smiles(之前尝试过用pdb转,但rdkit显示warning)(多肽氨基酸序列含有非标准氨基酸,也有的多肽是环肽),还望各位老师多多指教
作者
Author:
UW_0728.    时间: 2026-1-12 16:25
本帖最后由 UW_0728. 于 2026-1-12 16:28 编辑

用OpenBabel直接从pdb转SMILES即可。先装好OpenBabel、配置好环境变量,然后运行类似如下命令(假设所有pdb都在当前目录下):
  1. for n in $(basename -s .pdb ./*.pdb); do
  2.   echo $n
  3.   obabel -ipdb $n.pdb -osmi | head -n 1 | awk '{print $1}'
  4.   echo
  5. done > SMILES.txt
复制代码
执行完后当前目录下出现的SMILES.txt文件就包含了各个结构的SMILES了
不放心的话,可以选一部分环比较多/杂、构型略显奇怪的结构把相应的SMILES复制粘贴到https://pubchem.ncbi.nlm.nih.gov/edit3/index.html,生成二维结构图验证验证
作者
Author:
qiuqiuren    时间: 2026-1-13 13:44
老师,您好,我用obabel生成smiles,有一部分序列显示

*** Open Babel Warning  in PerceiveBondOrders
  Failed to kekulize aromatic bonds in OBMol::PerceiveBondOrders (title is 6rhe_D.pdb)

然后把生成的SMILES.txt用rdkit读取,部分报错
[13:04:14] Explicit valence for atom # 38 N, 4, is greater than permitted
[13:04:14] Explicit valence for atom # 38 N, 4, is greater than permitted
[13:04:14] Explicit valence for atom # 26 N, 4, is greater than permitted
[13:04:14] Explicit valence for atom # 125 O, 2, is greater than permitted
[13:04:14] Explicit valence for atom # 18 N, 4, is greater than permitted
[13:04:15] Explicit valence for atom # 26 N, 4, is greater than permitted
[13:04:15] Explicit valence for atom # 71 O, 2, is greater than permitted
[13:04:15] Explicit valence for atom # 38 N, 4, is greater than permitted
[13:04:15] Explicit valence for atom # 38 N, 4, is greater than permitted
[13:04:15] Explicit valence for atom # 18 N, 4, is greater than permitted
[13:04:15] Explicit valence for atom # 26 N, 4, is greater than permitted
[13:04:15] Explicit valence for atom # 30 N, 4, is greater than permitted,

这种情况是只能对报错的pdb单独处理吗




欢迎光临 计算化学公社 (http://bbs.keinsci.com/) Powered by Discuz! X3.3