Chemoinformatics 方向:
• 熟悉一些常用的机器学习和大数据分析慨念(Regression/Classification/large scale clustering or similarity analyses, interactive data visualization, etc.)
• 熟悉目前一些深度学习在药物化学中的应用方法(GAN/NLP/Reinforcement learning for molecular de novo design, GCN and CNN for molecular property or protein binding affinity prediction, etc.)
• 熟悉某些深度学习或数据分析工具 (Pytorch,TensorFlow,Keras,Knime,Pandas,Scikit-learn等)
• 具有较强的编程和算法能力,熟练应用至少一门编程语言(C++/C, python, Java等)
• NoSQL/Graph数据库 (MongoDB,Neo4j,HyperGraphDB
计算化学结合机器学习的领域:
• 理解量子化学基础理论(e.g.quantum theory of atoms in molecules (QTAIM))和熟悉量子化学计算流程和基本原理;
• 结合药物/生物化学及计算化学进行过相关的理论或实验研究,在相关领域发表过文章
• 化学信息学和分子建模软件:Schrödinger套件,化学信息学工具包(例如RDkit,Openeye,CDK),QM和MD模拟程序包(高斯,GROMACS/AMBER/CHARMM),Pymol等。
职责范围
• Work with the PI and computer science colleagues to develop or apply cutting-edge machine learning approaches for drug discovery (e.g. 3D-based de novo molecular generation or focused compound library; large-scale molecule similarity search; better deep representation of molecule or ligand-protein interaction, etc.)
• Work with computer scientist colleagues to build relevant chemoinformatics databases
• Work with computer scientist colleagues to develop novel analytic or interactive visualization tools for biochemical data analyses or machine learning model interpretation
• Publish in high-impact chemistry, chemical biology, and machine learning journals or present at scientific conferences (e.g. ACS, Gordon research conference, NeurIPS, ICML, etc.).
• Collaborate with bench chemists or biologists to design of molecular probes or focused library of specific target families for Proof-Of-Concept (POC) studies
• Identify and evaluate new modeling and informatics technology and software applications from open sources or through external collaborations