2023-24-project-catalogue

###Open DEL for Machine Learning in Drug Discovery

Project ID: 2228bd1044 (You will need this ID for your application)

Project Summary:

The pharmaceutical industry develops medicines, but where do they start? Usually by a company screening millions of molecules. These screens are expensive, and sometimes do not give us a good “hit”. Recently, there has been a lot of interest in the use of alternative ways of finding this molecular needle in a haystack. In one, molecules are each attached to a strand of DNA. You the molecules at the same time against your biological target of interest (like a protein) and you pick out the one that binds. The DNA strand can then be sequenced, telling you what the molecule is. This is very useful because you can screen so many molecules at the same time. But you will always need to make the molecule in its native form (not attached to DNA) to check the result is real, and you can still get lots of false positives and negatives. We are interested in a different approach. We make a DEL, screen against a protein (e.g., a protein in coronavirus), identify the ”hits” and then use machine learning to work out – by comparing the hits to the misses – what molecules should bind the protein. Crucially, the molecules we predict can be bought, rather than needing to be made, so it is a fast process. This idea has just recently been demonstrated by a team at Google, but it needs work. What kind of molecules work best in the DEL? How do we train the algorithm that predicts the molecules? How big does the chemical library need to be? Does this idea work for every protein? There are significant questions to be answered in this research, which will involve chemical synthesis, protein assays and machine learning. The results will be highly impactful for the future development of medicines.