Proteins belong to the most important molecules in living organisms. They function as messengers, transporters and catalysts, and provide cells and tissues with structure. The expression profile of proteins is rich in information, which can be used, for example, in diagnosing dis
...
Proteins belong to the most important molecules in living organisms. They function as messengers, transporters and catalysts, and provide cells and tissues with structure. The expression profile of proteins is rich in information, which can be used, for example, in diagnosing diseases. Therefore proteomics, the large scale study of proteins, can give us valuable information on molecular pathways and state of health. As a result, proteomics has the potential to transform personalized medicine.
Recent advances in mass spectrometry have led to a draft of the human proteome. With current mass spectrometry based techniques, these types of large scale studies remain an enormous effort. Therefore, there is a great need for breakthrough technologies to push proteomics from fundamental research into the clinic.
Genomics has benefitted from fast and inexpensive emerging single-molecule techniques. We envision similar effects for single-molecule protein sequencing. In this thesis we present our technology that will allow us to analyze protein expression profiles of samples as small as a single cell with large dynamic range.
Back in 2011, when this project was initiated, there was hardly any literature available on this topic. However, the past years more research groups openly shifted their focus to single-molecule protein sequencing. In Chapter 1, we give an overview of recent efforts to establish single-molecule protein sequencing. The foremost reason for the absence of highly sensitive and high-throughput protein sequencing techniques is the complexity of primary protein structures compared to DNA/RNA molecules. Where DNA and RNA consist of four unique building blocks, proteins are built from 20 distinctive amino acids.
Independent of the read out method of choice, this requires the detection of 20 distinguishable signals. A non-trivial challenge. Fortunately, a limited number of proteins occur compared to the theoretical number that could be created using 20 unique building blocks. While the exact number of protein coding genes in the human genome is still under debate, the number is believed to be roughly 20,000, resulting in a number of protein products that is finite. This, together with protein databases such as UniProt, allows for an alternative way of identifying protein sequences. Rather than detecting every single element, as is essential for DNA sequencing, we choose to focus on detecting the sequence of a subset of elements.@en