We are working with Hartwig Medical Foundation to take cancer research to a new level, using incredible amounts of processing power, storage capacity, and a novel research platform. Through co-creation we are undertaking an unprecedented journey, in a highly motivated mixed team, on a state-of-the-art cloud platform, with an enourmous amount of data.
Hartwig Medical Foundation is the first national DNA sequencing center in the Netherlands. It has grown out of the Center for Personalized Treatment (CPCT), to which hospitals contribute small pieces of tumor tissue (biopsy specimens) and blood samples from cancer patients who are participating in the program. These samples undergo whole genome sequencing, before being processed in a genomics pipeline. The sequencers produce huge files full of the DNA raw data, cut into millions of tiny parts, that are then passed to the pipeline. This pipeline then uses different software and a large amount of computers to piece them back together in the correct order. This actually happens twice for each patient: once on the basis of the blood, to identify the patient’s own DNA, and once with the biopsy specimen to determine the DNA of the tumor. With these, the pipeline compares the two in a next step, to identify the abnormalities that cause cancer and give the tumor cells their malicious characteristics. It is the output of this comparison that we want to study in relation to the clinal data.
EXCEPTIONAL PROCESSING POWER
Processing and comparing the DNA samples requires exceptional processing power, storage and bandwidth. Hartwig Medical Foundation will feed this data into a database where the effectiveness of cancer treatments can be compared with the genetic structures of the tumors. The idea is that in the future treatments can be better matched to individual patients, which could lead to more effective treatments and fewer side effects. While this is an exciting prospect, it’s important to keep in mind the sensitive personal details; security is of the utmost importance. A researcher doing statistical analysis, for example, should not be able to trace the results back to specific individuals. At the same time, the patients, who naturally are only entered in the database if they have given their explicit permission, often do want their data accessible to the doctor who is treating them.
IMPROVING THE PIPELINE
From January to March 2016, Schuberg Philis conducted a pilot project for Hartwig Medical Foundation, developing a platform on which the pipeline could run. Even before this trial, the team had improved the pipeline software. During the pilot project, everything worked satisfactorily, and in April a contract was signed for Schuberg Philis to host the pipeline in the Mission Critical Cloud.
When we began this project, Hartwig Medical Foundation had already found a considerable number of hospitals willing to supply cancer biopsy samples via CPCT studies. With the pipeline in our cloud, they were able to scale this up, and by the end of 2016 there were approximately 24 actively participating hospitals and procedures for providing the samples have been evaluated by the Medical Ethics for another 16 by end of 2016. According to Hans van Snellenberg, Director of the Hartwig Medical Foundation, their goal is to grow to 50 to 60 hospitals, which would provide coverage of most of the hospitals and patients in the Netherlands. The outcome data of the pipeline analyses is fed back to the hospital in a patient report and also stored with the clinical data in the HMF database.
“We have improved the pipeline together. The data now goes in and out more quickly, and the infrastructure is better utilized. Driving down cost per patient is important.”
The outcome data of the pipeline analyses is currently used (for research use only) by bioinformaticians and oncologists of the hospitals. The hospitals can get to the genetic data of their patients via a secure portal hosted by Schuberg Philis. Hartwig Medical Foundation is also receiving research proposals (for data out of the database), each of which must be examined by the data access board for scientific quality, technical feasibility and legal and ethical criteria. The institute has also received a lot of media interest, which is feeding the hope of better treatments. Early in 2017, Hans van Snellenberg looked back on a year of changes, discussing the role Schuberg Philis has played.
JOINING OUR BEST PEOPLE WITH THE BEST TOOLS
“Schuberg Philis has done four important things for us in the past year. Firstly, they have provided enormous processing capacity. Secondly, we have improved the pipeline together. The data now goes in and out more quickly, and the infrastructure better utilized. Driving down cost per patient is important to make the current set up acceptable for incorporating in the care system. Last year only two sequencers were running in the pipeline; now all ten are running. In April we began with 40 patients per month, and by the end of 2016 we had 110 per month, and that number is increasing. Thirdly, Schuberg Philis has also invested in storage capacity, ranging from continually accessible object storage through to slower tape storage. Data is now stored in a compressed form, while remaining quickly accessible. It is raw data, stored in such a way that we can trace its origins. So we do not foresee problems from legacy systems in the near future, and can continue to improve the pipeline. Finally, I see all sorts of addedvalue activities such as the proof of concept for the portal that is allowing the research centers to see the raw data themselves and conduct analyses. Nevertheless, we do not want to be peddling hope, by claiming that we have the final key in curing cancer. With our combined efforts allthough, we can make a difference by joining our best people with the best tools.
How big, we will see in the future. ”At the end of 2015, the biggest challenge facing Schuberg Philis was to thoroughly understand the pipeline and then to improve it. What was the biggest challenge for Schuberg Philis in 2016? Janot van Wegen, Customer Director at Schuberg Philis: “The processing capacity and storage that we supply to Hartwig Medical Foundation is exponentially larger than any of our other customers. It amounts to a few hundred gigabytes per patient, which is 35 to 40 terabytes per month with the current numbers. Archiving that is a big challenge, because we are now dealing with quantities that it was thought only big players such as Amazon Web Services can handle. However this data cannot be placed in the public cloud, mainly for privacy reasons, so it remains in our Mission Critical Cloud. We have learned a lot about the pressure on the storage environment, and will apply those lessons to increase the stability of the environment in the future. In addition we have also gained a deeper understanding of how a DNA sequencing pipeline works.”