Machine learning. We hear the phrase a lot in modern science, but what does it mean? And how might machines that can learn help solve complex problems like cancer?
Brittany Lasseigne, Phd, a researcher at the HudsonAlpha Institute for Biotechnology, recently explained how the technique is working in her Huntsville, Ala., laboratory. It was an elaboration of her talk to more than 1,000 people at the institute’s November cancer-fighting fundraiser “Tie the Ribbons”.
“The way I think about machine learning is it’s a way to take big data sets and complicated problems and get some traction on them,” Lasseigne said. Her example is how a calculator works.
“You have inputs – maybe the numbers 2 and 3 – and you have an algorithm that you give the calculator, maybe the addition operation, and what does the calculator do?” she asked. “It takes that set of numbers and that algorithm and it gives you back an output, in this case 5. Machine learning kind of turns that on its head.”
“What you do is give the computer the inputs, the 2 and the 3, and the output 5, and you ask it to come up with the equation or the algorithm that describes the relationship of those two sets of numbers,” Lasseigne said.
Lasseigne’s inputs are the “huge bucket of genomic data” about cancer now available thanks to the growing field of genomic research, and her outputs are what scientists are trying to predict, “things like patient survival or response to different drug therapies or their likelihood of having a cancer or having recurrent cancer.”
So, Lasseigne asks the computer to take the data sets and “come up with an equation that describes the inputs – the genomic data – compared to the outputs – whether or not the patient would respond to a drug and then I can take that equation, that algorithm, and apply it in new settings.”
“This has really started to take off in the last few years,” she said, “because we have the hardware, the cool math and the software, and we’ve got the big data.” Researchers now have huge data sets whose collection has been funded by taxpayers, and researchers like Lasseigne also work with doctors and clinics to monitor patients during clinical trials
Science needed enough data to seek patterns that could be useful. So, how is it working?
One way is scientists are now thinking in terms of the molecular patterns that patients’ cancers have. They are beginning to see cancer as different in each patient, because of the differences in that patient’s genes. Patterns of similarity can still emerge from enough data and computing power, but a breast cancer patient might have the same underlying gene changes as a patient with ovarian cancer. “It might tell us what the best drugs might be for that patient at that time to get,” Lasseigne said.
Processing this much data – terrabytes of it – means big investments in computers. Research institutes like HudsonAlpha make those investments. It also means crossing disciplines. Biologists work today with mathematicians and computer programs, Lasseigne said.
“My background is actually in engineering,” she said, “so that’s part of what drew me to this – a systems approach to try and get as much information as we can out of a data set.”