Biostatistics from 30,000 feet: An embarrassment of riches

This post is part of a series on Google's Project Baseline and my perspective as an amateur bioinformatician.

The Human Genome Project will probably go down in history as the biggest government project to ever finish so early and under budget.  It pulled the entire genomics industry up by their bootstraps and precipitated a drop in cost for DNA sequencing far below the Moore's law-type predictions that had been the conventional wisdom in the industry.  Today it costs well under $1000 and a day to sequence a human genome, a task that cost the Human Genome Project upwards of one billion dollars and 13 years only a few years ago.  

This all sounds like quite a boon for the computational biologists, right? Surely now we can sequence everyone's genome and tease out the genetic basis of disease for the betterment of all human-kind! Not so fast -- the laws of combinatorics are working against researchers in the field, as you'll soon see.

Read More

Google wants your blood, sweat, and tears

This post is part of a series on Google's Project Baseline and my perspective as an amateur bioinformatician.

Today the Washington Post reported on a massive leak of personal information and passwords belonging to over 6 million customers.  I don't know what page of the physical paper this story was printed on, but it definitely wasn't front-page news.  Nowadays, these kinds of leaks have become commonplace.  Over the last several years there have been many high-profile leaks of private information from companies like Amazon, Uber and Venmo, potentially compromising the personal and financial information of tens of millions of people.  And yet we all still use these services.  We do an internal calculus involving the risk of a leak, the sensitivity of the data, and the benefit of using the service.  For most of us, we decide that we want that sweet, sweet same-day delivery, a car on call, and a painless way to pay back our friends for Thai food more than we want absolute security of our personal data. But no company has more access to our private data than Google.  Chances are that your recovery email for Verizon, Amazon and Uber is a Gmail account, your browser is Chrome, and even if your phone doesn't run on Android, you have several Google services installed with a bevy of permissions.

Most people seem to trust Google with their data.  But now they want more data from as many volunteers as they can get. Much more data. And of a far more personal nature.  Google is collaborating with investigators from Stanford and Duke universities on an audacious plan to map human health.  Google wants your blood, your sweat, your tears, and several other bodily secretions that people don't talk about in polite company.  They want to sequence and enumerate your genome, your proteome, your metabolome, and your microbiome.  They want to scan you with every medical and wearable device imaginable. Oh, and they want to do this continuously for the next five years.  It's a big ask, but the payoff for our understanding of human health could be immense.

I said yes.  Please don't be evil, Google.