As healthcare moves toward genetically tailored treatments, one of the biggest hurdles to truly personalized medicine is the lack of fast, low-cost genetic testing.
And few people are more familiar with the problems of today’s genetic diagnostics tools than Kalim Mir, the 52-year-old founder of XGenomes, who has spent his entire professional career studying the human genome.
“Ultimately genomics is going to be the foundation for healthcare,” says Mir. “For that we need to move toward a sequencing of populations.” And population-scale gene sequencing is something that current techniques are unable to achieve.
“If we’re talking about population scale sequencing with millions of people we just don’t have the throughput,” Mir says.
That’s why he started XGenomes, which is presenting as part of the latest batch of Y Combinator companies next week.
A visiting scientist in Harvard Medical School’s Department of Genetics, Mir worked with the famed Harvard professor George Church on a new kind of gene sequencing technology that promised to conduct sequencing at higher speeds and far lower costs than anything that was on the market.
The costs of sequencing a genome have come down significantly in the 19 years since the Human Genome Project successfully completed its project for $1 billion.
These days, gene sequencing can take a couple of days and cost around $1,000, Mir says. But with XGenomes, Mir hopes to drive the cost of testing down even further.
“We developed a way where we’re sequencing directly on the DNA where we’re not manipulating it except for opening up the double helix,” says Mir.
Running a startup focused on conducting gene sequencing at population scales is not where Mir thought he’d be when he was growing up in Yorkshire in Northern England. “When I was in school there, I was not into science or tech. I was interested in literature,” he recalls.
That changed when he read Aldous Huxley’s Brave New World and began thinking about the implications of genetic manipulation that the book presented.
Mir went on to study molecular biology at Queen Mary College and upon graduation worked in a biotech company in the U.S.
After returning to England to complete his doctorate in the mid-90s, Mir worked with the geneticist Edwin Southern on the foundational science that now form the core of testing technologies like 23andMe, Illumina, and Affymetrix.
Xgenomes technology works by unzipping strands of DNA and then sequencing the strands concurrently.
“I like to think of the genome as a book. The genome has chapters and the chapters could be the chromosomes,” says Mir. “Current technologies read it letter by letter. [But] we’re recognizing words.”
The company is able to accomplish this feat by using optical imaging technologies. Samples are treated with reagents that are then excited by lasers. XGenomes tech then “reads” the bits of DNA that are highlighted and identifies them.
Using this new tech, Mir thinks he can ultimately sequence a full genome in one to two hours and for as little as $100.
That would be a sea change in the way that testing is conducted and could bring about the rapid throughput of sequencing that Mir says is needed to make the vision of truly personalized medicine a reality.
Visual assessment is critical to healthcare — whether that is a doctor peering down your throat as you say “ahhh” or an MRI of your brain. Since the X-ray was invented in 1895, medical imaging has evolved into many modalities that empower clinicians to see into and assess the human body. Recent advances in visual sensors, computer vision and compute power are currently powering a new wave of innovation in legacy visual technologies(like the X-Ray and MRI) and sparking entirely new realms of medical practice, such as genomics.
Over the next 10 years, healthcare workflows will become mostly digitized, with wide swaths of personal data captured and computer vision, along with artificial intelligence, automating the analysis of that data for precision care. Much of the digitized data across healthcare will be visual and the technologies that capture and analyze it are visual technologies.
These visual technologies traverse a patient’s journey from diagnosis, to treatment, to continuing care and prevention.They capture, analyze, process, filter and manage any visual data from images, videos, thermal, x-ray’s, ultrasound, MRI, CT scans, 3D, and more. Computer vision and artificial intelligence are core to the journey.
Three powerful trends — including miniaturization of diagnostic imaging devices, next generation imaging to for the earliest stages of disease detection and virtual medicine — are shaping the ways in which visual technologies are poised to improve healthcare over the next decade.
Miniaturization of Hardware Along with Computer Vision and AI will allow Diagnostic Imaging to be Mobile
Medical imaging is dominated by large incumbents that are slow to innovate. Most imaging devices (e.g. MRI machines) have not changed substantially since the 1980s and still have major limitations:
Complex workflows: large, expensive machines that require expert operators and have limited compatibility in hospitals.
Strict patient requirements: such as lying still or holding their breath (a problem for cases such as pediatrics or elderly patients).
Expensive solutions: limited to large hospitals and imaging facilities.
But thanks to innovations in visual sensors and AI algorithms, “modern medical imaging is in the midst of a paradigm shift, from large carefully calibrated machines to flexible, self-correcting, multi-sensor devices” says Daniel K. Sodickson, MD, PhD, NYU School of Medicine, Department of Radiology.
Next Generation Sequencing, Phenotyping and Molecular Imaging Will Diagnose Disease Before Symptoms are Presented
Genomics, the sequencing of DNA, has grown at a 200% CAGR since 2015, propelled by Next Generation Sequencing (NGS) which uses optical signals to read DNA, like our LDV portfolio company Geniachip which was acquired by Roche. These techniques are helping genomics become a mainstream tool for practitioners, and will hopefully make carrier screening part of routine patient care by 2028.
Phenomics, the analysis of observable traits (phenotypes) that result from interactions between genes and their environment, will also contribute to earlier disease detection. Phenotypes are expressed physiologically and most will require imaging to be detected and analyzed.
Next Generation Phenotyping (NGP) uses computer vision and deep learning to analyze physiological data, understand particular phenotype patterns, then it correlates those patterns to genes. For example, FDNA’s Face2Gene technology can identify 300-400 disorders with 90%+ accuracy using images of a patient’s face. Additional data (images or videos of hands, feet, ears, eyes) can allow NGP to detect a wide range of disorders, earlier than ever before.
Molecular imaging uses DNA nanotech probes to quantitatively visualize chemicals inside of cells, thus measuring the chemical signature of diseases. This approach may enable early detection of neurodegenerative diseases such as Alzheimer’s, Parkinson’s and dementia.
Telemedicine to Overtake Brick-and-Mortar Doctors Visits
By 2028 it will be more common to visit the doctor via video over your phone or computer than it will be to go to an office.
Telemedicine will make medical practitioners more accessible and easier to communicate with. It will create an all digitized health record of visits for a patient’s profile and it will reduce the costs of logistics and regional gaps in specific medical expertise. An example being the telemedicine services rendered for 1.9M injured in the war in Syria.4
The integration of telemedicine into ambulances has led to stroke patients being treated twice as fast. Doctors will increasingly call in their colleagues and specialists in real time.
Screening technologies will be integrated into telemedicine so it won’t just be about video calling a doctor. Pre-screening your vitals via remote cameras will deliver extensive efficiencies and hopefully health benefits.
“The biggest opportunity in visual technology in telemedicine is in solving specific use cases. Whether it be detecting your pulse, blood pressure or eye problems, visual technology will be key to collecting data,” says Jeff Nadler, Teldoc health.
Remote patient monitoring (RPM) will be a major factor in the growth of telemedicine and the overall personalization of care. RPM devices, like we are seeing with the Apple Watch, will be a primary source of real-time patient data used to make medical decisions that take into account everyday health and lifestyle factors. This personal data will be collected and owned by patients themselves and provided to doctors.
Visual Tech Will Power the Transformation of Healthcare Over the Next Decade
Visual technologies have deep implications for the future of personalized healthcare and will hopefully improve the health of people worldwide. It represents unique investment opportunities and we at LDV Capital have reviewed over 100 research papers from BCC Research, CBInsights, Frost & Sullivan, McKinsey, Wired, IEEE Spectrum and many more to compile our 2018 LDV Capital Insights report. This report highlights the sectors that power to improve healthcare based on the transformative nature of the technology in the sector, projected growth and business opportunity.
There are tremendous investment opportunities in visual technologies across diagnosis, treatment and continuing care & prevention that will help make people healthier across the globe.
When we last spoke to Sophia Genetics it had around 350 hospitals linked via its SaaS platform, and was then adding around 10 new hospitals per month.
Now it says its Sophia AI platform is being used by more than 850 hospitals across 77 countries, and it claims to have supported the diagnosis of more than 300,000 patients.
The basic idea is to improve diagnoses by enabling closer collaboration and knowledge sharing between hospitals via the Sophia AI platform, with an initial focus on oncology, hereditary cancer, metabolic disorders, pediatrics and cardiology.
Expert (human) insights across the network of hospital users are used to collectively enhance genomic diagnostics, and push towards predictive analysis, by feeding and training AI algorithms intended to enhance the reading and analysis of DNA sequencing data.
Sophia Genetics describes its approach as the “democratization” of DNA sequencing expertise.
Commenting on the Series E in a statement, Lilly Wollman, co-head of Generation’s growth equity team said: “We believe that leveraging genetic sequencing and advanced digital analysis will enable a more sustainable healthcare system. Sophia Genetics is a leader in the preventive and personalized medicine revolution, enabling the development of targeted therapeutics, thereby vastly improving health outcomes. We admire Sophia Genetics not just for its differentiated analytics capability across genomic and radiomic data, but also for its exceptional team and culture”.
The new funding will be put towards further expanding the number of hospitals using Sophia Genetics’ technology, and also on growing its headcount with a plan to ramp up hiring in the US especially.
The Swiss-founded firm is now co-based in Lausanne and Boston, US.
In another recent development the company added radiomics capabilities to its platform last year, allowing for what it describes as “a prediction of the evolution of a tumour”, which it suggests can help inform a physician’s choice of treatment for the patient.
Google -owned AI specialist, DeepMind, has claimed a “significant milestone” in being able to demonstrate the usefulness of artificial intelligence to help with the complex task of predicting 3D structures of proteins based solely on their genetic sequence.
Understanding protein structures is important in disease diagnosis and treatment, and could improve scientists’ understanding of the human body — as well as potentially helping to support protein design and bioengineering.
Writing in a blog post about the project to use AI to predict how proteins fold — now two years in — it writes: “The 3D models of proteins that AlphaFold [DeepMind’s AI] generates are far more accurate than any that have come before — making significant progress on one of the core challenges in biology.”
There are various scientific methods for predicting the native 3D state of protein molecules (i.e. how the protein chain folds to arrive at the native state) from residual amino acids in DNA.
But modelling the 3D structure is a highly complex task, given how many permutations there can be on account of protein folding being dependent on factors such as interactions between amino acids.
There’s even a crowdsourced game (FoldIt) that tries to leverage human intuition to predict workable protein forms.
DeepMind says its approach rests upon years of prior research in using big data to try to predict protein structures.
Specifically it’s applying deep learning approaches to genomic data.
“We’re proud to be part of what the CASP organisers have called “unprecedented progress in the ability of computational methods to predict protein structure,” placing first in rankings among the teams that entered (our entry is A7D).”
“Our team focused specifically on the hard problem of modelling target shapes from scratch, without using previously solved proteins as templates. We achieved a high degree of accuracy when predicting the physical properties of a protein structure, and then used two distinct methods to construct predictions of full protein structures,” it adds.
DeepMind says the two methods it used relied on using deep neural networks trained to predict protein properties from its genetic sequence.
“The properties our networks predict are: (a) the distances between pairs of amino acids and (b) the angles between chemical bonds that connect those amino acids. The first development is an advance on commonly used techniques that estimate whether pairs of amino acids are near each other,” it explains.
“We trained a neural network to predict a separate distribution of distances between every pair of residues in a protein. These probabilities were then combined into a score that estimates how accurate a proposed protein structure is. We also trained a separate neural network that uses all distances in aggregate to estimate how close the proposed structure is to the right answer.”
It then used new methods to try to construct predictions of protein structures, searching known structures that matched its predictions.
“Our first method built on techniques commonly used in structural biology, and repeatedly replaced pieces of a protein structure with new protein fragments. We trained a generative neural network to invent new fragments, which were used to continually improve the score of the proposed protein structure,” it writes.
“The second method optimised scores through gradient descent — a mathematical technique commonly used in machine learning for making small, incremental improvements — which resulted in highly accurate structures. This technique was applied to entire protein chains rather than to pieces that must be folded separately before being assembled, reducing the complexity of the prediction process.”
DeepMind describes the results achieved thus far as “early signs of progress in protein folding” using computational methods — claiming they demonstrate “the utility of AI for scientific discovery”.
Though it also emphasizes it’s still early days for the deep learning approach having any kind of “quantifiable impact”.
“Even though there’s a lot more work to do before we’re able to have a quantifiable impact on treating diseases, managing the environment, and more, we know the potential is enormous,” it writes. “With a dedicated team focused on delving into how machine learning can advance the world of science, we’re looking forward to seeing the many ways our technology can make a difference.”
Presenting onstage today in the 2018 TC Disrupt Berlin Battlefield is Indian agtech startup Imago AI, which is applying AI to help feed the world’s growing population by increasing crop yields and reducing food waste. As startup missions go, it’s an impressively ambitious one.
The team, which is based out of Gurgaon near New Delhi, is using computer vision and machine learning technology to fully automate the laborious task of measuring crop output and quality — speeding up what can be a very manual and time-consuming process to quantify plant traits, often involving tools like calipers and weighing scales, toward the goal of developing higher-yielding, more disease-resistant crop varieties.
Currently they say it can take seed companies between six and eight years to develop a new seed variety. So anything that increases efficiency stands to be a major boon.
And they claim their technology can reduce the time it takes to measure crop traits by up to 75 percent.
In the case of one pilot, they say a client had previously been taking two days to manually measure the grades of their crops using traditional methods like scales. “Now using this image-based AI system they’re able to do it in just 30 to 40 minutes,” says co-founder Abhishek Goyal.
Using AI-based image processing technology, they can also crucially capture more data points than the human eye can (or easily can), because their algorithms can measure and asses finer-grained phenotypic differences than a person might pick up on or be easily able to quantify just judging by eye alone.
“Some of the phenotypic traits they are not possible to identify manually,” says co-founder Shweta Gupta. “Maybe very tedious or for whatever all these laborious reasons. So now with this AI-enabled [process] we are now able to capture more phenotypic traits.
“So more coverage of phenotypic traits… and with this more coverage we are having more scope to select the next cycle of this seed. So this further improves the seed quality in the longer run.”
The wordy phrase they use to describe what their technology delivers is: “High throughput precision phenotyping.”
Or, put another way, they’re using AI to data-mine the quality parameters of crops.
“These quality parameters are very critical to these seed companies,” says Gupta. “Plant breeding is a very costly and very complex process… in terms of human resource and time these seed companies need to deploy.
“The research [on the kind of rice you are eating now] has been done in the previous seven to eight years. It’s a complete cycle… chain of continuous development to finally come up with a variety which is appropriate to launch in the market.”
But there’s more. The overarching vision is not only that AI will help seed companies make key decisions to select for higher-quality seed that can deliver higher-yielding crops, while also speeding up that (slow) process. Ultimately their hope is that the data generated by applying AI to automate phenotypic measurements of crops will also be able to yield highly valuable predictive insights.
Here, if they can establish a correlation between geotagged phenotypic measurements and the plants’ genotypic data (data which the seed giants they’re targeting would already hold), the AI-enabled data-capture method could also steer farmers toward the best crop variety to use in a particular location and climate condition — purely based on insights triangulated and unlocked from the data they’re capturing.
One current approach in agriculture to selecting the best crop for a particular location/environment can involve using genetic engineering. Though the technology has attracted major controversy when applied to foodstuffs.
Imago AI hopes to arrive at a similar outcome via an entirely different technology route, based on data and seed selection. And, well, AI’s uniform eye informing key agriculture decisions.
“Once we are able to establish this sort of relation this is very helpful for these companies and this can further reduce their total seed production time from six to eight years to very less number of years,” says Goyal. “So this sort of correlation we are trying to establish. But for that initially we need to complete very accurate phenotypic data.”
“Once we have enough data we will establish the correlation between phenotypic data and genotypic data and what will happen after establishing this correlation we’ll be able to predict for these companies that, with your genomics data, and with the environmental conditions, and we’ll predict phenotypic data for you,” adds Gupta.
“That will be highly, highly valuable to them because this will help them in reducing their time resources in terms of this breeding and phenotyping process.”
“Maybe then they won’t really have to actually do a field trial,” suggests Goyal. “For some of the traits they don’t really need to do a field trial and then check what is going to be that particular trait if we are able to predict with a very high accuracy if this is the genomics and this is the environment, then this is going to be the phenotype.”
So — in plainer language — the technology could suggest the best seed variety for a particular place and climate, based on a finer-grained understanding of the underlying traits.
In the case of disease-resistant plant strains it could potentially even help reduce the amount of pesticides farmers use, say, if the the selected crops are naturally more resilient to disease.
While, on the seed generation front, Gupta suggests their approach could shrink the production time frame — from up to eight years to “maybe three or four.”
“That’s the amount of time-saving we are talking about,” she adds, emphasizing the really big promise of AI-enabled phenotyping is a higher amount of food production in significantly less time.
As well as measuring crop traits, they’re also using computer vision and machine learning algorithms to identify crop diseases and measure with greater precision how extensively a particular plant has been affected.
This is another key data point if your goal is to help select for phenotypic traits associated with better natural resistance to disease, with the founders noting that around 40 percent of the world’s crop load is lost (and so wasted) as a result of disease.
And, again, measuring how diseased a plant is can be a judgement call for the human eye — resulting in data of varying accuracy. So by automating disease capture using AI-based image analysis the recorded data becomes more uniformly consistent, thereby allowing for better quality benchmarking to feed into seed selection decisions, boosting the entire hybrid production cycle.
Sample image processed by Imago AI showing the proportion of a crop affected by disease
In terms of where they are now, the bootstrapping, nearly year-old startup is working off data from a number of trials with seed companies — including a recurring paying client they can name (DuPont Pioneer); and several paid trials with other seed firms they can’t (because they remain under NDA).
Trials have taken place in India and the U.S. so far, they tell TechCrunch.
“We don’t really need to pilot our tech everywhere. And these are global [seed] companies, present in 30, 40 countries,” adds Goyal, arguing their approach naturally scales. “They test our technology at a single country and then it’s very easy to implement it at other locations.”
Their imaging software does not depend on any proprietary camera hardware. Data can be captured with tablets or smartphones, or even from a camera on a drone or using satellite imagery, depending on the sought for application.
Although for measuring crop traits like length they do need some reference point to be associated with the image.
“That can be achieved by either fixing the distance of object from the camera or by placing a reference object in the image. We use both the methods, as per convenience of the user,” they note on that.
While some current phenotyping methods are very manual, there are also other image-processing applications in the market targeting the agriculture sector.
But Imago AI’s founders argue these rival software products are only partially automated — “so a lot of manual input is required,” whereas they couch their approach as fully automated, with just one initial manual step of selecting the crop to be quantified by their AI’s eye.
Another advantage they flag up versus other players is that their approach is entirely non-destructive. This means crop samples do not need to be plucked and taken away to be photographed in a lab, for example. Rather, pictures of crops can be snapped in situ in the field, with measurements and assessments still — they claim — accurately extracted by algorithms which intelligently filter out background noise.
“In the pilots that we have done with companies, they compared our results with the manual measuring results and we have achieved more than 99 percent accuracy,” is Goyal’s claim.
While, for quantifying disease spread, he points out it’s just not manually possible to make exact measurements. “In manual measurement, an expert is only able to provide a certain percentage range of disease severity for an image example; (25-40 percent) but using our software they can accurately pin point the exact percentage (e.g. 32.23 percent),” he adds.
They are also providing additional support for seed researchers — by offering a range of mathematical tools with their software to support analysis of the phenotypic data, with results that can be easily exported as an Excel file.
“Initially we also didn’t have this much knowledge about phenotyping, so we interviewed around 50 researchers from technical universities, from these seed input companies and interacted with farmers — then we understood what exactly is the pain-point and from there these use cases came up,” they add, noting that they used WhatsApp groups to gather intel from local farmers.
While seed companies are the initial target customers, they see applications for their visual approach for optimizing quality assessment in the food industry too — saying they are looking into using computer vision and hyper-spectral imaging data to do things like identify foreign material or adulteration in production line foodstuffs.
“Because in food companies a lot of food is wasted on their production lines,” explains Gupta. “So that is where we see our technology really helps — reducing that sort of wastage.”
“Basically any visual parameter which needs to be measured that can be done through our technology,” adds Goyal.
They plan to explore potential applications in the food industry over the next 12 months, while focusing on building out their trials and implementations with seed giants. Their target is to have between 40 to 50 companies using their AI system globally within a year’s time, they add.
While the business is revenue-generating now — and “fully self-enabled” as they put it — they are also looking to take in some strategic investment.
“Right now we are in touch with a few investors,” confirms Goyal. “We are looking for strategic investors who have access to agriculture industry or maybe food industry… but at present haven’t raised any amount.”
Earlier this year, news broke that police had devised an unexpected new method to crack cold cases. Rather than use a suspect’s DNA to identify them, data from the DNA was used to search public repositories and identify an alleged killer’s family members. From there, a bit of family tree building led to a limited number of suspects and the eventual identification of the person who was charged with the Golden State killings. In the months that followed, more than a dozen other cases were reported to have been solved in the same manner.
The potential for this sort of analysis had been identified by biologists as early as 2014, but they viewed it as a privacy risk—there was potential for personal information from research subjects to leak out to the public via their DNA sequences. Now, a US-Israeli team of researchers has gone through and quantified the chances of someone being identified through public genealogy data. If you live in the US and are of European descent, odds are 60 percent that you can be identified via information that your relatives have made public.
ID, the family plan
Any two humans share identical versions of the vast majority of their DNA. But there are enough differences commonly scattered across the three billion or so bases of our genomes that it’s now cheap and easy to determine which version of up to 700,000 differences people have. This screen forms the basis of personal DNA testing and genealogy services.
The genetics of Europe are a bit strange. Just within historic times, it’s seen waves of migrations, invasions, and the rise and fall of empires—all of which should have mixed its populations up thoroughly. Yet, if you look at the modern populations, there’s little sign of all this upheaval and some indications that many of the populations have been in place since agriculture spread across the continent.
This was rarely more obvious than during the contraction and collapse of the Roman Empire. Various Germanic tribes from north-eastern Europe poured into Roman territory in the west only to be followed by the force they were fleeing, the Huns. Before it was over, one of the groups ended up founding a kingdom in North Africa that extended throughout much of the Mediterranean, while another ended up controlling much of Italy.
It’s that last group, the Longobards (often shorted as “Lombards”), that’s the focus of a new paper. We know very little of them or any of the other barbarian tribes that roared through Western Europe other than roughly contemporary descriptions of where they came from. But a study of the DNA left behind in the cemeteries of the Longobards provides some indication of their origins and how they interacted with the Europeans they encountered.
Nebula Genomics, the startup that wants to put your whole genome on the blockchain, has announced the raise of $4.3 million in Series A from Khosla Ventures and other leading tech VC’s such as Arch Venture Partners, Fenbushi Capital, Mayfield, F-Prime Capital Partners, Great Point Ventures, Windham Venture Partners, Hemi Ventures, Mirae Asset, Hikma Ventures and Heartbeat Labs.
Nebula has also has forged a partnership with genome sequencing company Veritas Genetics.
Veritaswas one of the first companies to sequence the entire human genome for less than $1,000 in 2015, later adding all that info to the touch of a button on your smartphone. Both Nebula and Veritas were cofounded by MIT professor and “godfather” of the Human Genome Project, George Church.
The partnership between the two companies will allow the Nebula marketplace, or the place where those consenting to share their genetic data can earn Nebula’s cryptocurrency called “Nebula tokens” to build upon Veritas open-source software platform Arvados, which can process and share large amounts of genetic information and other big data. According to the company, this crossover offers privacy and security for the physical storage and management of various data sets according to local rules and regulations.
“As our own database grows to many petabytes, together with the Nebula team we are taking the lead in our industry to protect the privacy of consumers while enabling them to participate in research and benefit from the blockchain-based marketplace Nebula is building,” Veritas CEO Mirza Cifric said in a statement.
The partnership will work with various academic institutions and industry researchers to provide genomic data from individual consumers looking to cash in by sharing their own data, rather than by freely giving it as they might through another genomics company like 23andMe .
“Compared to centralized databases, Nebula’s decentralized and federated architecture will help address privacy concerns and incentivize data sharing,” added Nebula Genomics co-founder Dennis Grishin. “Our goal is to create a data flow that will accelerate medical research and catalyze a transformation of health care.”