Bengaluru: You read a brochure that promises to tell you what diseases you are likely to get in the next decade. Obviously you want to lead a longer, healthier life. So you send a cheek swab to the clinic and await test results. Thankfully, your reports are largely all-clear except a minor mutation which would likely pre-dispose you to diabetes in your 40s. Your doctor dutifully prescribes diet control, exercise and a pill.
A few months later, you hear that your cousin has got convicted in a hit-and-run case. Unwittingly, you played a role in the conviction because it was your DNA that led the police to identify him. Your chances of getting diabetes may be lower, but your cousin’s chances of proving himself innocent are nil. Sounds unbelievable?
Humans are now at a stage where we can truly appreciate how little we know of our biology. The realisation of ignorance has also accelerated our efforts to understand the mechanisms that govern our physical and behavioural characteristics.
For example, the protein haemoglobin is synthesised by three genes in the human body. Different mutations in these genes cause different diseases such as thalassemia or sickle cell anaemia. Sequencing of the genome can identify the presence of mutation and thus indicate the likelihood of disease. Other mutations — such as the ones in BRCA1 linked to breast cancer — do not always cause disease but can increase disease predisposition.
Genome wide sequencing of diseased individuals can lead to the discovery of underlying mutations that contribute to the disease.
Also read: Team led by Indian-origin scientist finds way to better predict disease risk from DNA codes
The evolution of genome database
When the human genome project began in 1990, the effort to sequence of one human seemed difficult. The project which focussed on mapping the entire human genome to determine both the physical and functional genetic make-up of humans took 13 years to complete.
Today, we know that sequencing 1, 10 or even 1,000 genomes is insufficient to discover the genetic origins of our health and disease.
Various countries have embarked on projects to sequence 1 million genomes of their citizens. These genomic databases are expected to reveal key insights on disease prevalence and underlying origins for the disease.
India has also taken a small step in this direction — piloting the IndiGen project to look at 10,000 Indian genomes.
But this is only one of many genomic databases in the country. Many academic and private institutions that research genetic diseases hold human genomic samples and data, albeit at different scales.
Private companies like Strand Life Sciences, Mapmygenome, Medgenome offer genetic testing and in the process, end up becoming repositories of genomic data. Stem cell clinics and blood banks, which might not analyse genomic data, nonetheless keep cellular samples from which DNA information can be extracted.
In addition to providing health-related information, genomic databases are also used for forensic identification.
Such databases serve to help identify repeat offenders or deceased human remains. One such was most famously used to identify the Golden Gate Killer in the US, a man accused of killing 12 people and raping 45 women between 1976 and 1986.
Also read: For the first time, India has a genome database. But are we ready to use it?
Why due process is crucial
India is also considering a forensic DNA database, which will be governed through the DNA Technology (Application and Regulation) Bill.
The bill is currently under examination with the parliamentary standing committee on science and technology. As scientific tools improve and we understand more about our DNA, we can expect genomic databases and the purposes they are used for to increase.
The collection of information and its proper utilisation for the benefit of society is always welcome. But a collection of such vast amounts of personal and sensitive data is vulnerable to exploitation.
For example, earlier this year, a high court in Florida allowed police agencies to access private databases maintained by genomic companies to seek forensic data. Though private companies are opposing this move, the decision does set a dangerous precedent. It allows the use of data gathered under certain limitations of consent, to be re-purposed without explicit consent.
Such cross-talks between databases can lead to people withdrawing data or not even submitting data to civilian databanks, for the fear of implicating themselves or their family members in a crime. Thus, databanks need to be governed by strong purpose limitation causes and the transfer of data outside of those purposes should be criminalised.
Further, due processes of consent need to be set in to allow effective use of databanks.
Consider for example, research projects that promise to look into disease prevalence. These projects generate massive data which can be used by commercial organisations to research and engineer better diagnostic/therapeutic tools.
However, the databanks need to ensure their donors are aware of data being transferred to commercial partners and of benefit-sharing agreements if commercially viable products are generated. Commercially viable products could include diagnostic panels for diseases or creation of therapies based on better understanding of disease origin through analyses of the data.
Also read: GM babies are possible, but do we really want them?
How to shape policies
The data protection law — when it comes in — will ensure protection of sensitive personal information of donors. The law will regulate the processing of personal data of individuals by government and private entities. Processing of data will be allowed if the individual gives consent, or in a medical emergency, or by the State for providing benefits.
The bill is yet to be passed by Parliament.
Procedures of consent will be handled through ICMR/DBT guidelines and the data protection law. However, the implementation of rights’ protection in databanks will be ensured through their existing ethics committees.
Currently, ethics committees are incompetent and lack the ability to properly function. Ethics committees are composed by people familiar to the institution and essentially are not incentivised to scrutinise their host institution. Their powers are limited to assessing proposals at the beginning of the projects, but cannot ask for follow-up or scrutinise the host institution’s routine working. Hence it is critical that ethics committees in databanks are empowered to make decisions and hold databanks accountable for any rights’ violations.
But finally the most important facet that shapes data banking is how the data will be used. Consider DNA data banking for forensic purposes — DNA merely indicates the presence of a person at a crime scene, but cannot imply causality. Yet DNA is increasingly being considered a silver bullet to solve impending cases.
Your cousin might have been trying to help the victim and left his DNA on the scene. Even results from health-related databanks are not absolute. Your predisposition to diabetes is a probability based on many other factors — nutrition, exercise, stress, etc. It is important that your doctor takes these into account before prescribing you an otherwise unnecessary lifestyle change.
Genomics data banking is a necessary first step that can help India solve many existing issues. But our policies need to mature to support data banking and use them effectively. Till then they will remain repositories of personal and sensitive data that can be easily misused.
Shambhavi Naik is a fellow at the Takshashila Institution’s Technology and Policy Programme.
Also read: Gene editing might alter our DNA, but at the cost of our humanity