Press "Enter" to skip to content

A new study links India's genetic diversity to language, not geography

WEST LAFAYETTE, Indiana —The popularity of genetic and ancestry services such as Ancestry.com and 23andMe attests that People worry about the origin of their ancestors. The underlying assumption is that the geography of one's ancestors affects one's genes today.

Historically, scientists have discovered that geography is the main driver of genetic diversity in a population. Now, new research from Purdue University indicates that while that may be true for European countries, it is not true for all other parts of the world, especially places like India, where language and social systems have strongly affected how and where people live. The model the researchers developed to analyze the genetics of Indian populations will allow other researchers to analyze populations where genetics are not as closely tied to geography. Understanding the genetics of human populations helps scientists understand the history of human movement and cultures, and paves the way for understanding human health and susceptibility to disease.

 paschou-p21 "src =" https://www.purdue.edu/uns/images/2021/paschou-p21LO.jpg "/>
<span class= Peristera Paschou

Peristera Paschou, a population geneticist and associate professor of biological sciences at Purdue, studies human genetic variation around the world and led the study with Petros Drineas, associate director of Purdue's Department of Computer Science.

“Our genome bears the signature of our ancestors, and the genetic makeup of modern populations has been shaped by the forces of evolution. What we are looking for is what led different groups of people to come together and what separated them ”. Paschou said. “To understand the genetics of human populations, we created a model that allows us to consider together many different factors that may have shaped genetics. Interdisciplinary research combining genetics and informatics was key to our work, as well as the analysis of a comprehensive data set that represents the diversity of the Indian subcontinent. ”

Many population analyzes are based primarily on data sets of individuals of European descent living in Europe or North America; Genomic data for populations in other parts of the world are lacking. Data from European samples showed that genetics correlate very closely with geography: if you know someone's genetics, you can guess where they are from, within a few kilometers in some cases, and if you know where the ancestors of someone, has a close approximation of their genetic makeup.

 bose-a21 "src =" https://www.purdue.edu/uns/images/2021/bose-a21LO.jpg "/>
<span class= Aritra Bose

Aritra Bose earned her Ph.D. from Purdue in both data science and genetics. Reading studies on how European genomes map to geography, Bose, who was born and raised in Calcutta, thought, “Hey. That wouldn't work in India. "India is home to more than 800 languages, as well as an age-old caste system that regulates who can marry and have children with whom.

" I read these articles and thought, 'How can I use this concept in a stratified population like India? '"Bose said." I grew up there, I understand castes and languages, and the complexities of society that can affect genetics. "

The Indian population had shown that the European model of population genetics and geography failed to explain the population genetics of India. of castes, culture and language.

The model, and the conclusions reached by the team of geneticists and data scientists when using it, has just been published in a study in the journal Molecular Biology and Evolution . Their study revealed that shared language, not geography, is the most powerful force shaping gene flow in India.

Developing the model was not easy. At first, Bose hit a snag with some of his equations and brought up the problem to his mentor at IBM Research, where he was a fellow at the time. Working with their doctoral advisers and several IBM Research computer scientists, the team was able to develop a robust and flexible model.

Drineas, one of Bose's PhD Advisors, said: “I was intrigued by the interplay between genetics and sociodemographic factors in shaping the population structure of the Indian continent. It was exciting to see that our model detected spoken language as a major force in bringing people together in India, across geographic and social barriers. We were fortunate that Aritra Bose, our former PhD student (co-advised with Professor Paschou) worked on this project, as he has extensive experience in the algorithmic and human genetic side of our research, as well as the expertise to interpret our findings. in the context of human genetic diversity within India. ”

The resulting model, the first to be able to take into account so many different variables, has been highly successful in analyzing the genetics of the Indian population, giving scientists insight into how Indians moved to India and how. various groups of people mixed. People who speak the same, or even similar languages, tend to be much more closely related, even if they live geographically far apart.

"It sheds light on how genetics works in our society," Bose said. “This is the first model that can take into account the social, cultural, environmental and linguistic factors that shape the genetic flow of populations. It helps us understand what factors contribute to the genetic puzzle that is India. Unravel the puzzle. ”

The data helps place India in context with the rest of the world genetically. Indians speaking Indo-European and Dravidian languages ​​were more closely related to Europeans, while Indians speaking Tibeto-Burmese languages ​​were more closely related to East Asians.

This type of interdisciplinary research, combining data science with population genetics, and this model in particular, will help researchers understand the genetics of the human world, especially non-European countries with a rich history of diversity and migration.

About Purdue University

Purdue University is a leading public research institution that develops practical solutions to today's toughest challenges. Ranked # 5 Most Innovative University in the United States by U.S. News & World Report, Purdue offers world-changing research and out-of-this-world discoveries. Committed to practical and online learning, in the real world, Purdue offers a transformative education for all. Committed to affordability and accessibility, Purdue has frozen tuition and most fees at the 2012-13 levels, allowing more students than ever to graduate debt-free. See how Purdue never stops in persistent pursuit of the next big leap at https://purdue.edu/.[19459003

Writer, Media Contact: Brittany Steff, 765-494-7833, bsteff@purdue.edu

Sources: Aritra Bose, abose@ibm.com

Peristera Paschou, ppaschou@purdue.edu

Petros Drineas, pdrineas@purdu.edu

Journalists visiting the campus [1945901113] : Journalists must follow Protect Purdue protocols and the following guidelines:

  • The campus is open, but the number of people in the spaces may be limited. We will be as accommodating as possible, but you may be asked to leave or introduce yourself from another location.
  • To allow access, particularly to campus buildings, we recommend that you contact the Purdue News Service media contact listed in the release to inform them of the nature of the visit and where you will be visiting. A representative from the News Service can facilitate secure access and accompany you to campus.
  • Correctly wear masks inside any building on campus, and correctly wear masks outdoors when a social distance of at least six feet is not possible.

ABSTRACT

Integration of linguistics, social structure and geography to model genetic diversity in India

Aritra Bose, Daniel E. Platt, Laxmi Parida, Petros Drineas and Peristera Paschou

DOI: 10.1093 / molbev / msaa321

India represents an intricate tapestry of population substructure shaped by geography, language, culture and social stratification. While geography is closely correlated with genetic makeup in other parts of the world, the strict inbreeding imposed by the Indian caste system and the large number of languages ​​spoken add additional levels of complexity to understanding the structure of the Indian population. To date, no study has attempted to model and evaluate how these factors have interacted to shape patterns of genetic diversity within India. We combined all the publicly available data from the Indian subcontinent into a data set of 891 people from 90 well-defined groups. Combining geographic, genetic and demographic factors, we developed COGG (Correlation Optimization of Genetics and Geodemographics) to build a model that explains the observed population genetic substructure. We show that shared language along with social structure have been the most powerful forces in creating gene flow paths on the subcontinent. Additionally, we discovered the ethnic groups that best capture the diverse genetic substructure using a peak leverage score statistic. By integrating data from India with a data set of 1,323 additional individuals from 50 Eurasian populations, we found that Indo-European and Dravidian speakers of India show genetic drift shared with Europeans, while Tibeto-Burmese speaking tribal groups have a maximal shared genetic drift with East Asians.

Be First to Comment

    Leave a Reply

    Your email address will not be published. Required fields are marked *