Phylogeographic Surveillance of CA HIV Transmission Networks

Sudeb Dalai, University of California, Berkeley
Advisor: Art Reingold
Basic Biomedical Sciences
Dissertation Award

HIV epidemics are driven by multiple factors, including sociopolitical events, risk behaviors, and properties of the virus that may affect its transmissibility or disease potential. Recent studies of the HIV epidemics in Africa, Asia, and North America, many using state-of-the-art computer and lab-based approaches, have indicated that these epidemics are each fueled by different risk factors, patterns of virus transmission, and different viral strains. While the African and Asian HIV epidemics are perpetuated by sexual-social factors, intravenous drug use (IDU), and commercial sex work, male-male sex has remained the primary mode of transmission in most US communities. However, in recent years the HIV epidemic in California has shifted from a primarily white MSM (men who have sex with men) population to a diverse range of overlapping risk groups, including immigrants and underrepresented minorities. In San Mateo, a northern CA county with an annual HIV incidence of 4.6%, the epidemic is sustained by multiple ethnic, migratory, and behavioral networks, including MSM, immigrants from Latin America and Asia, and IDU. In addition, increased migration and population mobility have led to the introduction of new HIV strains from other regions into the community, further complicating public health efforts to identify, treat, and and prevent new infections. Finally, the HIV virus is constantly changing and evolving through mutations in various viral genes, leading to the emergence of new viral species and strains. Since these genes, including the HIV polymerase (pol) gene, are the targets of antiretroviral drugs, mutations in genes like pol can lead to viral drug resistance, potentially causing treatment complications and clinical failure in patients. We believe that the extraordinary genetic diversity of HIV may explain the divergent epidemic patterns seen in different regions, and that by collecting HIV gene sequences from numerous patients it is possible to garner information on the factors and risk networks underlying epidemics, track drug resistance within individual patients and in communities, and potentially predict future epidemic trends. We propose to collect over 13 years of HIV gene sequences from patients receiving treatment in San Mateo County, CA (~1000 pol gene sequences), link these sequences to patient clinical and demographic information, and analyze them by developing and applying innovative, rigorous computational methods. While this will establish the largest collection of San Mateo HIV sequences analyzed to date, we will also continue to acquire additional Bay Area sequences from regional collaborators and public databases. We expect that 1) investigation of HIV sequences that appear highly similar will reveal common modes of transmission, risk groups, and networks operating in northern CA; 2) monitoring the ways in which viral gene sequences evolve and mutate in response to drug therapy will uncover new insights on the nature and consequences of drug resistance; 3) continued surveillance of HIV genes will, moving forward, provide an expanding resource to track the regional introduction of new viral species and pinpoint the sources that continue to drive the northern CA epidemic. HIV sequence data contain important information, which can assist community public health programs and treatment providers in developing effective, integrated strategies to reduce transmission and identify and treat newly-infected individuals.