Disclaimer: This dissertation has been written by a student and is not an example of our professional work, which you can see examples of here.

Any opinions, findings, conclusions, or recommendations expressed in this dissertation are those of the authors and do not necessarily reflect the views of UKDiss.com.

Epidemiology of Squamous Cell Carcinomas of the Head and Neck

Info: 41404 words (166 pages) Dissertation
Published: 10th Dec 2019

Reference this

Tagged: Biology

Table of ContentsList of Tables

Table 1: Comparison of SCCHN incidence in Canada and India (Globocan 2012)

Table 2: Selected physical characteristics of cigarettes, cigars, pipes and bidi

Table 3: Characteristics of SNPs involved in tobacco and alcohol metabolism

Table 6: Representation of joint effect ORs for SCCHN by strata of smoking and SNP

List of Figures

Figure 1: Pathways involving tobacco carcinogens, metabolizing enzymes and SCCHN

Figure 2: A depiction of copy number variants in human genome adopted from He et al, 2012

Figure 3: Hypothetical DAG 1

Figure 4: Hypothetical DAG 2

Figure 6 Minimal sufficient set of confounders identified from hypothetical DAGs 1 and 2 to estimate total causal effect of an exposure on the outcome

Figure 7: Causal graph representing time-varying confounders effected by prior exposure

Figure 8: Percentage of control participants recruited from participating clinics at Indian and Canadian site

Figure 9: Causal graph representing SEP in three periods of life, oral cancer and associated potential confounders

Figure 10:  Mediation model proposed by Baron and Kenny (1986)

Figure 11: Illustration of selection bias within a case-control study

Figure 12: Causal diagram illustrating information bias due to exposure misclassification

1         Introduction to be written

2         Literature review

The following sub-sections present current knowledge regarding the epidemiology of squamous cell carcinomas of the head and neck (SCCHN) with special reference to Canada and India, the role of risk factors such as tobacco, alcohol consumption and specific genetic polymorphisms involved in their metabolism, human papillomaviruses (HPV) and socioeconomic position (SEP), followed by a brief description of life-course epidemiology, case-control study design, counterfactual causal framework and directed acyclic graphs.

2.1       Squamous cell carcinomas of the head and neck (SCCHN) – Definition

Malignant tumours arising from the squamous cells that line the mucosal surface of the oral cavity, pharynx and larynx [C00‐C14, C32 under the International Classification of Diseases (ICD) 10 classification], are commonly referred to as squamous cell carcinomas of the head and neck (1). Histologically, more than 90% of cancers of the oral cavity, pharynx and larynx are of squamous cell origin (2).

2.2       Epidemiology of SCCHN

SCCHN are a heterogeneous group of cancers that differ in distribution, predisposing factors, diagnostic workup and management strategies. According to Globocan 2012 statistics, SCCHN accounted for approximately 599,500 incident cases worldwide, making them the 7th most common cancers in incidence (3.8% of cases) (3). Most of these cancers affect males (70.8%) and are diagnosed above 60 years of age (4). The sub-site with the highest cancer incidence is the oral cavity (300,373), followed by the larynx (156,877) and pharynx (142,387) [Age standardized incidence rates (ASIR) per 100,000 population: oral cavity=4, pharynx=1.9, larynx=2.1]. Globally, these cancers were the 8th most common causes of cancer mortality (3.6% of cases), and were responsible for 300,000 deaths in 2012 (3).

There is wide variation in the geographic distribution of SCCHN incidence across the globe (4, 5). Approximately two-thirds of the burden of incident SCCHN cases is borne by developing countries, with India accounting for 25% of new cases and 35% of deaths occurring worldwide (3). In 2012, approximately 142,000 new SCCHN cases were reported in India, accounting for 30% of all incident cancer cases in this country (6). There has been a rapid increase in the incidence of these cancers, specifically oral cancers, in India. A comparison of Globocan 2008 and 2012 reveals that oral cancer surpassed lung cancer in a span of four years to become the 3rd most common cancer in this country after breast and cervical cancers (3, 7).

In developed countries such as Canada, SCCHN accounts for 3% of incident cancer cases (3). An increase in the incidence of SCCHN from 3,000 new cases in 1990 to an estimated 5,650 new cases in 2016 has been reported, accounting for 1,650 deaths in this country in 2016 (8). According to Canadian Cancer Statistics 2016, a significant decrease in the incidence rate of oral cavity cancers was noted in males between 1992 and 2003, after which the rates became relatively stable (8). Rates among females did not change significantly between 1992 and 2012. In contrast, the incidence rate of pharyngeal cancers has increased significantly in both males and females since the mid-1990s. In males, the incidence of pharyngeal cancers surpassed that of oral cavity cancers in 2001 while in females, the incidence of oral cavity cancers continues to be higher than that of pharyngeal cancers (8).

A comparison of SCCHN incidence between India and Canada (Table 1) based on Globocan 2012 estimates shows that the age standardised incidence rates (ASIR) for SCCHN overall and nearly all subsites for both males and females are higher in India than in Canada  (9).

Table 1: Comparison of SCCHN incidence in Canada and India (Globocan 2012)

Type of Cancer Canada India
Males Females Males Females
SCCHN incidence

(total numbers)

3,394 1,347 108,477 32,663
ASIR per 100,000 population
    SCCHN 11.8 4.2 20.9 6.1
    Oral 5.5 2.9 10.1 4.3
    Larynx 3 0.6 4.6 0.5
    Pharynx 3.2 0.8 6.3 1.3
ASIR- Age standardised incidence rates. Age standardization was performed using the direct methods and the World standard population as proposed by Segi (10) and modified by Doll et al (11).

direct method and

the World standard population as proposed by Segi



modified by Doll et al.


direct method and

the World standard population as proposed by Segi



modified by Doll et al.


SCCHN have a significant impact on the quality of life and psychosocial health of the patients and impose a considerable economic burden on their families (12, 13). In the US, patients with SCCHN have more than three times the incidence of suicides compared to the general population (14). Most of these have been reported to occur within the first 5 years of diagnoses and has been attributed to adverse effects on patients’ quality of life and resulting psychological distress that may last for decades after successful treatment. The overall 5‐year survival rates are low for SCCHN, and vary by cancer sub‐site from 35% for oral to 65% for laryngeal cancers (6, 15). Multiple primary tumours developing at the cancer site and a high rate of secondary tumours compared to other malignancies contribute to this poor prognosis scenario, which has not changed over the past 30 years, (16-18). Although the majority of SCCHN can readily be accessed for visual and tactile examination (e.g., oral cavity cancers), 60% of patients are diagnosed at stage III and IV in North‐America (19). In India, up to 80% of patients present with advanced disease (6). This situation may be attributed to diagnostic delay (failure in recognizing early signs and symptoms of cancer by patients and/or professionals, delay in accessing professional care) and lack of diagnostic tools with high sensitivity and specificity for the early detection of clinical disease (19, 20). Severe functional and esthetic sequelae, especially for cases diagnosed at late stages, have been reported following treatment for these SCCHN. According to a 2007 study, the mean per-patient expense of managing oral cancers in the UK in the first year following diagnosis is 3,500$USD for pre-cancer and 25,000$USD for stage IV cancer patients (21). In North America, SCCHN are responsible for approximately 2.8 billion $USD per year in productivity loss (21). For these reasons, SCCHN have been recognised as a major public health problem in both developed and developing countries.

2.3       Risk factors for SCCHN

SCCHN are complex diseases with multi-factorial aetiology. The discrepancy in the geographic distribution of their incidence has been attributed to variations in the risk factors involved in different locations (5). In developed countries, approximately two-thirds of SCCHN cases are attributed to tobacco smoking and alcohol consumption (4, 22-24) and about 17%-56% of cases may be due to high risk HPV infection (4, 25-27). In developing countries, such as India and most parts of South Asia, paan chewing is the strongest risk factor (4, 5, 28). Other risk factors include social (e.g., SEP) and psychosocial variables (e.g., acute life events, work stress, depression) (12, 29-33), familial associations (34-41), diet, sexual behaviour, infection and oral/periodontal health related factors (4, 5, 42-45). The sections below describe in detail the risk factors for SCCHN; special emphasis is given to tobacco smoking, alcohol consumption, genetic variations (polymorphisms and copy number variations) and SEP because they are central to this dissertation.

2.3.1        Tobacco use and alcohol consumption       Tobacco use

Tobacco use is the strongest risk factor for SCCHN. Among the various forms of tobacco consumption [e.g., smoking, chewing and snuffing), smoking (e.g., cigarettes, pipes, cigars, bidi, hookah, chutta, chillam) is the most common (46-48). In its smoked form, tobacco was first used as pipes and cigars, and later as bidis (especially in South Asia), followed by cigarettes in the later half of the nineteenth century (49). Selected characteristics of cigarettes, cigars, pipes and bidis including nicotine content are provided in Table 2 (49-51).

Table 2: Selected physical characteristics of cigarettes, cigars, pipes and bidi


About 50% of men and 9% of women in developing countries, and 35% of men and 22% of women in developed countries smoke tobacco in the form of cigarettes (47). In 2013, the average daily cigarette consumption was 15.2 and 12.5 for male and female smokers respectively in Canada (52). Among the provinces, Quebec reported the highest daily cigarette consumption, at 15.6 overall (males=16.5, females=14.5) (52).

In India, approximately 35% of adults use tobacco in some form (47). Paan/betel quid (a combination of tobacco, areca nut and slaked lime wrapped in a betel leaf) chewing is one of the most commonly used forms of tobacco in India, in both males and females (53-56). The prevalence of tobacco smoking is around 14% in India and is much higher in males than females (24% vs 3%) (47). Bidi is the most commonly used smoking product (prevalent in 9% of adults), followed by cigarettes (6%) (47). One bidi produces more nicotine, carbon dioxide, tar, alkaloids and potential carcinogens than a regular cigarette (57-60).       Tobacco use and risk for SCCHN

The International Agency for Research on Cancer (IARC) first reported the positive association of tobacco use and alcohol consumption with SCCHN risk in 1985 and 1988, respectively (61, 62). Approximately 69 chemicals identified in tobacco smoke contribute to tumourigenesis, including 10 that are identified as Group 1 human carcinogens by the IARC (63). The most important of these carcinogens, which have also been causally linked to SCCHN, are volatile nitrosamines [e.g., NDMA (nitrosodimethylamine), NEMA (nitrosoethylamine)], nitrosodiethanolamine (NDELA), tobacco specific nitrosamines (TSNA) [e.g., 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) and N-nitrosonornicotine (NNN)], polycyclic aromatic hydrocarbons (PAH) (e.g., benz[a]pyrene, benz[a]anthracene), aromatic amines, benzene and volatile aldehydes (e.g., acetaldehyde, formaldehyde) (28, 63, 64).

The oral cavity, pharynx and larynx are directly exposed to tobacco smoke, compared to other sites such as the lungs (49, 51, 57). In the West, approximately 45% of SCCHN cases in men and 75% of cases in women have been attributed to tobacco smoking (24). It is independently responsible for a quarter of SCCHN cases in non‐alcohol users (24, 65) worldwide, and 60%-90% of deaths from SCCHN in North America (66). The cigarette is the most common form of smoking and thus it is the main route of delivery of tobacco related carcinogens in most countries. An IARC review documented magnitudes of average relative risks ranging between 4 and 10 for SCCHN risk for ever smokers relative to never smokers (49). However, cigar and pipe smoking may deliver equivalent or higher doses of carcinogens compared to cigarette smoking. Indeed, the largest pooled analysis thus far of 19 studies on SCCHN reported that risk estimates for individuals who smoked only cigarettes, only cigars and only pipes were 3.93, 3.49 and 3.71, respectively, compared to non-smokers (67). Individuals who smoked various combinations of these products were also at approximately 2.5 to 3.5 times the risk for these diseases (67).

The majority of SCCHN cases in India and many Asian countries are attributable to paan/betel quid chewing (4, 55, 56, 68-70). The carcinogenic effect of paan chewing is complex, as it results from an interaction between carcinogens in tobacco, arecoline, the main alkaloid in arecanut, and an increased alkalinity of the oral mucosa due to slaked lime (61, 68, 71-73). In Southern India, approximately 50% of cases in men and 90% in women are attributable to frequent and long term paan chewing (54). A recent meta-analysis reported a 5-7 times higher risk for oral cancers associated with chewers compared to non-chewers (70) in India. Bidi smoking, reported to deliver approximately 1.5 times the carcinogens of commercial cigarettes, also significantly increases the risk of SCCHN (57). Studies including meta-analytical reviews report a 2-7 times increased risk among bidi smokers compared to non-smokers (55, 56, 74, 75). However, evidence on the association between filtered cigarette smoking and SCCHN in India is mixed. Both case-control and longitudinal studies report little or no association between this exposure and outcome (45, 55, 76, 77).

Multiple measures of tobacco use (e.g., frequency, duration, cumulative consumption) have been associated with SCCHN risk, with studies reporting linear or non-linear dose-response relationships with these exposures (24, 55, 56, 69, 75, 78-82). A large pooled analysis in a European male population reported a monotonic increase in risk for SCCHN (from as low as 2 daily cigarette consumption) with increasing frequency of cigarette smoking relative to non-smokers (81). A similar dose-response relationship was demonstrated with a cumulative measure of paan chewing in studies from South India (56) and Taiwan (83). For the association between years since cessation of the habit and SCCHN risk, multiple studies report an inverse relationship (78, 80, 84-87).       Alcohol consumption

The World Health Organization (WHO) estimates that there are approximately two billion alcohol consumers worldwide (28). More than half of men (55%) and one third of women (34.4%) consume some form of alcoholic beverage (88) and their drinking patterns vary from occasional to habitual drinking, to alcohol abuse (28). There is wide variation in the type, quality and quantity of alcohol consumed across countries. In Canada, about three-quarters of the population (78%) drinks alcohol in the form of wine (10% ethanol), beer (5% ethanol), hard liquors (50% ethanol) and various combination of spirits (89). In 2011, Quebec reported the highest rate of consumption (82%) in the country (89), and a higher percentage of males consumed alcohol than females (83% vs 74.5%) (89).

Comparatively, the prevalence of alcohol consumption in India is much lower, with only 21% of men and 2% of women who have this habit (90). The state of Kerala located in the Southwest of India reports the highest rates of alcohol consumption in the country (91). In addition to other forms of alcohol, people in India consume high quantities of “toddy”, a beverage produced locally from the fermented and distilled sap of palm and coconut trees (approximately 8-10% ethanol), and a locally brewed liquor known as “arrack”, traditionally produced from fermented palm sap and fruit, grain, or sugarcane (approximately 40-60% ethanol) (69, 72).       Alcohol consumption and risk for SCCHN

There is a general consensus that alcohol plays the role of a promoter/cocarcinogen in carcinogenesis (36, 62, 92-94). Local exposure to ethanol, the principal type of alcohol found in most alcoholic beverages, is considered to increase the solubility of oral, pharyngeal and laryngeal mucosa, facilitating the penetrance of other carcinogens (24, 93, 95). Heavy drinking induced nutritional deficiencies and a direct toxic effect on the epithelium by alcohol beverages with high concentrations of ethanol may also contribute to alcohol associated carcinogenesis (94). In addition, certain alcoholic beverages contain low levels of carcinogenic substances (e.g., nitrosamines, urethane, polycyclic hydrocarbons) (62, 86). Furthermore, the primary metabolite of ethanol metabolism in the body, acetaldehyde, is a Group 1 human carcinogen that exerts multiple mutagenic and carcinogenic effects, qualifying alcohol as an initiator of the cancer pathway (24, 96-100). Other mechanisms are detailed in sub-section

Alcohol consumption accounts for approximately 30% of all SCCHN cases worldwide (101). The greater risk of disease in men is attributed to their higher average alcohol consumption relative to women (98, 101). An increase in the risk of SCCHN with different levels of ethanol consumption, duration, frequency and alcohol types has been documented among never tobacco users (24, 79, 101-103). In a large pooled analysis of case-control studies, Hashibe et al. documented that among never users of tobacco, approximately 7% of SCCHN cases were attributable to alcohol drinking alone. A meta-analysis on several cancers reported between 1956 and 2012 documented risk for SCCHN with magnitudes ranging between 1.44 to 1.83, and 2.65 to 5.13 for moderate and heavy drinkers, respectively, among European and North American populations (104). In South India, approximately 26% of the risk for oral cancer is attributable to alcohol consumption, with the risk ranging from 1.2 to 2.8 times higher among moderate to heavy alcohol consumers relative to non-consumers (54, 69, 80).

Similar to tobacco products, a dose-response relationship, either linear or non-linear, has been documented for alcohol consumption and SCCHN association (101, 105-108). In a prospective study, Freedman et al. reported an increased risk of SCCHN (1.5 times for males, 2.5 times for females) for 3 drinks per day or more (65). However, a recent meta-analysis reported elevated risks at even lower levels, with risk ratios of 1.29, 3.24, 8.61, 13.2 for 10g (12ml), 50g (64ml), 100g (127ml), and 125g (160ml) of ethanol per day, respectively (101). Polesel et al. reported a non-linear dose-response relationship in a pooled European study and documented a threshold effect at 50g of ethanol consumption per day for pharyngeal and laryngeal cancers, and at 150g (191ml) for oral cancers (107). A recent meta-analysis also reported a non-linear dose-response relationship between frequency of ethanol consumed and SCCHN. However, they did not report a threshold effect for any SCCHN site (109).       Combined effect of tobacco and alcohol on risk for SCCHN

The interaction between tobacco and alcohol use in elevating the risk for SCCHN has been well demonstrated. Together they account for approximately 75-80% of SCCHN cases in North America and Europe (22-24) and 50% of oral cancer cases among males in Kerala, India (110). Several studies have considered the nature of the joint effects of smoking and alcohol on SCCHN (79, 101, 104, 105, 108, 111). Positive interactions on both additive and multiplicative scales have been reported between these exposures (98, 101, 105). A non-linear dose-response relationship has been documented for the combined effects of daily alcohol and cigarette consumption (108). For example, a 35-fold increase in risk of SCCHN was observed among those consuming 89g of ethanol and 10 cigarettes daily (108). The risk curve was steeper for increasing daily cigarette consumption among drinkers as compared to increasing alcohol consumption among smokers.

To summarise, it has been consistently demonstrated that tobacco use and alcohol consumption in various forms are strong risk factors for SCCHN, and several correlated measures of these exposures (frequency, duration, cumulative measures and time since cessation) are associated with SCCHN risk.

2.3.2        Genetic polymorphisms and copy number variations

Although tobacco and alcohol are strong risk factors for various cancers (e.g., SCCHN and lung cancer), only a very small proportion of tobacco users and alcohol consumers develop these diseases (33, 112, 113). For example, approximately 10%-15% of smokers develop lung cancers, and even a lesser proportion, SCCHN (112, 113), suggesting inter-individual variation in host susceptibility towards these diseases (33, 114, 115). Investigations of individual genetic makeup have shown that variations in the expression of carcinogen metabolizing enzymes due to variants of genes encoding these enzymes, structural variations in DNA segments, mutagen sensitivity, chromosomal aberrations, DNA repair and apoptosis, contribute alone or in combination to inter-individual variation in susceptibility to cancers including SCCHN (116-120). Single nucleotide polymorphisms (SNPs) are the most common form of variation in the human genome, and SNPs in key genes encoding enzymes involved in the metabolism of specific carcinogens found abundantly in tobacco smoke and alcohol have been the subject of research interest in the past two decades. These SNPs along with risk behaviours are the focus of manuscripts II and III of this thesis. Hence, I describe below the enzymatic pathways underlying the metabolism of tobacco and alcohol carcinogens, specific genes and related SNPs that could alter these pathways and contribute to individual differences in SCCHN susceptibility.       Enzymatic pathways in carcinogen metabolism

About 90% of chemical carcinogens from a variety of environmental exposures including tobacco smoke enter the human body as non‐carcinogenic pro‐carcinogens (121). They require bio‐activation into reactive molecules for further conjugation, which facilitates their elimination from the body (122, 123). The scenario is similar with constituents of alcoholic beverages. It is hypothesized that part of the susceptibility to tobacco and alcohol related cancers may be determined by inter-individual differences in the bio-activation of pro-carcinogens and detoxification of carcinogens derived from these exposures.

The bio-activation and detoxification processes are catalysed by enzymes generally known as phase I and phase II xenobiotic metabolizing enzymes (XMEs), respectively (115, 124, 125). Majorly expressed in the liver, these enzymes are also found in the mucosal lining of various organs including the upper aero-digestive tract. The phase I XMEs that activate pro-carcinogens from environmental sources (including tobacco smoke) into intermediate reactive, electrophilic metabolites belong mainly to the superfamily of cytochrome P450 (CYP) enzymes. Phase I XMEs belonging to the alcohol dehydrogenase family (ADH) oxidise ethanol to acetaldehyde. These reactive moieties (e.g., diol epoxides, arene dioxides, acetaldehyde from ethanol) are genotoxic and can form DNA adducts that may cause mutations in the DNA and result in cell transformation, transcription and translation errors (Figure 1) (122). If DNA repair or cell death does not occur, these molecular changes persist and mark the earliest events in the pathway leading to tobacco and alcohol related cancers such as SCCHN. The detoxification and elimination of these reactive moieties are facilitated by Phase II glutathione S-transferase (GST) (via conjugation by nucleophilic glutathione) and acetaldehyde dehydrogenase (ALDH) XMEs. This conjugation reaction increases the water solubility of the substrates from Phase I biotransformation and ultimately gets them eliminated through urine and sweat (122).

Figure 1: Pathways involving tobacco carcinogens, metabolizing enzymes and SCCHN       Phase I and phase II enzymes, associated genes and SNPs

With respect to tobacco and alcohol related cancers, several enzymes belonging to the families of CYP, GST, ADH and ALDH enzymes have been studied. Some of the most widely studied are CYP1A1, CYP2E1, CYP2A6, CYP2D6, GSTM1, GSTP1, GSTT1, ADH1B, and ALDH2 (126). The specific pro-carcinogenic and carcinogenic substrates of these enzymes are provided in Table 3 (121, 127-129).

The catalytic activity of each of these enzymes is determined by genes (DNA sequence) encoding them. For example, CYP1A1 is encoded by the CYP1A1 gene. Alternative forms of a given gene (or variants of genes) that differ in function, resulting from variations within the nucleotide sequence in DNA at a given gene locus are termed alleles. DNA sequence variations resulting in alleles that are common in the population (i.e., the least frequent/rare/minor allele occurs in more than 1% of the population due to natural selection of genetic drift) are known as genetic polymorphisms.

Table 3: XMEs and their substrates present in tobacco and alcohol

Enzyme Substrates
CYP1A1 Polycyclic aromatic hydrocarbons (PAH), heterocyclic aromatic amines (HAA),
CYP2A6 NNK, N-Nitroso-N-Diethylamine (NDEA), nicotine, cotinine, ether
CYP2D6 Amines, nicotine
CYP2E1 Benzene, acrylonitrile, N-Nitroso Diethylamine (NDEA), TSNA (NNK, NNN), ether, ethanol
GSTM1 Arene oxide, diolepoxide
GSTP1 Arene oxide, diolepoxide
ADH1B Ethanol
ALDH2 Acetaldehyde (from both alcohol and cigarette smoke)

When polymorphic DNA sequences occur due to alterations at a single nucleotide base, they are termed SNPs. Based on the combination of alleles from the maternal and paternal chromosomes, three variable types of individuals can be identified in the population. Genotypes resulting from the presence of the same allele on both chromosomes are referred to as homozygous, whereas those with a wild type allele on one chromosome and a variant allele on the other (maternal or paternal), are termed heterozygous. Homozygous wild type (wild/wild) is usually associated with a functionally normal enzyme, whereas homozygous mutant (variant/variant) or heterozygous (wild/variant) genotypes can result in a functionally different enzyme (e.g., a fast, slow, inactive enzyme). In summary, SNPs result in different group of (e.g carriers: homozygous variant + heterozygous genotypes, non-carriers: homozygous wild type genotypes) individuals with distinct traits (inter-individual variation) in a given population. These SNPs can lead to functionally different xenobiotic enzymes involved in the biotransformation of tobacco and alcohol pro-carcinogens, which, in turn, can result in differential SCCHN risks among individuals with different genotypes.       Candidate genes and SNPs associated with carcinogen metabolism and risk for SCCHN

The genes encoding phase I and phase II XMEs are highly polymorphic and various SNPs are associated with these genes (130). These SNPs can lead to enzyme products with increased, altered, decreased or no activity (129, 131). SNPs enhancing the activity of Phase1 CYP enzymes (e.g., CYP1A1*2A, CYP1A1*2C, CYP2E1c2) result in faster conversion of tobacco pro-carcinogens to reactive carcinogenic metabolites (132-135). Similar functional changes to ADH1B enzymes (e.g., ADH1B*2) lead to a higher conversion rate of ethanol to acetaldehyde (36, 136). Certain SNPs related to phase II XMEs (e.g., GSTP1Val) that cause a decreased activity of corresponding enzymes may result in decreased detoxification and excretion of these genotoxic metabolites (137). Overall, these functional changes may result in an overload of reactive carcinogens in the human body, which can lead to an increased risk of SCCHN. Other groups of SNPs that decrease the activity of phase 1 XMEs (e.g., CYP2D6null) or phase II XMEs (e.g., GSTM1null) may result in a decreased production or decreased rate of detoxification of these metabolites respectively, resulting in differential risk for SCCHN (138, 139). Furthermore, because tobacco smoke is one of the richest sources of carcinogenic chemicals that are substrates for these enzymes, the association between these SNPs and SCCHN risk can vary depending on different levels of tobacco smoking. This gene-environment interaction can result in sub-groups with differential risk for SCCHN within a population. The identification of high-risk groups can ultimately aid in targeting prevention activities. Hence, in this work, we focus on the association between several widely-studied SNPs altering the functions of phase I and phase II XMEs and SCCHN risk, alone or in interaction with tobacco smoking. We also consider the ADH1B*2 SNP associated with alcohol metabolism. A summary of characteristics of these genetic variants are provided in Table 3 and described in the sub-sections below.

Table 3: Characteristics of SNPs involved in tobacco and alcohol metabolism


a Chr: Chromosome; rs_number*- stands for reference SNP cluster ID. It is an accession number that is a stable and unique identifier for SNPs.

CYP1A1*2A, CYP1A1*2C and SCCHN risk

The CYP1A1 is a highly active CYP enzyme majorly involved in the activation of pro- carcinogens such as polycyclic aromatic hydrocarbons (e.g., benzo[a]pyrene) and aromatic amines found in tobacco smoke, environmental pollutants and smoked food (133). The enzyme is encoded by the CYP1A1 gene foundon chromosome 15. Polycyclic aromatic hydrocarbons induce expression of this gene (140). SNP’s designated as CYP2A1*2A, which was the first variant to be identified for the CYP1A1 gene, and CYP1A1*2C, are two of the widely studies polymorphisms in this gene (141). These SNP’s inherit together, resulting in a non-random association (linkage disequilibrium=LD) between them (142).  The frequencies of minor alleles (C allele for CYP1A1*2A and G allele for CYP1A1*2C) vary in different ethnicities with 5-10% reported for the C allele and 3-5% for the G allele among Caucasians (143, 144). These SNPs, which occur on the restriction sites that control enzyme activity, result in increased enzyme activity (~ 2-fold) (143, 144). Based on the hypothesis that increased enzyme activity leads to enhanced activation of pro-carcinogens to carcinogens, these SNPs are considered to increase the risk of SCCHN (142, 145-151).

Multiple meta-analytical reviews have aimed to clarify the association between the two CYP1A1 SNPs and SCCHN risk (132, 133, 146, 152-155). An increased risk association of CYP1A1*2A (113, 133, 155) and CYP1A1*2C (148, 154) with SCCHN has been reported when combining all ethnicities. However, this association is inconsistent among Caucasians (34, 35, 133, 146, 148, 150).

Meta-analytical reviews have also considered the combined effect of these polymorphisms and smoking for the risk of SCCHN (35, 132, 133, 146, 153). In a 2003 review and pooled analysis, Hashibe et al. reported no evidence for interaction between CYP2A1*2C and smoking, and suggested that this result was due to the heterogeneous ethnicity among studies (152). However, Liu et al. (2013) reported that, compared to non-smokers and non-carriers of the CYP1A1*2C allele (AA genotype), carriers (AG/GG genotype) and smokers had the highest risk (approximately 2.4 fold), followed by the non-carrier and smokers (2-fold risk) (146). Similar associations were identified for the joint effects of CYP1A1*2A and smoking on SCCHN. Relative to the non-carriers (TT genotype) and non-smokers, the carriers (TC/CC genotype) + smokers had an approximately 3-fold increase in risk, followed by the non-carrier and smokers (1.78 times the risk). Overall, they reported a positive multiplicative interaction estimate of 1.5 between risky genotype categories of both SNPs and smoking. Qin et al. (2014) reported a positive interaction (1.51 times the risk) between carriers of the CYP1A1*2C variant and smoking (132). For CYP1A1*2A, He et al. (2014) reported 2.37 times the risk for SCCHN among carriers who were smokers (133). However, all these meta-analyses pooled studies on Asian and Caucasian populations. Hence, although extant research suggests that the combined effect of these SNPs and tobacco smoking intensifies the risk for SCCHN, more studies comprehensively reporting interaction results among Caucasian populations are required (156).

CYP2E1c2 and SCCHN risk

CYP2E1 is involved in the metabolic activation of compounds such as benzene, acrylonitrile, N-dimethyl nitrosamines and ether from tobacco smoke. It is encoded by the CYPE2E1 gene, located on chromosome 10. The gene is inducible by low dose nicotine and ethanol (157). A SNP designated as CYP2E1c2 is widely studied with regards to various tobacco related cancers (134, 158-164); its minor allele (c2 or C) has a frequency of less than 10% among Caucasians (134, 143). The allele is associated with increased enzyme activity [i.e., the c2/c2 genotype (CC genotype) has almost 10 times more carcinogen activating capacity than the c1/c1 genotype (GG genotype)] and hence is hypothesised to increase the risk for SCCHN (134, 135, 147, 163, 165-167). The most recent meta-analysis, conducted on 43 studies, suggested that carriers of the c2 allele are at increased risk for SCCHN among Asians and mixed populations, but not among Caucasians (168). A previous meta-analysis reported similar findings (134, 169). However, studies considering the combined effect of CYP2E1c2 and smoking (mainly stratum specific effects) provide conflicting results (163, 170), and there is a lack of studies comprehensively analysing the possibility of an interaction between CYP2E1c2 and various tobacco smoking levels.

GSTP1Val and SCCHN risk

Belonging to a superfamily of multi-functional Phase II XME, the GSTP1 enzyme metabolize a large variety of substrates and are involved in the detoxification of activated carcinogenic compounds from tobacco (e.g., diol epoxides of polycyclic aromatic hydrocarbons) (165). It is encoded by the GSTP1 gene located on chromosome 11. A SNP in this gene designated as GSTP1105Val has been studied in relation to multiple cancers including SCCHN (144, 147, 158-160, 163). In Caucasians, the frequency of the minor allele (G) is approximately 10-40% (171-173). Compared to the wild type A allele, the G allele encodes an enzyme that is 2-3 times less stable and hence less efficient in detoxifying phase I metabolites of tobacco procarcinogen metabolism (137, 152, 174). However, three meta-analyses conducted thus far have failed to identify any association between carriers of the G allele and SCCHN risk (137, 152, 175). They also did not identify any conclusive evidence supporting interaction between ever smoking and the G allele. Nevertheless, increased risk estimates for the joint effect of carrying the G allele and smoking have been reported with increasing levels of daily cigarette consumption and pack-years (34). To complicate the scenery further, GSTP1Val is known to be very substrate specific and highly efficient in detoxifying carcinogenic epoxide of benzo(a)pyrene specifically (176, 177) and indeed, a lower risk for SCCHN among Val allele carriers has been documented (178). In summary, more studies are required to comprehensively investigate joint effects and interaction between GSTP1val and tobacco smoking.       Copy number variants

Copy number variants (CNV) or polymorphisms have been defined as DNA segments present in variable copy numbers (repeats) in comparison with a reference genome (179). These segments are 1 kilobase or larger in size (from one kilobase to several mega-bases) and include deletion, duplication, insertion, inversion or complex recombination (Figure 2) (120). These structural variants are as important as SNPs in their contribution to genome variation. Genetic variants containing 0-13 gene copies have been reported across human populations (120). CNVs in genes involved in tobacco carcinogen activation and detoxification have been identified and are reported to alter SCCHN susceptibility (120). The identification of CNVs is advantageous in estimating the risk associated with various copy numbers of a variant rather than broad categorizations such as carriers vs non-carriers of the variant. In this work, apart from the SNPs already described, we consider CNVs in two genes, one encoding a phase I (CYP2D6) and the other a phase II (GSTM1) enzyme, the null variants of which render these respective enzymes non-functional.

Figure 2: A depiction of copy number variants in human genome adopted from He et al, 2012

Gene recombination event between two genes can result in gene duplication or multiplication (n=2,3…) or gene deletion (n=0). A duplication of gene could carry mutations from the original copy (red column)

CYP2D6 non-functional (null) CNV and SCCHN risk

CYP2D6 (Debrisoquine hydrolase) is the most genetically polymorphic of metabolic enzymes, with approximately 80 variants identified. It is majorly involved in the metabolism of nearly 20-25% clinically used drugs (180, 181) and pro-carcinogens from tobacco (e.g., various amines, nicotine) (182). The XME is encoded by the CYP2D6 gene located on chromosome 22. The variants identified are comprised of SNPs, deletions and insertions, and include normal activity, reduced activity or non-functional alleles (120). There is no detectable activity for this enzyme when encoded by CYP2D6 non-functional alleles (null alleles). Approximately 6-10% of Caucasians harbouring these null alleles are termed poor metabolizers of enzyme substrates (182-185). Due to a lower activation of pro-carcinogens to carcinogens, CYP2D6 null is hypothesised to be associated with a lower risk for tobacco related cancers such as SCCHN compared with highly active functional variants. However, evidence on this association has been inconsistent (138, 186-188). CNVs exist for CYP2D6 null (120) and individuals with lower numbers of copies of the null variant could have an increased risk for SCCHN compared to those with higher numbers of these variants. However, this hypothesis has not been explored yet, nor the interaction between CYP2D6 null CNV and tobacco in the risk for SCCHN.

GSTM1 CNV and SCCHN risk

Similar to GSTP1, the GSTM1 enzyme is involved in the detoxification of a variety of activated compounds from tobacco smoke with carcinogenic potential. The GSTM1 gene on chromosome 1 encodes the GST-mu enzyme (189). Among the three polymorphisms isolated for this gene, the GSTM1 null gene renders the GST-mu enzyme inactive, and individuals with this allele do not detoxify tobacco related carcinogenic compounds efficiently (190). An accumulation of such compounds that can form DNA adducts could increase the risk for cancers such as SCCHN. The null allele has a frequency of 40-60% among Caucasians (189). Multiple meta-analytical reviews support the hypothesis that GSTM1 null is associated with an increased risk of SCCHN in various ethnicities including Caucasians (113, 139, 150, 152, 191, 192). A higher risk has also been identified among smokers, suggesting an interaction between GSTM1null and tobacco smoking (193, 194). Relative to non-smokers with normally active GSTM1 (non-null), individuals who were smokers and GSTM1 null carriers have up to 5 times greater risk for SCCHN, with the risk increasing with heavier levels of tobacco smoked (6 times for GSTM1 null + more than 20 daily cigarette consumption, 7.4 times for GSTM1 null + more than 40 pack-years of tobacco) (34). CNVs have been identified for GSTM1. Approximately 10% of Caucasians have up to 2 copies of the GSTM1 homozygous deletion (120, 195). Although studies on SCCHN (primary tumors, secondary primary tumors and recurrent tumors), bladder and prostate cancer documented no risk associated with one copy of GSTM1, presence of at least 2 copies of GSTM1 was associated with low risk for these outcomes compared to GSTM1 homozygous deletion (196-199). The interaction between CNVs for GSTM1 null and tobacco smoking has yet to be reported comprehensively among Caucasians.       SNPs associated with tobacco and alcohol risk behaviours and risk for SCCHN

The genetic variants discussed so far are hypothesised to be associated with SCCHN risk independently or in interaction with smoking. However, there are SNPs that not only have the potential to interact with risk behaviours, but are documented to affect tobacco and alcohol risk behaviours. CYP2A6*2 and ADH1B*2 are two such variants that influence tobacco and alcohol consumption behaviours respectively. These variants are the focus of manuscript III and are described in the sub-sections below.

CYP2A6*2, intensity of smoking and SCCHN risk

CYP2A6*2 and nicotine metabolism

Tobacco smoking is a complex behaviour influenced by social, environmental, psychological and genetic risk factors (200-202). The various phases identified in the continuum of this behaviour include the preparatory stage, initial trying (initiation), repeated irregular/sporadic use (experimentation), regular use, nicotine dependence/ addiction, cessation and relapse (200, 203). Following initiation, this rewarding behaviour is strongly determined by the addictive agent in tobacco called nicotine (50). Within 10-20 seconds of its inhalation, nicotine reaches the brain and starts exerting its psychoactive effects (204). However, nicotine has a short half-life (8 minutes on average) as it is rapidly inactivated and removed from the body, lowering its levels in plasma and tissues (204). Hence, to attain and maintain optimal levels of nicotine in the brain, the individual has to smoke again. Thus, factors affecting the metabolism of nicotine may influence various phases of smoking behaviour.

Approximately 70-80% of the nicotine entering the body is metabolized/ inactivated into cotinine through a 2-step process (204, 205): nicotine is first converted to nicotine iminium ion, which is later oxidized into cotinine. The first part of the process is the rate limiting step and is catalyzed by the phase I CYP2A6 enzyme, mainly in the liver. Overall, 80-90% of the inactivation of nicotine to cotinine is catalyzed by the CYP2A6 enzyme encoded by the CYP2A6 gene on chromosome 19 (205-207). Although several SNPs have been identified in this gene, only a few have been functionally characterised as capable of altering enzyme activity (207-209). Based on their activity, carriers of the functional SNPs have been grouped as slow nicotine metabolizers (individuals hypothesised to smoke less), intermediary metabolizers (individuals hypothesised to be moderate smokers) and normal metabolizers (individuals hypothesised to smoke heavily) (208). Of these genetic variants, the first to be characterised and one of the most widely studied is CYP2A6*2 (206), which is categorized under slow metabolizers. The homozygous variant (AA) and heterozygosity (AT) of this allele results in complete and partial inactivity of the CYP2A6 enzyme, respectively (202, 210). Consequently, relative to homozygous wild type (TT genotype), smokers who are carriers of the variant (AA or AT genotypes) of this allele exhibit higher plasma nicotine levels for a given amount of nicotine ingested (due to a lower conversion rate of nicotine to cotinine). Based on this mechanism, the CYP2A6*2 allele was hypothesised to have an inverse association with smoking behaviour (e.g., number of cigarettes smoked per day, nicotine dependence).

Association of CYP2A6*2 with cigarettes smoked per day

There is strong evidence for the association between the CYP2A6*2 allele and number of cigarettes smoked per day among Caucasian adult smokers. Inter-ethnic variation has been reported in the frequency distribution of the CYP2A6*2 allele. Although they are rarer (0-0.7%) in the Chinese, Korean and Japanese population, their frequencies range from 1% to 3% among Canadian, American and European Caucasians (206). Several (211-215) but not all (216-218) studies looking into the association between CYP2A6*2 and smoking behaviour among Caucasian adult smokers reported that the CYP2A6*2 allele (AT/AA genotype) protected smokers against becoming nicotine dependent and  that they smoked fewer cigarettes per day relative to non-carriers (TT genotype). A meta-analysis including observational studies published between 1998 and 2004 documented no overall association between the CYP2A6 gene (multiple variants) and smoking behaviour (202). However, the majority of the studies included in the review used broad definitions of smoking (e.g., ever/never/current/former smoker) which may have led to misclassification of the outcome and obscure significant differences between the groups. Given the existence of a well demonstrated biological mechanism connecting the gene, nicotine metabolism and smoking behaviour, they attributed their results mainly to a lack of methodological rigour in the studies investigated and emphasised the importance of specifically defining the smoking variable (202). Another meta-analysis in the same year provided evidence that smokers who were carriers of at least one CYP2A6*2 allele smoked significantly fewer cigarettes per day, and also had higher chances of quitting smoking (214). The first ever study on CYP2A6 poor metabolizers conducted among Canadian Caucasians in 1998 (211) was reanalysed by Rao et al. (212) using stringent analytical methods. These included: a) a new genotyping method which, unlike the original study, removed the chances of CYP2A6*2 false-positives (219), b) a precise definition of the smoking outcome through multiple indices, and c) proper control for population stratification (confounding by variation in ethnicity) by the restriction of participants to Caucasian smokers who had at least 3 grand-parents of Caucasian ethnicity. This study reported that among smokers, relative to non-carriers of CYP2A6*2 allele (TT genotype), carriers (AT or AA), smoked fewer cigarettes per day [(13.5 vs 19.5, P<0.03) overall, and at times of heavy smoking (19 vs 29, P<0.001)], had lower breath carbon monoxide levels and lower cotinine levels. A similar study conducted among another North American population reported that among dependent smokers, slow metabolizers (which included carriers of at least one *2 allele) smoked 7 fewer cigarettes per day on average relative to non-carriers (21.3 v 28.3 cigarettes per day) (213)Also, among Caucasians who smoked at least 10 cigarettes per day, those who were slow metabolizers had significantly lower mean and overall puff volume compared to normal or intermediary metabolizers (220).

A genome-wide meta-analysis conducted in 2010 which analysed 710 SNP’s on chromosomes 15, 19, and 8 among adult participants of European ancestry, documented strong association between CYP2A6*2 allele and number of cigarettes smoked per day (221). A recent meta-analysis on slow metabolizers of CYP2A6 also reported similar findings (215).

Overall, findings from observational and meta-analyses reported thus far indicate that, due to their involvement in nicotine metabolism, smokers who are homozygous (AA) or heterozygous (AT) for the CYP2A6*2 allele smoke with less intensity (cigarettes per day) relative to homozygous non-carriers (TT).

CYP2A6*2 and SCCHN risk

Based on their involvement in the activation of tobacco pro-carcinogens to carcinogens, CYP2A6 genetic variants have been implicated in the risk for SCCHN. However, the literature on this association is sparse. Three studies have investigated the role of CYP2A6*4 in the risk for tobacco-related cancers (39, 163, 222). This SNP has a lower frequency among Caucasians (0.5-1%) compared to CYP2A6*2 (206). However, similar to CYP2A6*2, CYP2A6*4 renders the enzyme inactive, resulting in decreased bio-activation of substrates such as nicotinate and NNK, NNN and NDEA pro-carcinogens found in tobacco (206). Carriers of the CYP2A6*4 allele have been associated with a significantly lower risk for tobacco related cancers including those of the upper aerodigestive tract. Furthermore, this variant is suggested to affect cancer risk solely in smokers (39, 223). Based on similarities with CYP2A6*4, it can be hypothesised that smokers who are carriers of the CYP2A6*2 allele (AT or AA) are at a lower risk for SCCHN. However, no studies have yet investigated the role of CYP2A6*2 in SCCHN risk nor its interaction with smoking.

ADH1B*2, alcohol consumption and SCCHN risk

Alcohol consumption patterns are influenced by social, environmental, psychological and genetic factors with inter- and intra-ethnic variability (224-227). Much of the inter-individual variability in alcohol use is attributable to factors underlying the metabolism of ethanol (226-228). Ethanol entering the human body is first metabolized into acetaldehyde and later to acetate before being removed from the body. The oxidation of ethanol to acetaldehyde is catalyzed by ADH and its isoenzymes majorly in the liver. They are also expressed in the stomach, gut and upper aerodigestive tract in detectable quantities. Similar to nicotine metabolism, the inter-individual variability in alcohol to acetaldehyde metabolism is mostly attributed to the genetic polymorphisms in ADH genes encoding ADH enzymes. Of these, SNPs related to ADH1B and ADH1C iso-enzymes of ADH, namely ADH1B*2 and ADH1C*1, are two of the most functionally polymorphic and well characterised variants in adults. These SNPs are not only associated with alcohol consumption behaviour, but also with altered risk for SCCHN among alcohol consumers in various ethnicities (229). Although they seem to be in linkage disequilibrium, studies in both Caucasian and Asian populations suggest that ADH1B*2 has a significant effect on the risk of SCCHN after adjustment for ADH1C*1 (230, 231). Also, among multiple ADH SNPs studied, ADH1B*2 has the strongest association with alcohol consumption behaviour and SCCHN (136, 228, 229). Hence, we will be focusing on the role of ADH1B*2 in relation to both SCCHN risk and alcohol consumption behaviour.

Association between ADH1B*2 and alcohol metabolism

The role of ADH1B*2 in alcohol consumption behaviour has been widely investigated (232). The frequency of this allele varies in different ethnicities [Asian: 69% (range-19%-91%), European: 5.5% (range- 1%-43%), Mexican: 3% (range-2%-7%)] (232). The homozygous variant (AA genotype) and heterozygosity (AG genotype) of this allele result in an ADH enzyme that rapidly oxidizes ethanol to acetaldehyde (an up to 50-100-fold increase in activity has been reported) (36, 224, 227, 228). Carriers of this allele (AA or AG genotype) are at decreased risk of alcohol dependence compared to non-carriers (GG genotype). This is hypothesised to be due to the prompt build-up of acetaldehyde (resulting from the rapid oxidation of ethanol), which leads to negative physiological reactions termed alcohol-induced flushing, which is characterised by cutaneous flushing, increased skin temperature, decreased blood pressure, tachycardia, dizziness, anxiety, nausea, headache and generalised weakness (233). These aversive reactions lead to decreased alcohol consumption.

Association of ADH1B*2 with alcohol consumption behaviour

The association between ADH1B*2 and alcohol consumption behaviour was first investigated in the East Asians (234-237), and then later among Europeans and other ethnicities (230, 238, 239). Among East Asians, this allele decreases the risk of alcohol dependence by about 80% relative to non- carriers (236, 238). A study on 4,597 Australian twins (3 studies combined) reported that non-carriers of the ADH1B*2 allele (GG genotype) had fewer negative reactions post alcohol consumption (p=8.2×10-7), consumed a higher number of drinks per day (p=2.7×10-6) and had a greater overall cumulative alcohol consumption (p=8.9×10-8) relative to carriers (228). On average, participants with GG, GA and AA genotypes consumed 5.1, 4.1 and 1.9 drinks per day.  A recent meta-analysis (2,298 alcohol-dependent cases and 3,334 non-dependent controls) documented that the ADH1B*2 allele was associated with a significant reduction (by 66%) of alcohol dependence and number of drinks per day among European-Americans. A meta-analysis on all studies published between 1990 and 2011 reported robust associations, also reported similar findings (232). Overall, the accumulated evidence is consistent with the hypothesis that an elevation in acetaldehyde leads to an increased sensitivity to alcohol among ADH1B*2 carriers, reducing the likelihood for alcohol dependence and number of drinks per day among Caucasian adults.

ADH1B*2 and SCCHN risk

ADH1B*2 has been strongly implicated in the risk for upper aerodigestive tract cancers among various ethnicities. Acetaldehyde, the initial metabolite of ethanol, has been suggested to exert multiple mutagenic and carcinogenic effects, qualifying alcohol as an initiator of the cancer pathway (96-99). Hence, it was hypothesised that fast metabolizers of ethanol (GA or AA genotype) have a higher exposure to acetaldehyde, increasing the risk for SCCHN (36). However, contrary to this hypothesis, the first reported study investigating this association (among Japanese alcoholics) reported an increased risk for SCCHN among the GG genotype relative to the GA or AA genotype (240). Brennan et al. reasoned this to be due to residual confounding by alcohol consumption (36). However, studies since then have consistently shown a decreased risk (up to a 50% reduction) for SCCHN among carriers of the GA or AA genotype (136, 229, 241). No association was identified among never-drinkers (229, 242) and the protective effect was significant at higher levels of alcohol. These reports hypothesize alternative mechanisms of carcinogenesis.

Combined effect of ADH1B*2 allele and alcohol consumption has also been investigated. A joint effects analysis conducted among the Japanese population reported that when compared to non-drinkers who were AA or GA/AA genotype carriers, GG genotype carriers who were drinkers were at significantly increased risk for the disease. The effect was more pronounced among heavy drinkers (9-26 times higher risk) (231, 243). A Korean study documented a higher risk for the GG genotype compared to the AA genotype within moderate and heavy drinker strata of alcohol consumption (244). Two recent studies among Caucasians did not document any interaction between ADH1B*2 and alcohol consumption levels (241, 245). However, large European studies, which documented significant lower risk among the strata of medium and heavy drinkers among carriers of this allele (GA/AA genotype), do indicate a possibility of negative interaction on an additive or multiplicative scale within this ethnicity (136, 229). Studies among Asian and Caucasian populations have consistently documented no altered risk among never-drinkers who were either carriers or non-carriers of the ADH1B*2 allele. Overall, studies investigating both main effect and stratum specific effects indicate the possibility of interaction between ADH1B*2 and measures of alcohol consumption.

Hypothesis underlying the association between ADH1B*2 and risk for SCCHN

Multiple potential pathways (not mutually exclusive) underlying the association between ADH1B*2 and SCCHN among alcohol consumers have been proposed. Most of them are based on a direct carcinogenic action of acetaldehyde. Hashibe et al. reasoned that the fast metabolism of ethanol (among GA/AA genotypes) leading to increased acetaldehyde exposure may initiate alternative mechanisms to clear off the peak of acetaldehyde. However, such mechanisms may not be activated among GG genotype carriers who have a moderate initial metabolism, leading to acetaldehyde build up, which in turn increases the risk for cancer (136). In addition, compared to ADH enzymes, the expression of acetaldehyde dehydrogenase (ALDH2) enzymes that majorly degrade acetaldehyde to acetate is extremely weak in the upper aerodigestive tract (246). The resulting inefficient degradation of acetaldehyde may also contribute to additional acetaldehyde exposure among the GG genotype, especially among those consuming moderate to high levels of alcohol (231). Furthermore, apart from ADH enzymes, certain oral microflora can also convert ethanol to acetaldehyde (247-249). Following alcohol consumption, higher levels of acetaldehyde have been found in saliva relative to other parts of the body (especially in individuals with poor oral hygiene) (99, 250, 251). This oral microflora-salivary acetaldehyde pathway can contribute to peak acetaldehyde concentrations among the GG genotype (231, 252). Another hypothesis independent of the acetaldehyde pathway is that the fast metabolism of ethanol may result in lower local exposure (136, 229). Hence, alcohol may not be able to exert its promoter effect (aiding dissolution of other carcinogens), conferring protection against neoplastic changes in the head and neck region among GA/AA genotypes.

To summarise, although most SNPs described above are associated with SCCHN based on their involvement in the bio-activation of tobacco related pro-carcinogens and detoxification of carcinogenic metabolites, a comprehensive characterisation of their interaction with different levels of smoking incorporating all aspects such as interaction on both multiplicative and additive scales, joint effects and stratum specific risks, has not been reported (156). Furthermore, since CYP2A6*2 and ADH2B*2 affect specific measures of tobacco and alcohol consumption behaviours respectively, these behaviours may not only interact but also mediate the causal pathways between these SNPs and SCCHN risk. These pathways have not been elucidated yet.

2.3.3        Human papillomavirus (HPV)

In the past decade, HPV infection has emerged as a strong risk factor for SCCHN. A trend of decreasing incidence of oral cavity cancers (consistent with a decrease in tobacco use), and an increase in the incidence of oropharyngeal cancers (tonsils, base of tongue) have been documented in may developed countries, especially among men (8, 26, 27, 253, 254). The increased incidence of oropharyngeal cancers has been attributed to HPV infection. This infection has been detected in approximately 25% of SCCHN cases worldwide (255). The majority of HPV-positive SCCHN are oropharyngeal cancers. This virus is transmitted through skin-to-skin and skin-to-mucosa contact. Hence, unprotected sexual behaviours, notably oral sex, have been identified as routes of HPV transmission with respect to anogenital cancers and SCCHN. More than 100 sub-types of HPV have been identified, among which HPV 16, 18, 31, 33 and 35 have been classified as high-risk sub-types in relation to cancer. More than two-thirds of HPV-positive SCCHN have been attributed to HPV-16 infection. Results from a 2006 meta-analysis show that the association between HPV-16 and SCCHN was strongest for tonsillar (15-fold), followed by oropharyngeal (4-fold), and oral and laryngeal cancers (2-fold) (256). A recent prospective cohort study (2016) conducted in the USA reported an up to 7-fold increase in risk associated with HPV-16 for incident SCCHN cases, with a positive association only for oropharyngeal cancers (257). The researchers also reported that HPV-16 infection preceded SCCHN incidence. HPV-positive SCCHN are clinically distinct from HPV-negative cases and their survival rates are better compared to that of HPV-negative patients (three-year survival of 84% vs. 57%, respectively) (258). Based on recent trends in the incidence of oral cavity and oropharyngeal cancers, the existence of two distinct SCCHN risk groups (tobacco and alcohol related, and HPV related) has been suggested. However, a large study from IARC reported that relative to HPV-negative/non-smokers, HPV-positive/smokers had the greatest risk for both oral cavity and oropharyngeal cancers, greater than HPV-positive/non-smokers or HPV-negative/smokers (259). Evidence from other studies also indicate interaction between risk behaviours and HPV status in the risk for SCCHN (260-262).

2.3.4        Socioeconomic position (SEP)

Similar to genetic factors, socioeconomic position (SEP) is a well-documented distal determinant of health outcomes including SCCHN (263-274). In addition, behavioural risk factors such as tobacco use and alcohol consumption are socially patterned (275-281). Hence, whether they are the primary focus or not, it is essential to consider measures of SEP in most epidemiologic studies. In this thesis, different measures of SEP are used, either as the main exposure (manuscript I) or as an important confounder between exposures and the outcome of SCCHN. Therefore, in the following sub-sections I present an overview of the complex construct of SEP, various methods to measure this exposure and their association with SCCHN risk.       Definition of SEP

The term ‘socioeconomic position’ refers to the economic and social well-being of a person assessed through components such as occupation, income, wealth, education and social status. Krieger (1997) defines SEP as an aggregate concept that includes both resource based (income, wealth, education) and prestige based (individuals’ rank or status in the social hierarchy, evaluated with reference to people’s access to and consumption of goods, services and knowledge) measures that are linked to both childhood and adult social class position (282).       Indicators of SEP

Based on theory (indicating social class, status or position), correlations with health outcomes, suitability for particular societies and availability of data across the life course, observational studies use various indicators in an attempt to measure SEP. Commonly used measures of SEP are addressed below.

Asset/wealth index

An asset or wealth index is a measure of the material endowment of an individual or household. It is considered an acceptably reliable proxy for consumption and thus SEP, particularly in low to middle income societies (283, 284). The wealth index is calculated using readily-observable household characteristics such as durable assets and household amenities (e.g., car, refrigerator, television, owning a bicycle, livestock, radio, sewing machine), housing characteristics or conditions (household floor, roof wall material, toilet facilities, water supply), access to services (e.g., electricity supply, drinking water sources), and housing tenure (status of house, land or farm ownership) (284-287). It is stated that asset index was developed based on availability and convenience especially in more agrarian societies and not on a plausible direct causal relationship between wealth or asset possession and health (284). There is also an argument that the index is unlikely to capture the broad concept of SEP (288). However, poor housing is associated with a wide range of health conditions (289). Indicators such as overcrowding in houses have been associated with sanitation and the spread of infections. Moreover, health and mortality are sensitive to fine gradations in neo-material conditions such as access to cars, home ownership, presence of a home garden and healthier food (290, 291). Furthermore, housing tenure, conditions, assets and amenities reflect an individual’s educational and occupational status and income (284). The wealth index gained popularity through its use in Demographic and Health Surveys (DHS) data sets to quantify and compare socioeconomic inequalities across approximately 35 countries which mostly included low and middle income countries (283, 292). This measure was utilized because of a lack of reliable data on income and expenditures. Also, household assets are resistant to change in response to short-term economic shocks, which are a feature of low and middle income settings. Based on its slower response to economic shocks, it is also argued that the wealth index captures long term stable aspects of economic status (288, 293). Unlike other indictors such as education and current income, information on components of the wealth index is available across life and hence is an SEP measure available at multiple periods of life.


Education is one of the most widely used individual-level measures of SEP. Education marks the transition from childhood to adolescence or early adulthood and indicates an individual’s independence from parental care (294). An individual’s educational attainment could determine that individual’s health through its influence on decision-making skills, awareness about opportunities, general awareness and interactions with people, access to information and health care, choices of lifestyle behaviours, job and income levels, housing conditions, status in the society and stress coping mechanisms (31, 295). Relative to other measures of SEP such as income and occupation, education is easier to measure, can be assessed in people who are not in active labour, is equally available to both sexes especially in developed countries, has a high response rate with the exclusion of only a few members of the population and has less subjectivity to negative adult health selection. Together, these attributes make education a useful and important measure of SEP (295-297). However, education is usually acquired early in life and stable after early adulthood, and thus represents SEP only during a short window of the life course (285, 295). Commonly used markers of education include number of years of formal education and highest level of education attained in life (285, 294). However, the analysis of these markers can be complicated. The number of years of education does not convey any information regarding the quality of the education and its social and economic value. Furthermore, the meaning of a particular level of education and number of years of education are not the same everywhere, and are related to age and birth cohort, social class position, race/ethnicity and cultural norms (282). For example, significant social and educational reforms took place in the state of Kerala in India in the mid-1900s (298). Until that time, a feudalistic system existed for land ownership, wealth, access to education and privileges. Education was considered the privilege of people of the higher caste (hierarchy in the Hindu religion based on occupation) and Syrian Christians, whereas people from the backward caste and most females were denied formal education (298). Completing four years of education was a high educational attainment. However, political movements since the Indian independence (1947), especially in the late 1950s, resulted in free and compulsory education until 14 years of age (8 years of education), and education was given a higher importance in the society (299). This educational reform played an important role in lifting people out of poverty by providing the means for upward social mobility. Such features specific to societies and birth cohorts must be considered when using and analysing markers of education as measures of SEP.

Occupation and income

Occupation and income are commonly used measures of SEP. Occupational status is a direct measure of social class in most societies and is the major structural link between education and income (294). Income is a direct indicator of SEP and is the result of an individual’s occupation (300). Occupation plays an important role in positioning an individual within the social structure that directly controls access to resources, interaction with peers, exposure to job related environments and physical exposures, psychological risks and risk behaviours such as tobacco and alcohol consumption (295). Income levels impact health outcomes by influencing the material circumstances of an individual such as quality, type and location of housing, food, clothing, health care, transportation opportunities for cultural, recreational and physical activities, child care and exposure to various toxins (294). Overall, these features make occupation and income suitable measures of SEP in health research. However, occupation and income can be difficult to measure with precision, especially in low and middle income societies (268, 283, 284, 293). This can be attributed to features such as higher non-response rate, missing information on people who are not part of active labour (e.g., home makers) and fluctuations with short term economic shocks (293, 295). Furthermore, most occupational classifications have been developed and validated on working men (295). These factors pose a challenge when using occupation and income as measures of SEP.       Association of SEP with risk for SCCHN

As demonstrated with health outcomes such as cardiovascular diseases, mortality, allostatic load, multiple cancers and oral health conditions, cumulative disadvantageous SEP over the life course has been associated with increased risk for SCCHN, independent of behavioural risk factors (272, 290, 301-305). A large meta-analytical review by Conway et al (2008) on case-control studies that included 24 and 17 studies from high and low income countries, respectively, examined the association between three measures of SEP (income, occupation and education) and oral cancer risk (267). Participants with low educational attainment, low occupational status and low income had 1.85, 1.84 and 2.41 times the risk, respectively, of developing oral cancer relative to their higher SEP counterparts. In addition, disadvantageous SEP was independently associated with increased oral cancer risk in high and low income countries across the world. Most (269, 306-308) but not all studies (309) conducted subsequently in developed and developing countries have shown that a disadvantageous SEP is independently associated with an increased risk of SCCHN.

2.4       Complex exposures – Need for comprehensive conceptual and analytical frameworks

Genetic exposures such as SNPs are fixed at birth and are well defined. By contrast, exposures such as behavioural risk factors and SEP have a complex dynamic nature. An individual’s SEP may not remain the same from childhood to early to late adulthood stages of their life (276, 285, 310). The situation is similar for behavioural risk factors such as tobacco and alcohol habits, as individuals’ behavioural patterns can vary (e.g., frequency, duration, type of tobacco or beverage) over the course of life (311). Thus, these exposures are time-varying. Capturing the dynamic nature of these exposures within an epidemiologic study and addressing it in the analysis is challenging. The challenge is compounded by the bi-directional associations within these exposures at multiple time periods, and between these variables and the health outcome. For example, SEP is considered to affect risk behaviours. However, such behaviours (e.g., alcohol consumption) have also been considered as determinants of socioeconomic consequences, especially in developing societies (312). In addition, these risk behaviours are highly correlated. Hence, SEP in an earlier period of life, for example childhood, may affect risk behaviours in adolescence and early adult life, which can in turn affect social conditions in subsequent late adult life. In short, this time-varying nature produces a complex feedback loop between these variables acting as multiple confounders and mediators in the causal pathways to the health outcome (313). A further concern is the possibility of reverse causality. Based on the social causation perspective, an individual’s SEP components can influence their health positively or negatively. For example, following a low educational attainment, one could get a job that exposes them to chemicals and physical hazards including carcinogens, physical and psychological stress, noise, heat, cold, unsafe conditions, and dust, among others. These exposures lead to an increased risk of disease. The same person could also face unemployment, which increases the risk of depression, anxiety and disability, and may lead to unhealthy coping practices (e.g., cigarette smoking and alcohol consumption). In contrast, based on the selection hypothesis, healthy people may obtain and retain their occupational status. These bidirectional associations make collecting repeated data on these exposures at multiple time points and assessing their temporary relationship with the health outcome imperative. Addressing these issues requires a comprehensive theoretical study framework, a study design that is appropriate for the health outcome being investigated, a suitable analytical framework and associated techniques. In this thesis, I used the conceptual framework of life course epidemiology, a case-control study design that is advantageous to study rare disease outcomes such as SCCHN, a counterfactual causal inference analytical framework to incorporate repeated measures of exposures and causal effects of exposures on the outcome and causal diagrams. A brief overview of these elements of my thesis are presented in the subsections below.

2.4.1        Life course epidemiology – Definition and origin

Kuh and Shlomo define life course epidemiology as “the study of long-term effects on later health or disease risk of physical or social exposures during gestation, childhood, adolescence, young adulthood and later adult life” (314). Research in the 1950s by Sir Richard Doll and colleagues suggested that smoking was a strong risk factor for lung cancer (and concomitantly for laryngeal, oesophageal and bladder cancers). This marked a paradigm shift in risk factor research: the focus of chronic disease investigations shifted to an adult lifestyle approach where multiple adult life exposures were implicated in the risk for later life health outcomes (315). However, Forsdahl (1977) documented a strong correlation between infant mortality rates and mortality in middle age for the same generation in specific counties in Norway (316). Similar results linking early life events to adult health outcomes were documented in ecological studies conducted in the USA and Britain, and historical cohort studies (e.g., British birth cohorts) during the following 15 years (317-321). These observations gave rise to the concept of biological programing based on the fetal origins hypothesis. According to this hypothesis, “environmental exposures such as undernutrition during critical periods of growth and development in utero may have long term effects on adult chronic disease risk by ‘‘programming’’ the structure or function of organs, tissues, or body systems” (319). In combination, the above observations supported the importance of biological, behavioural, and psychosocial processes that may operate throughout an individual’s life course, or across generations to influence disease risk, rather than just an adult lifestyle approach to chronic diseases (322). This research became the foundation for the conceptual framework of life-course epidemiology, conceived in the late 1990s, which gives importance to time (duration) and timing of biological, behavioural and social exposures that may act independently, cumulatively or interactively to influence disease risk (314, 323).

2.4.2        Models under the life course epidemiology framework

The main aim of the life course epidemiology framework is to elucidate pathways linking exposures across the life course to later life health outcomes. To achieve this objective, various theoretical models linking exposures to health outcomes have been proposed. They are described below.       Accumulation model

The accumulation model is considered the most fundamental of all life-course models and gives importance to time (duration) of exposures (324). The model proposes that exposures clustered at different periods of life may accumulate longitudinally over the course of life, leading to differential risk for chronic disease outcomes (323). This concept is in line with the notion of allostatic load, which is the wear and tear on biological systems resulting from chronic over activity or inactivity of normal physiological systems in response to increased exposures (in number and/or duration) from the external environment (323, 325). Indeed, Kuh et al. (1997) describe an individual’s biological resources accumulated over the life course as their ‘health capital’, which describes and influences current and future health (314). Ben-Schlomo and Khu (2002) propose that risk can accumulate with independent and uncorrelated insults (no interaction between exposures), or with correlated insults (e.g., SEP, smoking, alcohol) that cluster together leading to a health outcome, or similar insults (disadvantageous SEP at different life stages) that form a chain leading to the outcome (323).       Critical period model

Stemming directly from the concept of biological programing and fetal origins hypothesis, the critical period model gives importance to the timing of exposures. In its strict sense, the critical period model posits that exposures during specific periods of life can cause irreversible biological damage and have a long-lasting effect on biological systems, irrespective of exposures in prior or later periods of life (323). The sensitive period model is a variation of the critical period model which recognizes that although periods with a higher sensitivity to the effects of an exposure may exist, the effects can be modified or even reversed with prior or later exposure profiles (322).       Mobility or pathways model

The mobility or pathways model is considered to be a variation of the accumulation model and is mostly examined in studies with SEP (326). It focuses on the cumulative effect of exposures along life trajectories and implicates differential exposure throughout the life course in adult disease causation. This model implies the interaction of exposures at multiple periods of life (e.g., SEP in childhood, early and late adulthood). Different hypotheses proposed within the pathways model posit different health effects. For example, under the natural health selection hypothesis, less healthy individuals get into a downward mobility (moving from an advantageous to a disadvantageous SEP) and healthier individuals tend to have upward mobility (moving from a disadvantageous to an advantageous SEP) (327, 328). These mobile groups are separated from the individuals who do not show any mobility across life periods as both groups are considered to have distinct traits that make them mobile or non-mobile. In contrast, under a gradient/health constraint hypothesis, mobile groups (either upward or downward mobility between different time periods) possess health traits of both the period they leave and the one they join, thus minimizing the health difference between the SEP groups (327-329). The risk associated with mobile groups will be intermediate between the two non-mobile groups (greater than the group with advantageous SEP in all time periods, and lesser than the non-mobile groups with disadvantageous SEP at all time points). Interestingly, an elevated risk for a health outcome (e.g., cardiovascular mortality) has been documented among individuals who experience deprivation in early life, followed by later life affluence (316). Forsdahl (1977) hypothesized that this was partly due to risky exposures associated with an affluent lifestyle (e.g., elevation in adult cholesterol levels) (316).

Life course epidemiology allows considerable overlap between the models specified above. Hence, the models are not mutually exclusive and empirically difficult to disentangle (330). For example, under a social mobility model, a disadvantageous SEP in childhood can interact with an advantageous or disadvantageous SEP in early adulthood to confer a particular risk for SCCHN. However, this is indeed a chain of risk described under the accumulation model. Furthermore, the critical period model with effect modification in prior or later periods (322), or sensitive period model, is reflected in various interactions of the exposure possible under the social mobility concept.

2.4.3        Suitability of the life course framework to study social, genetic and behavioural risk factors

The life course framework is particularly well suited for this work exploring genetic, behavioural and social risk factors of SCCHN, as the multiple ways in which exposures can lead to the cancer outcome can be encompassed within this framework. For example, the time-dependent aspect of SEP and associated behavioural risk factors can be effectively captured under this framework and tested under the accumulation, critical and social mobility models. Familial risk factors such as SNPs are already fixed and exert an effect throughout life, which can be visualized under an accumulation model. For example, SNPs such as CYP1A1*2A and CYP2E1c2 increase the risk of SCCHN among Asians independent of smoking. This could be an example of an independent insult causing the health outcome, as explained under the accumulation model. However, the effect of this SNP on the risk of SCCHN among Caucasians might be present only in the presence of heavy smoking (interaction). Yet again, SNPs such as ADH1B*2 and CYP2A6*2 can interact with alcohol and smoking. They also affect alcohol and risk behaviours. Hence, the effect of these SNPs on the risk of SCCHN can be partly through these risk behaviours, which is referred to as mediation. The concept of interacting and mediating causal pathways leading to a health outcome has been defined under the life course framework and is reflected in the accumulation model (322). Thus, the possible causal pathways to SCCHN involving potentially confounding, interacting and mediating factors can be tested under the life-course framework. However, this study framework needs to be complemented by a suitable study design incorporating life course epidemiology to specifically study the relatively rare outcome of SCCHN.

2.5       Study designs for observational epidemiologic studies

Two of the main observational study designs for epidemiologic research are cohort and case-control designs (331). In this thesis, we used a hospital based case-control design and novel approaches to existing analytical techniques originally developed for cohort data. Hence, the principles of these designs are described below with emphasis on case-control studies.

2.5.1        Cohort studies

In a typical cohort study, a group of individuals, sampled based on exposure to certain conditions, are identified and traced over time for the occurrence of health outcomes (332). A commonly used measure of disease frequency is the incidence rate, which is the number of new cases per population at risk in a given time period. The incidence rate can be calculated in both the exposed and unexposed group, from which both absolute and relative measures of association between exposure and outcome can be derived. The difference between the incidence rate in the exposed and that in the unexposed group provides the incidence rate difference (on an absolute scale), whereas the ratio between incidence rates in the exposed to the unexposed group gives the relative risk (RR) (on the relative scale) (333). The calculation of these measures is possible and straightforward in a cohort study, as the probability of outcome in the non-exposed is known (334). This study design is useful because it provides information on multiple exposures and outcomes and their variation over time, and ascertains temporality (cause precedes effect). However, its time-consuming nature makes it a poor choice to study rare outcomes such as cancers (under a rare disease assumption, health outcomes with a prevalence of less than 10% in the population is considered rare), as following the entire population for long periods of time would be impractical, and the sample would not yield sufficient cases to derive reasonably precise measures of association.

2.5.2        Case-control study design

The case-control design can be advantageous compared to cohort studies, especially when investigating rare disease outcomes, because of its efficient way of sampling individuals from the source population based on the outcome (333, 335). Compared to a cohort study, a case-control design includes a larger fraction of individuals from a source population who develop the outcome (cases) and a lower proportion of those who do not (controls). This design attained significance in the 1920’s through studies on rare outcomes such as lip, oral cavity and breast cancers (336). In a case-control study, an adequate number of cases from a source population are first selected and classified as exposed or unexposed. Next, their exposure profile is compared with that of controls, who are sampled from and representative of (with respect to exposure distribution) the same source population from where the cases were recruited (337). The source population or the underlying, “hypothetical” cohort, were participants from case-control are sampled is elusive, that is, they are not from a roster nor followed to record outcomes. The controls are selected independent of their exposure status. Cancer case-control studies are usually population or hospital based, depending on the source population from which cases and controls are sampled (332).

Because the numbers of cases and controls are fixed by the investigator in a case-control study, the probability of the outcome among the source population remains unknown (334). Hence, relative risk cannot be estimated directly using this study design unless we use techniques to correct for the sampling strategy that gave rise to the data; to correct  by the probability with which cases and controls are sampled (sampling fraction) into the study from the underlying population (334, 335). However, since the counts of participants among cases and controls with and without the exposure are available, the measure of association derived from case-control studies is the odds ratio (OR) (334). Basically, the OR is defined as the ratio of odds of the exposure among cases to that among the controls (exposure OR). However, the calculation of the exposure OR and outcome OR are mathematically equivalent, making it a valid measure of association between exposure and outcome (334). For rare outcomes such as cancers (incidence less than 10% in a population), the OR approximates the RR (334, 336).

Although case-control studies are suitable for the investigation of cancer outcomes, the design itself poses challenges with respect to certain research questions. First, unlike cohort studies, data on exposures that change over time (e.g., SEP, smoking) are usually not available from case-control studies. This makes it difficult to assess exposures under a time-varying framework. Second, the estimation of association between exposure and outcome is limited to the health outcome on which the study sampling was based. Hence, a researcher might refrain from exploring research questions that require the use of analytical techniques for which a variable other than the main outcome of interest must be used as a dependent variable (e.g., mediation analysis, multi-step modelling such as inverse probability weighted marginal structural models). However, such scenarios are encountered when research questions aim to elucidate causal pathways and mechanisms underlying exposure-outcome relationships. The case-control design should be explicitly taken into account while answering these questions and appropriate study frameworks such as life course epidemiology, control sampling techniques and statistical methods are needed to mitigate these challenges (335).

2.6       Causal inference and causal effect estimation

“One commonly heard argument is that epidemiologic studies are about associations, not causations. According to this proposition, epidemiologists should not worry too much about fishy causal concepts but rather focus their efforts on estimating correct associations. This is certainly a safer strategy but also a dangerous one because it can make much of epidemiology close to irrelevant for both scientists and policy makers”.  – Hernán (2005)

Information on cause and effect relationships between exposures and health outcomes is the fundamental contribution of epidemiology to the improvement of health (338, 339). Causality and causal inference have been a subject of great interest and contentious debate since the 18th century (340). These concepts further evolved during the 19th century through pioneering works on infectious diseases (Henle-Koch postulates), social causation of disease (Rudolph Virchow), and smoking and various cancers (341). The evidence linking smoking and lung cancer in the 1950’s [and concomitantly with other health outcomes (larynx, esophagus and bladder cancers)] led to the formulation of the “Bradford Hill criteria” (1965) for causation, (which include strength of association, consistency, specificity, temporality, biological gradient, plausibility, coherence, experimental evidence, and analogy. The adaptation of the Bradford Hill criteria led to the Surgeon Generals criteria (1964 and 1982) to assess causality (342-345). Another development occurred in 1976, when Rothman conceived the causal pie model which posits that the causal mechanism (the relation between cause and effect) results from multiple interacting component causes or exposures (346). Today, causal inference is largely viewed as an exercise in the measurement of the causal effect of an exposure rather than as a process to be evaluated based on criteria or guidelines (339). This exercise majorly involves defining a clear causal question even if one thinks its unlikely to interpret estimates as causal, choosing causal diagrams, statistical parameters and analytical techniques that help address the causal question, and c) specifying the assumptions under which the statistical parameters we estimate would correspond with the answer to the causal question.

2.6.1        Causal inference and causal effects under the counterfactual/potential outcomes framework

Apart from substantive knowledge on the outcome and exposures, causal effect estimation requires appropriate causal models/frameworks, causal diagrams depicting assumed relationships between variables, and rigorous analytical techniques based on the study design (347). A statistical association between two random variables X and Y could reflect five possibilities; a) X causes Y, b) Y causes X; c) X and Y have a common cause (confounding), d) random fluctuation, and e) the association was induced by conditioning on a common effect of X and Y (347). Given these possibilities, the statistical association between exposure X and outcome Y can be defined as causal if changing the value of X would make a difference in the value of Y, provided nothing else temporally prior to or simultaneous with X changed (347). The measurement of a causal effect fundamentally requires contrasting the value of Y in the presence of a temporally prior variable X (observed) to the potential value of Y in the absence (i.e., any other value) of X (counter to the fact- unobserved) (332). This understanding, known as the counterfactual concept, originally conceived by Scottish philosopher David Hume in the 18th century, gave rise to the counterfactual/potential outcomes model for causal inference (348). Here, a counterfactual/potential outcome is defined as the outcome Y that one would have had, possibility contrary to the fact, under an exposure other than X (348). In an empirical setting, an individual is either exposed or unexposed and one potential outcome is always missing. Hence, although it is not possible to ascertain the causal effect of an exposure on an outcome for an individual, the counterfactual model allows the estimation of the average of individual causal effects in a target population as a parameter in a statistical model using observed data (349). However, this estimation is only possible if three basic identifiability assumptions are met (349): exchangeability, counterfactual consistency and positivity. Two study groups are exchangeable if the probability of the outcome in one group is the same as that of the second group, had the exposures been reversed (i.e., the potential outcome is independent of the exposure). In a well-designed randomized clinical trial (RCT), the exchangeability assumption is met as participants are randomized into groups, which essentially ascertains that the exposure is independent of other covariates and the outcome. Counterfactual consistency is the rule that allows the potential outcome to be linked to the observed outcome. It outlines that the potential outcome under the observed exposure is the observed outcome. This assumption is usually considered to be met if the exposure is well-defined and manipulable by intervention (e.g., dose of drugs, dose of specific measure of specific tobacco type rather than dose of tobacco smoking in general) and is violated in the case of under-defined, non-manipulable exposures such as social exposures (e.g., SEP). Positivity means that the probability of exposure at every level of all covariates in the model is above 0. Two types of positivity violations are: a) stochastic or chance positivity violation in which there is no probability of exposure at a certain level of a covariate due to lower sample size (e.g., genetic polymorphisms with low minor allele frequency), and b) deterministic positivity violation in which the individual has no chance of being exposed (e.g., positive exposure to alcohol among non-alcohol consumers).

2.6.2        Causal inference in observational studies

The counterfactual model of causal inference has largely dominated the scientific discourse on the estimation of causal effects in the health sciences since the last century. This model stimulated the development of the randomized trial study design by Ronald A. Fisher, and associated inferential statistics in the 1920’s by Fisher, Jerzey Neyman and Egon Pearson (348). Because this study design achieves a valid substitution for counterfactual experience, and the randomization procedure ensures that exchangeability and positivity assumptions are met, RCTs are the study design of choice to estimate causal effects of well-defined manipulable exposures/interventions on outcomes. However, not all exposures can be manipulated under experimental conditions (e.g., social risk factors) or can be randomized and assigned among humans due to ethical concerns (e.g., smoking). This limitation of RCTs created a need to infer causality utilizing non-experimental observational study designs (e.g., case-control, longitudinal), which have been the mainstay of the majority of epidemiologic studies. However, the greater probability of violating identifiability assumptions in these designs made causal inference from observational studies a challenge. To address this challenge, Rubin (1974) developed the model into a general framework for causal inference that can be applied to non-experimental studies as well, and demonstrated the feasibility of causal inference utilizing these study designs (350).

2.6.3        Causal inference from study settings with complex time-dependent feedback loops

As discussed in previous sub-sections, exposures such as SEP and risk behaviours are dynamic and time-varying. These exposures measured at one time point can affect the exposure measured at subsequent time points. Along the way, they can also affect or be affected by other covariates that may bias the causal association between the exposure and the outcome. In other words, time-varying systems are subjected to complex feed-back loops that compound the challenge of causal inference. To overcome this problem, James Robins introduced three powerful analytical methods stimulated by the counterfactual framework, under the umbrella term of G-methods: the parametric g-computation formula (1986), G-estimation of structural nested models (1989), and inverse-probability weighted marginal structural models (1998) (351-354). These methods made the estimation of causal effects under time-varying feedback conditions achievable with longitudinal data. However, these techniques have not been implemented in a case-control study including a combination of time-varying exposures and confounders. Recent advancements in inferential statistics through the work of Tyler VanderWeele, Stijn Vansteelandt, Miguel Hernan and colleagues have also made the estimation of direct and indirect effects (mediation) as well as the attribution of effects to pathways underlying causal association (e.g., 4-way decomposition) between exposures and outcome empirically possible with longitudinal data. However, their demonstration within a case-control study is limited and software codes for the easy implementation in commonly used analytical software such as Stata are lacking (335, 355).

2.7       Directed Acyclic Graphs for causal inference

“Epidemiologists are acutely conscious of the danger of over-interpreting associations as causal, and it may be as a consequence of this that they sometimes avoid thinking about the potentially causal nature of associations between exposures of interest and potential confounders. It is all too easy to fall into a purely empirical approach to analysis, where covariates are added to the model one by one and retained if they seem to make a difference.  Valid inference would be better served if, perhaps with the aid of causal diagrams, careful consideration was given to whether each factor should be in the model, particularly if the factor may have been caused in part by the exposure under study.”Weinberg (1993)

The scientific discourse on causal inference has been supported by a rapid growth in the last two decades in the availability and accessibility of concepts and tools that allow the rigorous and systematic assessment of whether statistical associations are causal. One such important methodological advancement has been the development and increasing adaptation of causal diagrams or directed acyclic graphs (DAGs). DAG is a graphical tool proposed by Judia Pearl and colleagues, which was introduced into the epidemiology literature in 1995 (356, 357). These graphs are diagrams with formal rules that majorly help in: a) designing epidemiologic studies, b) understanding the causal and non-causal relations among variables related to a specific substantive research question and, c) evaluating structural relationships that may pose a threat to study validity (e.g., confounding, selection/collider bias, information bias) (347). A confounder is most commonly defined as a variable that is ‘associated’ with both exposure and outcome and is not an intermediate variable between them. Adjusting for traditionally defined confounders when they are in fact non-confounders as revealed through DAGs would induce bias in the estimates (e.g., over adjustment, M-bias) (347). DAGs are extensively used in this thesis to demonstrate underlying causal relations (e.g., time-varying framework, mediation), facilitate various analytical decisions (e.g., identification  of confounders for assessment of total, direct and indirect effects) and explain potential biases (e.g., confounding, selection bias). To facilitate their understanding in the Methods sections of this thesis, I describe below the basic terminology, rules and concepts underlying DAGs, structural definitions of confounding and selection bias, steps to follow to estimate the total effect of an exposure on the outcome using DAGs and the special case of time-varying confounding affected by prior exposure.

2.7.1        Basic DAG terminology

A DAG consists of a set of random variables (nodes or vertices), both measured (e.g., X, Y, Z in DAG 1) and unmeasured (typically represented by U as in DAG 1), each variable pair connected by a single arrow (directed edges).

The graph is directed as each arrow has only one arrowhead and points from one variable to one other variable. It is also acyclic as no variable can cause itself either directly, or through other variables. An exception to this is a time-varying variable where an arrow from the variable measured at one point in time (e.g., SEP in childhood) can point to the same variable measured in a subsequent point in time (SEP in early or late adulthood). Unlike traditional confounder diagrams in which there is uncertainty in the meaning of the arrows used (i.e., whether the arrow represents association, prediction or causation), each arrow in a DAG depicts causation (as per the definition of cause provided in sub-section 2.5.1). The variable from which an arrow originates (parent) is a direct cause (causative or preventive) of the variable towards which the arrow head leads to (descendent). In DAG 1, X is a direct cause of Y. Similarly, Z is a cause of X, and U of Z and Y. All common causes of any pair of variables must be included in a causal graph. The arrows do not specify the magnitude or direction of causation.

2.7.2        Paths in DAGs

Each path in a DAG goes between the exposure and the outcome without passing through a node more than once. A path can be open or closed. Open paths have an expected causal association

flowing along them (e.g., paths 1 in DAGs 2).

Some paths are open naturally, that is, prior to the intervention of the researcher. Causal paths are naturally open paths in which all arrows point in the same direction from exposure to outcome either directly (e.g., X→Y in DAG 1) or through multiple intermediates (e.g., X→M→Y in DAG 2). All such causal open paths contribute to the total effect of the exposure on the outcome. Such paths can however be closed mistakenly by conditioning (restriction or matching by study design, stratification or covariate adjustment in statistical models during analysis) on the intermediates/mediators. For example, conditioning on M closes the open causal path between X and Y in DAG 2, creating a biased estimate of the total effect of X on Y). On the contrary, certain non-causal paths can be left open naturally (e.g., paths 2 and 3 in DAG 2). Such paths can be used to structurally define confounding paths and naturally include variables that are common causes of the exposure and the outcome (e.g., C2 in path 2 of DAG2). These paths create a bias and the expectation of an association between exposure and outcome that is non-causal. This bias is termed confounding and can be removed by conditioning on any variable along non-causal naturally open paths (e.g., conditioning on either C1 or C2 or C3 or C4 can block the non-causal naturally open path 2)

Closed paths are those through which no association flows; they are considered blocked either naturally or by conditioning on variables along them (e.g., conditioning on a confounder makes an open non-causal path closed). For example, path 4 (X → M ← C4 → Y) in DAG 2 is blocked at M which has arrows originating from C4 and X colliding on it. M is termed a collider on path 4. Conditioning on a collider can mistakenly open the blocked non-causal path, creating a biased association to flow between exposure and outcome, and is considered a selection bias. It is to be noted that a collider is path specific. Also, a variable can have different meanings depending on path. For example, in DAG 2, M is a collider on path 4, but not on paths 1 (X → M→ Y) and

3 (X ← C1 ← C2 → C3 → C4 → M → Y), M is a mediator on path 1 (X → M→ Y), but not on paths 3 and 4. M is a confounder on path 3, but not on paths 1 and 4.

DAGs also help us to structurally define selection bias as any bias occurring due to conditioning on the common effect of two variables, one of which is either the exposure or cause of exposure, and the other is the outcome or cause of the outcome (358).

2.7.3        Steps to estimate the total effect of an exposure on the outcome using DAGs

To estimate the total causal effect of an exposure on an outcome, 5 steps should be followed; 1) draw the best DAG; 2) find all the paths between the exposure and the outcome; 3) Separate the causal and non-causal paths; 4) Separate open and closed paths; 5) find the minimally sufficient set(s) of conditioning variables, where minimally sufficient set is a sufficient set (a sufficient set is a set which contains variables, conditioning on which leaves all causal paths open and closes all non-causal paths) of which no proper subset is sufficient.

In the case of DAG 1, the minimally sufficient set of conditioning variables that to estimate the total effect of X on Y will be {Z}. Although U, an measured variable is also in the confounding path, conditioning on Z turns a confounder U to a non-confounder. However, this same statistical model with Y fitted on X and Z cannot be used to identify the total effect of Z on Y. This is because, according to the DAG 1 (Figure 3), X would be a mediator between Z and Y, and having X in the model would block the causal path between Z and Y. In other words, DAGs inform us if a separate statistical model is needed to estimate the total effect of each exposure. For DAG 2, multiple minimally sufficient sets are possible; e.g., {C1} or {C2} or {C3} or {C4}. Figure 6 depicts conditioning on C3, leaving the only causal path 1 to be open. The selection of any of these 4 variables for conditioning depends on whether the variables have missing data, measurement error or specification error. Although M in DAG 2 is in the confounding path 3, conditioning on it will close the only open causal path between X and Y (M is a mediator in path 1) and will open a blocked non-causal path (path 4).

2.7.4        Time-varying confounding affected by prior exposure

Bias due to confounding and selection bias is compounded while attempting to estimate the effect of a time-varying exposure under conditions of time-varying confounding affected by prior exposure (i.e., covariates can act as both confounders and mediators) (358). A hypothetical time-varying situation involving SEP in childhood (CH SEP), early adulthood (EAH SEP), confounders measured during childhood (C1) and early adulthood (C2) periods under a specific temporal relation with respect to outcome (oral cancer) is depicted in Figure 7.

Figure 7: Causal graph representing time-varying confounders effected by prior exposure

Any method that involves conditioning on C2a to estimate the magnitude of the blue lines may induce bias by creating a non-causal association between CH SEP and oral cancer through the path CH SEP → C2a ←C1→ oral cancer (i.e., opening this naturally blocked non-causal path). However, not adjusting for C2a results in an open non-causal path between EAH SEP ← C2a ← C1→ oral cancer and thus a confounded causal association between EAH SEP and oral cancer. This situation arises because the effect of EAH SEP on oral cancer is confounded by C2a, and C2a is effected by CH SEP (prior exposure); in other words, time-varying confounding is affected by prior exposure. Such situations can only be addressed using g-methods described in sub-section 2.6.3

3         Rationale and study objectives to be written

4         Methods

This dissertation comprises three manuscripts based on three empirical studies, each addressing one specific objective of this work. These empirical studies utilized data from an international collaborative study. Although studies at each site had a similar overall study design and data collection procedures (as they followed the same study protocol), the distribution and types of risk factors at each study site, variables used in each manuscript and statistical analyses performed to achieve the objectives were different. The overall study design, data and sample collection procedures, as well as specific methodologies for each manuscript are explained in the sub-sections below.

4.1       Overall study design

The Head and Neck Cancer (HeNCe) Life study is an international multi-center hospital based case‐control study investigating the aetiology of SCCHN focusing on social, psychosocial, lifestyle, biological and genetic factors, using the life-course framework. This collaborative study was conducted in Canada, India and Brazil. Manuscript I uses data from the Indian site where the incidence of SCCHN, especially oral cancers, is on the rise, and where large social inequalities have been reported (359, 360). Manuscripts II and III rely on data from the Canadian site, where genetic data were available and smoking and alcohol have been the strongest risk factors for SCCHN. Although study sites followed similar protocols, study instruments were culturally adapted through multiple pilot studies.

4.2       Target populations and samples

The target populations for the studies were male and female adult residents of Malabar region of Kerala in India, and Greater Montreal area in Canada. The eligibility criteria of the study in India were: (i) English, French or Malayalam (Kerala native language) speaking; (ii) to be born in India or Canada; and (iii) to live within a 150 or 50 km radius from the recruiting hospitals in Calicut (Kerala) and Montreal, respectively. In addition, the participants shouldn’t have had any: (iv) previous history of any type of cancer or cancer treatment; (v) mental or cognitive disorders; (vi) communication problems (e.g., inability to speak because of lesions); and (vii) diseases related to immuno‐compromise (e.g., HIV/AIDS). Lastly, participants who were too sick or in palliative care were not eligible to participate

In India, cases (N=350) were recruited from the oral pathology clinic at the Government Dental College, and from the cancer outpatient unit of the Government Medical College, Calicut, Kerala (both institutions catering to the same catchment area), India between 2008 and 2012. Controls (N=371) were recruited from other outpatient clinics in these intuitions during the same study period.

In Canada, cases (N=460) were recruited from Ear, Nose and Throat (ENT) and radio‐oncology clinics of four major referral hospitals in Montreal (Jewish General Hospital, Montreal General Hospital, Royal Victoria Hospital, and Notre‐Dame Hospital) between 2005 and 2013. Controls (N=458) were recruited from other clinics in the same hospitals.

4.3       Case definition and selection

Incident cases diagnosed with stage I to IV histologically confirmed squamous cell carcinomas of head and neck region, which included cancers of the tongue, gum, floor of the mouth, and other locations in the mouth, oropharynx, hypo‐pharynx and larynx (C01‐C06, C09, C10, C12‐ C14, and C32, under the International Statistical Classification of Diseases, 10 Version: 2010), were eligible for this study. Lip (C00), salivary gland (C07‐08) and nasopharyngeal (C11) cancers were excluded due to their different aetiologies (361-363). For logistic reasons, only oral cancer cases (C01-C06, and C09 under International Classification of Diseases 10 Version: 2010) were recruited at the Indian site.

4.4       Control definition and selection

Non-cancer controls were frequency matched to each identified case by 5-year age group and sex. They were randomly selected from several outpatient clinics in the same hospitals from a list of non‐chronic diseases which were not documented to be strongly associated with tobacco and alcohol consumption to mitigate Berkson’s bias (364). The participation of controls from each clinic was restricted to less than 20% to limit overrepresentation of a single diagnostic/disease group (365). The genetic profile of the participants was not known during recruitment. The list of clinics from which control participants were recruited and the distribution of controls at Indian and Canadian site are given in Table 4.

Figure 8: Percentage of control participants recruited from participating clinics at Indian and Canadian site

4.5       Ethics approval and informed consent

4.6       Data collection

The data collection procedures consisted of (i) questionnaire based interviews and (ii) Biological sample collection.

4.6.1        Questionnaire based interviews

or Biological sample collection

Following the interviews, biological samples were collected from each participant to perform genetic and HPV analyses (366). SNPs associated with tobacco and alcohol metabolism were the main exposures in manuscripts II and III which used the Canadian data. In addition, exposure to HPV was used as a potential confounder in these manuscripts. Hence, although sample collection was performed at both study sites, this sub-section focuses on genetic analysis at the Canadian site.

Oral epithelial cells, a reliable source of genetic material and HPV DNA, were collected through a validated and reliable protocol using mouthwash, and brush biopsies (366-368). The latter was used to collect epithelial cells from the lesion (in cases) as well as normal mucosa in the oral cavity and oro‐pharyngeal areas (both cases and controls) (details of sample collection are available in Appendix V) (368). Both mouth wash and brush biopsy methods are simple, non‐invasive, inexpensive, and have high acceptance rate among participants. Also, these methods provide great yields of both human DNA and HPV‐DNA after purification (366, 369-372). Following collection, the samples were stored at 4oC as soon as possible and at ‐20oC at the sample analysis site. For the Canadian participants, genetic analyses and HPV detection were performed at laboratories at the Albert Einstein College of Medicine in New York, and the CHUM in Montreal respectively.

4.6.2        Genotyping analysis for DNA polymorphism

To idHPV detection

HPV DNA detection was performed using a standardized PCR protocol (373, 374). The samples were centrifuged (at 1000 x g for 10 minutes), the DNA was extracted from the pellet with a small quantity of supernatant by a modified Gentra Purgene protocol (375). The purified DNA underwent PCR and amplification. To ascertain the integrity of DNA and that there was sufficient sample available for PCR analysis, beta-globulin testing was performed. An absence of beta- and 84 (376-378).

4.7       Data quality control and management

4.8       Measures – Manuscript I

In Manuscript I, we investigated the association between SEP collected at three periods of the participants’ lives and oral cancer risk using the accumulation, critical period and social mobility life course models. The dependent variable (oral cancer), main exposure (SEP) and potential confounders are described below.

4.8.1        Dependent (outcome) variable – Oral cancer status       Asset/wealth index and principal component analysis (PCA)

The asset/wealth index was created from a list of questions on various assets (housing characteristics, durable assets and access to services) available at the participant’s longest place of residence during three time periods: childhood (0-16 years), early adulthood (17-30 years), and late adulthood (above 30 years). As detailed in Appendix VII, Table 1, information on nine assets/items from childhood, eleven from early adulthood and twelve from late adulthood were used.

An issue in using housing indicators (which are all correlated) is that each of them could have a different relationship with SEP and may not be sufficient to differentiate household SEP when used individually (283). Hence, different indicators are aggregated to derive a uni-dimensional measure that can be further categorized to reflect different levels of SEP. Summing up the indicators is a common practice (379). However, this assumes an equal weight for each indicator.  In this study, we overcame these challenges using principal component analysis (PCA), which is an increasingly employed (e.g., by World Bank Demographic and Health Surveys data sets) data reduction method for creating uni-dimensional SEP measures from data on different assets (283, #2632, 284, 292, 293).

Principal component analysis

With PCA, multiple original variables can be summarized with relatively few dimensions that capture the maximum possible information (variation) from the original variables. Mathematically, from an initial set of n correlated variables (original), PCA creates uncorrelated components, where each component is a linear weighted combination of the original variables (380). For example, if X1, X2, … , Xn are n original indicators, then the first component (PC1) is given by,

PC1= a11X1 + a12X2 +…. + a1nXn

and mth component is given by

PCm= am1X1 + am2X2 +…. + amnXn

Where amn is the weight for the mth principal component and the nth variable.

Since PCA aims to maximize the variance, it is sensitive to scale differences in the original variables. For example, in our study, responses to some of the questions on housing were nominal (e.g., type of material for the floor, roof, wall) while others were binary (e.g., presence or absence of radio, clock, TV) or categorical. Hence, the original variables must be standardized and converted to a correlation matrix before performing a PCA (381). The weights for each component are given by eigenvectors of the correlation matrix, and the variance for each component is given by the eigenvalue of corresponding the eigenvector (380). The components are arranged so that the first component explains the largest possible amount of variation in the original data. The second component is uncorrelated with the first and explains a smaller amount additional variance, unexplained by the first component. Subsequent components are uncorrelated with first and second components and explains smaller and smaller additional, unexplained proportion of variation of the original variables (380).       Creating the asset index as a measure of SEP using PCA

To standardise the original asset indicators, first, responses to all questions on assets were binary coded into advantageous and disadvantageous SEP based on the type of material used and facilities available, according to the context of Kerala, India. Next, a tetrachoric correlation matrix (381) was created from these binary variables for each life period (Appendix VII, Tables 2,3,4).  If any variable correlated highly (|0.8|) with other variables, only one variable from the pair of correlated variables was retained for further analysis. In addition, variables were excluded in stepwise manner until a factorable correlation matrix with Kaiser-Meyer-Olkin (KMO) value > 0.7 was attained for each period separately (293). Assets with low test-retest reliability (inter-class correlation) were also removed (Appendix VI, Table 1). The final variables retained in the matrix for each period were; Childhood: crowding, floor, wall, window, piped water, bath, clock, KMO=0.832; Early adulthood: crowding, wall, window, piped water, clock, bicycle; KMO=0.771; Late adulthood:  Crowding, wall, window, piped water, clock, radio, television, phone, KMO=0.801. A PCA was conducted on the final correlation matrices to assess the dimensionality of the assets, and the component that explained the maximum variance in each life period (the first component childhood explained 65% of variance, 64% each for early and late adulthood) was extracted (283). Continuous scores were predicted out of these components. The continuous score for each life period was then dichotomized using the median of the distribution as cut-off generating respective binary variables representing the SEP exposure (0=advantageous SEP, 1=exposure to disadvantageous SEP) for childhood, early and late adulthoodperiods of life.       SEP exposure measure for critical period models

The binary variables (0-advantageous SEP, 1-disadvantageous SEP) representing SEP in childhood, early, and late adulthood were used as the main exposure in the critical period model representing each of these life periods.       SEP exposure measure for the accumulation model

A summation of the binary variables representing SEP in each life period generated a variable with four categories with increasing periods of exposure to disadvantageous SEP. This variable represented the accumulation model. The variable was coded as: 0=0 period– participants who were in advantageous SEP in all 3 periods of life; 1=1 period-participants who were exposed to disadvantageous SEP in any 1 period and non-exposed in any 2 periods of life; 2=2 periods-participants who were exposed to disadvantageous SEP in any 2 periods and non-exposed in any 1 period of life; and 3=3 periods-participants who were exposed to disadvantageous SEP in all three periods of life.       SEP exposure measure for social mobility models

Two models were tested for mobility: childhood to early adulthood mobility, and early to late adulthood mobility.

Childhood to early adulthood mobility – The SEP measure representing this model was a 4-category variable. Stable advantageous SEP (0, 0): Participants who maintained a stable advantageous SEP in both childhood and early adulthood were coded as 0. Upward mobility (1, 0): Participants who were exposed to a disadvantageous SEP in childhood but went on to attain an advantageous SEP in early adulthood were coded as 1. Downward mobility (0, 1): Participants who had an advantageous SEP in childhood but disadvantageous SEP in early adulthood were coded as 2. Stable disadvantageous SEP (1, 1): Participants who maintained a stable disadvantageous SEP in both childhood and early adulthood were coded as 3; all categories were assigned irrespective of their SEP in late adulthood.

Early to late adulthood mobility – A similar strategy was adopted to create the 4-category SEP variable representing social mobility between early and late adulthood by considering participants’ SEP in these 2 periods of life.

4.8.2        Covariates used as potential confounders

One of the main challenges addressed in manuscript I is the nature (both static and dynamic) of potential confounders and their temporal ordering with respect to the time-varying exposure of SEP across three time periods and oral cancer. We identified both time-invariant [age, sex, caste i.e., hierarchy in Hindu religion based on occupation, education] and time-varying factors (cigarette smoking, bidi smoking, paan chewing and alcohol consumption) as potential confounders.       Baseline confounders (time- invariant)

Age, sex and caste

Age and sex are strong risk factors for oral cancers. They can also determine an individual’s SEP at different periods of life. Hence, to mitigate confounding controls were frequency matched to cases based on 5-year age group and sex. However, there might exist differences within each age group that may result in residual confounding (333). Furthermore, age and sex stand for unknown or unmeasured potential confounders that may determine both the SEP and cancer status of an individual. Hence, these variables were further adjusted in the statistical analysis. Age was used as a continuous variable and sex was binary coded (0= females, 1= males).  Caste is a hierarchy in the Hindu religion based on occupation, and may determine an individual’s SEP as well as the outcome of cancer. In this study, we collected details on forward caste, backwards caste, other backward caste, scheduled caste scheduled tribe and others as classified by government of Kerala1. We adjusted for this variable using a categorical variable (0=higher caste, 1=middle caste comprising of backward caste, 2=other backward[1]/scheduled caste/scheduled tribe/others).

Education (time-invariant)

As discussed previously several indicators are used to measure SEP and they may capture different dimensions of this complex construct (please refer sub-section,). Education may capture a different dimension of SEP than the wealth index. Also, it is an independent risk factor for oral cancers, and the education an individual attains (education is mostly stable after childhood or adolescence) may determine their asset/wealth index in adulthood. Detailed information regarding education was collected from each participant (please refer to the questionnaire page…. Appendix… ) We used number of years of formal education in the form of a binary variable (0: high education; 1: low education) as an indicator. However, the measure of education is subjected to bias if the differences in birth cohorts of participants from a range of age groups included in a study are unaccounted for (285, 382, 383). With respect to the Kerala study site, considerable educational and sociopolitical reforms took place in the mid1950s, which changed the landscape of education in this state of India (as noted in sub-section, education).  This information was used to mitigate bias in the categorization of education. The participants were first divided into 2 groups: older: those born before 1950, younger: those born after 1950). For the older cohort, 0-3 years of formal education was considered low level, and 4 years and above was considered as high level of education. For the younger cohort, 8 years of formal education as used as the cut-off for this binary categorization.       Time-varying confounders

Tobacco smoking

We used

Paan / betel quid chewing

Similar to tobacco

Alcohol consumption

4.8.3        Temporal relationship of confounders in relation to SEP in three periods of life and oral cancer

The temporal ordering of exposures and covariates with respect to the outcome is imperative when testing life-course models (323). Furthermore, to estimate causal effects (or when applying frameworks for causal inference or associated analytical techniques), the precedence of the causal factor in relation to its effect, is of absolute necessity. Whereas temporal ordering is easier in cohort studies (refer to sub-section 2.5 observational study designs), it is a challenge in case-control studies. But our detailed and comprehensive data collection methods and techniques to handle confounders (as described in sub-section in our life-course based study allowed us to achieve an approximate temporal ordering of variables with respect to SEP in several periods of life and oral cancer diagnosis. As shown in the causal diagram in Figure 9, the vector C0 represented the time-invariant covariates such as age, sex and caste that temporally precede every other variable under consideration. The vector C1 represented covariates that were measured for the period between 0-16 years of age. We included education in C1 because it is usually attained during this period, and could causally affect the subsequent life events of an individual. Other variables represented in C1 and subsequent vectors C2a, C2b, C3a and C3b were time-varying risk behaviours (cigarette, bidi, paan and alcohol use). As mentioned previously in the sub-section of confounders, the cumulative measures of these risk behaviours were calculated for 0-16 years, 17-23 years, 24-30 years, 31 -50 years, and above 50 years. Risk factors collected for the period between 0-16 years might be an effect rather than cause of SEP between 0-16 years of age and were included in C1. However, we suspected that the association between late adulthood SEP (17-30 years) and habits captured during 17-30 years, was

Figure 9: Causal graph representing SEP in three periods of life, oral cancer and associated potential confounders



bi-directional, that is, SEP and habits can influence each other causally. Bidirectional arrows cannot occur in causal structures at the same time point (347, 356, 384). To overcome this, we split the habits in this period into vectors C2a (17-23 years) and C2b (24-30 years). This was done assuming that C2a would be affected by C0, C1 and CH SEP, but would influence part of SEP in 17-30 years and other subsequent variables. And C2b would be affected by C0, C1, C2a, CH SEP and EAH SEP. The choice of cut-point (i.e., 23 years) was arbitrary. A similar strategy was used with risk behaviours recorded for above 30 years of age. Risk behaviours recorded during the period 31-50 years of age were represented by C3a, and those recorded above for 50 years (the eldest participant was 88 years old) were represented by C3b. This approximate temporal ordering identified complex feed-back loops between the variables under study as any given variable/vector represented in Figure 2 had an arrow pointing from them to any other variable/vector temporally subsequent to it.

4.9       Measures- Manuscript II

In the Manuscript II, we considered the interactive effects of SNPs investigated in this study and smoking on the risk of SCCHN. Hence, the dependent (SCCHN) variable, main exposures (SNPs and smoking) and associated potential confounders are described below.

4.9.1        Dependent (outcome) variable – SCCHN

SCCHN cases were selected as described in section 3.3. Only histologically confirmed squamous cell carcinomas were included in the study. The outcome variable was treated as binary, with the presence of any oral or pharyngeal or laryngeal cancers coded as 1 (cases) and the absence of all coded as 0 (controls).

4.9.2        Independent (main exposure) variable ‐ Genetic variants

The genetic variants associated with CYP450 genes coding phase I XMEs are involved in the bio-activation of a variety of tobacco smoke chemicals into electrophilic reactive moieties with carcinogenic potential. The variants associated with GST genes encoding phase II enzymes are involved in the detoxification of reactive metabolites of phase I bio transformation. The characteristics of these SNPs and their association with SCCHN have been described in detail in sub-section 2.3.2, and Tables 2 and 3. In general, I will consider all genetic exposures as binary variables, with categories coded as 0 considered as reference. The genotypes were collapsed into two categories majorly because the minor allele frequencies of these SNPs in the Caucasian population (except those related to GST enzymes) were low. Specific details on categorization of these genetic measures are given below.       Single nucleotide polymorphisms in CYP and GST genes

Dominant models of inheritance were tested for CYP1A1*2A, *2C, CYP2E1c2, CYP2A6*2 and GSTP1105Val. Dominant model assumes that just the presence of the variant allele, as either homozygous variant or heterozygous variant/wild phenotypes, is enough for the effect of wild allele to be masked. Hence carriers of these variant alleles, considered as the exposed group (assuming equal risk for homozygous variant and heterozygous wild/variant groups) were compared with non-carriers, assumed unexposed. Thus, CT/CC genotypes for CYP1A1*2A, AG/GG genotypes for CYP1A1*2C, GC/CC genotypes for CYP2E1c2, AT/AA genotypes for CYP2A6*2 and AG/GG for GSTP1105Val were respectively coded 1 (carriers, exposed), and TT genotypes for CYP1A1*2A, AA genotypes for CYP1A1*2C, GG genotypes for CYP2E1c2, TT genotypes for CYP2A6*2 and AA for GSTP1105Val were respectively coded 0       Copy number variants in CYP2D6 and GSTM1 genes

In this study, we identified 1 to 9 copy numbers of CYP2D6 non-functional null allele among our sample. Individuals with lower number of these null allele are hypothesized to be at relatively higher risk for SCCHN compared to those with higher number of copies of the allele. Based on the distribution of these CNVs in this study, this genetic exposure was binary coded; 1 to 2 copies considered as exposed (coded 1) and 3 to 9 copies as unexposed (coded 0).  For GSTM1, we identified 0 to 3 copies. To ascertain sufficient numbers in the categories, the GSTM1CNV classification was limited to Null (0 copies, coded 1) and Non-null (1-3 copies, coded 0).

4.9.3        Independent (main exposure) variable ‐ Pack-years of cigarette smoking

To incorporate the effect of correlated measures such as frequency and duration of smoking and to avoided issues related to collinearity between these measures during statistical analysis, it is recommend to use cumulative measures of smoking in studies investigating the impact of this risk behaviour on cancers (385-387). Hence, in this study, we used cigarette pack-years to represent tobacco smoking history (388). Pack-years was computed as the product of the average smoking intensity over lifetime, and the total duration of smoking at the time of diagnosis for cases and at the time of interview for controls.

Cigarette pack-years was derived from information on participants’ history of cigarette (filtered or unfiltered or hand-rolled), cigar and pipe smoking along the life-course in a similar method as described in subsection, Tobacco smoking. Hand-rolled cigarettes, cigars and pipes were first converted to their commercial cigarette equivalent (20 commercial cigarettes = 4 hand-rolled cigarette = 4 cigar=5 pipes= 1 pack of commercial cigarettes) (79). This information was used to create total duration of smoking and average packs smoked per day over life time respectively. A product of these two generated a continuous measure of pack-years of cigarettes smoked over life time. Certain participants had a combination of active periods of smoking and periods of abstinence over their life-course. Periods of abstinence were excluded while calculating total duration as we assumed very low probability of misclassification (inclusion vs exclusion of such periods of abstinence gave us similar results [e.g total duration of smoking including periods of abstinence, (mean=32.25 years ±15.45) and excluding such periods (mean=31.47 years ±15.46)]) . Furthermore, from information on time since smoking cessation (age during interview minus age of cessation), we identified that participants who stopped smoking ≤ 2 years prior to recruitment had a higher risk for the outcome than actual current smokers (time since cessation=0) (Manuscript II, Supplemental material 1). Hence, to minimize probability of protopathic /reverse causality bias, we used a cut-off of 2 years’ prior interview to define ex-smokers, and excluded details of any exposure (e.g., frequency, duration) during this period for pack-year calculations.

To estimate the effect of various SNPs at different levels of smoking, we categorised the cigarette pack-year variable into 3 categories. The optimal cut-off point for categorization was informed through multiple rigorous modelling approaches.

The first step was to determine the correct functional form of pack-years using dose-response curves. For this, first an outcome model with pack-years entered as linear form was fit following guidelines proposed by Leffondre et al 2002 (386). Subsequently, I fitted multiple logistic regression models, each with pack-years in restricted cubic spline functional form determined by knots at various percentiles of its distribution (5, 50 and 95; 10, 50, 90; 25, 50, 75; 5,25, 75 as well as the modified knot positions recommended by Harrell) (389). Next, among these spline models, the best fit model was chosen by comparing Akaikes information criteria (AIC) values (390). The model with knot positions at 5, 50 and 95 percentiles had the lowest AIC value and was deemed as the best fit. Subsequently, using a likelihood ratio test, fit of this model was compared with that of the linear model under the assumption that the linear model was nested within the model with spline parameters. The spline model had a superior fit. Using this model with spline parameters, the shape of the dose-response curve between pack-years and the SCCHN outcome was constructed, and determined to be non-linear (Manuscript II, Supplemental material 2). The curve indicated that the risk for the outcome increased sharply up to approximately 70 pack-years beyond which the risk plateaued.  This informed us that the risk point (optimal cut-off) would lie anywhere between >0 and 70 pack-years.

In the second step, a parametric outcome based approach, developed to identify optimal cut-off for continuous covariates with non-linear functional form as well with respect to a binary outcome, was used to identify the optimal cut-point among smokers (391). This approach, a) maximized the difference in risk between participants in the two outcome groups, and b) bonferroni corrected for alpha=5% (to circumvent the possibility of inflation of Type 1 error in the identified cut point, due to multiple comparisons of various cut points possible over the range of >0 and 70 pack years). The optimal cut-off was identified to be at 32 pack-years (defined as smoking 32 packs of commercial cigarette per day for a year, or 16 packs/day for 2 years, or 8 packs/day for 4 years, 4 Covariates used as potential confounders

It has been recommended that while assessing interactive effects between two variables, all measured potential confounders for the relation between each exposure variable (i.e., genetic variants, and smoking) and the outcome (SCCHN) must be present in the full confounder set. Variables considered as confounders for estimating the total effect of genetic variants and health outcomes are usually limited to those that address population stratification (biased association between genetic variant and outcome due to heterogeneous ethnicity/ population sub structure), SNPs in linkage disequilibrium, and sex. However, many enzymes coded by SNPs considered in this study are induced by polycyclic aromatic hydrocarbons (CYP1A1), nicotine (CYP2A6, CYP2E1), and ethanol (CYP2E1) found in pollutants, occupational exposures, diet, tobacco smoke, alcohol among others. SNPs under study are actually noisy proxies for enzymes they code for. Hence it can be argued that the sources of these exposures may be confounders for the relation between SNPs and SCCHN. Hence, to rule out the possibility of any confounding by these exposures, we considered ethnicity, SNPs in LD (e.g., CYP1A1*2A and 2C), age, sex, alcohol (ethanol) and education (SEP proxy for occupation and diet as information on them was not available) as potential confounders for respective SNPs and SCCHN associations, as depicted in Appendix VIII, DAGs figures xxxx to XXXX.  DAGs were constructed using DAGitty software version 1.1 (392, 393). To mitigate confounding by ethnicity (population stratification), all analyses were restricted to Caucasians.

The covariates included in the final set of confounders for any or all gene-environment interaction models in this study are described below.

Alcohol consumption

The frequency of ethanol consumption (average amount of ethanol in ml consumed per day) was used as the measure of alcohol consumption. This measure was derived from detailed information on wine, beer/cider, hard liquor, aperitif, or other alcoholic beverages consumed by the participants collected using a similar method as described in subsection, Each beverage was converted to ethanol equivalents (10% ethanol in wine and aperitif, 5% in beer/cider, and 50% in hard liquor) (78). The frequency for each stable period was converted to millilitres of ethanol consumed per day for each stable period. This information was used to calculate total duration and total frequency of ethanol in ml consumed over life time. This data was used to calculate average amount of ethanol consumed per day in ml. Similar to tobacco pack-years, the correct functional form of this ethanol frequency variable was determined (by comparing fit of linear and restricted cubic spline models and fitting dose-response curves) to be non-linear (figure XXX).  The two spline parameters in continuous form were used to represent frequency of ethanol consumed per day.

Socioeconomic position – Education

Socioeconomic position (SEP) is a determinant of tobacco smoking and a distal risk factor for SCCHN. Detailed information regarding education was collected from each participant (please refer to the questionnaire page…. Appendix… ). For this specific analysis, we used the number of years of formal education as the measure of SEP, used as a continuous variable in its linear functional form.

HPV status

As described in sub-section 4.6.4, HPV status was recorded for 35 HPV types. Based on their oncogenic potential, these types were assigned into hierarchical categories: 1) HPV 16: all participants positives for HPV 16, alone or in combination with other types (coded 3); 2) High risk HPV type: all high risk HPV types except for HPV 16, i.e., HPV 18, 31, 33, 35, 39, 51 (coded 2); 3) Low risk HPV types: all other participants positive for any remaining low-risk HPV types (coded 1);  HPV-negative: participants in whom no HPV type was detected (394, 395).


4.10   Measures- Manuscript III

Manuscript III aimed at estimating the effects of CYP2A6*2 and ADH1B*2 on SCCHN through interactive and mediating pathways by smoking and alcohol intensities respectively. Hence, the dependent variable (SCCHN), exposures (CYP2A6*2 and ADH1B*2) and associated potential confounds are described below.

4.10.1    Dependent (outcome) variable – SCCHN

The dependent variable was SCCHN as described in sub-section 4.9.1

4.10.2    Independent (main exposure) variables – CYP2A6*2 and ADH1B*2

In this study, CYP2A6*2 was genotyped as TT, AA and AT (A = minor allele). Relative to carriers of this allele (AT or AA genotype), non-carriers (TT genotype) are documented to smoke with

4.10.3    Mediators – Intensity of smoking and alcohol consumption

Among the various dimensions of smoking and alcohol consumption behaviour, CY2A6*2 is strongly associated with the intensity of smoking, and ADH1B*2 with intensity of alcohol consumption (241, 396). Hence, we used the intensity measures of these behaviours as mediators.

Details of the smoking data collection are described in sub-section 4.9.3. All tobacco types were converted to a commercial cigarette equivalent based on their-nicotine content (1/9 cigar = 1/3.5 pipe=1/2 hand rolled cigarettes= 1 commercial cigarette) (50). From the total duration and frequency of a commercial cigarettes used, we calculated the average number of commercial cigarettes smoked per day over the lifetime. Using techniques described in sub-section 4.9.3, the

The data collection on alcohol consumption as well as the creation of an intensity measure of ethanol consumption were described in sub-section 4.9.4. Using a technique similar to the one employed for the categorization of smoking intensity, the optimal cut-off point to categorize the average amount of ethanol in millilitres consumed per day over the lifetime was identified to be at 25ml of ethanol. The final intensity measure for alcohol was represented by a binary variable: mild drinkers (coded 0): participants who consumed up to 25ml of ethanol per day and heavy drinkers (coded 1): participants who consumed more than 25ml of ethanol per day considered as

4.10.4    Covariates used as potential confounders

Manuscript III involved analysis related to mediation and interaction based on the counterfactual causal framework. For the estimation and causal interpretation of effects in mediation studies using the counterfactual framework, four no-confounding assumptions are required along with correct model specification (335): there  is no unmeasured confounder of the effects of (i) genetic exposure on SCCHN, (ii) genetic exposure on the associated mediating risk behaviour, and (iii) mediating risk behaviour on SCCHN, and (iv) none of the mediating risk behaviour-SCCHN confounders are affected by the associated genetic exposures. We addressed (i) and (ii) by restricting our analysis to Caucasians, thus mitigating confounding due to population stratification (397). For (iii), we adjusted for potential confounders of the relationship between risk behaviours and SCCHN. For the smoking intensity-SCCHN association, we identified duration and time since cessation of smoking (continuous, mean centred, current and non-smokers recoded to zero), and intensity of alcohol (continuous, adjusted for restricted cubic spline) as confounders. For the alcohol intensity-SCCHN association, time since stoppage of use of alcohol (continuous, mean centred, current and non-users recoded to zero) of alcohol, and pack-years of commercial cigarette equivalence (as described previously in sub-section 4.9.2) were identified. Additionally, we adjusted for age (continuous), sex, number of years of education (continuous) and HPV risk types for both associations. These variables are not known to be affected by the associated genetic exposures that may potentially address the 4th no-confounding assumption. Please refer to sub-section 4.9.4 for details on these confounders.

4.11   Statistical analysis

This section presents the details of general and specific statistical techniques used to analyse the data for each manuscript.

4.11.1    General considerations

Descriptive statistical analysis was performed to explore the distribution of variables used in the study among cases and controls. T-Tests were used to compare means of continuous variables between the two groups, while chi-square tests based on cross-tabulations were used to describe categorical data (398). For manuscript II and III which involved genetic variants, deviations from the Hardy-Weinberg equilibrium were assessed among the control population using chi-square tests. Minor allele frequencies were estimated among controls.

The primary dependent/outcome variable investigated in each manuscript was a binary. Furthermore, exposure models used to create inverse probability weights for the marginal structural models in the 1st manuscript, and mediator models fitted in the 3rd manuscript had a binary dependent variable. Hence, all manuscripts depended on a binary logistic regression model to calculate association or effect estimates.

Binary logistic regression

Binary logistic regression is a type of generalized linear model used to estimate the probability of a binary response (dependent) variable as a linear function of any number of independent predictor variables by fitting data to a logistic curve (390). If P is the probability of a disease occurring and 1-P is the probability of the disease not occurring, then P/1-P gives the odds of the disease occurring. A log transformation allows the odds of a disease to be expressed as a linear function of the independent variables as:

01* = downward mobility = unexposed in CH and exposed in EAH, irrespective of exposure in LAH, and A11* = exposed to disadvantageous SEP in both CH and EAH irrespective of exposure status in LAH.

Similarly, equation for EAH to LAH mobility is

Logit (Y|A EL_mobility) = β1 A*10+ β2A*01+ β3A*11                                     Eq7                                                                                                                

Where A*10 = being exposed in EAH and unexposed in LAH, irrespective of exposure status in CH, A*01 = unexposed in EAH and exposed in LAH, irrespective of exposure in CH, and A*11 = exposed in both EAH and LAH irrespective of exposure status in CH.

The reference category for each mobility pattern is being unexposed at both time periods (no mobility), irrespective of exposure status in the other time period which is not included in a specific mobility testing.



1. Who. International Statistical Classification of Diseases and related Health Problems 10th Revision – ICD-10 Version:2010. 2010.

2. Sanderson RJ, Ironside JAD. Squamous cell carcinomas of the head and neck. BMJ : British Medical Journal. 2002;325(7368):822-7.

3. Ferlay J, Soerjomataram I, Ervik M, Dikshit R, Eser S, Mathers C, et al. GLOBOCAN 2012 v1.0, Cancer Incidence and Mortality Worldwide: IARC CancerBase No. 11 [Internet]. Lyon, France: International Agency for Research on Cancer; 2013. Available from: http://globocan.iarc.fr, Accessed on January 05, 2015.2013.

4. Shield KD, Ferlay J, Jemal A, Sankaranarayanan R, Chaturvedi AK, Bray F, et al. The global incidence of lip, oral cavity, and pharyngeal cancers by subsite in 2012. CA: A Cancer Journal for Clinicians. 2016:n/a-n/a.

5. Warnakulasuriya S. Global epidemiology of oral and oropharyngeal cancer. Oral oncology. 2009;45(4-5):309-16.

6. Kulkarni MR. Head and Neck Cancer Burden in India. 2013;4(April):29-35.

7. Ferlay J, Shin H-R, Bray F, Forman D, Mathers C, Parkin DM. Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008. International journal of cancer. 2010;127(12):2893-917.

8. Canadian Cancer Statistics 2016. Toronto, ON: Canadian Cancer Society; 2016.; 2016.

9. http://globocan.iarc.fr/ 2012. 2012.

10. Segi M, Tōhoku D, Nippon Taigan K. Cancer mortality for selected sites in 24 countries. Cancer mortality for selected sites in twenty four countries. 1950:6 v.

11. Doll R, Payne P. Cancer Incidence in Five Continents, Vol. I Union Internationale Contre le Cancer, Geneva. 1966.

12. Buchmann L, Conlee J, Hunt J, Agarwal J, White S. Psychosocial distress is prevalent in head and neck cancer patients. The Laryngoscope. 2013;123(6):1424-9.

13. Oliveira Cd, Bremner K, Reka P, Gunraj N, Chan K, Stuart P, et al. Understanding the costs of cancer care before and after diagnosis for the 21 most common cancers in Ontario : a population-based descriptive study. 2013:1-8.

14. Kam D, Salib A, Gorgy G, et al. Incidence of suicide in patients with head and neck cancer. JAMA Otolaryngology–Head & Neck Surgery. 2015;141(12):1075-81.

15. Canadian Cancer Society, 2011.

16. SEER Cancer Statistics Review; 1973–1998.

17. G.L. Day WJB. Second primary tumors in patients with oral cancer. Cancer. 1992;70(1):14-9.

18. Lippman Sm HWK. Second malignant tumors in head and neck squamous cell carcinoma: the overshadowing threat for patients with early-stage disease. Int J Radiat Oncol Biol Phys. 1989;Sep;17((3)):691-4.

19. Lingen MW, Kalmar JR, Karrison T, Speight PM. Critical evaluation of diagnostic aids for the detection of oral cancer. Oral oncology. 2008;44(1):10-22.

20. Carvalho AL, Pintos J, Schlecht NF, et al. PRedictive factors for diagnosis of advanced-stage squamous cell carcinoma of the head and neck. Archives of Otolaryngology–Head & Neck Surgery. 2002;128(3):313-8.

21. Menzin J, Lines LM, Manning LN. The economics of squamous cell carcinoma of the head and neck. Current opinion in otolaryngology & head and neck surgery. 2007;15(2):68-73.

22. Blot WJ. Oral and Pharyngeal cancers. Cancer Surv. 1994;19-20:23-42.

23. Marron M, Boffetta P, Zhang Z-F, Zaridze D, Wunsch-Filho V, Winn DM, et al. Cessation of alcohol drinking, tobacco smoking and the reversal of head and neck cancer risk. 1 ed. England: International Agency for Research on Cancer, Lyon, France.; 2010. p. 182-96.

24. Hashibe M, Brennan P, Benhamou S, Castellsague X, Chen C, Curado MP, et al. Alcohol Drinking in Never Users of Tobacco, Cigarette Smoking in Never Drinkers, and the Risk of Head and Neck Cancer: Pooled Analysis in the International Head and Neck Cancer Epidemiology Consortium. Journal of the National Cancer Institute. 2007;99(10):777-89.

25. International Agency for Research on Cancer (IARC). IARC Monographs on the Evaluation of Carcinogenic Risks to Humans: Volume 100B. Biological Agents. Lyon, France: International Agency for Research on Cancer; 2009.

26. Forte T, Niu J, Lockwood Ga, Bryant HE. Incidence trends in head and neck cancers and human papillomavirus (HPV)-associated oropharyngeal cancer in Canada, 1992-2009. Cancer causes & control : CCC. 2012;23(8):1343-8.

27. Chaturvedi AK, Anderson WF, Lortet-Tieulent J, Curado MP, Ferlay J, Franceschi S, et al. Worldwide Trends in Incidence Rates for Oral Cavity and Oropharyngeal Cancers. Journal of clinical oncology : official journal of the American Society of Clinical Oncology. 2013;31(36).

28. Petti S. Lifestyle risk factors for oral cancer. Oral oncology. 2009;45(4-5):340-50.

29. The relation of socioeconomic status to oral and pharyngeal cancer. Epidemiology. 1991;2(3):194-200.

30. Conway DI, McKinney PA, McMahon AD, Ahrens W, Schmeisser N, Benhamou S, et al. Socioeconomic factors associated with risk of upper aerodigestive tract cancer in Europe. Eur J Cancer. 2010;46(3):588-98.

31. Conway DI, Petticrew M, Marlborough H, Berthiller J, Hashibe M, Macpherson LM. Socioeconomic inequalities and oral cancer risk: a systematic review and meta-analysis of case-control studies. International journal of cancer. 2008;122(12):2811-9.

32. Pruyn JF, de Jong PC, Bosman LJ, van Poppel JW, van Den Borne HW, Ryckman RM, et al. Psychosocial aspects of head and neck cancer–a review of the literature. Clinical otolaryngology and allied sciences. 1986;11(6):469-74.

33. Knox SS, Cancer Cell I. From ‘omics’ to complex disease: a systems biology approach to gene-environment interactions in cancer. Cancer Cell Int. 2010(1475-2867).

34. Olshan AF, Weissler MC, Watson MA, Bell DA. GSTM1, GSTT1, GSTP1, CYP1A1, and NAT1 Polymorphisms, Tobacco Use, and the Risk of Head and Neck Cancer. 2000:185-91.

35. Hashibe M, Brennan P, Strange RC, Bhisey R, Cascorbi I, Lazarus P, et al. CYP1A1 Genotypes and Risk of Head and Neck Cancer Genotypes and Risk of Head and Neck Cancer. 2003:1509-17.

36. Brennan P, Lewis S, Hashibe M, Bell DA, Boffetta P, Bouchardy C, et al. Pooled analysis of alcohol dehydrogenase genotypes and head and neck cancer: a HuGE review. Am J Epidemiol. 2004;159(1):1-16.

37. Hung RJ, van der Hel O, Tavtigian SV, Brennan P, Boffetta P, Hashibe M. Perspectives on the molecular epidemiology of aerodigestive tract cancers. Mutation research. 2005;592(1-2):102-18.

38. Brunotto M, Zarate AM, Bono A, Barra JL, Berra S. Risk genes in head and neck cancer : A systematic review and meta-analysis of last 5 years. Oral oncology. 2013.

39. Canova C, Richiardi L, Merletti F, Pentenero M, Gervasio C, Tanturri G, et al. Alcohol, tobacco and genetic susceptibility in relation to cancers of the upper aerodigestive tract in northern Italy. Tumori. 2010;96(1):1-10.

40. Sreelekha TT, Ramadas K, Pandey M, Thomas G, Nalinakumari KR, Pillai MR. Genetic polymorphism of CYP1A1, GSTM1 and GSTT1 genes in Indian oral cancer. Oral oncology. 2001;37(7):593-8.

41. Vincent-Chong VK, Ismail SM, Rahman Zaa, Sharifah Na, Anwar a, Pradeep PJ, et al. Genome-wide analysis of oral squamous cell carcinomas revealed over expression of ISG15, Nestin and WNT11. Oral diseases. 2012;18(5):469-76.

42. International Agency for Research on Cancer (IARC). IARC Monographs on the

Evaluation of Carcinogenic Risks to

Humans: Volume 100E. Personal Habits

and Indoor Combustions. Lyon, France:

International Agency for Research on Cancer. 2009.

43. Heck JE, Berthiller J, Vaccarella S, Winn DM, Smith EM, Shan’gina O, et al. Sexual behaviours and the risk of head and neck cancers: a pooled analysis in the International Head and Neck Cancer Epidemiology (INHANCE) consortium. 1 ed. England: Lifestyle, Environment, and Cancer Group, International Agency for Research on Cancer, Lyon, France.; 2010. p. 166-81.

44. Tezal M, Sullivan Ma, Reid ME, Marshall JR, Hyland A, Loree T, et al. Chronic periodontitis and the risk of tongue cancer. Archives of otolaryngology–head & neck surgery. 2007;133(5):450-4.

45. Laprise C, Shahul HP, Madathil SA, Thekkepurakkal AS, Castonguay G, Varghese I, et al. Periodontal diseases and risk of oral cancer in Southern India: Results from the HeNCe Life study. International journal of cancer. 2016;139(7):1512-9.

46. Giovino GA, Henningfield JE, Tomar SL, Escobedo LG, Slade J. Epidemiology of Tobacco Use and Dependence. Epidemiologic Reviews. 1995;17(1):48-65.

47. Report GI. Global Adult Tobacco Survey; India 2009-2010, Ministry of Health & Family Welfare.

48. Tobacco use in Canada: Patterns and trends, 2013.

49. International Agency for Research on Cancer.IARC

Monographs on the Evaluation of Carcinogenic Risks to

Humans: Volume 83: Tobacco Smoke and Involuntary

Smoking. Lyon, France: International Agency for Research on

Cancer. 2004.

50. Hoffmann D, Hoffmann I. Chemistry and toxicology.In: US Department of Health and Human Services. Cigars: health effects and trends (Smoking and Tobacco Control Monograph 9). DHHS (Publ No. NIH 98-4302), 1998:55–104.

51. Kumar R, Prakash S, Kushwah AS, Vijayan VK. Breath carbon monoxide concentration in cigarette and bidi smokers in India. Indian J Chest Dis Allied Sci. 2010;52(1):19-24.

52. Reid J, Hammond D, Rynard V, Burkhalter R. Tobacco Use in Canada:

Patterns and Trends, 2015 Edition. Waterloo, ON: Propel Centre for Population Health Impact,

University of Waterloo.; 2015.

53. Pindborg Jj KJGPC, Chawla TN. Studies in oral leukoplakias. Prevalence of leukoplakia among 10 000 persons in Lucknow, India, with special reference to use of tobacco and betel nut. Bull World Health Organ. 1967;37(1):109-16.

54. Balaram P, Sridhar H, Rajkumar T, Vaccarella S, Herrero R, Nandakumar A, et al. Oral cancer in southern India: the influence of smoking, drinking, paan-chewing and oral hygiene. International journal of cancer. 2002;98(3):440-5.

55. Jayalekshmi PA, Gangadharan P, Akiba S, Nair RRK, Tsuji M, Rajan B. Tobacco chewing and female oral cavity cancer risk in Karunagappally cohort, India. British Journal of Cancer. 2009;100(5):848-52.

56. Madathil SA, Rousseau MC, Wynant W, Schlecht NF, Netuveli G, Franco EL, et al. Nonlinear association between betel quid chewing and oral cancer: Implications for prevention. Oral oncology. 2016;60:25-31.

57. D. Hoffmann LDS, Wynder EL. Comparative chemical analysis of indian bidi and American cigarette smoke. International journal of cancer. 1974;14:49-55.

58. Watson CH, Polzin GM, Calafat AM, Ashley DL. Determination of tar, nicotine, and carbon monoxide yields in the smoke of bidi cigarettes. Nicotine Tob Res. 2003;5(5):747-53.

59. Pakhale ea. Chemical analysis of smoke of Indian cigarettes, bidis and other indigenous forms of smoking-levels of steam-volatile phenol, hydrogen cyanide and benzo(a)pyrene. The Indian journal of chest diseases and allied sciences. 1990;32(2).

60. Pakhale SS, Sarkar S, Jayant K, Bhide SV. Carcinogenicity of Indian bidi and cigarette smoke condensate in Swiss albino mice. J Cancer Res Clin Oncol. 1988;114(6):647-9.

61. Iarc LF. IARC Monographs on the Evaluation of Carcinogenic Risks to Humans1985.

62. WHO. IARC Monographs on the evaluation of carcinogenic risks to humans- Alcohol drinking

IARC, Lyon, France; 1988.

63. Iarc. IARC monographs on evaluation of human carcinogens: Agents classified by IARC Monographs, Volumes1-109.

64. De Stavola BL, Daniel RM. Marginal structural models: the way forward for life-course epidemiology? Epidemiology (Cambridge, Mass). 2012;23(2):233-7.

65. Freedman ND, Abnet CC, Leitzmann MF, Hollenbeck AR, Schatzkin A. Prospective investigation of the cigarette smoking-head and neck cancer association by sex. Cancer. 2007;110(7):1593-601.

66. Center for disease c, Prevention. Smoking and tobacco use- The health consequences of Smoking. 2004.

67. Wyss A, Hashibe M, Chuang SC, Lee YC, Zhang ZF, Yu GP, et al. Cigarette, cigar, and pipe smoking and the risk of head and neck cancers: pooled analysis in the International Head and Neck Cancer Epidemiology Consortium. Am J Epidemiol. 2013;178(5):679-90.

68. Shiu M, Chen TH, Eur JCP. Impact of betel quid, tobacco and alcohol on three-stage disease natural history of oral leukoplakia and cancer: implication for prevention of oral cancer. Eur J Cancer Prev. 2004;13(1).

69. Muwonge R, Ramadas K, Sankila R, Thara S, Thomas G, Vinoda J, et al. Role of tobacco smoking, chewing and alcohol drinking in the risk of oral cancer in Trivandrum, India: A nested case-control design using incident cancer cases. Oral oncology. 2008;44(5):446-54.

70. Gupta B, Johnson NW. Systematic Review and Meta-Analysis of Association of Smokeless Tobacco and of Betel Quid without Tobacco with Incidence of Oral Cancer in South Asia and the Pacific. PLoS ONE. 2014;9(11):e113385.

71. Awang MN, Singapore Med J. Betel quid and oral carcinogenesis. Singapore Med J. 1988;29:589-93.

72. Balaram P, Sridhar H, Rajkumar T, Vaccarella S, Herrero R, Nandakumar A, et al. Oral cancer in southern India: The influence of smoking, drinking, paan-chewing and oral hygiene. International journal of cancer. 2002;98(3):440-5.

73. Chandra et.al P, Drug Alcohol D. Prevalence and correlates of areca nut use among psychiatric patients in India. Drug and Alcohol Dependence. 2003;69(3):311-6.

74. Sankaranarayanan R, Duffy SW, Nair MK, Padmakumary G, Day NE. Tobacco and alcohol as risk factors in cancer of the larynx in Kerala, India. International journal of cancer. 1990;45(5):879-82.

75. Rahman M, Sakamoto J, Fukui T. Bidi smoking and oral cancer: a meta-analysis. International journal of cancer Journal international du cancer. 2003;106(4):600-4.

76. Abdoul Hossain MMDDB. Risk for oral cancer associated to smoking, smokeless and oral dip products. Indian J Public Health. 2012;Jan-Mar;56(1):57-60.

77. Tobacco chewing and female oral cavity cancer risk in Karunagappally cohort, India. Br J Cancer. 2009;100(5):848-52.

78. Schlecht NF, Franco EL, Pintos J, Kowalski LP. Effect of smoking cessation and tobacco type on the risk of cancers of the upper aero-digestive tract in Brazil. Epidemiology (Cambridge, Mass). 1999;10(4):412-8.

79. Schlecht NF, Franco EL, Pintos J, Negassa A, Kowalski LP, Oliveira BV, et al. Interaction between Tobacco and Alcohol Consumption and the Risk of Cancers of the Upper Aero-Digestive Tract in Brazil. American Journal of Epidemiology. 1999;150(11):1129-37.

80. Znaor A, Brennan P, Gajalakshmi V, Mathew A, Shanta V, Varghese C, et al. Independent and combined effects of tobacco smoking, chewing and alcohol drinking on the risk of oral, pharyngeal and esophageal cancers in Indian men. International journal of cancer Journal international du cancer. 2003;105(5):681-6.

81. Polesel J, Talamini R, La Vecchia C, Levi F, Barzan L, Serraino D, et al. Tobacco smoking and the risk of upper aero-digestive tract cancers: A reanalysis of case-control studies using spline models. International journal of cancer Journal international du cancer. 2008;122(10):2398-402.

82. Guha N, Warnakulasuriya S, Vlaanderen J, Straif K. Betel quid chewing and the risk of oral and oropharyngeal cancers: a meta-analysis with implications for cancer control. International journal of cancer. 2014;135(6):1433-43.

83. Lee CH, Lee KW, Fang FM, Wu DC, Tsai SM, Chen PH, et al. The neoplastic impact of tobacco-free betel-quid on the histological type and the anatomical site of aerodigestive tract cancers. International journal of cancer. 2012;131(5):E733-43.

84. Blot WJ, McLaughlin JK, Winn DM, Austin DF, Greenberg RS, Preston-Martin S, et al. Smoking and drinking in relation to oral and pharyngeal cancer. Cancer Res. 1988;48(11):3282-7.

85. Lewin F, Norell SE, Johansson H, Gustavsson P, Wennerberg J, Biorklund A, et al. Smoking tobacco, oral snuff, and alcohol in the etiology of squamous cell carcinoma of the head and neck: a population-based case-referent study in Sweden. Cancer. 1998;82(7):1367-75.

86. Castellsague et.al X, Int JC. The role of type of tobacco and type of alcoholic beverage in oral carcinogenesis. International journal of cancer. 2004;108(714-749).

87. Bosetti C, Gallus S, Peto R, Negri E, Talamini R, Tavani A, et al. Tobacco smoking, smoking cessation, and cumulative risk of upper aerodigestive tract cancers. Am J Epidemiol. 2008;167(4):468-73.

88. WHO. Global status report on alcohol and health. Switzerland; 2011.

89. Canada H. Canadian Alcohol and Drug Use Monitoring Survey Canada2011 [updated 2012-08-02. Available from: http://www.hc-sc.gc.ca/hc-ps/drugs-drogues/stat/_2011/tables-tableaux-eng.php#t10.

90. Prasad R. Alcohol use on the rise in India. The Lancet.373(9657):17-8.

91. Economic review 2012- State planning board, Trivandrum, India, http://spb.kerala.gov.in/~spbuser/images/pdf/er12/Chapter4/chapter04.html. 2012.

92. Weisburger JH, Wynder EL. The role of genotoxic carcinogens and of promoters in carcinogenesis and in human cancer causation. Acta Pharmacol Toxicol (Copenh). 1984;55 Suppl 2:53-68.

93. Poschl G, Pöschl G, Seitz HK. Alcohol and Cancer. Alcohol and Alcoholism. 2004;39(3):155-65.

94. Seitz HK, Stickel F, Homann N. Pathogenetic mechanisms of upper aerodigestive tract cancer in alcoholics. International journal of cancer. 2004;108(4):483-7.

95. Doll R, Forman D, Vecchia Cl, Woutersen R. Alcoholic beverages and cancers of the digestive tract and larynx. 1999.

96. Baan R, Straif K, Grosse Y, Secretan B, El Ghissassi F, Bouvard V, et al. Carcinogenicity of alcoholic beverages. The Lancet Oncology. 2007;8(4):292-3.

97. Secretan B, Straif K, Baan R, Grosse Y, El Ghissassi F, Bouvard V, et al. A review of human carcinogens—Part E: tobacco, areca nut, alcohol, coal smoke, and salted fish. The Lancet Oncology. 2009;10(11):1033-4.

98. Boffetta P, Hashibe M. Alcohol and cancer. The Lancet Oncology. 2006;7(2):149-56.

99. Chang JS, Straif K, Guha N. The role of alcohol dehydrogenase genes in head and neck cancers: a systematic review and meta-analysis of ADH1B and ADH1C. Mutagenesis. 2012;27(3):275-86.

100. Connor J. Alcohol consumption as a cause of cancer. Addiction. 2016:n/a-n/a.

101. Turati F, Garavello W, Tramacere I, Pelucchi C, Galeone C, Bagnardi V, et al. A meta-analysis of alcohol drinking and oral and pharyngeal cancers: results from subgroup analyses. Alcohol and alcoholism (Oxford, Oxfordshire). 2012;48(1):107-18.

102. Talamini R, La Vecchia C Fau – Levi F, Levi F Fau – Conti E, Conti E Fau – Favero A, Favero A Fau – Franceschi S, Franceschi S. Cancer of the oral cavity and pharynx in nonsmokers who drink alcohol and in nondrinkers who smoke tobacco. (0027-8874 (Print)).

103. Schlecht NF, Pintos J, Kowalski LP, Franco EL. Effect of type of alcoholic beverage on the risks of upper aerodigestive tract cancers in Brazil. Cancer Causes Control. 2001;12(7):579-87.

104. Bagnardi V, Blangiardo M Fau – La Vecchia C, La Vecchia C Fau – Corrao G, Corrao G. A meta-analysis of alcohol drinking and cancer risk. (0007-0920 (Print)).

105. Hashibe M, Brennan P, Chuang SC, Boccia S, Castellsague X, Chen C, et al. Interaction between tobacco and alcohol use and the risk of head and neck cancer: pooled analysis in the International Head and Neck Cancer Epidemiology Consortium. Cancer Epidemiol Biomarkers Prev. 2009;18(2):541-50.

106. Maasland DH, van den Brandt Pa Fau – Kremer B, Kremer B Fau – Goldbohm RAS, Goldbohm Ra Fau – Schouten LJ, Schouten LJ. Alcohol consumption, cigarette smoking and the risk of subtypes of head-neck cancer: results from the Netherlands Cohort Study. (1471-2407 (Electronic)).

107. Polesel J, Dal Maso L, Bagnardi V, Zucchetto A, Zambon A, Levi F, et al. Estimating dose-response relationship between ethanol and risk of cancer using regression spline models. International journal of cancer. 2005;114(5):836-41.

108. Dal Maso L, Torelli N, Biancotto E, Di Maso M, Gini A, Franchin G, et al. Combined effect of tobacco smoking and alcohol drinking in the risk of head and neck cancers: a re-analysis of case-control studies using bi-dimensional spline models. Eur J Epidemiol. 2016;31(4):385-93.

109. Bagnardi V, Rota M, Botteri E, Tramacere I, Islami F, Fedirko V, et al. Alcohol consumption and site-specific cancer risk: a comprehensive dose-response meta-analysis. Br J Cancer. 2015;112(3):580-93.

110. Government of Kerala Vision 2030, Chapter 4- Health sector.

111. Lubin JH, Purdue M, Kelsey K, Zhang ZF, Winn D, Wei Q, et al. Total exposure and exposure rate effects for alcohol and smoking and risk of head and neck cancer: a pooled analysis of case-control studies. Am J Epidemiol. 2009;170(8):937-47.

112. Hsu TC, Spitz MR, Schantz SP. Mutagen sensitivity: a biological marker of cancer susceptibility. Cancer Epidemiology Biomarkers &amp;amp; Prevention. 1991;1(1):83.

113. Ho T, Wei Q, Sturgis EM. Epidemiology of carcinogen metabolism genes and risk of squamous cell carcinoma of the head and neck. Head Neck. 2007;29(7):682-99.

114. Harris CC, Mulvihill JJ, Thorgeirsson SS, Minna JD. Individual differences in cancer susceptibility. Ann Intern Med. 1980;92(6):809-25.

115. Luch A. Nature and nurture – lessons from chemical carcinogenesis. Nat Rev Cancer. 2005;5(2):113-25.

116. Perera FP. Molecular Epidemiology: Insights Into Cancer Susceptibility, Risk Assessment, and Prevention. Journal of the National Cancer Institute. 1996;88(8):496-509.

117. Friedlander PL. Genomic instability in head and neck cancer patients. Head Neck. 2001;23(8):683-91.

118. Sturgis EM, Wei Q. Genetic susceptibility–molecular epidemiology of head and neck cancer. Curr Opin Oncol. 2002;14(3):310-7.

119. Chen YC, Hunter DJ. Molecular epidemiology of cancer. CA Cancer J Clin. 2005;55(1):45-54; quiz 7.

120. He Y, Hoskins JM, McLeod HL. Copy Number Variants in pharmacogenetic genes. Trends in molecular medicine. 2011;17(5):244-51.

121. Hecht SS, Hoffmann D. Tobacco-specific nitrosamines, an important group of carcinogens in tobacco and tobacco smoke. Carcinogenesis. 1988;9(6):875-84.

122. Costa LED. Gene–Environment Interactions: Fundamentals of Ecogenetics. National Institute of Environmental Health Science; 2006. p. A382-A.

123. Hunter DJ. Gene-environment interactions in human diseases. Nature reviews Genetics. 2005;6(4):287-98.

124. Singh MS, Michael M. Role of xenobiotic metabolic enzymes in cancer epidemiology. Methods Mol Biol. 2009;472:243-64.

125. Ingelman-Sundberg M, Oscarson M, McLellan RA. Polymorphic human cytochrome P450 enzymes: an opportunity for individualized drug treatment. Trends in Pharmacological Sciences. 1999;20(8):342-9.

126. Dong LM, Potter JD, White E, Ulrich CM, Cardon LR, Peters U. CLINICIAN ’ S CORNER The Role of Polymorphisms in Candidate Genes. 2013;299(20):2423-36.

127. Ma Q, Lu AYH. CYP1A Induction and Human Risk Assessment: An Evolving Tale of in Vitro and in Vivo Studies. Drug Metabolism and Disposition. 2007;35(7):1009-16.

128. Le Gal A, Dreano Y, Lucas D, Berthou F. Diversity of selective environmental substrates for human cytochrome P450 2A6: alkoxyethers, nicotine, coumarin, N-nitrosodiethylamine, and N-nitrosobenzylmethylamine. Toxicol Lett. 2003;144(1):77-91.

129. Bozina N, Bradamante V, Lovric M. Genetic polymorphism of metabolic enzymes P450 (CYP) as a susceptibility factor for drug response, toxicity, and cancer risk. Arh Hig Rada Toksikol. 2009;60(2):217-42.

130. Sim SC, Ingelman-Sundberg M. The Human Cytochrome P450 (CYP) Allele Nomenclature website: a peer-reviewed database of CYP variants and their associated effects. Human genomics. 2010;4(4):278-81.

131. Costa CG, Eaton DL. Chapter 1. Introduction; Gene-Environment Interactions: Fundamentals of Ecogenetics2006. 2-6 p.

132. Qin J, Zhang J-X, Li X-P, Wu B-Q, Chen G-B, He X-F. Association between the CYP1A1 A2455G polymorphism and risk of cancer: evidence from 272 case–control studies. Tumor Biology. 2014;35(4):3363-76.

133. He X-F, Wei W, Liu Z-Z, Shen X-L, Yang X-B, Wang S-L, et al. Association between the CYP1A1 T3801C polymorphism and risk of cancer: Evidence from 268 case–control studies. Gene. 2014;534(2):324-44.

134. Lu D, Yu X, Du Y. Meta-analyses of the effect of cytochrome P450 2E1 gene polymorphism on the risk of head and neck cancer. Mol Biol Rep. 2011;38(4):2409-16.

135. Tang K, Li Y, Zhang Z, Gu Y, Xiong Y, Feng G, et al. The PstI/RsaI and DraI polymorphisms of CYP2E1 and head and neck cancer risk: a meta-analysis based on 21 case-control studies. BMC Cancer. 2010;10:575.

136. Hashibe M, Boffetta P, Zaridze D, Shangina O, Szeszenia-Dabrowska N, Mates D, et al. Evidence for an important role of alcohol- and aldehyde-metabolizing genes in cancers of the upper aerodigestive tract. Cancer Epidemiol Biomarkers Prev. 2006;15(4):696-703.

137. Lang J, Song X, Cheng J, Zhao S, Fan J. Association of GSTP1 Ile105Val Polymorphism and Risk of Head and Neck Cancers : A Meta-Analysis of 28 Case- Control Studies. 2012;7(11).

138. Shukla P, Gupta D, Pant MC, Parmar D. CYP 2D6 polymorphism: a predictor of susceptibility and response to chemoradiotherapy in head and neck cancer. J Cancer Res Ther. 2012;8(1):40-5.

139. Tripathy CB, Roy N. Meta-analysis of glutathione S-transferase M1 genotype and risk toward head and neck cancer. Head & neck. 2006;28(3):217-24.

140. Khlifi R, Messaoud O, Rebai A, Hamza-Chaffai A. Polymorphisms in the Human Cytochrome P450 and Arylamine N-Acetyltransferase: Susceptibility to Head and Neck Cancers. BioMed Research International. 2013;2013:20.

141. San Jose C, Cabanillas A, Benitez J, Carrillo JA, Jimenez M, Gervasini G. CYP1A1 gene polymorphisms increase lung cancer risk in a high-incidence region of Spain: a case control study. BMC Cancer. 2010;10:463-.

142. Hashibe M, Brennan P, Strange RC, Bhisey R, Cascorbi I, Lazarus P, et al. Meta- and Pooled Analysis of GSTM1, GSTT1, GSTP1 and CYP1A1 Genotypes and Risk of Head and Neck Cancer. 2003:1509-17.

143. Garte S, Gaspari L, Alexandrie AK. Metabolic gene polymorphism frequencies in control populations. Cancer Epidemiol Biomarkers Prev. 2001;10.

144. Boccia S, Cadoni G, Sayed-Tabatabaei FA, Volante M, Arzani D, De Lauretis A, et al. CYP1A1, CYP2E1, GSTM1, GSTT1, EPHX1 exons 3 and 4, and NAT2 polymorphisms, smoking, consumption of alcohol and fruit and vegetables and risk of head and neck cancer. J Cancer Res Clin Oncol. 2008;134(1):93-100.

145. Rojas M, Cascorbi I, Alexandrov K, Kriek E, Auburtin G, Mayer L, et al. Modulation of benzo[a]pyrene diolepoxide-DNA adduct levels in human white blood cells by CYP1A1, GSTM1 and GSTT1 polymorphism. Carcinogenesis. 2000;21(1):35-41.

146. Liu L, Wu G, Xue F, Li Y, Shi J, Han J. Functional CYP1A1 genetic variants , alone and in combination with smoking , contribute to development of head and neck cancers. European Journal of Cancer. 2013;49(9):2143-51.

147. Olivieri EHR, da Silva SD, Mendonça FF, Urata YN, Vidal DO, Faria MDAM, et al. CYP1A2*1C, CYP2E1*5B, and GSTM1 polymorphisms are predictors of risk and poor outcome in head and neck squamous cell carcinoma patients. Oral oncology. 2009;45(9):e73-9.

148. Qin J, Zhang J-x, Li X-p, Wu B-q. Association between the CYP1A1 A2455G polymorphism and risk of cancer : evidence from 272 case – control studies. 2013;450.

149. Anantharaman D, Chaubal PM, Kannan S, Bhisey Ra, Mahimkar MB. Susceptibility to oral cancer by genetic polymorphisms at CYP1A1, GSTM1 and GSTT1 loci among Indians: tobacco exposure as a risk modulator. Carcinogenesis. 2007;28(7):1455-62.

150. Brunotto M, Zarate AM, Bono A, Barra JL, Berra S. Risk genes in head and neck cancer: a systematic review and meta-analysis of last 5 years. Oral oncology. 2014;50(3):178-88.

151. Chatterjee S, Dhar S, Sengupta B, Ghosh A, De M, Roy S, et al. Polymorphisms of CYP1A1, GSTM1 and GSTT1 Loci as the Genetic Predispositions of Oral Cancers and Other Oral Pathologies: Tobacco and Alcohol as Risk Modifiers. Indian journal of clinical biochemistry : IJCB. 2010;25(3):260-72.

152. Hashibe M, Brennan P, Strange RC, Bhisey R, Cascorbi I, Lazarus P, et al. Meta- and pooled analyses of GSTM1, GSTT1, GSTP1, and CYP1A1 genotypes and risk of head and neck cancer. Cancer Epidemiol Biomarkers Prev. 2003;12(12):1509-17.

153. Varela-Lema L, Taioli E, Ruano-Ravina A, Barros-Dios JM, Benhamou S, Bhisey RA, et al. Meta- and pooled analysis of GSTM1 and CYP1A1 polymorphisms and oropharyngeal cancer: a HuGE-GSEC review. Genetics in medicine : official journal of the American College of Medical Genetics. 2008;10(6):369-84.

154. Wang Y, Yang H, Duan G, Wang H. The association of the CYP1A1 Ile462Val polymorphism with head and neck cancer risk: evidence based on a cumulative meta-analysis. OncoTargets and therapy. 2016;9:2927-34.

155. Xie S, Luo C, Shan X, Zhao S, He J, Cai Z. CYP1A1 MspI polymorphism and the risk of oral squamous cell carcinoma: Evidence from a meta-analysis. Mol Clin Oncol. 2016;4(4):660-6.

156. Knol MJ, VanderWeele TJ. Recommendations for presenting analyses of effect modification and interaction. International journal of epidemiology. 2012;41(2):514-20.

157. Howard LA, Micu AL, Sellers EM, Tyndale RF. Low doses of nicotine and ethanol induce CYP2E1 and chlorzoxazone metabolism in rat liver. J Pharmacol Exp Ther. 2001;299(2):542-50.

158. Yao K, Qin H, Gong L, Zhang R, Li L. CYP2E1 polymorphisms and nasopharyngeal carcinoma risk: a meta-analysis. Eur Arch Otorhinolaryngol. 2016.

159. Fu P, Yang F, Li B, Zhang B, Guan L, Sheng J, et al. Meta-analysis of CYP2E1 polymorphisms in liver carcinogenesis. Digestive and Liver Disease.

160. Zhang MX, Liu K, Wang FG, Wen XW, Song XL. Association between CYP2E1 polymorphisms and risk of gastric cancer: An updated meta-analysis of 32 case-control studies. Mol Clin Oncol. 2016;4(6):1031-8.

161. Boccia S, Cadoni G, Sayed-Tabatabaei FA, Volante M, Arzani D, De Lauretis A, et al. CYP1A1, CYP2E1, GSTM1, GSTT1, EPHX1 exons 3 and 4, and NAT2 polymorphisms, smoking, consumption of alcohol and fruit and vegetables and risk of head and neck cancer. J Cancer Res Clin Oncol. 2008;134.

162. Soya SS, Vinod T, Reddy KS, Gopalakrishnan S, Adithan C. CYP2E1 polymorphisms and gene-environment interactions in the risk of upper aerodigestive tract cancers among Indians. Pharmacogenomics. 2008;9.

163. Ruwali M, Khan AJ, Shah PP, Singh AP, Pant MC, Parmar D. Cytochrome P450 2E1 and head and neck cancer: interaction with genetic and environmental risk factors. Environ Mol Mutagen. 2009;50(6):473-82.

164. Gattá GJF, de Carvalho MB, Siraque MS. Genetic polymorphisms of CYP1A1, CYP2E1, GSTM1, and GSTT1 associated with head and neck cancer. Head & Neck. 2006;28.

165. Yamazaki H, Inui Y, Yun CH, Guengerich FP, Shimada T. Cytochrome P450 2E1 and 2A6 enzymes as major catalysts for metabolic activation of N-nitrosodialkylamines and tobacco-related nitrosamines in human liver microsomes. Carcinogenesis. 1992;13(10):1789-94.

166. Hung HC, Chuang J, Chien YC, Hildesheim A. Genetic polymorphisms of CYP2E1, GSTM1, and GSTT1;environmental factors and risk of oral cancer. 1997:901-5.

167. Maurya SS, Anand G, Dhawan A, Khan AJ, Jain SK, Pant MC, et al. Polymorphisms in drug-metabolizing enzymes and risk to head and neck cancer: evidence for gene-gene and gene-environment interaction. Environ Mol Mutagen. 2014;55(2):134-44.

168. Zhuo X, Song J, Liao J, Zhou W, Ye H, Li Q, et al. Does CYP2E1 RsaI/PstI polymorphism confer head and neck carcinoma susceptibility?: A meta-analysis based on 43 studies. Medicine (Baltimore). 2016;95(43):e5156.

169. Tang K, Li Y, Zhang Z, Gu Y, Xiong Y, Feng G, et al. The PstI/RsaI and DraI polymorphisms of CYP2E1and head and neck cancer risk: a meta-analysis based on 21 case-control studies. BMC Cancer. 2010;10(1):575.

170. Cury NM, Russo A, Galbiatti AL, Ruiz MT, Raposo LS, Maniglia JV, et al. Polymorphisms of the CYP1A1 and CYP2E1 genes in head and neck squamous cell carcinoma risk. Mol Biol Rep. 2012;39(2):1055-63.

171. Buchard A, Sanchez JJ, Dalhoff K, Morling N. Multiplex PCR detection of GSTM1, GSTT1, and GSTP1 gene variants: simultaneously detecting GSTM1 and GSTT1 gene copy number and the allelic status of the GSTP1 Ile105Val genetic variant. The Journal of molecular diagnostics : JMD. 2007;9(5):612-7.

172. Hezova R, Bienertova-Vasku J, Sachlova M, Brezkova V, Vasku A, Svoboda M, et al. Common polymorphisms in GSTM1, GSTT1, GSTP1, GSTA1 and susceptibility to colorectal cancer in the Central European population. European Journal of Medical Research. 2012;17(1):17.

173. Agúndez JAG, García-Martín E, Martínez C, Benito-León J, Millán-Pascual J, Díaz-Sánchez M, et al. The GSTP1 gene variant rs1695 is not associated with an increased risk of multiple sclerosis. Cellular and Molecular Immunology. 2015;12(6):777-9.

174. Ryberg D, Skaug V, Hewer A, Phillips DH, Harries LW, Wolf CR, et al. Genotypes of glutathione transferase M1 and P1 and their significance for lung DNA adduct levels and cancer risk. Carcinogenesis. 1997;18(7):1285-9.

175. Zhang Z-j, Hao K, Shi R, Zhao G, Jiang G-x, Song Y, et al. Glutathione S-transferase M1 (GSTM1) and glutathione S-transferase T1 (GSTT1) null polymorphisms, smoking, and their interaction in oral cancer: a HuGE review and meta-analysis. American journal of epidemiology. 2011;173(8):847-57.

176. Hu X, Xia H, Srivastava SK, Herzog C, Awasthi YC, Ji X, et al. Activity of four allelic forms of glutathione S-transferase hGSTP1-1 for diol epoxides of polycyclic aromatic hydrocarbons. Biochem Biophys Res Commun. 1997;238(2):397-402.

177. Saarikoski ST, Voho A, Reinikainen M, Anttila S, Karjalainen A, Malaveille C, et al. Combined effect of polymorphic GST genes on individual susceptibility to lung cancer. International journal of cancer. 1998;77(4):516-21.

178. Ruwali M, Singh M, Pant MC, Parmar D. Polymorphism in glutathione S-transferases: susceptibility and treatment outcome for head and neck cancer. Xenobiotica. 2011;41(12):1122-30.

179. Feuk L, Carson AR, Scherer SW. Structural variation in the human genome. Nature reviews Genetics. 2006;7(2):85-97.

180. CYP2D6 allele nomenclature [Internet]. Available from: http://www.cypalleles.ki.se/cyp2d6.htm.

181. Zhuo W, Wang Y, Zhuo X, Zhu Y, Wang W, Zhu B, et al. CYP1A1 and GSTM1 polymorphisms and oral cancer risk: association studies via evidence-based meta-analyses. Cancer investigation. 2009;27(1):86-95.

182. Neafsey P, Ginsberg G, Hattis D, Sonawane B. Genetic Polymorphism in Cytochrome P450 2D6 (CYP2D6): Population Distribution of CYP2D6 Activity. Journal of Toxicology and Environmental Health, Part B. 2009;12(5-6):334-61.

183. Hoskins JM, Carey LA, McLeod HL. CYP2D6 and tamoxifen: DNA matters in breast cancer. Nat Rev Cancer. 2009;9(8):576-86.

184. Solus JF, Arietta BJ, Harris JR, Sexton DP, Steward JQ, McMunn C, et al. Genetic variation in eleven phase I drug metabolism genes in an ethnically diverse population. Pharmacogenomics. 2004;5(7):895-931.

185. Carol B, Suzanne C. CYP2D6 and smoking behaviour 1997.pdf. Pharmacogenetics (1997). 1997;7:411-4.

186. Agundez JA, Gallardo L, Ledesma MC, Lozano L, Rodriguez-Lescure A, Pontes JC, et al. Functionally active duplications of the CYP2D6 gene are more prevalent among larynx and lung cancer patients. Oncology. 2001;61(1):59-63.

187. Yadav SS, Ruwali M, Pant MC, Shukla P, Singh RL, Parmar D. Interaction of drug metabolizing cytochrome P450 2D6 poor metabolizers with cytochrome P450 2C9 and 2C19 genotypes modify the susceptibility to head and neck cancer and treatment response. Mutation Research/Fundamental and Molecular Mechanisms of Mutagenesis. 2010;684(1–2):49-55.

188. Gajecka M, Rydzanicz M, Jaskula-Sztul R, Kujawski M, Szyfter W, Szyfter K. CYP1A1, CYP2D6, CYP2E1, NAT2, GSTM1 and GSTT1 polymorphisms or their combinations are associated with the increased risk of the laryngeal squamous cell carcinoma. Mutat Res. 2005;574.

189. Tripathy CB, Roy N. Meta-analysis of glutathione S-transferase M1 genotype and risk toward head and neck cancer. Head Neck. 2006;28(3):217-24.

190. Sundberg K, Dreij K, Seidel A, Jernström B. Glutathione Conjugation and DNA Adduct Formation of Dibenzo[a,l]pyrene and Benzo[a]pyrene Diol Epoxides in V79 Cells Stably Expressing Different Human Glutathione Transferases. Chemical Research in Toxicology. 2002;15(2):170-9.

191. Hiyama T, Yoshihara M, Tanaka S, Chayama K. Genetic polymorphisms and head and neck cancer risk ( Review ). 2008:945-73.

192. Zhuo W, Wang Y, Zhuo X, Zhu Y, Wang W, Zhu B, et al. CYP1A1 and GSTM1 polymorphisms and oral cancer risk: association studies via evidence-based meta-analyses. Cancer Invest. 2009;27(1):86-95.

193. Suzen HS, Guvenc G, Turanli M, Comert E, Duydu Y, Elhan A. The role of GSTM1 and GSTT1 polymorphisms in head and neck cancer risk. Oncol Res. 2007;16(9):423-9.

194. Choudhury JH, Singh SA, Kundu S, Choudhury B, Talukdar FR, Srivasta S, et al. Tobacco carcinogen-metabolizing genes CYP1A1, GSTM1, and GSTT1 polymorphisms and their interaction with tobacco exposure influence the risk of head and neck cancer in Northeast Indian population. Tumour Biol. 2015;36(8):5773-83.

195. Huang RS, Chen P, Wisel S, Duan S, Zhang W, Cook EH, et al. Population-specific GSTM1 copy number variation. Human molecular genetics. 2009;18(2):366-72.

196. Zhang X, Huang M, Wu X, Kadlubar S, Lin J, Yu X, et al. GSTM1 copy number and promoter haplotype as predictors for risk of recurrence and/or second primary tumor in patients with head and neck cancer. Pharmacogenomics and Personalized Medicine. 2013;6:9-17.

197. Sharma R, Ahuja M, Panda NK, Khullar M. Interactions among genetic variants in tobacco metabolizing genes and smoking are associated with head and neck cancer susceptibility in North Indians. DNA Cell Biol. 2011;30(8):611-6.

198. Zhang X, Lin J, Wu X, Lin Z, Ning B, Kadlubar S, et al. Association between GSTM1 copy number, promoter variants and susceptibility to urinary bladder cancer. International Journal of Molecular Epidemiology and Genetics. 2012;3(3):228-36.

199. Emeville E, Broquere C, Brureau L, Ferdinand S, Blanchet P, Multigner L, et al. Copy number variation of GSTT1 and GSTM1 and the risk of prostate cancer in a Caribbean population of African descent. PLoS One. 2014;9(9):e107275.

200. Flay BR.

Youth tobacco use: risk patterns, and control J. Slade, C.T. Orleans (Eds.), Nicotine Addiction: Principles and  Management, Oxford University Press, New York 1993:653–61.

201. Tyas SL, Pederson LL. Psychosocial factors related to adolescent smoking: a critical review of the literature. Tobacco Control. 1998;7(4):409-20.

202. Carter B, Long T, Cinciripini P. A meta-analytic review of the CYP2A6 genotype and smoking behavior. Nicotine & tobacco research : official journal of the Society for Research on Nicotine and Tobacco. 2004;6(2):221-7.

203. Mayhew KP, Flay BR, Mott JA. Stages in the development of adolescent smoking. Drug and Alcohol Dependence. 2000;59, Supplement 1:61-81.

204. Benowitz NL, Hukkanen J, Jacob P. Nicotine Chemistry, Metabolism, Kinetics and Biomarkers. Handbook of experimental pharmacology. 2009(192):29-60.

205. Messina ES, Tyndale RF, Sellers EM. A Major Role for CYP2A6 in Nicotine C-Oxidation by Human Liver Microsomes. Journal of Pharmacology and Experimental Therapeutics. 1997;282(3):1608.

206. Raunio H, Rautio A, Gullstén H, Pelkonen O. Polymorphisms of CYP2A6 and its practical consequences. British Journal of Clinical Pharmacology. 2001;52(4):357-63.

207. Malaiyandi V, Sellers EM, Tyndale RF. Implications of CYP2A6 Genetic Variation for Smoking Behaviors and Nicotine Dependence. Clinical Pharmacology & Therapeutics. 2005;77(3):145-58.

208. Benowitz NL, Swan GE, Jacob P, Lessov-Schlaggar CN, Tyndale RF. CYP2A6 genotype and the metabolism and disposition kinetics of nicotine. Clinical Pharmacology & Therapeutics. 2006;80(5):457-67.

209. Nakajima M, Fukami T, Yamanaka H, Higashi E, Sakai H, Yoshida R, et al. Comprehensive evaluation of variability in nicotine metabolism and CYP2A6 polymorphic alleles in four ethnic populations. Clinical Pharmacology & Therapeutics. 2006;80(3):282-97.


211. Pianezza ML, Sellers EM, Tyndale RF. Nicotine metabolism defect reduces smoking. Nature. 1998;393(6687):750.

212. Rao Y, Hoffmann E, Zia M, Bodin L, Zeman M, Sellers EM, et al. Duplications and defects in the CYP2A6 gene: identification, genotyping, and in vivo effects on smoking. Mol Pharmacol. 2000;58(4):747-55.

213. Schoedel Ka, Hoffmann EB, Rao Y, Sellers EM, Tyndale RF. Ethnic variation in CYP2A6 and association of genetically slow nicotine metabolism and smoking in adult Caucasians. Pharmacogenetics. 2004;14(9):615-26.

214. Munafò MR, Clark TG, Johnstone EC, Murphy MFG, Walton RT. The genetic basis for smoking behavior: A systematic review and meta-analysis. Nicotine & Tobacco Research. 2004;6(4):583-97.

215. Pan L, Yang X, Li S, Jia C. Association of CYP2A6 gene polymorphisms with cigarette consumption: a meta-analysis. Drug Alcohol Depend. 2015;149:268-71.

216. Sabol SZ, Hamer DH. An Improved Assay Shows No Association Between the CYP2A6 Gene and Cigarette Smoking Behavior. Behavior Genetics. 1999;29(4):257-61.

217. Tiihonen J, Pesonen U, Kauhanen J, Koulu M, Hallikainen T, Leskinen L, et al. CYP2A6 genotype and smoking. Molecular psychiatry. 2000;5(4):347-8.

218. Loriot MA, Rebuissou S, Oscarson M, Cenee S, Miyamoto M, Ariyoshi N, et al. Genetic polymorphisms of cytochrome P450 2A6 in a case-control study on lung cancer in a French population. Pharmacogenetics. 2001;11(1):39-44.

219. Oscarson M, McLellan RA, Gullsten H, Yue QY, Lang MA, Bernal ML, et al. Characterisation and PCR-based detection of a CYP2A6 gene deletion found at a high frequency in a Chinese population. FEBS Lett. 1999;448(1):105-10.

220. Strasser AA, Malaiyandi V, Hoffmann E, Tyndale RF, Lerman C. An association of CYP2A6 genotype and smoking topography. Nicotine Tob Res. 2007;9(4):511-8.

221. Thorgeirsson TE, Gudbjartsson DF, Surakka I, Vink JM, Amin N, Geller F, et al. Sequence variants at CHRNB3-CHRNA6 and CYP2A6 affect smoking behavior. Nat Genet. 2010;42(5):448-53.

222. Canova C, Hashibe M, Simonato L, Nelis M, Metspalu A, Lagiou P, et al. Genetic Associations of 115 Polymorphisms with Cancers of the Upper Aerodigestive Tract across 10 European Countries: The ARCAGE Project. Cancer Research. 2009;69(7):2956.

223. Kamataki T, Fujieda M, Kiyotani K, Iwano S, Kunitoh H. Genetic polymorphism of CYP2A6 as one of the potential determinants of tobacco-related cancer risk. Biochemical and Biophysical Research Communications. 2005;338(1):306-10.

224. Bosron WF, Li T-K. Genetic polymorphism of human liver alcohol and aldehyde dehydrogenases, and their relationship to alcohol metabolism and alcoholism. Hepatology. 1986;6(3):502-10.

225. Meyers JL, Dick DM. Genetic and Environmental Risk Factors for Adolescent-Onset Substance Use Disorders. Child and adolescent psychiatric clinics of North America. 2010;19(3):465-77.

226. Bierut Laura J. Genetic Vulnerability and Susceptibility to Substance Dependence. Neuron. 2011;69(4):618-27.

227. Bierut LJ, Goate AM, Breslau N, Johnson EO, Bertelsen S, Fox L, et al. ADH1B is associated with alcohol dependence and alcohol consumption in populations of European and African ancestry. Molecular psychiatry. 2012;17(4):445-50.

228. Macgregor S, Lind PA, Bucholz KK, Hansell NK, Madden PA, Richter MM, et al. Associations of ADH and ALDH2 gene variation with self report alcohol reactions, consumption and dependence: an integrated analysis. Human molecular genetics. 2009;18(3):580-93.

229. Hashibe M, McKay JD, Curado MP, Oliveira JC, Koifman S, Koifman R, et al. Multiple ADH genes are associated with upper aerodigestive cancers. Nat Genet. 2008;40(6):707-9.

230. Borras E, Coutelle C, Rosell A, Fernandez-Muixi F, Broch M, Crosas B, et al. Genetic polymorphism of alcohol dehydrogenase in europeans: the ADH2*2 allele decreases the risk for alcoholism and is associated with ADH3*1. Hepatology. 2000;31(4):984-9.

231. Asakage T, Yokoyama A, Haneda T, Yamazaki M, Muto M, Yokoyama T, et al. Genetic polymorphisms of alcohol and aldehyde dehydrogenases, and drinking, smoking and diet in Japanese men with oral and pharyngeal squamous cell carcinoma. Carcinogenesis. 2006;28(4):865-74.

232. Li D, Zhao H, Gelernter J. Strong association of the alcohol dehydrogenase 1B gene (ADH1B) with alcohol dependence and alcohol-induced medical diseases. Biol Psychiatry. 2011;70(6):504-12.

233. Wall TL. Genetic associations of alcohol and aldehyde dehydrogenase with alcohol dependence and their mechanisms of action. Ther Drug Monit. 2005;27(6):700-3.

234. Chen WJ, Loh EW, Hsu YP, Chen CC, Yu JM, Cheng AT. Alcohol-metabolising genes and alcoholism among Taiwanese Han men: independent effect of ADH2, ADH3 and ALDH2. Br J Psychiatry. 1996;168(6):762-7.

235. Osier M, Pakstis AJ, Kidd JR, Lee JF, Yin SJ, Ko HC, et al. Linkage disequilibrium at the ADH2 and ADH3 loci and risk of alcoholism. Am J Hum Genet. 1999;64(4):1147-57.

236. Chen CC, Lu RB, Chen YC, Wang MF, Chang YC, Li TK, et al. Interaction between the functional polymorphisms of the alcohol-metabolism genes in protection against alcoholism. American Journal of Human Genetics. 1999;65(3):795-807.

237. Thomasson HR, Edenberg HJ, Crabb DW, Mai XL, Jerome RE, Li TK, et al. Alcohol and aldehyde dehydrogenase genotypes and alcoholism in Chinese men. American Journal of Human Genetics. 1991;48(4):677-81.

238. Whitfield JB. Alcohol Dehydrogenase and Alcohol Dependence: Variation in Genotype-Associated Risk between Populations. American Journal of Human Genetics. 2002;71(5):1247-50.

239. Edenberg HJ, Xuei X, Chen HJ, Tian H, Wetherill LF, Dick DM, et al. Association of alcohol dehydrogenase genes with alcohol dependence: a comprehensive analysis. Human molecular genetics. 2006;15(9):1539-49.

240. Yokoyama A, Muramatsu T, Omori T, Yokoyama T, Matsushita S, Higuchi S, et al. Alcohol and aldehyde dehydrogenase gene polymorphisms and oropharyngolaryngeal, esophageal and stomach cancers in Japanese alcoholics. Carcinogenesis. 2001;22(3):433-9.

241. Hakenewerth AM, Millikan RC, Rusyn I, Herring AH, North KE, Barnholtz-Sloan JS, et al. Joint effects of alcohol consumption and polymorphisms in alcohol and oxidative stress metabolism genes on risk of head and neck cancer. Cancer Epidemiol Biomarkers Prev. 2011;20(11):2438-49.

242. Guo H, Zhang G, Mai R. Alcohol Dehydrogenase-1B Arg47His Polymorphism and Upper Aerodigestive Tract Cancer Risk: A Meta-Analysis Including 24,252 Subjects. Alcoholism: Clinical and Experimental Research. 2012;36(2):272-8.

243. Hiraki A, Matsuo K, Wakai K, Suzuki T, Hasegawa Y, Tajima K. Gene–gene and gene–environment interactions between alcohol drinking habit and polymorphisms in alcohol-metabolizing enzyme genes and the risk of head and neck cancer in Japan. Cancer Science. 2007;98(7):1087-91.

244. Ji YB, Lee SH, Kim KR, Park CW, Song CM, Park BL, et al. Association between ADH1B and ADH1C polymorphisms and the risk of head and neck squamous cell carcinoma. Tumor Biology. 2015;36(6):4387-96.

245. Garcia SM, Curioni OA, de Carvalho MB, Gattas GJ. Polymorphisms in alcohol metabolizing genes and the risk of head and neck cancer in a Brazilian population. Alcohol Alcohol. 2010;45(1):6-12.

246. Dong Y-J, Peng T-K, Yin S-J. Expression and activities of class IV alcohol dehydrogenase and class III aldehyde dehydrogenase in human mouth. Alcohol. 1996;13(3):257-62.

247. Muto M, Hitomi Y, Ohtsu A, Shimada H, Kashiwase Y, Sasaki H, et al. Acetaldehyde production by non-pathogenic Neisseria in human oral microflora: implications for carcinogenesis in upper aerodigestive tract. International journal of cancer. 2000;88(3):342-50.

248. Salaspuro M. Interrelationship between alcohol, smoking, acetaldehyde and cancer. Novartis Found Symp. 2007;285:80-9; discussion 9-96, 198-9.

249. Tillonen J, Homann N, Rautio M, Jousimies-Somer H, Salaspuro M. Role of yeasts in the salivary acetaldehyde production from ethanol among risk groups for ethanol-associated oral cavity cancer. Alcohol Clin Exp Res. 1999;23(8):1409-15.

250. Homann N, Jousimies-Somer H, Jokelainen K, Heine R, Salaspuro M. High acetaldehyde levels in saliva after ethanol consumption: methodological aspects and pathogenetic implications. Carcinogenesis. 1997;18(9):1739-43.

251. Homann N, Tillonen J, Meurman JH, Rintamaki H, Lindqvist C, Rautio M, et al. Increased salivary acetaldehyde levels in heavy drinkers and smokers: a microbiological approach to oral cavity cancer. Carcinogenesis. 2000;21(4):663-8.

252. Tsai ST, Wong TY, Ou CY, Fang SY, Chen KC, Hsiao JR, et al. The interplay between alcohol consumption, oral hygiene, ALDH2 and ADH1B in the risk of head and neck cancer. International journal of cancer. 2014.

253. Nasman A, Attner P, Hammarstedt L, Du J, Eriksson M, Giraud G, et al. Incidence of human papillomavirus (HPV) positive tonsillar carcinoma in Stockholm, Sweden: an epidemic of viral-induced carcinoma? International journal of cancer. 2009;125(2):362-6.

254. Hwang T-Z, Hsiao J-R, Tsai C-R, Chang JS. Incidence trends of human papillomavirus-related head and neck cancer in Taiwan, 1995–2009. International journal of cancer. 2015;137(2):395-408.

255. Kreimer AR, Clifford GM, Boyle P, Franceschi S. Human Papillomavirus Types in Head and Neck Squamous Cell Carcinomas Worldwide: A Systematic Review. Cancer Epidemiology Biomarkers & Prevention. 2005;14(2):467-75.

256. Hobbs CG, Sterne JA, Bailey M, Heyderman RS, Birchall MA, Thomas SJ. Human papillomavirus and head and neck cancer: a systematic review and meta-analysis. Clin Otolaryngol. 2006;31(4):259-66.

257. Agalliu I, Gapstur S, Chen Z, et al. Associations of oral α-, β-, and γ-human papillomavirus types with risk of incident head and neck cancer. JAMA Oncology. 2016;2(5):599-606.

258. Ang  KK, Harris  J, Wheeler  R, Weber  R, Rosenthal  DI, Nguyen-Tân  PF, et al. Human Papillomavirus and Survival of Patients with Oropharyngeal Cancer. New England Journal of Medicine. 2010;363(1):24-35.

259. Herrero R, Castellsagué X, Pawlita M, Lissowska J, Kee F, Balaram P, et al. Human Papillomavirus and Oral Cancer: The International Agency for Research on Cancer Multicenter Study. JNCI: Journal of the National Cancer Institute. 2003;95(23):1772-83.

260. Smith EM, Ritchie JM, Summersgill KF, Hoffman HT, Wang DH, Haugen TH, et al. Human Papillomavirus in Oral Exfoliated Cells and Risk of Head and Neck Cancer. JNCI Journal of the National Cancer Institute. 2004;96(6):449-55.

261. Smith EM, Rubenstein LM, Haugen TH, Pawlita M, Turek LP. Complex Etiology Underlies Risk and Survival in Head and Neck Cancer Human Papillomavirus, Tobacco, and Alcohol: A Case for Multifactor Disease. Journal of Oncology. 2012;2012:9.

262. Sinha P, Logan HL, Mendenhall WM. Human papillomavirus, smoking, and head and neck cancer. American journal of otolaryngology. 2012;33(1):130-6.

263. Marmot M. Social determinants of health inequalities. The Lancet. 2005;365(9464):1099-104.

264. Adler NE, Ostrove JM. Socioeconomic status and health: what we know and what we don’t. Annals of the New York Academy of Sciences. 1999;896:3-15.

265. Bartley M, Blane D, Montgomery S. Health and the life course: why safety nets matter. BMJ (Clinical research ed). 1997;314(7088):1194-6.

266. Blane D. Social determinants of health–socioeconomic status, social class, and ethnicity. American Journal of Public Health. 1995;85(7):903-5.

267. Conway DI, Petticrew M, Marlborough H, Berthiller J, Hashibe M, Macpherson LMD. Socioeconomic inequalities and oral cancer risk: A systematic review and meta-analysis of case-control studies. International journal of cancer. 2008;122(12):2811-9.

268. Hajat A, Kaufman JS, Rose KM, Siddiqi A, Thomas JC. Do the wealthy have a health advantage? Cardiovascular disease risk factors and wealth. Soc Sci Med. 2010;71(11):1935-42.

269. Hwang E, Johnson-Obaseki S, McDonald JT, Connell C, Corsten M. Incidence of head and neck cancer and socioeconomic status in Canada from 1992 to 2007. Oral oncology. 2013;49(11):1072-6.

270. Link BG, Phelan J. Social conditions as fundamental causes of disease. Journal of Health and Social Behavior. 1995:80-94.

271. Nicolau B, Netuveli G, Kim JW, Sheiham A, Marcenes W. A life-course approach to assess psychosocial factors and periodontal disease. J Clin Periodontol. 2007;34(10):844-50.

272. Pollitt R, Rose K, Kaufman J. Evaluating the evidence for models of life course socioeconomic factors and cardiovascular outcomes: a systematic review. BMC Public Health. 2005;5(1):7.

273. Szanton SL, Candidate CD. Allostatic Load : A Mechanism of Socioeconomic Health. 2010;7(1):7-15.

274. Warnakulasuriya S. Significant oral cancer risk associated with low socioeconomic status. Evid Based Dent. 2009;10(1):4-5.

275. Bhan N, Srivastava S, Agrawal S, Subramanyam M, Millett C, Selvaraj S, et al. Are socioeconomic disparities in tobacco consumption increasing in India? A repeated cross-sectional multilevel analysis. BMJ Open. 2012;2(5).

276. Kaufman JS, Cooper RS. Seeking Causal Explanations in Social Epidemiology. American Journal of Epidemiology. 1999;150(2):113-20.

277. Corsi DJ, Boyle MH, Lear Sa, Chow CK, Teo KK, Subramanian SV. Trends in smoking in Canada from 1950 to 2011: progression of the tobacco epidemic according to socioeconomic status and geography. Cancer causes & control : CCC. 2013.

278. Droomers M, Schrijvers CTM, Stronks K, van de Mheen D, Mackenbach JP. Educational Differences in Excessive Alcohol Consumption: The Role of Psychosocial and Material Stressors. Preventive Medicine. 1999;29(1):1-10.

279. Fone DL, Farewell DM, White J, Lyons RA, Dunstan FD. Socioeconomic patterning of excess alcohol consumption and binge drinking: a cross-sectional study of multilevel associations with neighbourhood deprivation. BMJ Open. 2013;3(4).

280. Hiscock R, Bauld L, Amos A, Fidler JA, Munafò M. Socioeconomic status and smoking: a review. Annals of the New York Academy of Sciences. 2012;1248(1):107-23.

281. Thankappan KR, Thresia CU. Tobacco use & social status in Kerala. Indian J Med Res. 2007;126(4):300-8.

282. Krieger N, Williams DR, Moss NE. Measuring social class in US public health research: concepts, methodologies, and guidelines. Annual review of public health. 1997;18(16):341-78.

283. Filmer D, Pritchett LH. Estimating Wealth Effects without Expenditure Data-or Tears: An Application to Educational Enrollments in States of India. Demography. 2001;38(1):115-32.

284. Gwatkin DR, Rutstein S, Johnson K, Suliman E, Wagstaff A, Amouzou A. Socio-economic differences in health, nutrition, and population within developing countries: an overview. Niger J Clin Pract. 2007;10(4):272-82.

285. Galobardes B, Shaw M, Lawlor DA, Lynch JW, Davey Smith G. Indicators of socioeconomic position (part 1). Journal of Epidemiology and Community Health. 2006;60(1):7-12.

286. Shaw M, Annu Rev Public H. Housing and public health. Annu Rev Public Health. 2004;25:397-418.

287. McKenzie DJ. Measuring inequality with asset indicators. Journal of Population Economics. 2005;18(2):229-60.

288. Howe LD, Hargreaves JR, Gabrysch S, Huttly SRA. Is the wealth index a proxy for consumption expenditure? A systematic review. Journal of Epidemiology and Community Health. 2009;63(11):871-7.

289. Krieger J, Higgins DL. Housing and Health: Time Again for Public Health Action. American Journal of Public Health. 2002;92(5):758-68.

290. Smith GD, Hart C, Blane D, Gillis C, Hawthorne V. Lifetime socioeconomic position and mortality: prospective observational study. BMJ (Clinical research ed). 1997;314(7080):547-52.

291. Berkman LF, Macintyre S. The measurement of social class in health studies: old measures and new formulations. IARC Sci Publ. 1997(138):51-64.

292. The DHS Program: Demographic and Health Surveys. Available from: http://www.dhsprogram.com/topics/wealth-index/Index.cfm [Internet]. Available from: http://www.dhsprogram.com/topics/wealth-index/Index.cfm.

293. Balen J, McManus DP, Li Y-S, Zhao Z-Y, Yuan L-P, Utzinger J, et al. Comparison of two approaches for measuring household wealth via an asset-based index in rural and peri-urban settings of Hunan province, China. Emerging Themes in Epidemiology. 2010;7:7-.

294. Lynch J, Kaplan G. Socioeconomic position: Oxford University Press; 2000.

295. Shavers VL. Measurement of socioeconomic status in health disparities research. J Natl Med Assoc. 2007;99(9):1013-23.

296. Liberatos P, Link BG, Kelsey JL. The measurement of social class in epidemiology. Epidemiol Rev. 1988;10:87-121.

297. Hauser RM. Measuring Socioeconomic status in childhood development: Blackwell Publishing; 1994. 1541-5 p.

298. Nair PRG. Education and Socio-Economic Change in Kerala, 1793-1947. Social Scientist. 1976;4(8):28-43.

299. AKG center for research and studies, Communist party of India (Marxist), State committee, Kerala. Education Bill. Kerala,India: AKG Center for Research and Studies; 2009 [updated 2012. Available from: http://www.cpimkerala.org/eng/education-23.php?n=1.

300. Galobardes BF, Lynch J, Smith GD, Br Med B. Measuring socioeconomic position in health research. British medical bulletin. 2007;81-82(1):21-37.

301. Lynge E. Unemployment and cancer: a literature review. IARC Sci Publ. 1997(138):343-51.

302. Robertson T, Popham F, Benzeval M. Socioeconomic position across the lifecourse & allostatic load: data from the West of Scotland Twenty-07 cohort study. BMC Public Health. 2014;14(1):184.

303. Galobardes B, Lynch JW, Davey Smith G. Childhood Socioeconomic Circumstances and Cause-specific Mortality in Adulthood: Systematic Review and Interpretation. Epidemiologic Reviews. 2004;26(1):7-21.

304. Bernabe E, Suominen AL, Nordblad A, Vehkalahti MM, Hausen H, Knuuttila M, et al. Education level and oral health in Finnish adults: evidence from different lifecourse models. J Clin Periodontol. 2011;38(1):25-32.

305. Brennan DS, Spencer AJ. Income-based life-course models of caries in 30-year-old Australian adults. Community Dent Oral Epidemiol. 2015;43(3):262-71.

306. Johnson S, McDonald JT, Corsten M, Rourke R. Socio-economic status and head and neck cancer incidence in Canada: A case-control study. Oral oncology. 2010;46(3):200-3.

307. Madani AH, Dikshit M, Bhaduri D, Jahromi AS. Relationship between Selected Socio-Demographic Factors and Cancer of Oral Cavity – A Case Control Study. Cancer Inform. 2010;9:163-8.

308. Krishna Rao S, Mejia GC, Roberts-Thomson K, Logan RM, Kamath V, Kulkarni M, et al. Estimating the effect of childhood socioeconomic disadvantage on oral cancer in India using marginal structural models. Epidemiology (Cambridge, Mass). 2015;26(4):509-17.

309. Conway et.al DI, Conway DI, McMahon AD, Smith K, Black R, Robertson G, et al. Components of socioeconomic risk associated with head and neck cancer: a population-based case-control study in Scotland. Br J Oral Maxillofacial Surg. 2010;48(1):11-7.

310. Kaufman J. Progress and pitfalls in the social epidemiology of cancer. Cancer Causes & Control. 1999;10(6):489-94.

311. Stringhini S, Sabia S, Shipley M, et al. ASsociation of socioeconomic position with health behaviors and mortality. Jama. 2010;303(12):1159-66.

312. Global status report on alcohol and health, World Health Organisation. WHO:Management of substance abuse, 2014. Available form: http://www.who.int/substance_abuse/publications/global_alcohol_report/en/. 2014.

313. VanderWeele TJ, Jackson JW, Li S. Causal inference and longitudinal data: a case study of religion and mental health. Social Psychiatry and Psychiatric Epidemiology. 2016:1-10.

314. Kuh D, Ben-Shlomo Y. A life course approach to chronic disease epidemiology. Oxford; New York: Oxford University Press; 1997.

315. Kuh D, Ben-Shlomo Y, Lynch J, Hallqvist J, Power C. Life course epidemiology. Journal of Epidemiology and Community Health. 2003;57(10):778.

316. Forsdahl A. Are poor living conditions in childhood and adolescence an important risk factor for arteriosclerotic heart disease? British Journal of Preventive &amp;amp; Social Medicine. 1977;31(2):91.

317. Wadsworth ME, Cripps HA, Midwinter RE, Colley JR. Blood pressure in a national birth cohort at the age of 36 related to social and familial factors, smoking, and body mass. Br Med J (Clin Res Ed). 1985;291(6508):1534-8.

318. M.Wadsworth. The imprint of time: childhood, history and adult life: Oxford: Clarendon Press.; 1991.

319. Barker DJ. The fetal and infant origins of adult disease. BMJ : British Medical Journal. 1990;301(6761):1111-.

320. Barker DJ, Osmond C. Infant mortality, childhood nutrition, and ischaemic heart disease in England and Wales. Lancet. 1986;1(8489):1077-81.

321. Osmond C, Barker DJ, Winter PD, Fall CH, Simmonds SJ. Early growth and death from cardiovascular disease in women. BMJ : British Medical Journal. 1993;307(6918):1519-24.

322. Kuh D, Ben-Shlomo Y, Lynch J, Hallqvist J, Power C. Life course epidemiology. Journal of Epidemiology and Community Health. 2003;57(10):778-83.

323. Ben-Shlomo Y, Kuh D. A life course approach to chronic disease epidemiology: conceptual models, empirical challenges and interdisciplinary perspectives. International Journal of Epidemiology. 2002;31(2):285-93.

324. Blane D, Netuveli G, Stone J. The development of life course epidemiology. Revue d’epidemiologie et de sante publique. 2007;55(1):31-8.

325. McEwen BS. Stress, adaptation, and disease. Allostasis and allostatic load. Annals of the New York Academy of Sciences. 1998;840:33-44.

326. Power C, Hertzman C. Social and biological pathways linking early life and adult disease. British medical bulletin. 1997;53(1):210-21.

327. Hart CL, Davey Smith G, Blane D. Social mobility and 21 year mortality in a cohort of Scottish men. Social Science & Medicine. 1998;47(8):1121-30.

328. Blane D, Harding S, Rosato M. Does social mobility affect the size of the socioeconomic mortality differential?: evidence from the Office for National Statistics Longitudinal Study. Journal of the Royal Statistical Society Series A, (Statistics in Society). 1999;162(Pt. 1):59-70.

329. Bartley M, Plewis I. Increasing social mobility: an effective policy to reduce health inequalities. Journal of the Royal Statistical Society: Series A (Statistics in Society). 2007;170(2):469-81.

330. Hallqvist J, Lynch J, Bartley M, Lang T, Blane D. Can we disentangle life course processes of accumulation, critical period and social mobility? An analysis of disadvantaged socio-economic positions and myocardial infarction in the Stockholm Heart Epidemiology Program. Soc Sci Med. 2004;58(8):1555-62.

331. Mayo NE, Goldberg MS. When is a case-control study not a case-control study? J Rehabil Med. 2009;41(4):209-16.

332. Rothman KJ. Epidemiology : an introduction. Oxford; New York: Oxford University Press; 2002.

333. Szklo M, Nieto FJ. Epidemiology : beyond the basics. Burlington, Mass.: Jones & Bartlett Learning; 2014.

334. Sistrom CL, Garvan CW. Proportions, odds, and risk. Radiology. 2004;230(1):12-9.

335. VanderWeele TJ. Explanation in Causal inference: Methods for mediation and Interaction. Press OU, editor. USA2015. 706 p.

336. Breslow NE. Statistics in epidemiology: the case-control study. J Am Stat Assoc. 1996;91(433):14-28.

337. Schlesselman  JJ,  editor.  Case-control  studies:  design,  conduct,  analysis. New York: Oxford University Press; 1982.

338. Rothman KJ. Modern epidemiology. Boston: Little, Brown; 1986.

339. Savitz DA, Wellenius GA. Interpreting epidemiologic evidence. 2nd ed. New York: Oxford University Press; 2016. 226 p.

340. Bunge M. Causality in Modern Science. New York: Dover. 434 pp. 3rd.1979.

341. Kaufman JS, Poole C. Looking back on “causal thinking in the health sciences”. Annu Rev Public Health. 2000;21:101-19.

342. Doll R, Hill AB. The Mortality of Doctors in Relation to Their Smoking Habits. British Medical Journal. 1954;1(4877):1451-5.

343. Doll R, Hill AB. Lung Cancer and Other Causes of Death in Relation to Smoking. British Medical Journal. 1956;2(5001):1071-81.

344. Hill AB. The environment and disease: association or causation? Proceedings of the Royal Society of Medicine. 1965;58.

345. Parascandola M, Weed DL, Dasgupta A. Two Surgeon General’s reports on smoking and cancer: a historical investigation of the practice of causal inference. Emerging Themes in Epidemiology. 2006;3(1):1.

346. Rothman KJ. CAUSES. American Journal of Epidemiology. 1976;104(6):587-92.

347. Glymour MM. Using causal diagrams to understand common problems in social epidemiology. In: Oakes MJ, Kaufman JS, editors. Methods in social epidemiology: Jossey-Bass; 2006. p. 387-418.

348. Hofler M. Causal inference based on counterfactuals. BMC Med Res Methodol. 2005;5:28.

349. Hernan M, Robins JM. Causal inference 2016 [Available from: https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/.

350. Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology. 1974;66(5):688-701.

351. Robins J. A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Mathematical Modelling. 1986;7(9):1393-512.

352. Robins J. Proceedings of the Section on Bayesian Statistical Science. Alexandria, VA, American Statistical Association; 1998. Marginal structural models. 1997:1-10.

353. Robins JM. Marginal Structural Models versus Structural nested Models as Tools for Causal inference. In: Halloran ME, Berry D, editors. Statistical Models in Epidemiology, the Environment, and Clinical Trials. New York, NY: Springer New York; 2000. p. 95-133.

354. Robins JM, Hernán Ma, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology (Cambridge, Mass). 2000;11(5):550-60.

355. Vanderweele TJ, Vansteelandt S. Odds ratios for mediation analysis for a dichotomous outcome. American journal of epidemiology. 2010;172(12):1339-48.

356. Pearl J. Causal diagrams for empirical research. Biometrika. 1995;82(4):669-88.

357. Greenland S, Pearl J, Robins JM. Causal diagrams for epidemiologic research. Epidemiology (Cambridge, Mass). 1999;10(1):37-48.

358. Hernan MA, Hernandez-Diaz S, Robins JM. A structural approach to selection bias. Epidemiology (Cambridge, Mass). 2004;15(5):615-25.

359. Gupta B, Ariyawardana A, Johnson NW. Oral cancer in India continues in epidemic proportions: evidence base and policy initiatives. International dental journal. 2013;63(1):12-25.

360. Bhan N, Rao KD, Kachwaha S. Health inequalities research in India: a review of trends and themes in the literature since the 1990s. International Journal for Equity in Health. 2016;15:166.

361. Moore N, Pierce A, Wilson DS, Johnson. The epidemiology of lip cancer: a review of global incidence and aetiology. Oral Dis. 1999;5(3):185-95.

362. Zarbo RJ. Salivary gland neoplasia: a review for the practicing pathologist. Mod Pathol. 2002;15(3):298-323.

363. Chang ET, Adami HO. The enigmatic epidemiology of nasopharyngeal carcinoma. Cancer Epidemiol Biomarkers Prev. 2006;15(10):1765-77.

364. Westreich D. Berkson’s bias, selection bias, and missing data. Epidemiology (Cambridge, Mass). 2012;23(1):159-64.

365. Nishimoto, Pintos J, Schlecht NF, Torloni H, Carvalho AL, Kowalski LP, et al. Assessment of control selection bias in a hospital-based case-control study of upper aero-digestive tract cancers. J Cancer Epidemiol Prev. 2002;7(3):131-41.

366. Heath EM, Morken NW, Campbell Ka, Tkach D, Boyd Ea, Strom Da. Use of buccal cells collected in mouthwash as a source of DNA for clinical testing. Archives of pathology & laboratory medicine. 2001;125(1):127-33.

367. D’Souza G, Sugar E, Ruby W, Gravitt P, Gillison M. Analysis of the effect of DNA purification on detection of human papillomavirus in oral rinse samples by PCR. J Clin Microbiol. 2005;43(11):5526-35.

368. James JS. IMPROVING DETECTION OF PRECANCEROUS Computer-Assisted Analysis of the Oral Brush Biopsy DETECTING ORAL. Journal of american dental association. 2001;130(October 1999).

369. Egan KM, Abruzzo J, Cytobrush B, Newcomb PA, Titus-ernstoff L, Franklin T, et al. Collection of Genomic DNA from Adults in Epidemiological Studies by Buccal Cytobrush and Mouthwash. 2001:687-96.

370. Lawton G, Thomas, Schonrock, Monsour, Frazer. Human papillomaviruses in normal oral mucosa: a comparison of methods for sample collection. J Oral Pathol Med. 1992;Jul;(21(6)):265-9.

371. Walling DM, Flaitz CM, Adler-Storthz K, Nichols CM. A non-invasive technique for studying oral epithelial Epstein-Barr virus infection and disease. Oral Oncol. 2003;39(5):436-44.

372. Muñoz N, Bosch FX. Biomarkers for biological agents. IARC Sci Publ. 1997;142:127-42.

373. Coutlée F, Mayrand MH, Provencher D, Franco E. The future of HPV testing in clinical laboratories and applied virology research. Clinical and diagnostic virology. 1997;8(2):123-41.

374. Kornegay JR, Roger M, Davies PO, Shepard P, Guerrero NA, Lloveras B, et al. International Proficiency Study of a Consensus L1 PCR Assay for the Detection and Typing of Human Papillomavirus DNA : Evaluation of Accuracy and Intralaboratory and Interlaboratory Agreement International Proficiency Study of a Consensus L1 PCR Assay for. 2003.

375. London SJ, Xia J, Lehman TA, Yang J-h, Granada E, Chunhong L. Collection of Buccal Cell DNA in Seventh-Grade Children Using Water and a Toothbrush. 2001:1227-30.

376. de Villiers EM, Fauquet C, Broker TR, Bernard HU, Hausen Hz. Classification of papillomaviruses. Virology. 2004;June 20(324(1)):17-27.

377. Bernard H-U. The clinical importance of the nomenclature, evolution and taxonomy of human papillomaviruses. Journal of clinical virology : the official publication of the Pan American Society for Clinical Virology. 2005;32 Suppl 1:S1-6.

378. Jiang G QWRHBRD. Elimination of false-positive signals in enhanced chemiluminescence (ECL) detection of amplified HPV DNA from clinical samples. Biotechniques. 1995;Oct;19(4):566-8.

379. Montgomery MR, Gragnolati M, Burke KA, Paredes E. Measuring living standards with proxy variables. Demography. 2000;37(2):155-74.

380. Jolliffe IT. Principal component analysis. New York: Springer; 2002.

381. Debelak R, Tran US. Principal Component Analysis of Smoothed Tetrachoric Correlation Matrices as a Measure of Dimensionality. Educational and Psychological Measurement. 2013;73(1):63-77.

382. Beebe-Dimmer J, Lynch JW, Turrell G, Lustgarten S, Raghunathan T, Kaplan GA. Childhood and Adult Socioeconomic Conditions and 31-Year Mortality Risk in Women. American Journal of Epidemiology. 2004;159(5):481-90.

383. Hadden WC. Annotation: the use of educational attainment as an indicator of socioeconomic position. Am J Public Health. 1996;86(11):1525-6.

384. Greenland S, Pearl J, Robins JM. Causal diagrams for epidemiologic research. Epidemiology (Cambridge, Mass). 1999;10(1):37-48.

385. Dietrich T, Hoffmann K. A comprehensive index for the modeling of smoking history in periodontal research. J Dent Res. 2004;83(11):859-63.

386. Leffondre K. Modeling Smoking History: A Comparison of Different Approaches. American Journal of Epidemiology. 2002;156(9):813-23.

387. Leffondr K, Abrahamowicz M, Xiao Y. Modelling smoking history using a comprehensive smoking index : Application to lung cancer. 2006(September):4132-46.

388. National Cancer I. http://www.cancer.gov/dictionary?cdrid=306510.

389. Harrell FE. Regression Modeling Strategies : With Applications to Linear Models, Logistic Regression, and Survival Analysis. 2001.

390. Hosmer DW, Lemeshow S. Applied logistic regression. New York: Wiley; 1989.

391. Williams BA, Madrekar JN, Madrekar SJ, Cha SS, Furth AF. Finding Optimal Cutpoints for Continuous Covariates with Binary and Time-to-Event Outcomes Mayo Clinic, Rochester, Minnesota Division of Biostatistics DoHSR; June 2006.  Contract No.: 10027230.

392. Textor J. Drawing and Analyzing Causal DAGs with DAG itty User Manual for Version 2 . 0. 2013:1-12.

393. Textor J, Hardt J, Knuppel S. DAGitty: A graphical tool for analysisng causal diagrams. Epidemiology (Cambridge, Mass). 2011;5(22):745-.

394. Ndiaye C, Mena M, Alemany L, Arbyn M, Castellsagué X, Laporte L, et al. HPV DNA, E6/E7 mRNA, and p16INK4a detection in head and neck cancers: a systematic review and meta-analysis. The Lancet Oncology. 2014;15(12):1319-31.

395. Martin S, Mona S, Marc TG, Edward SP, Meg W, Jennifer LC, et al. Human Papillomavirus Prevalence in Oropharyngeal Cancer before Vaccine Introduction, United States. Emerging Infectious Disease journal. 2014;20(5):822.

396. Tyndale RF, Sellers EM. Variable CYP2A6-mediated nicotine metabolism alters smoking behavior and risk. Drug metabolism and disposition: the biological fate of chemicals. 2001;29(4 Pt 2):548-52.

397. Wacholder S, Rothman N, Caporaso N. Population stratification in epidemiologic studies of common genetic variants and cancer: quantification of bias. Journal of the National Cancer Institute. 2000;92(14):1151-8.

398. Rosner B. Fundamentals of Biostatistics. Seventh ed ed2010. 2-888 p.

399. VanderWeele TJ. An introduction to interaction analysis.  Explanation in causal inference: methods of mediation and interaction. United States of America: Oxford University press; 2015. p. 258.

400. Hernan MA, Brumback B, Robins JM. Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. Epidemiology (Cambridge, Mass). 2000;11(5):561-70.

401. Hogue CJR, Parker CB, Willinger M, Temple JR, Bann CM, Silver RM, et al. The Association of Stillbirth with Depressive Symptoms 6–36 Months Post-Delivery. Paediatric and Perinatal Epidemiology. 2015;29(2):131-43.

402. Menvielle G, Franck J-e, Radoï L, Sanchez M, Févotte J, Guizard A-V, et al. Quantifying the mediating effects of smoking and occupational exposures in the relation between education and lung cancer: the ICARE study. European Journal of Epidemiology. 2016;31(12):1213-21.

403. Xu X, Ritz B, Cockburn M, Lombardi C, Heck JE. Maternal Preeclampsia and Odds of Childhood Cancers in Offspring – A California Statewide Case-Control Study. Paediatr Perinat Epidemiol. 2017.

404. Hernan MA. A definition of causal effect for epidemiological research. J Epidemiol Community Health. 2004;58(4):265-71.

405. Mishra G, Nitsch D, Black S, De Stavola B, Kuh D, Hardy R. A structured approach to modelling the effects of binary exposure variables over the life course. International Journal of Epidemiology. 2009;38(2):528-37.

406. Cole SR, Hernán MA. Constructing Inverse Probability Weights for Marginal Structural Models. American Journal of Epidemiology. 2008;168(6):656-64.

407. Nandi A, Glymour MM, Kawachi I, VanderWeele TJ. Using marginal structural models to estimate the direct effect of adverse childhood social conditions on onset of heart disease, diabetes, and stroke. Epidemiology (Cambridge, Mass). 2012;23(2):223-32.

408. Murray ET, Mishra GD, Kuh D, Guralnik J, Black S, Hardy R. Life course models of socioeconomic position and cardiovascular risk factors: 1946 birth cohort. Ann Epidemiol. 2011;21(8):589-97.

409. Leffondre K, Wynant W, Cao Z, Abrahamowicz M, Heinze G, Siemiatycki J. A weighted Cox model for modelling time-dependent exposures in the analysis of case-control studies. Stat Med. 2010;29(7-8):839-50.

410. Platt RW, Brookhart MA, Cole SR, Westreich D, Schisterman EF. An information criterion for marginal structural models. Statistics in Medicine. 2013;32(8):1383-93.

411. Knol MJ, Egger M, Scott P, Geerlings MI, Vandenbroucke JP. When one depends on the other: reporting of interaction in case-control and cohort studies. Epidemiology (Cambridge, Mass). 2009;20(2):161-6.

412. Judd CM, Kenny DA. Process Analysis. Evaluation Review. 1981;5(5):602-19.

413. Baron RM, Kenny DA. The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. J Pers Soc Psychol. 1986;51(6):1173-82.

414. MacKinnon DP, Fairchild AJ. Current Directions in Mediation Analysis. Current directions in psychological science. 2009;18(1):16-20.

415. MacKinnon DP, Krull JL, Lockwood CM. Equivalence of the Mediation, Confounding and Suppression Effect. Prevention Science. 2000;1(4):173-81.

416. Valeri L, VanderWeele TJ. Mediation analysis allowing for exposure-mediator interactions and causal interpretation: theoretical assumptions and implementation with SAS and SPSS macros. Psychological methods. 2013;18(2):137-50.

417. Robins JM, Greenland S. Identifiability and exchangeability for direct and indirect effects. Epidemiology (Cambridge, Mass). 1992;3(2):143-55.

418. Pearl J. Direct and indirect effects.  Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence; Seattle, Washington. 2074073: Morgan Kaufmann Publishers Inc.; 2001. p. 411-20.

419. Hayes AF. Beyond Baron and Kenny : Statistical Mediation Analysis in the New Millennium Beyond Baron and Kenny : Statistical Mediation Analysis in the New Millennium. 2009(July 2014):37-41.

420. VanderWeele TJ. A unification of mediation and interaction: a 4-way decomposition. Epidemiology (Cambridge, Mass). 2014;25(5):749-61.

421. Erratum: A Unification of Mediation and Interaction: A 4-Way Decomposition. Epidemiology (Cambridge, Mass). 2016;27(5):e36.

422. VanderWeele TJ. A unification of mediation and interaction.  Explanation in causal inference: Methods for mediation and interaction. USA: Oxford university press; 2015. p. 371-96.

423. DiCiccio TJ, Efron B. Bootstrap confidence intervals. 1996:189-228.

424. http://www.stata.com/manuals13/rbootstrap.pdf [Internet].

425. Carlson LE, Speca M, Patel KD, Goodey E. Mindfulness-Based Stress Reduction in Relation to Quality of Life, Mood, Symptoms of Stress, and Immune Parameters in Breast and Prostate Cancer Outpatients. Psychosomatic Medicine. 2003;65(4):571-81.

426. Sapolsky RM. The Influence of Social Hierarchy on Primate Health. Science. 2005;308(5722):648-52.

427. Adler NE, Stewart J. Preface to The Biology of Disadvantage: Socioeconomic Status and Health. Annals of the New York Academy of Sciences. 2010;1186(1):1-4.

428. Kelly-Irving M, Mabile L, Grosclaude P, Lang T, Delpierre C. The embodiment of adverse childhood experiences and cancer development: potential biological mechanisms and pathways across the life course. Int J Public Health. 2013;58(1):3-11.

429. Buckley R, Cartwright K, Struyk R, Szymanoski E. Integrating housing wealth into the social safety net for the Moscow elderly: an empirical essay. Journal of Housing Economics. 2003;12(3):202-23.

430. Kawachi I, Kennedy BP. Health and social cohesion: why care about income inequality? BMJ (Clinical research ed). 1997;314(7086):1037-40.

431. Berton HK, Cassel JC, Gore S. Social Support and Health. Medical Care. 1977;15(5):47-58.

432. Berkman LF, Glass T, Brissette I, Seeman TE. From social integration to health: Durkheim in the new millennium. Social Science & Medicine. 2000;51(6):843-57.

433. McEwen BS. Allostasis and Allostatic Load: Implications for Neuropsychopharmacology. Neuropsychopharmacology. 2000;22(2):108-24.

434. McEwen BS. Interacting Mediators of Allostasis and Allostatic Load: Towards an Understanding of Resilience in Aging. 2003;52(10):10-6.

435. Epel ES, Blackburn EH, Lin J, Dhabhar FS, Adler NE, Morrow JD, et al. Accelerated telomere shortening in response to life stress. Proceedings of the National Academy of Sciences of the United States of America. 2004;101(49):17312-5.

436. Willeit P, Willeit J, Mayr A, Weger S, Oberhollenzer F, Brandstätter A, et al. Telomere Length and Risk of Incident Cancer and Cancer Mortality. JAMA: The Journal of the American Medical Association. 2010;304(1):69-75.

437. Patel et.al MM, Patel MM, Parekh LJ, Jha FP, Sainger RN, Patel JB, et al. Clinical usefulness of telomerase activation and telomere length in head and neck cancer. Head & Neck. 2002;24(12):1060-7.

438. Sainger. Clinical significance of telomere length and associated proteins in oral cancer. Biomark In. 2007;14(2):9-19.

439. Sebastian S, Grammatica L, Paradiso A. Telomeres, telomerase and oral cancer (Review). Int J Oncol. 2005;27(6):1583-15896.

440. Bakhtiar SM, Ali A, Barh D. Epigenetics in Head and Neck Cancer. In: Verma M, editor. Cancer Epigenetics: Risk Assessment, Diagnosis, Treatment, and Prognosis. New York, NY: Springer New York; 2015. p. 751-69.

441. Jithesh PV, Risk JM, Schache AG, Dhanda J, Lane B, Liloglou T, et al. The epigenetic landscape of oral squamous cell carcinoma. British Journal of Cancer. 2013;108(2):370-9.

442. Paper IR, Oncology N, Shaw R. The epigenetics of oral cancer. International journal of oral and maxillofacial surgery. 2006;35(2):101-8.

443. Shaw R. The epigenetics of oral cancer. International Journal of Oral and Maxillofacial Surgery. 2006;35(2):101-8.

444. McGuinness D, McGlynn LM, Johnson PCD, MacIntyre A, Batty GD, Burns H, et al. Socio-economic status is associated with epigenetic differences in the pSoBid cohort. International journal of epidemiology. 2012;41(1):151-60.

445. Subramanyam MA, Diez-Roux AV, Pilsner JR, Villamor E, Donohue KM, Liu Y, et al. Social Factors and Leukocyte DNA Methylation of Repetitive Sequences: The Multi-Ethnic Study of Atherosclerosis. PLOS ONE. 2013;8(1):e54018.

446. Stringhini S, Polidoro S, Sacerdote C, Kelly RS, van Veldhoven K, Agnoli C, et al. Life-course socioeconomic status and DNA methylation of genes regulating inflammation. International Journal of Epidemiology. 2015;44(4):1320-30.

447. Chen E, Hanson MD, Paterson LQ, Griffin MJ, Walker HA, Miller GE. Socioeconomic status and inflammatory processes in childhood asthma: The role of psychological stress. Journal of Allergy and Clinical Immunology. 2006;117(5):1014-20.

448. Pretscher D, Distel LV, Grabenbauer GG, Wittlinger M, Buettner M, Niedobitek G. Distribution of immune cells in head and neck cancer: CD8+ T-cells and CD20+B-cells in metastatic lymph nodes are associated with favourable outcome in patients with oro- and hypopharyngeal carcinoma. BMC Cancer. 2009;9(1):292.

449. Borghol N, Suderman M, McArdle W, Racine A, Hallett M, Pembrey M, et al. Associations with early-life socio-economic position in adult DNA methylation. Int J Epidemiol. 2012;41(1):62-74.

450. Fagundes CP, Way B. Early-Life Stress and Adult Inflammation. Current Directions in Psychological Science. 2014;23(4):277-83.

451. Ruwali M, Pant MC, Shah PP, Mishra BN, Parmar D. Polymorphism in cytochrome P450 2A6 and glutathione S-transferase P1 modifies head and neck cancer risk and treatment outcome. Mutation Research/Fundamental and Molecular Mechanisms of Mutagenesis. 2009;669(1–2):36-41.

452. Rothman KJ, Greenland S, Lash TL. Modern epidemiology. Philadelphia: Wolters Kluwer Health/Lippincott Williams & Wilkins; 2008.

453. Wacholder S, McLaughlin JK, Silverman DT, Mandel JS. Selection of controls in case-control studies. I. Principles. Am J Epidemiol. 1992;135(9):1019-28.

454. Olson SH. Reported Participation in Case-Control Studies: Changes over Time. American Journal of Epidemiology. 2001;154(6):574-81.

455. Galea S, Tracy M. Participation Rates in Epidemiologic Studies. Annals of Epidemiology. 2007;17(9):643-53.

456. Conway DI. Socioeconomic factors influence selection and participation in a population-based case-control study of head and neck cancer in Scotland. Journal of Clinical Epidemiology. 2008;61(11):1187-93.

457. India Go. Census of India Website. 2011.

458. Hardgrave RL. Caste in Kerala: A preface to the Elections. Kerala; November 21,1964.

459. Hernán MA, Cole SR. Invited Commentary: Causal Diagrams and Measurement Bias. American Journal of Epidemiology. 2009;170(8):959-62.

460. Lash TL, Fox MP, Fink AK. Applying quantitative bias analysis to epidemiologic data. 2009.

461. Krall EA, Valadian I, Dwyer JT, Gardner J. Accuracy of recalled smoking data. Am J Public Health. 1989;79(2):200-2.

462. Berney L, Blane DB, Soc Sci M, Blane B. Collecting retrospective data: accuracy of recall after 50 years judged against historical records. Social Science & Medicine. 1997;45(10):1519-25.

463. Prince Nelson SL, Viswanathan R, Paul J N, Diane L K, Paula S R, Bethany J W. An evaluation of common methods for dichotomization of continuous variables to discriminate disease status. Communications in Statistics – Theory and Methods. 2016:0-.

464. Altman DG. Categorising continuous variables. British Journal of Cancer. 1991;64(5):975-.

465. Greenland S. Dose-Response and Trend Analysis in Epidemiology: Alternatives to Categorical Analysis. Epidemiology (Cambridge, Mass). 1995;6(4):356-65.

466. Garcia-Closas M, Rothman N, Lubin J. Misclassification in Case-Control Studies of Gene-Environment Interactions: Assessment of Bias and Sample Size. Cancer Epidemiology Biomarkers &amp;amp; Prevention. 1999;8(12):1043.

467. Brennan P. Gene-environment interaction and aetiology of cancer: what does it mean and how can we measure it? Carcinogenesis. 2002;23(3):381-7.

468. VanderWeele TJ. Bias formulas for sensitivity analysis for direct and indirect effects. Epidemiology (Cambridge, Mass). 2010;21(4):540-51.

469. VanderWeele TJ, Asomaning K, Tchetgen Tchetgen EJ, Han Y, Spitz MR, Shete S, et al. Genetic variants on 15q25.1, smoking, and lung cancer: an assessment of mediation and interaction. American journal of epidemiology. 2012;175(10):1013-20.

470. Richardson DB, Rzehak P, Klenk J, Weiland SK. Analyses of case-control data for additional outcomes. Epidemiology (Cambridge, Mass). 2007;18(4):441-5.

471. Kujan O, Glenny AM, Duxbury J, Thakker N, Sloan P. Evaluation of screening strategies for improving oral cancer mortality: a Cochrane systematic review. J Dent Educ. 2005;69(2):255-65.

472. VanderWeele TJ. Policy-relevant proportions for direct effects. Epidemiology (Cambridge, Mass). 2013;24(1):175-6.

473. Howe LD, Smith AD, Macdonald-Wallis C, Anderson EL, Galobardes B, Lawlor DA, et al. Relationship between mediation analysis and the structured life course approach. International Journal of Epidemiology. 2016;45(4):1280-94.

[1] This includes the castes in the Hindu religion and sections of other religions that has been classified as backward by the state governments of India (here; Kerala) due to discrimination faced by them historically

[2] Monotonicity assumption:  The effect of exposure on the mediator or outcome, and the effect of the mediator on the outcome all have the same sign, i.e., either all not preventive (positive monotonicity assumption) or all not preventive (negative monotonicity assumption).

Cite This Work

To export a reference to this article please select a referencing stye below:

Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.

Related Services

View all

Related Content

All Tags

Content relating to: "Biology"

Biology is the scientific study of the natural processes of living organisms or life in all its forms. including origin, growth, reproduction, structure, and behaviour and encompasses numerous fields such as botany, zoology, mycology, and microbiology.

Related Articles

DMCA / Removal Request

If you are the original writer of this dissertation and no longer wish to have your work published on the UKDiss.com website then please: