Updates on the Molecular Genetics of Colorectal Cancer

Colorectal cancer (CRC) is a leading cause of cancer mortality worldwide. Although definitive therapies for advanced disease are still lacking, rapid progress has been made in the last decade in understanding the molecular mechanisms underlying CRC tumorigenesis and progression. In this review, we summarize the most recent findings in the molecular genetics of CRC with a focus on gene mutations and epigenetic changes that were identified in CRC patients.


Introduction
Colorectal cancer (CRC) is one of the most prevalent malignancies in the USA. With an estimated incidence of 134,490 new cases and 49,190 deaths in 2016, it is the third most common cause of cancer as well as cancer mortality in both men and women, with a five-year survival rate of less than 15% [1,2]. It is a complex and multifactorial disease resulting from lifestyle, dietary, and genetic changes [3]. The majority of cases, about 75%, are sporadic [1], meaning that a specific etiology has not been identified. The other 25% of cases are familial, either due to somatic changes, or hereditary genetic mutations that accounts for 5% to 6% of all CRC cases [1].
Genetic mutations in oncogenes, tumor suppressor genes, and mismatch repair genes lead to the formation of malignant cells [4][5][6]. Chromosomal instability (CIN) and microsatellite instability (MSI) are two molecular pathways that have been described in the progression to colorectal cancer [7,8]. 65-70% of sporadic cancers arise from the CIN pathway, which is defined by insertion, deletion, or rearrangement of entire chromosomal regions [7]. Cancers that form via the CIN pathway often have mutations in APC or KRAS [7]. MSI is characterized by a genetic or epigenetic defect in DNA repair, leading to loss of function of the MMR protein and mismatch mutations [9,10]. It accounts for 15% of CRC cases (12% sporadic, 3% due to Lynch syndrome) and it is also a part of the serrated pathway [11].
This article provides most recent updates in the molecular genetics of colorectal cancer. The following genetic changes have been identified in CRC patients. Table 1 summarizes the findings.

MCM9 mutation
Minichromosome maintenance (MCM) proteins play a role in initiating DNA replication [14]. MCM9 encodes DNA helicase which facilitates cellular mismatch repair [14]. Additionally, the MCM8-MCM9 complex is crucial in maintenance of the replication fork as well as double strand break repair. A genome mapping study was done by Goldberg et al. [14] in two consanguineous Ashkenazi sisters who developed mixed polyposis and metastatic CRC at an early age. Results revealed that a homozygous c.672_673delGGinsC mutation in MCM9 (chr6:119243200) was present in both sisters, while other healthy family members were heterozygous. This mutation caused a frame shift leading to premature truncation associated with polyposis and early-onset CRC [14]. MMP-11 Enzyme that catalyzes the degradation of 1-antitrypsin and insulin-like growth factor binding protein-1 (IGF-BP-1) Both CD147 and MMP-11 were over-expressed on both the mRNA and protein level in CRC tissue samples. They also found a direct correlation between CD147 and MMP-11 protein expression in CRC tissues. CD147 and MMP-11 expression also influenced lymph node metastasis, distant metastasis, and TNM stage. Increased expression of both genes also correlated with a shorter survival time. CRC (< 55 years) who also had a first-degree relative with CRC [15]. 16% of these patients were noted to have rare germline mutations in known CRC genes compared to healthy controls. However, POT1, POLE2, and MRE11 were newly identified in this study. POT1 works in maintaining telomere length, and three disruptive mutations were noted in the CRC group [15]. POLE2 is a component of the polymerase epsilon enzyme complex. A mutation in POLE2 leads to polymerase proofreading-associated polyposis [15]. MRE11 is a nuclease involved in double strand break repair and it is active in mismatch repair-deficient cancer [15]. The findings in this study support DNA replication and repair defects as a basis for inherited CRC.

ATF3 mutation
Activating transcription factor 3 (AFT3), a member of ATF/CREB family of basic region leucine zipper (bZip) proteins, regulates transcription by binding DNA [16]. This gene is up-regulated in response to cellular stress such as hypoxia, anoxia, carcinogens, and DNA damage [17]. A study by Jiang et al. revealed that ATF3 is overexpressed in human colorectal cancer cells [18]. In CRC cells, ATF3 down-regulates the expression of the B-cell lymphoma-2 (Bcl-2), a gene that inhibits apoptosis [18]. On the other hand, ATF3 up-regulates the activity of Bak, a protein of the BCL-2 family that induces apoptosis [18].
ATF3 may also play a role in metastasis as increased expression in colorectal cancer cells is linked with invasion. Over-expression of ATF3 increases the expression cluster of differentiation 44 (CD44) and also decreases retinoblastoma (Rb) expression, both of which are CRC stem cell markers [18]. ATF3 also decreased the expression of epithelial mesenchymal transition (EMT)-inducing transcription factors, which are essential for cancer metastasis [18]. These findings suggest multiple roles for ATF3 in the CRC progression, likely through regulating apoptosis and invasion.

TET2 mutation
Ten-eleven translocation (TET) consists of a family of enzymes that catalyze the oxidation of 5-methylcytosine (5mC) to 5-hydroxymethylcytosine (5hmC) [19]. TET1 suppresses tumor progression by inhibiting the WNT pathway [20]. In CRC, a decrease in TET1 messenger RNA (mRNA) has previously been observed, leading to decreased or absent levels of 5hmC [19]. However, a recent study by Huang et al. also revealed a loss of nuclear expression of TET2 in CRC cells [19]. In fact, loss of TET2 was linked to a more aggressive phenotype with metastasis [19]. The role of TET2 in CRC is unclear, but it has been proposed as a tumor suppressor through post-translational modification [19]. TET2 regulates gene expression by binding with DNA in the nucleus [20]. Thus, TET1 and TET2 suppress the progression of CRC through separate mechanisms.

CpG island methylator phenotype (CIMP)
CpG island methylator phenotype (CIMP) is due to hypermethylation of multiple CpG islands that encompass the promoter regions of multiple genes [21]. This may prevent binding of transcription factors to tumor-suppressor genes [22]. Consequently, the suppressor genes are silenced which lead to the progression of CRC [23]. A systematic review of thirty-six studies by Jia et al. revealed a poorer prognosis in CRC patients with high (CIMP-H) or positive CIMP than those with low (CIMP-L) or negative CIMP [24].

N-acetyltransferase 2 (NAT2) genotype
Prior studies have established a correlation between red and processed meat consumption and CRC, possibly due to exposure to heterocyclic amines (HAAs) [25][26][27]. HAAs are chemicals that are produced when meats are cooked at high temperatures [27]. N-acetyltransferase2 (NAT2) is a gene that activates HAAs via O-acetylation, which forms a reactive N-acetoxy species that binds DNA [28]. There are multiple genetic polymorphisms of NAT2 that affect its ability to catalyze HAAs. Populations with a higher frequency of the rapid NAT2 phenotype (Native Alaskans and Japanese Americans) have been noted to have a higher incidence of CRC. Contrastingly, populations with a lower frequency of this phenotype (North Africans) have a lower incidence of CRC [29].
A meta-analysis by Wang et al. examined the correlation between CRC and frequency of rapid NAT2 phenotype in Japanese and African Americans [29]. The Japanese American population had a high frequency of rapid NAT2 phenotype, whereas the African American population had an intermediate frequency. In both populations, there was a direct correlation between NAT2 activity and processed and red meat intake. There was also a stronger correlation between the rapid NAT2 phenotype and CRC than the intermediate phenotype. The slow phenotype had no correlation with CRC [29].

CDX2 gene
Caudal-related homeobox transcription factor 2 (CDX2) is a gene located in intestinal epithelial cells that that plays an important role in intestinal development and differentiation by regulating intestine-specific transcription factors [30,31]. A study by Olsen et al. evaluated 191 tumor biopsies from colon cancer patients to investigate CDX2 expression in colon cancer [31]. They discovered that CDX2 mRNA was downregulated in tumors with mismatch repair (MMR) deficiency, right-sided tumors, and poorly-differentiated tumors [31].

Thoc1 gene
The THO complex 1 (Thoc1) is a gene that encodes a nuclear matrix protein which binds the retinoblastoma protein and facilitates transcription elongation and mRNA export [32][33][34]. A study by Liu et al. evaluated Thoc1 activity in 185 CRC samples [35]. They discovered that Thoc1 mRNA and protein expression was significantly higher in CRC tissue than adjacent samples of normal colonic tissue [35]. A higher level of Thoc1 was associated with a more aggressive phenotype, thus leading to decreased survival. Their results suggest that Thoc1 may be used as a prognostic marker for CRC.

CD147 and MMP-11
Matrix metalloproteinases (MMPs) are group of enzymes that play an essential role in tumor invasion and metastasis via degradation of the extracellular matrix (ECM) during tissue growth and turnover. MMP expression is usually down-regulated in healthy tissue, but levels are increased in pathological states including inflammation and tumorigenesis [36].
Unlike other MMPs, MMP-11 (also called stromelysin-3) does not play a direct role in ECM remodeling. It catalyzes the degradation of 1-antitrypsin and insulin-like growth factor binding protein-1 (IGF-BP-1) [37,38]. CD147 plays a role in tissue remodeling via stimulation of extracellular matrix metalloproteinase inducer (EMMPRIN). It stimulates the secretion of MMPs from tumor cells, as well as regulates their expression and activity. Increased CD147 activity is linked to tumor growth and metastasis. CD147 has been linked to MMP-9 and MMP-2, but its correlation with MMP-11 has been unclear [39].
A study by Tian et al. evaluated the expression of CD147 and MMP-11 in 56 tissue biopsies with known CRC [40]. Compared with normal mucosa, they found that both CD147 and MMP-11 were over-expressed on both the mRNA and protein level in CRC tissue samples [40]. They also found a direct correlation between CD147 and MMP-11 protein expression in CRC tissues. CD147 and MMP-11 expression also influenced lymph node metastasis, distant metastasis, and TNM stage. Increased expression of both genes also correlated with a shorter survival time; therefore, they independently have prognostic value in CRC.

miR-320b and miR-206
MicroRNAs (miRNAs) are a class of small noncoding RNAs that are essential in post-transcriptional regulation. They degrade mRNA and inhibit translation, which leads to decreased gene expression [41][42][43]. MiRNA deregulation plays an important role in tumor formation and metastasis [44].
A study by Wang et al. [19] evaluated the role of miR-320b in CRC tissues of 48 patients. They found that miR-320b activity was decreased in the CRC tissues and cells. Contrastingly, they also found a direct correlation with up-regulation of miR-320b and a decrease in CRC cell growth. They also found that up-regulation of miR-320b down-regulated the expression of c-MYC, thus identifying it as a target gene. Moreover, c-Myc overexpression diminished the effects of miR-320b [45].
Sun et al. examined the activity of miR-206 in CRC tissues and compared it to adjacent healthy tissues in 80 patients [43]. They found that miRNA-206 activity was significantly down-regulated in the CRC samples. Furthermore, there was also a correlation with low miR-206 expression and poorer survival. MiR-206 may function as a tumor suppressor as up-regulation inhibits migration and invasion of CRC cells [43].

MAL, PRIMA1, PTGDR and SFRP1
DNA hypermethylation is an epigenetic event that occurs at gene regulatory sites, often in tumor suppressor genes. It causes decreased or absent gene expression, leading to the development of CRC [46][47][48]. MiRNAs (as previously described in the Wang et al. [19] and Sun et al. [43] studies) are epigenetic regulators through the posttranscriptional and translation processes of mRNAs. Altered expression of certain miRNAs leads to CRC progression [49].
Kalmar et al. performed whole genome expression profiling to evaluate the role of epigenetic processes on the progression of CRC. They evaluated 49 colorectal adenoma and 49 CRC samples and compared their gene expression to 49 healthy samples [50]. They identified 18 genes (including MAL, SFRP1, SULT1A1, PRIMA1, and PTGDR) which were down-regulated in both the adenoma and carcinoma samples compared to healthy tissues. Three of these genes (COL1A2, SFRP2, SOCS3) exhibited hypermethylation and THBS2 showed hypomethylation. BCL2, PRIMA1 and PTGDR were also hypermethylated but only in the CRC samples. In adenoma samples, miR-21 was overexpressed [50]. Epigenetic alterations of the identified genes and miRNAs may play an important role in CRC formation.

MUC5AC
MUC2 and MUC5AC are intestinal and gastric mucin genes, respectively, that have been linked to the serrated pathway in CRC formation [12]. . They demonstrated that MUC5AC hypomethylation occurs early in the serrated pathway, and it is specific for serrated polyps with MSI, CIMP-H, or the BRAF mutation. These findings indicate that MUC5AC hypomethylation may be a marker specific for malignant precursors of the serrated pathway [51].
They also demonstrated that MUC5AC hypomethylation was found only in MVHPs and SSAs, but not GCHPs. Therefore, it may also be used as a marker to distinguish GCHPs from MVHPs and SSAs. There was also a gradual increase in MUC5AC hypomethylation frequency from MVHP to SSA, indicating that MVHP may give rise to SSA and sessile serrated adenomas with dysplasia (SSA/D) [51].

TCF21
Transcription factor 21 (TCF21) plays an important role in organ development by promoting the transition of mesenchymal to epithelial cells (MET). Down-regulation of TCF21 via promoter hypermethylation has been associated with multiple cancers, including lung, kidney, and heart [52].
A study by Dai et  in CRC tissues. There was also a direct correlation between TCF21 methylation and lymph node invasion. Furthermore, TCF21 activity was restored with demethylation of CRC cell lines, inhibiting cell growth and invasion and promoting apoptosis.
These results indicate that the TCF21 gene may serve as a tumor suppressor [53].

Transient Receptor Potential (TRP) gene family
Transient Receptor Potential (TRP) is a family of ion channels that consist of multiple subsets of genes, including TRPM (melastatin), TRPV (vanilloid), and TPRC (canonical) [54]. Up-regulation of certain TRP's have previously been described in cancer cells, and it has been suggested that they are involved in multiple processes including cell proliferation, tumor invasion, cell differentiation, angiogenesis, and apoptosis [54,55]. Sozucan et al. investigated the expression of TRPM, TRPV, and TRPC in CRC tissue samples in 93 patients [56]. They discovered that there was a significant decrease in the expression of TRPV3, TRPV4, TRPV5, TRPM4, and TRPC6 genes in CRC tissues compared to healthy tissue, possibly due to epigenetic factors.

ANGPLT4
Angiopoietin-like protein 4 (ANGPTL4) is a secreted protein that plays many essential roles in tumorigenesis, including migration, invasion, and apoptosis [57,58]. ANGPTL4 is up-regulated in CRC tissues, and a study by Li et al. examined the molecular mechanism of this gene in 54 colon cancer tissue samples [57]. They determined that there was increased activity of ANGPTL4 in CRC tissue compared to normal tissue, and an even higher level correlated with liver metastasis. Molecular analysis revealed that over-expression of ANGPTL4 up-regulated bone morphogenetic protein 7 (BMP7). Furthermore, silencing of BMP7 expression reversed ANGPTL4 activity in HCT116 cells. They concluded that ANGPTL4 may be a promoter of CRC metastasis via up-regulation of BMP7 [57].

Single nucleotide polymorphisms
Single Nucleotide Polymorphisms (SNPs): Single nucleotide polymorphisms (SNPs) are the most common type of genetic variation [59], and they play an important role in the development and progression of many diseases including cancer.
A genome-wide analysis of 3494 patients with metastatic CRC across six prospective cohorts was done by Phipps et al. [60] They discovered multiple SNPs at 6p12.1 (nearest the ELOVL5 gene) which correlated with a poorer prognosis. These SNPs have been associated with a two-fold shorter survival and include the minor alleles at rs17544464, rs209489, and rs1442089 [60]. The ELOVL5 gene encodes a fatty acid elongase, and knockout of this gene in mouse models has been associated with hepatic steatosis [61], which is an independent risk factor for distant metastatic CRC [62].
Lemire et al. [63] performed a re-evaluation of a genome-wide association study (GWAS) of 16,517 patients. It revealed that an SNP at 14q23.1 (rs17094983) correlated with an increased risk of CRC. This locus is associated with the Reticulon 1 (RTN1) gene Colorectal Cancer: Open Access ISSN 2471-9943 which regulates cancer cell proliferation and survival. This gene is up-regulated in healthy colon tissue but down-regulated in cancer cells [63].
Yao et al. conducted a meta-analysis of 34 articles which included 90 studies to evaluate the correlation between SNPs of the CASC8 and SMAD7 genes and CRC [3]. They discovered that the rs7837328 and rs6983267 SNPs of the CASC8 gene were linked with an increased risk of CRC in both Caucasian and Asian patients. SNPs of rs7014346 and rs10505477 also correlated with an increased risk of CRC but only in Caucasians [3].
A rs12953717 SNP of the SMAD7 gene correlated with an increased risk of CRC in both Caucasians and Asians. On the other hand rs4939827 and rs4464148 SNPs correlated with an increased risk of CRC only in Caucasians [3].

Future Directions
The identification of new genes and SNPs associated with CRC in the aforementioned studies provide new insights into the pathophysiology and progression for CRC. These genes may serve as CRC prognostic markers as well as molecular targets for novel CRC therapies. Additional studies focusing on potential interactions between these genes and known CRC molecular pathways such as APC-Wnt signaling will further enhance our understanding of molecular genetics of CRC.