Morphological characteristics of walnut floral transition

Morphological differentiation of female flower buds in walnut was continuously observed by paraffin section. We divided walnut female flower bud differentiation into stages as follows: F_1 (predifferentiation period), F_2 (initial differentiation period) and F_3 (flower primordium differentiation period). In the predifferentiation period (F_1), the growth points of the flower bud appeared flat, and the bud scales appeared light green. In the initial differentiation period (F_2), the flower stalk primordium was raised, and the bud scales were yellow-green. In the flower primordium differentiation period (F_3), the primordium of the flower stalk continued to extend, the pistil primordium began to appear, and the bud scales appeared yellow-green (Fig. 1). The leaf buds (JRL) were collected at the same stage as F_2.

Figure 1

Figure 1

Morphological characteristics of walnut female flower buds. (AC) show the female flower buds in the predifferentiation period identified in the undifferentiated stage (F_1); (DF) show the female flower buds in the initial differentiation period (F_2); (GI) show the female flower buds identified at the flower primordium differentiation period (F_3); (JL) show the leaf buds (JRL) collected at the same stage of F_2. (A,D,G,J) are images of paraffin sections with a 100 μm scale bar; (B,E,H,K) are images of paraffin sections with a 50 μm scale bar; (C,F,I,L) show the external morphology of the buds at different stages with a 1 mm scale bar.

General analysis of walnut transcriptome data

Sequencing, splicing, and annotation of transcriptome data

A total of 194,926,208 raw reads were obtained through the sequencing of the transcript libraries (Table 1). By eliminating chimeric and low-quality reads, 189,041,550 clean reads were finally screened, and 132,154 unigenes were obtained with a mean length of 956 bp; the lengths of the unigenes ranged from 201 bp to 16831 bp (Fig. 2A).

Figure 2

Figure 2

Analysis of the transcriptome data. (A) length distribution of transcripts; (B) transcriptome data annotated in seven databases; (C) the species classification of the walnut transcriptome data annotated by the NR database.

All unigenes were compared with 7 databases (NR, NT, KEGG, Swiss-Prot, PFAM, GO, and KOG/COG) with an E value cutoff of e-5 for homology. Then, the unigenes were annotated according to the results: 39.46% of unigenes were annotated by the GO database, 20.51% of unigenes were annotated by the KO database, 13.77% of unigenes were annotated by the KOG database, 53.8% of unigenes were annotated by the NR database, 41.74% of unigenes were annotated by the NT database, 38.81% of unigenes were annotated by the PFAM database, and 39.6% of unigenes were annotated by the Swiss-Prot database. In total, 78,718 unigenes of the 132,154 total unigenes were annotated to at least one database, accounting for 59.56% of all unigenes, and 11,165 unigenes were annotated by all seven major databases, accounting for 8.44% of all unigenes (Fig. 2B).

The annotation results of the NR database show that 11372 unigenes (18.1% of the total annotated unigenes) were compared to Vitis vinifera, 6522 unigenes (10.4% of the total annotated unigenes) were compared to Prunus mume, 6141 unigenes (9.7% of the total annotated unigenes) were compared to Prunus persica, 4834 unigenes (7.7% of the total annotated unigenes) were compared to Jatropha curcas, 3457 unigenes (5.5% of the total annotated unigenes) were compared to Citrus sinensis, and the remaining 48.7% unigenes were compared to other species (Fig. 2C).

GO enrichment

The gene functional classifications were identified in a GO enrichment analysis based on the 52,157 unigenes annotated in the GO database. In the biological process category, cellular process, metabolic process, and single organism process were highly represented. In the cellular component category, significantly enriched genes were associated with cell, cell part, and organelle. In the molecular function category, GO terms related to binding, catalytic activity and transporter activity were significantly enriched (Fig. 3). KEGG pathway enrichment. KEGG pathway enrichment analysis showed that transport and catabolism, signal transduction, transition, carbohydrate metabolism and environmental adaptation had the most unigenes in classifications of Metabolism, Genetic Information Processing, Environmental Information Processing, Cellular Processes and Organismal Systems, respectively (Fig. 4). In particular, Environmental Information Processing and Organismal Systems, which included many subterms associated with flowering, such as the pathways of plant hormone signal transduction (ko04075) and Circadian rhythm-plant (ko04712), were enriched (Table S1).

Figure 4

Figure 4

KEGG pathway analysis of the transcriptome data. The abscissa is the ratio of the single pathway gene number to the total annotated gene number, and the ordinate is the name of the KEGG metabolic pathway. We divided the involved genes into 5 branches: (A) Cellular Processes; (B) Environmental Information Processing; (C) Genetic Information Processing; (D) Metabolism; (E) Organismal Systems.

Transcription factor (TF) enrichment analysis

TFs function alone or with other proteins in complexes by promoting or blocking the recruitment of RNA polymerase to specific genes19,20,21. Some TFs have been shown to play key roles in plant flowering22,23,24. Among the 132,154 unigenes, 4,835 of the unigenes (approximately 3.7%) were annotated as TFs, and they fell into diverse categories covering nearly all TF families, with MYB, NAC, and AP2-EREBP being the most highly represented (Fig. 5).

Figure 5

Figure 5

Transcription factor enrichment analysis of the walnut transcriptome data.

DEGs analysis of walnut transcriptome data

Identification of DEGs

The four databases from F_1, F_2, F_3, and JRL were used for paired comparisons, and four combinations (F_1vsF_2, F_1vsF_3, F_2vsF_3, F_2vsJRL) were chosen and analyzed.

The three combinations of female flower bud developmental stages (F_1vsF_2, F_1vsF_3, F_2vsF_3) had 597 DEGs, 784 DEGs and 532 DEGs, respectively. In addition, 16 DEGs were shared by all three combinations, which means they were expressed differently in the three periods of female flower buds (Fig. 6). The expression levels of DEGs between F_1, F_2 and F_3 were plotted in a heatmap (Fig. 7A).

Figure 6

Figure 6

Numbers of DEGs in F_1, F_2, F_3 and JRL. (AC,E) volcano plot of the DEGs in F_1vsF_2, F_1vsF_3, F_2vsF_3 and F_2vsJRL. The abscissa represents the multiple expression of genes in different samples. The ordinate represents the statistical significance of the change in gene expression. The scattered points in the map represent the genes, and genes without significant differences are presented as blue dots. Upregulated genes and downregulated genes with significant differences are presented as red and green dots, respectively; (D) Venn diagram of F_1vsF_2, F_1vsF_3 and F_2vsF_3.

Figure 7

Figure 7

(A) Heatmap of 16 DEGs expressed differently in three stages of female flower buds (F_1, F_2, and F_3); (B) Heatmap of DEGs between female flower buds (F_2) and leaf buds (JRL).

In addition, the female flower buds at the beginning of differentiation (F_2) and the leaf buds of the same stage (JRL) were used for comparison (F_2vsJRL). The results showed that 374 DEGs were expressed differently in female buds (F_2) and leaf buds (JRL), of which 90 unigenes in the female flower buds (F_2) showed a higher expression level than that in the leaf buds (JRL), and the other 284 unigenes showed the opposite trend (Fig. 6E). The expression levels of 374 DEGs between female buds (F_2) and leaf buds (JRL) were plotted in a heatmap (Fig. 7B).

GO functional analysis of DEGs

We conducted GO functional analysis with the upregulated and downregulated DEGs of F_1vsF_2, F_1vsF_3, and F_2vsF_3, and the results are shown in Fig. 8. The results indicated that the DEGs of the three stages of female flower bud differentiation had functions enriched in the regulation of RNA metabolic process (GO:0051252, BP); regulation of nucleobase-containing compound metabolic process (GO:0019219, BP); apoplast (GO:0048046, CC); nucleic acid binding transcription factor activity (GO:0001071, MF); transcription factor activity, sequence-specific DNA binding (GO:0003700, MF); and sequence-specific DNA binding (GO:0043565, MF). In addition, the GO functions of the DEGs between female flower buds (F_2) and leaf buds (JRL) were enriched in photosynthesis (GO:0015979, BP), photosystem I reaction center (GO:0009538, CC), photosystem I (GO:0009522, CC), photosystem (GO:0009521, CC), and thylakoid part (GO:0044436, CC).

Figure 8

Figure 8

(AD) represent the enriched GO terms of the DEGs in F_1vsF_2, F_1vsF_3, F_2vsF_3 and F_2vsJRL, respectively. The upregulated GO terms are shown with red bars, and the downregulated GO terms are shown with blue bars.

KEGG pathway analysis of DEGs: KEGG pathway analysis of the DEGs of F_1VSF_2, F_1VSF_3, F_2VSF_3 showed that most pathways were associated with flowering, and three pathways (Photosynthesis-antenna proteins, Plant hormone signal transduction, Porphyrin and chlorophyll metabolism) were shared by the three combinations. In addition, the pathway enrichment analysis of female flower buds (F_2) and leaf buds (JRL) also included many flowering-related pathways (Fig. 9).

Figure 9

Figure 9

(AD) represent scatterplots of the top 20 enriched KEGG pathways of the DEGs in F_1vsF_2, F_1vsF_3, F_2vsF_3 and F_2vsJRL, respectively. The vertical axis represents the pathway names, the horizontal axis represents the rich factors corresponding to the pathways, the size of the q value is represented by the color of the dots, and the number of differentially expressed genes in each pathway is expressed by the size of the dots.

Analysis of DEGs related to flowering time: The flowering time genes that induce plant flowering are usually attributed to the photoperiod, vernalization, autonomous, GA, and age-regulated pathways, as well as the integrated factors of the intersection of these pathways3,6,13,25.

Based on previous studies, we summarized the important flowering time genes in the model plant Arabidopsis thaliana, and these genes were used as query genes (Table 2). Finally, we screened 31 DEGs associated with flowering time in walnut, and eighteen of the flowering time DEGs were chosen for qRT-PCR.

Table 2 Important genes involved in flowering time pathways. The number of walnut DEGs is shown after each gene name.

DEGs related to the walnut floral transition

A total of 31 unigenes were involved in the photoperiod (circadian rhythm), vernalization pathway, GA, and age-related pathways. To investigate the complex molecular mechanisms underlying floral transition in walnut, we set the 31 flowering time-related DEGs in the flowering time regulation network (Fig. 10).

Figure 10

Figure 10

Heatmap of flowering-time-related DEGs. The DEG expression levels are indicated by color code according to the scale. The squares from left to right indicate the samples in F_1, F_2, F_3, and JRL. Genes within the circle were not differentially expressed in our study; circles filled with azure blue indicate normal genes, and circles filled with yellow indicate integral genes. Arrows indicate promotion, and arrows with a bar indicate inhibition. Red lines between genes indicate that these genes can combine to form a complex. The heatmap was drawn with HEMI according to the DEG expression, and the flow chart was drawn with Visio.

Most of the DEGs were annotated in the photoperiod pathway: In total, twenty-four DEGs were annotated in this pathway, fifteen DEGs were highly expressed in the predifferentiation period (F_1), including CRY2 (1), FKF1 (1), GI (2), NF-YA (2), PIF3 (1), PPR5 (2), PRR7 (3), PRR9 (1), PRR73 (1), and PRR95 (1). CDF2 (1), CDF3 (1), and LHY (2) were highly expressed in the initial differentiation period (F_2). NF-YA (1) and TEM1 (2) were highly expressed in the flower primordium differentiation period (F_3). Two PHR2 genes had higher expression levels in leaf buds (JRL) than in flower buds (F_2), while PRR9 exhibited the opposite trend.

In Arabidopsis thaliana, CYCLING DOF FACTOR 2 (CDF2) and CYCLING DOF FACTOR 3 (CDF3) act redundantly to reduce CONSTANS (CO) expression and are degraded by the complex of GIGANTEA (GI) and FLAVIN-BINDING KELCH REPEAT F-BOX 1 (FKF1), thus releasing repression of CO and FT transcription26,27,28. Blue-light photoreceptors such as PHOTOLYASE-RELATED 2 (PHR2) are essential light detectors for the early development of plants and mediate phototropism and the expression of specific genes. Loss of a blue-light photoreceptor in the hy4 mutants of Arabidopsis thaliana substantially delayed flowering29,30. CRYPTOCHROME 2 (CRY2) can promote the expression of FT and is negatively regulated by FLOWERING LOCUS C (FLC)31. The NUCLEAR FACTOR Y (NF-Y) transcription factors (heterotrimeric complexes composed of NF-YA and the dimer of NF-YB/NF-YC) can initiate photoperiod-dependent flowering by cooperatively interacting with CO to drive the expression of FT32.

Phytochrome Interacting Factor 3 (PIF3) can bind the promoters of LATE ELONGATED HYPOCOTYL (LHY) and CIRCADIAN CLOCK ASSOCIATED 1 (CCA1) to form an in vitro ternary complex. Additionally, the complex of CCA1 and LHY can bind with TIMING OF CAB 1 (TOC1) to form a feedback loop that is necessary for the circadian clock in Arabidopsis thaliana33. In addition, the CCA1 and LHY complexes were repressed by the Pseudo Response Regulators (PRRs) encoded by TOC16,34. The activator CO and the repressor TEMPRANILLO (TEM) have a quantitative balance that determines the FT transcription level, as it shifts the CO/TEM balance in favor of CO activity, allowing FT transcription to reach the threshold level required to trigger flowering35.

In the age-regulated pathway, three DEGs were annotated as TREHALOSE-6-PHOSPHATE SYNTHASE 1 (TPS1), which is essential for normal vegetative growth and transition to flowering36,37. They were differentially expressed between the different flower bud formation stages, and the flower buds showed no difference from the leaf buds in their expression level.

In the vernalization pathway, vernalization can cause epigenetic changes in FLC and indirectly promote flower formation by histone modification, and MADS AFFECTING FLOWERING 5 (MAF5) is an FLC-related gene38,39,40,41,42,43. VERNALIZATION INSENSITIVE 3-LIKE 2 (VIL2), which can repress MAF5 and permit more rapid flowering during noninductive photoperiod conditions in Arabidopsis thaliana44. In this study, VIL2 was highly expressed in the predifferentiation period (F_1), decreased in the initial differentiation period (F_2), and decreased to the lowest point at the flower primordium differentiation period (F_3). In the leaf buds (JRL), its expression level was higher than that in the flower buds during the same period (F_2).

In the GA pathway, GA regulates LEAFY (LFY) via DELLAs, miR159 and MYBs45,46,47. The combination of gibberellin and GIBBERELLIN-INSENSITIVE DWARF1 (GID1) can combine with the DELLA protein to form a GA-GID1-DELLA trimer, and then the SKP1-CUL1-F-box (SCF) polymer can tag the trimer to induce the ubiquitin 26 S proteasome to degrade the DELLA protein, relieving the inhibitory effect of the DELLA protein on plant growth and producing gibberellin effects48,49. In this study, GID1 was highly expressed in the predifferentiation period (F_1), and its expression was downregulated in the initial differentiation period (F_2). In addition, the expression of GID1 was higher in flower buds (F_2) than in leaf buds (JRL).

In addition to the DEGs mentioned above, there are two DEGs annotated as APETALA1 (AP1). In Arabidopsis thaliana, AP1 is required for the floral transition50. In this study, the two AP1 genes were nearly undetectable in the predifferentiation period (F_1), and their expression showed a significant upregulation in the initial differentiation period (F_2).

Coexpression networks

Weighted gene coexpression network analysis (WGCNA) is a biology method for interaction analysis and resolving correlation networks51. To search for the genes involved in flowering time regulation in walnut, thirty-one flowering time-related DEGs were used to construct a coexpression network using the WGCNA method, and the results are presented in Fig. 11. In the coexpression network, many of the hub genes that participate in flowering time regulation were identified, such as JrCDF-2, JrCDF-3, JrPRR7, JrPRR7-1, JrPRR7-2, JrTPS-1-1, and the hub genes with the highest edge numbers were JrCRY2 and JrNF-YA-2.

Figure 11

Figure 11

Coexpression networks of 31 DEGs related to flowering time. In the drawn weight network graph, the weight between genes is divided into four parts, which are represented by point lines, short dotted lines, long dotted lines and solid lines from light to heavy weights. Larger nodes and redder colors indicate greater connectivity of the genes in the network graph. The coexpression networks were drawn on the OmicShare website (http://www.omicshare.com/tools/Home/) based on the correlation coefficients between the genes.

Verification of DEG expression by qRT-PCR

To further verify the gene expression levels shown by RNA-Seq, we chose eighteen flowering time-related DEGs to perform qRT-PCR. Except for JrTPS1-2, the expression trends of the DEGs showed high similarity between the qPCR data and the RNA-Seq data (Fig. 12).

Figure 12

Figure 12

Verification of 18 flowering time-related DEGs by qRT-PCR.

Source