Sampling site and regimes
Field work was carried out in the agricultural catchment of Grytelandsbekken (a creek of approx. 2.5 km in length) also known as Skuterud catchment, located in the municipality of Ås (20,000 people), 30 km southeast of Oslo (Fig. 1). The catchment area is of approx. 4.5 km2 and largely consists of farmlands (60%) and forest/marshlands (31%). Grytelandsbekken has previously been studied for spatial variations in the microbial communities of various lotic freshwater ecosystems in different regions of Norway21. Based on that study, it was characterised as a rural creek with the highest diversity and abundance of microbial communities. Thus, in this follow-up study, Grytelandsbekken, specifically, was selected for assessing the seasonal dynamics of lotic freshwater bacterial communities.
All field measurements and samplings were carried out over a 2-year period at the same site of the creek, i.e. at about 0.25 km. In total, there were 16 sampling events, split equally between cold and warm seasonal regimes. The cold season refers to December–March, while the warm season includes June–September. In detail, there were four events during the cold season in the first year, Cold_1 (Cold_1a/Dec.2014, Cold_1b/Jan.2015, Cold_1c/Feb.2015, and Cold_1d/Mar.2015) and four events during the cold season in the second year, Cold_2 (Cold_2a/Dec.2016, Cold_2b/Jan.2017, Cold_2c/Feb.2017, and Cold_2d/Mar.2017). A similar sample assembly was applied to the warm seasonal regimes, i.e. four in the first year, Warm_1 (Warm_1a/Jun.2015, Warm_1b/Jul.2015, Warm_1c/Aug.2015, and Warm_1d/Sep.2015) and four in the second year, Warm_2 (Warm_2a/Jun.2016, Warm_2b/Jul.2016, Warm_2c/Aug.2016, and Warm_2d/Sep.2016). The entire study duration and sampling sets were conceived based on similar settings reported in aquatic research worldwide, profiling microbial communities through seasonal and spatial variations20,35,36.
Primary physico-chemical parameters that are routinely measured in standard catchment water quality control37,38 were selected for abiotic characteristics of the lotic environment. These included organic matter content (expressed as CODCr), TSS, nutrients (Ptot and Ntot), EC, pH, and Temp. The latter was measured in situ at one of the weather stations, administrated by the Norwegian Institute of Bioeconomy Research (NIBIO) and located at the field measuring/sampling site of Grytelandsbekken. The station is equipped with a number of automatic sensors, providing hourly measurements and registering various climatic data online, which are all open and available at the AgroMetBase hosted by NIBIO (https://lmt.nibio.no/agrometbase/getweatherdata.php). The former parameters were analysed post-sampling in an accredited laboratory of the ALS Laboratory Group Norway AS. These analyses were performed in accordance with ISO and national standards for the respective parameters: CODCr (ISO 15705), TSS (CSN EN 872, NS 4733), Ptot (ISO 6878, ISO 15681-1), Ntot (EN 12260), EC (EN 27 888, SM 2520B, EN 16192), and pH (ISO 10523, EPA 150.1, EN 16192).
Genomic DNA purification: 16S rRNA amplicon library preparation and MiSeq sequencing
The seasonal water samples underwent DNA extraction using the QIAGEN DNeasy PowerWater Kit (QIAGEN GmbH, Hilden, Germany). In practice, 0.5 L of water was subjected to ultrafiltration to obtain a solid mass on a membrane filter (0.45 µm). DNA was then extracted from the collected filters, following the manufacturer’s instruction. The concentration and quality of the purified DNA was analysed on the NanoDrop Spectrophotometer (Thermo Fisher Scientific, Wilmington, USA). Five nanograms of purified DNA was used in PCR to prepare an amplicon library using the NEXTflex 16S V4 Amplicon-Seq Kit 2.0 (Bioo Scientific Corporation, Austin, TX, USA), following the provided protocol, which has previously been described in detail21. The applied specific 16S V4 forward primer (5′-GACGCTCTTCCGATCTTATGGTAATTGTGTGCCAGCMGCCGCGGTAA-3′) and reverse primer (5′-TGTGCTCTTCCGATCTAGTCAGTCAGCCGGACTACHVGGGTWTCTAAT-3′) were provided by the manufacturer and included in the sequencing kit. The library was prepared in triplicate for each sample. The final concentration of each library was measured on a Qubit Fluorometer (Life Technologies, Eugene, OR, USA) using the Quant-IT dsDNA HS Assay Kit (Invitrogen, Carlsbad, CA, USA). Pooling of all prepared libraries was achieved after concentration normalization using the SequalPrep Normalization Plate Kit (Thermo Fisher Scientific, Wilmington, USA). The pooled library was analysed on the Agilent 2100 Bioanalyzer system using the Agilent High Sensitivity DNA Kit (Agilent, CA, USA). It indicated a single band/unique product at 451 bp. The multiplexed library was sequenced on the Illumina MiSeq system using the MiSeq Reagent Kit V3, 600 Cycles (Illumina Inc., San Diego, CA, USA), following the default standard procedures.
Sequence data processing
The output sequence datasets were analysed using Microbial Genomics Module 2.0, added onto the CLC Genomic Workbench 10.1.1 (CLC Bio, QIAGEN Company, Aarhus, Denmark, https://www.qiagenbioinformatics.com/products/clc-genomics-workbench). The processing workflow consisted of four key components: quality filtration, OTU clustering, and alpha and beta diversity measures. Adapter and primer sequences were trimmed. Unqualified reads were trashed when the quality score was less than 20 or a higher number of ambiguous nucleotides (more than two) were detected. The average length after trimming was between 220–230 bp. Chimeric sequences and singletons were detected and discarded. The remaining qualified reads were used to characterize OTUs based on a reference database (Greengenes v_13_5)39 at a 97% identity level. The bacterial alpha diversity of each sample was estimated in rarefaction analysis with a depth cutoff at 50,000 reads. The bacterial beta diversity applied the Euclidean distance criterion (EDC) to estimate the community similarities between the examined samples. All sequence data are available at NCBI Sequence Read Archive, under accession number SRR10835654-669, as part of BioProject PRJNA599104.
Alpha diversity differences were tested using a two-tailed Student’s t-test at 0.05 significance. This was performed in the XLSTAT-ECOLOGY statistical software package version 2019.1.1 (Addinsoft 2020, Boston, USA, https://www.xlstat.com). A hierarchical clustering heat map was created to elucidate the bacterial community similarities/relatedness (pairwise) among all examined samples. It was conducted on a subset of top 700 OTUs, based on the EDC using the trimmed mean of M-values and Z-score (standard deviation numbers from the population mean) normalizations. This was further supported by the PERMANOVA analysis, performed to ascertain statistical significance of the clusters (p-value = 0.00001). These tests were executed using a package of functional features included in the Microbial Genomics Module (CLC Bio, QIAGEN Company, Aarhus, Denmark, https://www.qiagenbioinformatics.com/products/clc-genomics-workbench). Furthermore, the LEfSe tool40 was applied to identify the responsible bacterial members, accounting for community discrepancy between cold and warm seasons. The formatted abundance table of bacterial classes was uploaded to the Galaxy/Hutlab application web-based platform (Biostatistics Department, Harvard School of Public Health, Boston, MA, USA, https://huttenhower.sph.harvard.edu/galaxy) for pairwise comparison. Statistical significance of the comparison was determined by the Wilcoxon rank-sum test at the alpha value of 0.05. The identified features characterising the microbial differences among samples were processed using LDA, with a threshold score set at 2.0. Beyond that, RDA was carried out to determine the key abiotic environmental variables driving seasonal changes of the bacterial community based on the Pearson correlation test, with a statistical significance level higher than 95% (p < 0.05). It was performed using the most abundant bacterial phyla identified and determined physico-chemical parameters (abiotic environmental factors). Based on the computed scores of explanatory variables, only those with the uppermost weights and significant relations between the biotic and abiotic factors were included in the RDA plot. The RDA was performed in the XLSTAT-ECOLOGY statistical software package version 2019.1.1 (Addinsoft 2020, Boston, USA, https://www.xlstat.com).
The figure presenting the study site location (Fig. 1) is based on a satellite image and topographical map of Norway provided by the Norwegian Institute of Bioeconomy Research (NIBIO). The Institute’s geographical data is documented in the NIBIO’s primary map service the Source/Kilden (https://kilden.nibio.no). All maps and images collected in the Source/Kilden are open, and hence, they can freely be saved or printed (https://www.nibio.no/en/subjects/soil/national-land-resource-map?locationfilter=true).
Figures containing results and statistics (Figs. 2, 3, 4, 5) were generated using tools in the CLC Microbial Genomics Module version 2.5.1 (CLC Bio, QIAGEN Company, Aarhus, Denmark, https://www.qiagenbioinformatics.com/clc-microbial-genomics-module-latest-improvements), Galaxy version 1.0 (the Huttenhower Lab, Harvard School of Public Health, Boston, MA, USA, https://huttenhower.sph.harvard.edu/galaxy), and XLSTAT-ECOLOGY version 2019.1.1 (Addinsoft 2020, Boston, USA, https://www.xlstat.com).