Comparative Genomics of NBS Disease Resistance Genes Across Plant Species: Evolution, Mechanisms, and Biomedical Applications

Lucy Sanders Nov 26, 2025 932

This article provides a comprehensive synthesis of recent advances in the comparative analysis of Nucleotide-Binding Site (NBS) genes, the largest class of plant disease resistance genes.

Comparative Genomics of NBS Disease Resistance Genes Across Plant Species: Evolution, Mechanisms, and Biomedical Applications

Abstract

This article provides a comprehensive synthesis of recent advances in the comparative analysis of Nucleotide-Binding Site (NBS) genes, the largest class of plant disease resistance genes. Aimed at researchers, scientists, and drug development professionals, it explores the remarkable diversity and evolution of NBS genes from bryophytes to angiosperms, detailing methodologies for genome-wide identification and classification. The content covers the functional validation of NBS genes in plant immunity, the challenges in studying these highly variable genes, and comparative genomic insights from key horticultural crops and monocot-dicot systems. By integrating findings from large-scale genomic studies and functional analyses, this review highlights the potential of NBS genes as a genetic resource for improving disease resistance in crops and informs strategies for managing genetic disease resistance in a biomedical context.

Unraveling the Diversity and Evolutionary History of the NBS Gene Superfamily

Nucleotide-binding site (NBS) genes encode a critical class of plant resistance (R) proteins that serve as intracellular immune receptors, forming the core of the plant immune system known as effector-triggered immunity (ETI). These proteins, predominantly characterized by their nucleotide-binding site and leucine-rich repeat (NBS-LRR) domains, enable plants to detect specific pathogen effector molecules and initiate robust defense responses [1] [2]. This comparative guide examines the diversification, recognition mechanisms, and experimental approaches for studying NBS genes across plant species, providing researchers with essential methodologies and resources for advancing disease resistance research. Through systematic analysis of recent findings, we highlight the sophisticated strategies plants employ to combat evolving pathogens and the experimental tools available for dissecting these mechanisms.

NBS Gene Architecture and Classification

NBS-LRR proteins represent the largest and most prominent class of plant resistance genes, functioning as specialized immune sensors that detect pathogen invasions. These proteins typically exhibit a modular domain architecture consisting of three fundamental components: a variable N-terminal domain, a central nucleotide-binding adaptor shared by APAF-1, R proteins, and CED-4 (NB-ARC) domain, and a C-terminal leucine-rich repeat (LRR) region [3] [2]. The N-terminal domain determines the classification into distinct subfamilies, primarily TIR-NBS-LRR (TNL) containing a Toll/Interleukin-1 receptor domain or CC-NBS-LRR (CNL) featuring a coiled-coil domain, with a third minor subclass (RNL) containing an RPW8 domain [3].

The central NBS domain is responsible for nucleotide binding and ADP-ATP exchange, which serves as a molecular switch for activation of defense signaling [1]. The C-terminal LRR domain often participates in pathogen recognition through protein-protein interactions and regulates protein activation [2]. Genomic studies have revealed remarkable diversity in NBS domain architectures, with researchers identifying 12,820 NBS-domain-containing genes across 34 plant species ranging from mosses to monocots and dicots, classified into 168 distinct structural categories [3]. This expansion is particularly pronounced in flowering plants, with surveyed angiosperm genomes containing over 90,000 NLR genes according to the Angiosperm NLR Atlas [3].

Table 1: Major Classes of Plant NBS-LRR Proteins

Class	N-Terminal Domain	Key Features	Signaling Adaptors	Representative Examples
TNL	TIR (Toll/Interleukin-1 Receptor)	Recognizes pathogen effectors directly or indirectly; often requires EDS1 for signaling	EDS1	RPS4, RRS1-R (Arabidopsis)
CNL	CC (Coiled-Coil)	Major class involved in effector perception; shows significant expansion in angiosperms	NDR1	RPS2, RPS5 (Arabidopsis)
RNL	RPW8 (Resistance to Powdery Mildew 8)	Functions in signaling transduction within the NLR network	-	ADR1 (Arabidopsis)
Atypical	Variable (e.g., WRKY)	Unique domain combinations; often involved in specific recognition	Variable	RRS1-R (contains C-terminal WRKY domain)

Molecular Mechanisms of Effector Recognition and Signaling Activation

NBS-LRR proteins employ sophisticated molecular strategies to detect pathogen effectors, primarily through two distinct mechanisms: direct and indirect recognition. Direct recognition involves physical binding between the NBS-LRR protein and the pathogen effector, as demonstrated by the rice Pi-ta protein interaction with the fungal effector Avr-Pita, and the Arabidopsis RRS1-R recognition of bacterial PopP2 effector [1] [2]. These interactions typically occur between the LRR domain of the R protein and the pathogen effector, leading to conformational changes that activate defense signaling [2].

In contrast, indirect recognition operates through the "guard model," where NBS-LRR proteins monitor host cellular components ("guardees") that are targeted by pathogen effectors. When effectors modify these guardees, the NBS-LRR proteins detect the alteration and activate immunity [2]. Prominent examples include the Arabidopsis RPM1 and RPS2 proteins, which guard the RIN4 protein. RPM1 detects RIN4 phosphorylation by AvrRpm1 or AvrB, while RPS2 recognizes RIN4 cleavage by AvrRpt2 [2]. Similarly, RPS5 monitors the cleavage of PBS1 kinase by AvrPphB [2].

Upon effector recognition, NBS-LRR proteins undergo significant conformational changes that promote ADP-ATP exchange in the NBS domain, transitioning from an inactive to an active state [1]. This activation initiates downstream signaling cascades leading to defense responses including a rapid oxidative burst, accumulation of salicylic acid, transcriptional reprogramming, and frequently a hypersensitive response (HR) - a localized programmed cell death that restricts pathogen spread [1].

Figure 1: Plant Immune Signaling Pathways. This diagram illustrates the zig-zag model of plant immunity, showing the progression from PAMP-triggered immunity to effector-triggered immunity mediated by NBS-LRR proteins.

Comparative Genomic Analysis of NBS Genes Across Species

Recent comparative genomic studies have revealed extensive diversification of NBS gene families across land plants, reflecting their adaptive evolution in response to pathogen pressures. A comprehensive analysis of 34 plant species identified 12,820 NBS-domain-containing genes, classified into 168 distinct architectural classes with both conserved and species-specific structural patterns [3]. The expansion of NBS genes appears to be particularly pronounced in flowering plants, with angiosperm genomes containing dramatically more NBS genes (e.g., 2012 NBS encoding genes in wheat) compared to non-flowering plants like the moss Physcomitrella patens (approximately 25 NLRs) [3].

Orthogroup analysis has identified 603 conserved orthogroups (OGs), with some core orthogroups (OG0, OG1, OG2) being widely distributed across species, while others (OG80, OG82) appear to be species-specific [3]. Tandem duplications have been identified as a major driver of NBS gene expansion, contributing to the rapid evolution of new recognition specificities. Expression profiling of these orthogroups in cotton has demonstrated that OG2, OG6, and OG15 are particularly responsive to biotic and abiotic stresses, suggesting their importance in plant immunity [3].

Table 2: NBS Gene Repertoire Diversity Across Plant Species

Plant Species	Family/Group	Ploidy	Total NBS Genes	Notable Features	Reference
Arabidopsis thaliana	Brassicaceae	Diploid	~200	Model for TNL and CNL signaling; RRS1 with WRKY domain	[1]
Solanum tuberosum (Potato)	Solanaceae	Tetraploid	587-755 NBS domains	High clustering; copy number variation between cultivars	[4]
Oryza sativa (Rice)	Poaceae	Diploid	~500	Xa27 induced by AvrXa27; Pi-ta direct recognition	[1] [2]
Gossypium hirsutum (Cotton)	Malvaceae	Tetraploid	Extensive repertoire	OG2, OG6, OG15 upregulated in stress responses	[3]
Salvia miltiorrhiza	Lamiaceae	Diploid	196 (62 complete)	Reduced TNL and RNL members; link to secondary metabolism	[5]
Nicotiana benthamiana	Solanaceae	Diploid	345 candidates	Model for functional assays; hairpin library available	[6]
Physcomitrella patens	Moss	Haploid	~25 NLRs	Small repertoire representing ancestral state	[3]

Species-specific variations in NBS gene families are particularly evident in specialized medicinal plants like Salvia miltiorrhiza, where 196 NBS-LRR genes were identified, with only 62 possessing complete N-terminal and LRR domains [5]. Comparative analysis revealed a marked reduction in TNL and RNL subfamily members in Salvia compared to other model plants, suggesting lineage-specific evolution of immune receptors [5]. Expression analysis further indicated a connection between SmNBS-LRRs and secondary metabolism, highlighting the potential intersection between defense responses and production of medicinal compounds [5].

Key Experimental Methods for NBS Gene Identification and Functional Analysis

Genome-Wide Identification and Domain Analysis

Standard protocols for identifying NBS gene families begin with screening predicted proteomes for NBS domains using Hidden Markov Models (HMM) corresponding to established domain profiles (e.g., PF00931 from Pfam database) [3] [6]. This initial identification is typically followed by additional validation using batch BLASTP searches and domain architecture analysis with tools like HMMscan to confirm the presence of characteristic NBS, LRR, and TIR domains [6]. Researchers increasingly employ stringent filtering criteria (E-value <1e-¹⁶⁰, identity >70%, minimum sequence length of 200 residues) to eliminate false positives from related domains like ABC transporters and other P-loop containing proteins [6].

NBS Profiling and Sequence Capture

NBS profiling represents an efficient method for capturing sequence diversity in NBS domains across multiple genotypes. This approach utilizes PCR amplification with degenerate primers targeting highly conserved motifs within the NBS domain (P-loop, Kinase-2, and GLPL) to generate "NBS tags" - 200-480 bp fragments that encompass both conserved and variable regions [4]. As demonstrated in potato research, just 16 carefully designed primers can capture nearly all NBS domains from 91 genomes, providing comprehensive coverage of R gene diversity [4]. These NBS tags can then be mapped to reference genomes, with studies detecting an average of 26 nucleotide polymorphisms per NBS locus across potato cultivars, enabling haplotype analysis and marker development [4].

Functional Validation through Silencing Approaches

Virus-induced gene silencing (VIGS) has emerged as a powerful technique for functional characterization of NBS genes. Recent innovations include the development of comprehensive hairpin RNAi libraries targeting all predicted NBS genes in a species. For Nicotiana benthamiana, researchers have constructed a library covering 345 R gene candidates, enabling systematic functional screening [6]. This approach successfully validated known R genes including Prf, NRC2a/b, and NRC3 required for Pto/avrPto-mediated hypersensitive response, and NRG1 essential for Tobacco Mosaic Virus recognition [6]. Similarly, silencing of GaNBS (OG2) in resistant cotton demonstrated its crucial role in limiting virus titers during cotton leaf curl disease infection [3].

Figure 2: Experimental Workflow for NBS Gene Analysis. This diagram outlines the key methodological steps for comprehensive identification and functional characterization of NBS resistance genes.

Table 3: Essential Research Reagents for NBS Gene Studies

Reagent/Resource	Category	Specification/Function	Application Examples
Degenerate Primers	Molecular Biology	Target conserved NBS motifs: P-loop, Kinase-2, GLPL; designed with strategic degeneracy	NBS profiling; amplification of NBS tags from multiple genotypes [4]
Hairpin RNAi Library	Functional Genomics	Comprehensive library targeting all predicted NBS genes in a species	High-throughput functional screening; identification of R genes required for specific HR [6]
HMM Profiles	Bioinformatics	Curated domain models (e.g., Pfam PF00931); species-specific NBS HMMs	Genome-wide identification of NBS-containing genes; domain architecture analysis [3] [6]
Reference Genomes	Genomic Resources	Annotated genomes from diverse species; multiple cultivar sequences	Mapping NBS tags; identifying polymorphisms; comparative genomics [3] [4]
VIGS Vectors	Functional Validation	Virus-induced gene silencing vectors (e.g., TRV-based)	Rapid functional characterization of individual NBS genes [3] [6]
Effector Clones	Pathogen Factors	Cloned pathogen effector genes with appropriate expression systems	Testing specific R gene-effector interactions; HR assays [2] [6]

Regulation and Evolution of NBS Genes

Plant NBS genes are subject to sophisticated regulatory mechanisms to prevent inappropriate activation and to balance the metabolic costs of immunity. RNA silencing plays a crucial role in negatively regulating R gene expression through both transcriptional gene silencing (DNA methylation) and post-transcriptional gene silencing (mRNA cleavage mediated by small RNAs) [1]. Specific microRNAs including miR482 and miR472 have been shown to target nucleotide sequences encoding conserved NBS motifs, providing a layer of transcriptional control that may enable plants to maintain extensive NLR repertoires without fitness costs [1] [3].

Protein stability represents another key regulatory layer, with chaperone complexes containing HSP90, SGT1, and RAR1 contributing to proper folding and stability of NBS-LRR proteins [1]. Additionally, F-box proteins like CPR1/CPR30 target specific NBS-LRR proteins for degradation through the SKP1-Cullin1-F-box (SCF) E3 ubiquitin ligase complex, preventing autoimmunity [1]. Light regulation has also been documented, with blue light receptors CRY2 and PHOT2 stabilizing R protein HRT against Turnip Crinkle Virus by suppressing COP1 E3 ubiquitin ligase-mediated degradation [1].

Evolutionarily, NBS genes exhibit remarkable dynamism, with gene duplication and loss events serving as major drivers of gene family evolution [3]. Both whole-genome duplications and small-scale duplications (tandem, segmental, and transposon-mediated) contribute to NBS gene expansion, creating genetic raw material for evolving new recognition specificities [3]. This evolutionary flexibility enables plants to rapidly adapt to changing pathogen populations, though it also creates challenges for breeding durable resistance.

NBS genes represent a central component of the plant immune system, exhibiting remarkable structural diversity and sophisticated recognition mechanisms that enable specific pathogen detection. Their distribution across plant genomes reflects an evolutionary arms race with pathogens, resulting in complex gene families that display both conserved and species-specific characteristics. The experimental methodologies reviewed here - from genome-wide bioinformatic identification to functional validation using silencing approaches - provide researchers with powerful tools to characterize these important genes across plant species.

Future research directions will likely focus on understanding the precise molecular mechanisms of NBS-LRR activation, elucidating the complete signaling networks downstream of NBS protein activation, and exploiting this knowledge for engineering broad-spectrum disease resistance in crop plants. The increasing availability of high-quality plant genomes and advanced gene editing technologies presents unprecedented opportunities to dissect NBS gene function and apply these insights to agricultural challenges. As our understanding of NBS gene evolution, regulation, and function continues to deepen, so too will our ability to harness these natural defense systems for sustainable crop protection.

Nucleotide-binding site (NBS) genes constitute the largest and most critical class of plant disease resistance (R) genes, encoding proteins that function as intracellular immune receptors in plant defense systems. These molecular sentries recognize pathogen-specific effector molecules and initiate robust immune responses, culminating in effector-triggered immunity (ETI) [7] [8]. The NBS gene family is characterized by a conserved modular architecture featuring a central nucleotide-binding adaptor shared by APAF-1, R proteins, and CED-4 (NB-ARC) domain, which facilitates nucleotide binding and molecular switching activity through ATP/GTP hydrolysis [9] [3]. This domain is typically flanked by C-terminal leucine-rich repeats (LRRs) that mediate pathogen recognition through protein-protein interactions, while the variable N-terminal domains define the major NBS classes and their distinct signaling mechanisms [3] [8].

The classification of NBS genes primarily depends on their N-terminal domain architecture, which has given rise to three principal classes: Coiled-Coil NBS-LRR (CNL), Toll/Interleukin-1 Receptor NBS-LRR (TNL), and Resistance to Powdery Mildew 8 NBS-LRR (RNL) [9] [3]. This architectural diversity is not merely structural but reflects functional specialization within the plant immune system. CNL and TNL proteins primarily function as pathogen detectors, either directly interacting with pathogen effectors or monitoring changes in host proteins targeted by these effectors [7]. In contrast, RNL proteins typically serve as "helper" NLRs involved in transducing defense signals downstream of both TNL and CNL activation [7]. Understanding the distinct features, distribution patterns, and evolutionary dynamics of these three major classes provides crucial insights for developing disease-resistant crop varieties through molecular breeding strategies.

Comparative Analysis of NBS Gene Classes

Architectural Features and Molecular Signatures

Table 1: Comparative architecture of major NBS gene classes

Class	N-Terminal Domain	Central Domain	C-Terminal Domain	Key Conserved Motifs	Structural Role
CNL	Coiled-Coil (CC) or Leucine Zipper (LZ)	NB-ARC (NBS)	Leucine-Rich Repeat (LRR)	P-loop, Kinase-2, RNBS A, GLPL, MHDL [9]	Pathogen detector; direct effector recognition [7]
TNL	Toll/Interleukin-1 Receptor (TIR)	NB-ARC (NBS)	Leucine-Rich Repeat (LRR)	P-loop, Kinase-2, RNBS A, GLPL, MHDL [9]	Pathogen detector; senses effector-induced host changes [7]
RNL	Resistance to Powdery Mildew 8 (RPW8)	NB-ARC (NBS)	Leucine-Rich Repeat (LRR)	P-loop, Kinase-2, RNBS A, GLPL, MHDL [9]	"Helper" NLR; downstream signal transduction [7]

The CNL class is characterized by an N-terminal coiled-coil (CC) domain that facilitates protein oligomerization and signaling. The CC domain is typically 150-200 amino acids in length and forms alpha-helical structures that enable homotypic interactions [8]. The TNL class features a Toll/interleukin-1 receptor (TIR) domain at the N-terminus, which exhibits homology to animal immune receptors and functions in signal transduction through NADase activity [3]. The RNL class contains an N-terminal RPW8 domain, named after the Resistance to Powdery Mildew 8 protein from Arabidopsis, which is involved in signal transduction and cell death execution [9] [7].

All three classes share the conserved NB-ARC domain, which acts as a molecular switch by cycling between ADP-bound (inactive) and ATP-bound (active) states [3]. This domain typically contains several conserved motifs including the phosphate-binding loop (P-loop), kinase-2 motif, RNBS-A, GLPL, and MHDL motifs, which are essential for nucleotide binding and hydrolysis [9]. The C-terminal LRR domain across all classes consists of multiple tandem repeats of a 20-30 amino acid motif rich in leucine residues, creating a curved solenoid structure that provides a versatile framework for specific protein-protein interactions and pathogen recognition [8].

Distribution Across Plant Lineages

Table 2: Distribution of NBS gene classes across plant species

Plant Species	Family	CNL	TNL	RNL	Total NBS	Special Patterns
Arabidopsis thaliana [9]	Brassicaceae	100	77	13	352	Balanced distribution
Sunflower (Helianthus annuus) [9]	Asteraceae	100 (CNL) + 64 (RX_CC-like)	77	13	352	One-third clusters on chromosome 13
Sweet potato (Ipomoea batatas) [7]	Convolvulaceae	CN-type: Most common	-	-	889	83.13% in clusters
Nicotiana tabacum [10]	Solanaceae	CC-NBS: 23.3%	TIR-NBS: 2.5%	-	603	45.5% NBS-only
Dendrobium officinale [11]	Orchidaceae	10	0	-	74	TNL absence in monocots
Salvia miltiorrhiza [5]	Lamiaceae	Majority	Marked reduction	Marked reduction	196	Reduced TNL/RNL

The distribution of NBS gene classes follows distinct evolutionary patterns across plant lineages. CNL genes are ubiquitous across all angiosperms, representing the most widespread and numerous class in most plant genomes [3]. TNL genes exhibit a more restricted distribution, present in dicots but completely absent in monocots, including economically important crops such as rice, maize, and orchids [11]. This absence in monocots is potentially driven by the deficiency of downstream signaling components like NRG1/SAG101 in the TNL signaling pathway [11]. RNL genes represent the smallest class across all surveyed plant species, consistent with their specialized role as helper NLRs rather than primary pathogen sensors [9] [7].

Comparative genomic analyses reveal dramatic variation in NBS gene numbers across species, reflecting diverse evolutionary trajectories. Some plant families like Fabaceae (legumes) exhibit consistent expansion of NBS genes, while others like Cucurbitaceae display contraction patterns due to frequent gene loss and limited duplication [7]. The Solanaceae family shows remarkable diversity even among closely related species, with potato demonstrating "continuous expansion," tomato showing "expansion followed by contraction," and pepper exhibiting "contraction" patterns [7]. These distinct evolutionary dynamics highlight the rapid birth-and-death evolution characteristic of the NBS gene family, driven by continuous host-pathogen coevolution.

Experimental Protocols for NBS Gene Identification and Validation

Genome-Wide Identification Pipeline

The identification and classification of NBS genes across plant genomes follows established computational pipelines that leverage conserved domain architectures. The standard protocol begins with sequence retrieval from genomic databases such as Phytozome, NCBI, or specialized genome portals (e.g., Sunflower Genome Database, Ipomoea Genome Hub) [9] [7]. Researchers then perform HMMER searches using hidden Markov models of the NB-ARC domain (PF00931 from PFAM) as queries with an E-value cutoff of 1.0 or more stringent thresholds (1.1e-50 in some studies) [9] [3]. This initial identification is typically followed by domain validation using multiple tools including InterProScan, SMART, and NCBI's Conserved Domain Database (CDD) to confirm the presence of characteristic domains (CC, TIR, RPW8, LRR) [12] [10].

Advanced pipelines like RGAugury incorporate additional validation steps including motif analysis using MEME suite to identify conserved order of motifs like P-loop, kinase-2, RNBS-A, GLPL, and MHDL [9]. Gene classification is then performed based on domain architecture, followed by chromosomal mapping and cluster analysis to identify tandemly duplicated genes [7] [12]. More sophisticated approaches integrate comparative genomics through synteny analysis using tools like MCScanX to identify orthologous genes across related species [7] [10]. The entire workflow typically employs custom scripts to integrate these various bioinformatics tools into a cohesive pipeline for comprehensive NBS gene identification and classification.

NBS Gene Identification Workflow: Computational pipeline for genome-wide identification and classification of NBS genes.

Functional Validation Approaches

Functional characterization of NBS genes employs both expression analysis and genetic approaches. Transcriptome profiling using RNA-seq data from various tissues and stress conditions identifies differentially expressed NBS genes [7] [11]. Studies typically analyze expression patterns across different tissues (leaf, stem, root, flower) and under biotic stress conditions (pathogen infection) and abiotic stresses (drought, salinity, hormone treatments) [3] [7]. The differential expression analysis pipeline involves quality control of raw reads (Trimmomatic), alignment to reference genomes (HISAT2), transcript quantification (Cufflinks/Cuffdiff with FPKM normalization), and identification of significantly differentially expressed genes [10].

For experimental validation, virus-induced gene silencing (VIGS) has proven effective for functional characterization, as demonstrated in cotton where silencing of GaNBS (OG2) increased susceptibility to viral infection, confirming its role in disease resistance [3]. Quantitative reverse-transcription PCR (qRT-PCR) provides precise measurement of expression changes for selected candidate genes, typically using resistant and susceptible cultivars under pathogen challenge [7]. For example, sweet potato studies selected six differentially expressed NBS genes for qRT-PCR validation in resistant and susceptible lines infected with stem nematodes and Ceratocystis fimbriata, confirming RNA-seq results [7]. Additional functional insights come from promoter analysis identifying cis-elements related to hormone response (salicylic acid, jasmonic acid, ethylene) and stress responses, and protein interaction studies examining interactions with pathogen effectors and signaling components [3] [5].

Functional Validation Approaches: Experimental methods for characterizing NBS gene function.

NBS-Mediated Signaling Pathways in Plant Immunity

The NBS gene classes operate within sophisticated signaling networks that constitute the plant immune system. The CNL and TNL classes function as pathogen recognition receptors that initiate effector-triggered immunity (ETI), while RNL proteins act as signaling helpers that amplify and transduce defense signals [7]. The activation mechanism involves direct or indirect recognition of pathogen effectors, typically through the LRR domain, which induces conformational changes in the NBS domain that promote nucleotide exchange (ADP to ATP) and activate the protein [8].

Upon activation, CNL and TNL proteins trigger downstream signaling cascades that lead to defense activation, typically including a hypersensitive response (HR) characterized by programmed cell death at the infection site to restrict pathogen spread [8]. TNL proteins specifically require EDS1 (Enhanced Disease Susceptibility 1) as a central signaling component, which forms complexes with related proteins SAG101 and NRG1 to transduce signals [11]. CNL proteins utilize NDR1 (Nonrace-specific Disease Resistance 1) as a key signaling adapter [8]. RNL proteins, such as NRG1 and ADR1, function downstream of both TNL and CNL activation and are essential for HR cell death and defense gene amplification [7].

NBS-Mediated Immune Signaling: Simplified representation of plant immune pathways showing CNL, TNL, and RNL interactions.

Recent studies have revealed the complex interplay between these signaling components. In Arabidopsis, the TNL gene RPS4 confers specific resistance to bacterial pathogens in an EDS1-dependent manner [12]. Similarly, the cotton CNL gene GbCNL130 provides resistance to verticillium wilt across different hosts [12]. The emerging paradigm suggests that while CNL and TNL proteins specialize in pathogen recognition through diverse LRR domains, RNL proteins provide conserved signaling functions that are shared across multiple resistance pathways, explaining their smaller numbers but essential roles in plant immunity [7].

Table 3: Essential research reagents and computational tools for NBS gene analysis

Category	Tool/Reagent	Specific Application	Key Features
Bioinformatics Tools	HMMER v3.1b2 [10]	Domain identification using HMM profiles	PF00931 (NB-ARC) domain search
	InterProScan [8]	Multi-domain architecture analysis	Integrates multiple domain databases
	MEME Suite [12]	Conserved motif discovery	Identifies P-loop, kinase-2, other motifs
	MCScanX [10]	Synteny and duplication analysis	Identifies orthologous gene pairs
	OrthoFinder [3]	Orthogroup inference	Determines evolutionary relationships
Databases	PRGdb [9]	Curated R gene repository	153 cloned R genes, 177,072 annotated PRGs
	PFAM [10]	Protein family database	HMM profiles for NB-ARC, TIR, LRR domains
	NCBI CDD [10]	Conserved domain analysis	Domain architecture validation
	Plaza [3]	Comparative genomics	Evolutionary analyses across species
Experimental Methods	VIGS [3]	Functional gene validation	Rapid gene silencing in plants
	RNA-seq [7]	Expression profiling	Genome-wide expression analysis
	qRT-PCR [7]	Targeted expression validation	Precise quantification of candidate genes

The experimental toolkit for NBS gene research continues to evolve with technological advancements. Next-generation sequencing platforms enable high-quality genome assemblies that are crucial for accurate NBS gene annotation, as incomplete genomes often lead to underestimation of NBS gene numbers [7]. Specialized databases like ANNA (Angiosperm NLR Atlas) provide comprehensive collections with over 90,000 NLR genes from 304 angiosperm genomes, including 18,707 TNL, 70,737 CNL, and 1,847 RNL genes [3]. For functional studies, virus-induced gene silencing (VIGS) has emerged as a powerful technique for rapid functional characterization, particularly in species with challenging genetics or long generation times [3].

Machine learning approaches are increasingly complementing traditional domain-based methods for R gene prediction. Tools like DRAGO2/3, RGAugury, RRGPredictor, NLR-Annotator, and NLRtracker incorporate advanced algorithms to improve the accuracy of NBS gene identification and classification [8]. These computational advances are particularly valuable for handling the high sequence diversity and complex evolutionary patterns characteristic of NBS gene families. The integration of these bioinformatics tools with experimental validation creates a powerful framework for elucidating the roles of specific NBS genes in plant immunity and their potential applications in crop improvement.

Plant immunity relies on a sophisticated innate system where Nucleotide-binding Leucine-rich Repeat receptors (NLRs) serve as critical intracellular sentinels. These proteins recognize pathogen-specific effectors, initiating a robust defense response known as Effector-Triggered Immunity (ETI), often characterized by a hypersensitive response and programmed cell death to restrict pathogen spread [13] [14]. The NLR gene family exhibits extraordinary diversity in sequence and size across the plant kingdom, making it a focal point for understanding plant-pathogen co-evolution. This guide provides a comparative analysis of NLR diversity, from the modest repertoires in early land plants like bryophytes to the expansive families in flowering plants, synthesizing current genomic findings to aid researchers in selecting appropriate model systems and interpreting experimental data across species.

Genome-wide studies reveal dramatic variation in the number of NLR genes across different plant lineages. The following table summarizes this quantitative diversity, highlighting the contrast between ancient and modern plant groups.

Table 1: NLR Gene Repertoire Size Across Plant Species

Plant Species/Group	Classification	NLR Count	Key Characteristics and Subfamily Distribution
Bryophytes (e.g., Physcomitrella patens)	Non-vascular plants	~25 [3]	Minimal NLR repertoire; foundational ETI components
Lycophytes (e.g., Selaginella moellendorffii)	Early vascular plants	~2 [3]	Highly reduced NLR family
Salvia miltiorrhiza (Danshen)	Medicinal dicot (Angiosperm)	196 total, 62 typical [13]	61 CNL, 1 RNL, 0 TNL; marked TNL/RNL degeneration
Asparagus officinalis (Garden asparagus)	Horticultural crop (Angiosperm)	27 [15]	NLR contraction from wild relatives (63 in A. setaceus)
Capsicum annuum (Pepper)	Solanaceous crop (Angiosperm)	288 [14]	Tandem duplication-driven expansion, telomeric clustering
Triticum aestivum (Bread wheat)	Cereal crop (Angiosperm)	>2000 [3]	Massive lineage-specific expansion

The diversity is not merely numerical. NLR proteins are classified into subfamilies based on their N-terminal domains: CNLs (Coiled-Coil), TNLs (Toll/Interleukin-1 Receptor), and RNLs (RPW8). The distribution of these subfamilies also varies significantly. For instance, while the model plant Arabidopsis thaliana possesses all three types, monocots like rice (Oryza sativa) have completely lost the TNL subfamily, and some dicots like Salvia miltiorrhiza show a marked reduction in TNLs and RNLs [13] [15].

Table 2: NLR Subfamily Distribution and Evolutionary Trends

Plant Group	CNL Subfamily	TNL Subfamily	RNL Subfamily	Primary Evolutionary Driver
Bryophytes	Present	Present	Present	Foundational repertoire
Monocots (e.g., Rice, Wheat)	Highly expanded	Lost	Present	Tandem duplications
Eudicots (General)	Highly expanded	Variable (often expanded)	Present (small)	Tandem/segmental duplications
Salvia Species	Dominant (e.g., 61/62 in S. miltiorrhiza)	Lost or highly degenerate	Minimal (e.g., 1/62 in S. miltiorrhiza)	Lineage-specific degeneration

Evolutionary Mechanisms Driving NLR Diversity

The staggering disparity in NLR family sizes is primarily driven by several evolutionary mechanisms that operate at different scales across plant lineages.

Gene Duplication and Genome Dynamics

Tandem duplication is a major force for NLR expansion, particularly in angiosperms. This process creates clusters of NLR genes, often near telomeric regions, which act as hotbeds for generating new resistance specificities through recombination and diversifying selection [14]. In pepper (Capsicum annuum), for example, 18.4% (53/288) of NLRs arose from tandem duplications, with Chr08 and Chr09 being primary sites [14]. In contrast, whole-genome duplications (WGDs) contribute to the raw material for expansion, as observed in mosses (e.g., Bryidae) since the early Cretaceous [16].

Pathogen-Driven Selection and Domestication Bottlenecks

Plants engage in a continuous "arms race" with pathogens, where the evolution of a new pathogen effector selects for novel NLR recognition capabilities. This results in positive selection, particularly on the LRR domain responsible for effector recognition [14]. Conversely, domestication can lead to NLR contraction. A striking example is garden asparagus (Asparagus officinalis), which has only 27 NLRs, compared to 63 and 47 in its wild relatives A. setaceus and A. kiusianus, respectively. This loss, potentially due to selection for yield and quality, correlates with increased disease susceptibility [15].

Lineage-Specific Gene Gain and Loss

Deep evolutionary trajectories shape NLR repertoires. Bryophytes, despite their simple body plans, are now known to occupy a larger gene family space than vascular plants [17] [16]. However, their NLR repertoire remains small, suggesting alternative defense strategies or that the major expansion of NLRs is a hallmark of vascular plants [3]. Subsequent lineages experienced independent gains and losses, such as the complete loss of TNLs in monocots and their reduction in certain dicot lineages like Salvia [13].

Methodologies for NLR Identification and Analysis

A standardized workflow is employed for comprehensive genome-wide NLR identification and characterization. The following diagram outlines the core bioinformatics and functional validation pipeline.

Diagram 1: NLR Identification and Analysis Workflow.

Core Bioinformatics Identification Pipeline

The foundational step involves scanning proteomes or genomes to identify all potential NLR genes.

Hidden Markov Model (HMM) Searches: This primary method uses the conserved NB-ARC domain (PF00931) as a query to screen the entire proteome. Typical parameters use an E-value cutoff of 1e-5 to 1e-10 [13] [15] [14].
BLASTp Searches: Complementary homology-based searches are performed using reference NLR protein sequences from well-annotated species like Arabidopsis thaliana [15] [14].
Domain Validation and Classification: Candidate sequences are rigorously validated using domain databases (NCBI CDD, Pfam, InterProScan). Genes are then classified into CNL, TNL, RNL, or atypical categories based on the presence and completeness of N-terminal and LRR domains [13] [14].

Evolutionary and Expression Analysis

Following identification, NLRs are characterized to understand their evolution and potential function.

Phylogenetic and Gene Structure Analysis: Multiple sequence alignment of NB-ARC domains or full-length proteins is used to construct phylogenetic trees (e.g., via Maximum Likelihood in MEGA or IQ-TREE) [13] [14]. Gene structure (exon-intron) and conserved motifs are analyzed using tools like MEME and GSDS [15].
Duplication Analysis: Tools like MCScanX are used to identify tandem and segmental duplication events, key to understanding family expansion [14].
Cis-Regulatory Element Analysis: Promoter regions (e.g., 2 kb upstream) are analyzed with PlantCARE to identify defense-related elements like salicylic acid (SA) and jasmonic acid (JA) response motifs [15] [14].
Expression Profiling: RNA-seq data from pathogen-infected and healthy tissues identifies differentially expressed NLRs. Protein-protein interaction networks can be predicted using tools like STRING to pinpoint hub genes [14].

Functional Validation

Candidate NLR genes require experimental validation to confirm their role in immunity.

Virus-Induced Gene Silencing (VIGS): A powerful reverse genetics tool to knock down candidate gene expression in planta and assess the impact on disease resistance. For example, silencing GaNBS in resistant cotton demonstrated its role in defense against cotton leaf curl disease [3].
Heterologous Expression and Assays: Autoactive gain-of-function mutations (e.g., in a Medicago truncatula CNGC15 channel) can be used to validate the role of NLR-related signaling components and even transfer enhanced symbiotic potential to crops like wheat [18].

Table 3: Essential Research Reagents and Resources for NLR Studies

Reagent/Resource	Function/Application	Example Use Case
HMM Profile PF00931	Identifies the conserved NB-ARC domain in candidate NLRs.	Initial genome-wide screening in Salvia miltiorrhiza and pepper [13] [14].
Reference NLR Sequences	Serves as a query for BLASTp searches and phylogenetic analysis.	Arabidopsis NLRs from TAIR used to identify homologs in other species [14].
PlantCARE Database	Identifies hormone and stress-related cis-elements in promoter regions.	Revealed abundance of SA/JA motifs in pepper NLR promoters [14].
OrthoFinder Software	Clusters genes into orthogroups to infer evolutionary relationships.	Used to identify core and species-specific NLR orthogroups across 34 plant species [3].
VIGS Vectors	Enables transient knock-down of gene expression for functional validation.	Validated the role of GaNBS (OG2) in cotton virus resistance [3].

Research Implications and Future Directions

The comparative analysis of NLR diversity provides a roadmap for engineering disease resistance in crops. Understanding the evolutionary paths of different plant lineages helps identify key NLRs preserved over millions of years, which may represent core components of the plant immune system. The discovery that wild relatives of crops like asparagus harbor larger and more responsive NLR repertoires highlights their value as reservoirs for resistance gene mining [15]. Furthermore, the successful transfer of a gain-of-function mutation from Medicago to wheat, enhancing symbiosis with beneficial fungi, showcases the potential of leveraging NLR pathways for sustainable agriculture beyond pathogen resistance [18].

Future research will be fueled by the expanding genomic resources, such as the 123 newly sequenced bryophyte genomes [16] and the Marchantia polymorpha pangenome [19], enabling deeper evolutionary insights. Combining pangenome analyses with advanced genome editing techniques will allow scientists to not only understand the natural diversity of NLRs but also to synthesize new resistance genes, accelerating the development of durable disease-resistant crops.

The plant immune system relies heavily on nucleotide-binding site leucine-rich repeat (NBS-LRR) receptors, which play a crucial role in effector-triggered immunity [20]. Among these, Toll/Interleukin-1 receptor-NBS-LRR (TNL) proteins constitute a major subclass that function as pathogen sensors [21]. However, comparative genomic analyses have revealed a striking evolutionary divergence: TNL genes are consistently absent in monocots, including grasses and orchids, while remaining prevalent in dicot species [21] [20] [22]. This fundamental difference in the immune receptor repertoire represents a significant evolutionary split between the two major angiosperm lineages.

This guide provides a comparative analysis of NBS-LRR genes between monocot and dicot species, focusing on the phylogenetic distribution, structural characteristics, and evolutionary mechanisms underlying TNL gene loss. We present consolidated genomic data and experimental methodologies to facilitate research in plant immunity and support efforts in disease resistance breeding.

Comparative Analysis of NBS-LRR Gene Distribution

Table 1: NBS-LRR Gene Distribution in Monocot and Dicot Species

Species	Family/Type	Total NBS-LRR	TNL	CNL	RNL	Genome Size	Reference
Oryza sativa (rice)	Poaceae (monocot)	498	0	495	3	~430 Mb	[21]
Zea mays (maize)	Poaceae (monocot)	~140	0	~138	~2	~2.3 Gb	[21]
Phalaenopsis equestris (orchid)	Orchidaceae (monocot)	52	0	51	1	~1.2 Gb	[21]
Dendrobium catenatum (orchid)	Orchidaceae (monocot)	115	0	113	2	~1.0 Gb	[21]
Gastrodia elata (orchid)	Orchidaceae (monocot)	5	0	4	1	~0.9 Gb	[21]
Arabidopsis thaliana	Brassicaceae (dicot)	~200	~90	~100	~10	~135 Mb	[20]
Nicotiana tabacum	Solanaceae (dicot)	603	9	224	Not specified	~3.5 Gb	[23]
Capsicum annuum (pepper)	Solanaceae (dicot)	252	4	248*	Not specified	~3.3 Gb	[24]
Ipomoea batatas (sweet potato)	Convolvulaceae (dicot)	889	Present	Present	Present	~1.6 Gb	[25]
Akebia trifoliata	Lardizabalaceae (dicot)	73	19	50	4	~682 Mb	[26]
Vernicia montana (tung tree)	Euphorbiaceae (dicot)	149	12	137*	Not specified	~1.2 Gb	[27]

Includes other nTNL (non-TNL) genes beyond typical CNLs. *Specific counts not provided in source, but presence confirmed.

Key Distribution Patterns

The genomic data reveal several fundamental patterns in NBS-LRR distribution:

Consistent TNL absence in monocots: No TNL genes have been identified in any sequenced monocot genome, including grasses (rice, maize) and orchids, indicating this loss occurred in the common ancestor of all monocots [21] [20].
Variable NBS-LRR counts: The total number of NBS-LRR genes varies substantially within both monocot and dicot lineages, with orchids exhibiting particularly low numbers (as few as 5 in Gastrodia elata) compared to rice (498 genes) [21].
RNL conservation with lineage-specific differences: RNL genes are maintained in both monocots and dicots, but all orchid RNL genes belong only to the ADR1 lineage, with complete absence of the NRG1 lineage [21].

Evolutionary Mechanisms of TNL Gene Loss

Genomic and Signaling Pathway Coevolution

Table 2: Evolutionary Patterns and Compensatory Mechanisms in Monocots

Evolutionary Aspect	Monocots	Dicots	Functional Significance
TNL presence	Consistently absent	Generally present	Fundamental immune receptor difference
RNL lineages	ADR1 only in orchids	Both ADR1 and NRG1	NRG1 loss may relate to TNL absence
Downstream signaling	EDS1/PAD4 absent in some lineages	EDS1/PAD4 generally present	Co-evolution with TNL loss [20]
Evolutionary pattern in orchids	"Early shrinking to recent expanding" or "consistently shrinking"	Various patterns including expansion	Contributes to low R gene numbers [21]
Synteny evidence	Non-TNLs in syntenic regions with extinct TNLs	TNLs in syntenic regions	Supports TNL extinction model [22]

Research indicates that the loss of TNL genes in monocots coincided with the loss of key downstream signaling components. Some monocot lineages in the Alismatales order, along with certain eudicots in Lentibulariaceae, have lost both TNL genes and the EDS1/PAD4 signaling pathway, suggesting coordinated evolution of immune components [20]. Recent synteny-informed classification of NLR genes into CNLA, CNLB, CNL_C, TNL, and RNL categories provides a model explaining TNL extinction in monocots through compelling microsynteny evidence [22].

Structural and Functional Implications

The structural divergence in NBS-LRR genes between monocots and dicots extends beyond domain composition:

Conserved NBS motifs: Both monocots and dicots maintain conserved NBS domain motifs (P-loop, RNBS-A, kinase-2, RNBS-B, RNBS-C, and GLPL), though with subclass-specific variations [24].
Helper NLR relationships: The absence of TNLs in monocots coincides with the loss of the RNL NRG1 lineage, supporting the proposed functional association between TNLs and NRG1 proteins [21].
Chromosomal distribution patterns: NBS-LRR genes typically display clustered distribution on chromosomes in both monocots and dicots, with tandem duplications driving expansion in disease resistance gene clusters [24] [25].

Research Methodologies for NBS-LRR Gene Analysis

Standard Identification and Classification Pipeline

(NBS LRR Identification Workflow)

Experimental Protocols for Functional Validation

Protocol 1: Genome-Wide Identification of NBS-LRR Genes

Data Acquisition: Download genome assemblies and annotated protein sequences from public databases (NCBI, Phytozome, Plaza) [3].
HMMER Search: Perform HMMER searches (v3.1b2+) using the NB-ARC domain model (PF00931) from PFAM database with default e-value cutoff 1.1e-50 [3] [23].
Domain Validation: Confirm identified sequences using NCBI Conserved Domain Database (CDD) for TIR (PF01582), LRR (PF00560, PF07723, PF07725, PF12779, PF13306, PF13516, PF13855, PF14580), and Coiled-coil domains [23].
Classification: Categorize genes into structural classes (CNL, TNL, RNL, and variants) based on domain architecture [23] [26].

Protocol 2: Expression and Functional Analysis

Transcriptome Profiling: Analyze RNA-seq data from tissues under biotic/abiotic stresses, calculating FPKM values for expression quantification [3] [23].
Virus-Induced Gene Silencing (VIGS): For functional validation, clone candidate NBS-LRR genes into TRV-based vectors and infiltrate plants to assess disease resistance phenotypes [3] [27].
Differential Expression Analysis: Identify significantly expressed NBS-LRR genes using tools like Cuffdiff with appropriate multiple testing correction [23].

The Scientist's Toolkit: Essential Research Reagents

Table 3: Key Reagents and Resources for NBS-LRR Research

Reagent/Resource	Function	Example Sources/Tools	Application Context
HMMER Suite	Hidden Markov Model search for domain identification	http://hmmer.org/	Initial identification of NBS domains [23]
PFAM Database	Curated collection of protein domain families	http://pfam.xfam.org/	Domain architecture analysis [3]
NCBI CDD	Conserved Domain Database for domain verification	https://www.ncbi.nlm.nih.gov/cdd	Validation of TIR, LRR, and other domains [23]
OrthoFinder	Orthogroup inference and comparative genomics	https://github.com/davidemms/OrthoFinder	Evolutionary analysis across species [3]
MCScanX	Detection of collinear regions and duplication events	http://chibba.pgml.uga.edu/mcscan2/	Synteny and duplication analysis [23]
TRV VIGS Vectors	Virus-Induced Gene Silencing for functional validation	Available from plant molecular biology repositories	Loss-of-function studies [3] [27]
Plant DNA C-values Database	Genome size reference data	https://cvalues.science.kew.org/	Comparative genomics [28]

The absence of TNL genes in monocots, including grasses and orchids, represents a fundamental divergence in plant immune system architecture between the two major angiosperm lineages. This comparative analysis demonstrates that this gene loss is complemented by distinct evolutionary patterns in remaining NBS-LRR classes and their associated signaling components. The conserved methodologies for identifying and characterizing these genes across species provide researchers with standardized approaches for further investigation into plant immunity mechanisms. Understanding these lineage-specific differences in immune gene repertoire enhances our capacity for developing disease-resistant crops through both traditional breeding and biotechnological approaches.

The nucleotide-binding site-leucine-rich repeat (NBS-LRR) gene family represents one of the most critical lines of defense in plant immune systems, enabling plants to recognize diverse pathogens and initiate effector-triggered immunity. The expansion and contraction of this gene family across plant species have long intrigued evolutionary biologists. Two primary mechanisms—whole-genome duplication (WGD) and tandem duplication (TD)—have been identified as major drivers of NBS-LRR gene family evolution, yet they exhibit distinct patterns in their contributions to genomic architecture, functional specialization, and evolutionary trajectories. Understanding the differential roles of these duplication mechanisms is essential for deciphering plant adaptation strategies and harnessing resistance genes for crop improvement. This review synthesizes recent comparative genomic analyses to elucidate how WGD and TD have collectively shaped the NBS-LRR gene repertoire across the plant kingdom, with implications for disease resistance breeding and evolutionary biology.

Fundamental Distinctions Between WGD and TD

Whole-genome duplication (WGD) events involve the duplication of an organism's entire genome, creating massive genetic redundancy that persists as numerous syntenic paralogous regions. In contrast, tandem duplication (TD) occurs when a single gene or chromosomal segment is duplicated in a head-to-tail fashion, typically through unequal crossing over during meiosis, resulting in gene clusters localized to specific genomic regions [29] [30].

Empirical evidence from diverse plant species reveals systematic differences in the gene characteristics associated with each duplication mode. In Populus trichocarpa, WGD-derived genes are approximately 700 bp longer and are expressed in 20% more tissues than tandem duplicates [29]. Furthermore, certain functional categories are differentially enriched: disease resistance genes and receptor-like kinases commonly occur in tandem arrays but are significantly under-retained following WGD events. Conversely, WGD duplicate pairs are enriched for members of signal transduction cascades and transcription factors [29].

The evolutionary forces acting on these duplication types also differ substantially. WGD genes typically evolve under stronger purifying selection, preserving ancestral functions, while TD genes often experience more rapid functional divergence [30]. This distinction aligns with the gene balance hypothesis, which predicts that genes encoding proteins with numerous interaction partners (such as transcription factors) are preferentially retained following WGD to maintain stoichiometric balance, while dosage-sensitive genes can freely expand through TD without disrupting cellular equilibrium [29].

Table 1: Comparative Features of WGD and TD Genes

Feature	Whole-Genome Duplication (WGD)	Tandem Duplication (TD)
Genomic Organization	Syntenic blocks distributed across genome	Clustered arrays in localized regions
Gene Length	Significantly longer (e.g., +700 bp in Populus) [29]	Significantly shorter [29]
Expression Breadth	Expressed in ~20% more tissues [29]	More tissue-specific expression [29]
Typical Gene Functions	Signal transduction, transcription factors [29]	Disease resistance, receptor-like kinases [29]
Evolutionary Pressure	Strong purifying selection [30]	Rapid functional divergence [30]
Retention Bias	Genes with many protein-protein interactions [29]	Dosage-sensitive genes [29]

Evolutionary Patterns of NBS-LRR Genes Across Plant Lineages

Dynamic Evolutionary Trajectories in Rosaceae

Comparative genomic analyses of 12 Rosaceae species have revealed remarkable diversity in NBS-LRR evolutionary patterns, driven by species-specific combinations of WGD and TD events. Researchers identified 2,188 NBS-LRR genes across these species, with numbers varying distinctively across different lineages [31]. Phylogenetic reconstruction traced these back to 102 ancestral genes (7 RNLs, 26 TNLs, and 69 CNLs) that subsequently underwent independent duplication and loss events during Rosaceae divergence [31].

The evolutionary patterns observed include:

"First expansion and then contraction" in Rubus occidentalis, Potentilla micrantha, Fragaria iinumae, and Gillenia trifoliata
"Continuous expansion" in Rosa chinensis
"Expansion followed by contraction, then further expansion" in F. vesca
"Early sharp expansion to abrupt shrinking" in three Prunus species and three Maleae species [31]

Notably, species-specific duplications have been the primary driver of recent NBS-LRR expansion in Rosaceae. A study of five Rosaceae species found that 61.81% of strawberry, 66.04% of apple, 48.61% of pear, 37.01% of peach, and 40.05% of mei NBS-LRR genes derived from species-specific duplication events [32]. The four woody perennial species (apple, pear, peach, and mei) showed higher proportions of multi-copy NBS-LRR genes than the herbaceous strawberry, suggesting perennial life history may influence duplication retention [32].

Genomic Convergence in Rooted Plants

Recent evidence suggests tandem duplication of NBS-LRR genes represents a form of genomic convergence across different lineages of root plants adapting to soil microbial pressures. A comprehensive study of 205 Archaeplastida genomes revealed that TD-derived genes are notably prevalent in trees with developed root systems embedded in soil and are enriched for enzymatic catalysis and biotic stress responses [33].

Correlation analyses identified environmental factors related to soil microbes as significantly associated with TD frequency. Conversely, plants that transitioned to aquatic, parasitic, halophytic, or carnivorous lifestyles—reducing their interaction with soil microbes—consistently exhibited decreased TD frequency [33]. This pattern was further corroborated in mangroves that independently adapted to hypersaline intertidal soils with diminished microbial activity [33]. These findings position TD-driven genomic convergence as a widespread adaptation to soil microbial pressures among terrestrial root plants.

Methodological Framework for Analyzing Duplication Mechanisms

Identification and Classification of NBS-LRR Genes

The standard workflow for NBS-LRR gene identification involves a multi-step process combining homology searches and domain validation [23] [3] [31]:

Initial Screening: Perform BLAST and HMMER searches against the target proteome using the NB-ARC domain (PF00931) as a query, with threshold expectation values typically set at 1.0 for BLAST and default parameters for HMMER [31].
Domain Validation: Validate candidate genes through Pfam and NCBI Conserved Domain Database (CDD) analysis to confirm the presence of characteristic N-terminal domains (CC/TIR/RPW8) and NBS domains using an E-value cutoff of 10⁻⁴ [31].
Classification: Categorize validated NBS-LRR genes into subclasses (TNL, CNL, RNL) based on their N-terminal domain composition [31].
Duplication Mode Assignment: Identify duplication modes using MCScanX with all-vs-all BLASTP results (E-value < 1e⁻⁵) and genome annotation files as input [34]. The classifier follows a priority order: WGD/segmental > tandem > proximal > dispersed [34].

Figure 1: Experimental workflow for identifying and analyzing NBS-LRR genes and their duplication mechanisms.

Evolutionary Analysis and Expression Profiling

Following identification, researchers typically employ several bioinformatic approaches to understand the evolutionary history and functional implications of NBS-LRR duplicates:

Evolutionary Analysis:

Calculate non-synonymous (Ka) and synonymous (Ks) substitution rates using KaKs_Calculator 2.0 with appropriate evolutionary models (e.g., Nei-Gojobori) [23]
Construct phylogenetic trees using maximum likelihood methods with bootstrap validation [31]
Reconcile gene trees with species trees to infer duplication and loss events [31]

Expression Analysis:

Process RNA-seq data through standardized pipelines (e.g., Hisat2 for alignment, Cufflinks/Cuffdiff for quantification and differential expression) [23]
Analyze expression patterns across tissues and stress conditions
Validate multi-stress responsive genes using machine learning approaches (e.g., Random Forest) [35]

Experimental Evidence from Key Studies

Nicotiana Species Analysis

A comprehensive analysis of three Nicotiana genomes (N. tabacum, N. sylvestris, and N. tomentosiformis) identified 1,226 NBS genes, with the allotetraploid N. tabacum containing approximately the combined total of its parental species (603 genes) [23]. Notably, 76.62% of NBS members in N. tabacum could be traced back to their parental genomes, demonstrating the impact of WGD on NBS family expansion [23].

Table 2: NBS Gene Distribution in Three Nicotiana Species

Species	Ploidy	Total NBS Genes	NBS	TIR-NBS	CC-NBS	TIR-NBS-LRR	CC-NBS-LRR
*N. tomentosiformis*	Diploid	279	127	7	65	33	47
*N. sylvestris*	Diploid	344	172	5	82	37	48
*N. tabacum*	Allotetraploid	603	306	9	150	64	74

Domain architecture analysis revealed that approximately 45.5% of Nicotiana NBS genes contained only the NBS domain, followed by CC-NBS (23.3%), while TIR-NBS members were the least common [23]. This distribution reflects both the ancestral genetic repertoire and the lineage-specific expansions through different duplication mechanisms.

Aurantioideae Subfamily Research

A systematic study of 26 Aurantioideae species revealed tandem duplication as the predominant duplication type, confirming both a shared ancient WGD event (γWGD) and extensive recent TD activity [30]. Ka/Ks analysis indicated that all duplication types are under purifying selection pressure, with TD and proximal duplication undergoing the most rapid functional divergence [30].

Gene expression differentiation analysis between outer and inner pericarps of Citrus maxima 'Huazhouyou' found that the proportion of gene expression differentiation in the exocarp was generally higher, suggesting tissue-specific functional roles for duplicated genes in the peel [30]. This finding highlights how duplication mechanisms can contribute to specialized adaptations in particular plant tissues.

Research Reagent Solutions for NBS-LRR Studies

Table 3: Essential Research Tools for NBS-LRR Gene Analysis

Reagent/Resource	Primary Function	Application Examples
HMMER v3.1b2	Hidden Markov Model searches	Identification of NB-ARC domains (PF00931) [23]
MCScanX	Detection of gene duplication modes	Identifying WGD, tandem, proximal duplicates [23] [34]
KaKs_Calculator 2.0	Calculation of Ka/Ks ratios	Measuring selection pressure on duplicated genes [23]
Pfam/NCBI CDD	Protein domain identification	Validating TIR, CC, LRR, NBS domains [23] [31]
OrthoFinder	Orthogroup inference	Determining evolutionary relationships across species [3]
Cufflinks/Cuffdiff	RNA-seq analysis	Differential expression of NBS-LRR genes [23]
MEME Suite	Motif discovery	Identifying conserved protein motifs [31]

Whole-genome and tandem duplication mechanisms have distinct yet complementary roles in shaping the evolution and expansion of NBS-LRR gene families across plant species. WGD events provide the evolutionary substrate for preserving dosage-sensitive regulatory genes with broad expression patterns, while TD enables rapid, localized expansion of pathogen recognition genes tailored to specific environmental pressures. The interplay between these mechanisms has generated the remarkable diversity of NBS-LRR repertoires observed in modern plants, with lineage-specific duplications driving adaptations to distinct pathogenic challenges. Understanding these evolutionary dynamics provides crucial insights for harnessing NBS-LRR genes in crop improvement programs and predicting plant responses to emerging pathogens in changing environments. Future research integrating pan-genomic analyses with functional studies will further elucidate how duplication mechanisms collectively contribute to plant immune system evolution.

Plant immunity relies significantly on a diverse arsenal of disease resistance (R) genes, with the nucleotide-binding site-leucine-rich repeat (NBS-LRR) family representing the largest and most critical class. These genes encode proteins that detect pathogenic invaders and initiate robust defense responses [26] [27]. The central NBS domain facilitates nucleotide binding (ATP/GTP), providing energy for downstream signaling, while the LRR domain is involved in pathogen recognition and protein-protein interactions [27]. Based on their N-terminal domains, NBS-LRR genes are classified into three principal subfamilies: TNLs (TIR-NBS-LRR), CNLs (CC-NBS-LRR), and RNLs (RPW8-NBS-LRR) [15] [36]. The composition and size of this gene family vary dramatically across plant species, ranging from dozens to thousands of members, reflecting complex evolutionary histories shaped by pathogen pressures [15] [27].

Orthogroup analysis has emerged as a fundamental comparative genomics approach for classifying gene families across multiple species. An orthogroup comprises all genes descended from a single gene in the last common ancestor of the species being compared, including both orthologs (genes separated by speciation events) and paralogs (genes separated by duplication events) [37]. This methodology provides a powerful framework for identifying core sets of conserved resistance genes maintained across evolutionary lineages, as well as species-specific innovations that may underlie unique resistance capabilities. For plant resistance gene research, this approach helps researchers identify key candidates from the vast NBS-LRR repertoire for functional characterization and breeding applications [3].

Methodological Framework for Orthogroup Analysis

Core Workflow and Algorithm Selection

Orthogroup inference follows a systematic workflow beginning with genome assembly and annotation, followed by sequence similarity searches, clustering, and phylogenetic validation. The standard methodology involves identifying all genes containing the conserved NB-ARC domain (Pfam: PF00931) using tools like HMMER and BLASTP, followed by domain architecture analysis to classify genes into subfamilies (TNL, CNL, RNL) [26] [38]. The core orthology inference then clusters these sequences into orthogroups using specialized algorithms.

Multiple orthology inference algorithms are available, each with distinct strengths. OrthoFinder implements a phylogenetically informed tree-based approach, inferring gene trees for all orthogroups and analyzing them to identify orthologs, gene duplication events, and even the rooted species tree [39]. SonicParanoid offers a graph-based inference method modified from the InParanoid algorithm, providing rapid analysis without incorporating phylogenetic information [37]. Broccoli also uses a tree-based approach but employs network analyses to determine orthology relationships, while OrthNet incorporates synteny information to enhance orthology predictions [37]. A comparative study on Brassicaceae genomes revealed that while these algorithms generally produce similar results, OrthoFinder consistently demonstrates high ortholog inference accuracy on benchmark tests [37]. The table below compares the key algorithms used in orthogroup analysis.

Table 1: Comparison of Orthology Inference Algorithms for NBS Gene Analysis

Algorithm	Underlying Method	Key Features	Strengths for NBS Analysis	Considerations
OrthoFinder	Phylogenetic tree-based	Infers gene trees, rooted species tree, gene duplication events; Uses DIAMOND for fast sequence searches	High accuracy on benchmarks; Comprehensive phylogenetic analysis	Computationally intensive for very large datasets
SonicParanoid	Graph-based (MCL clustering)	Modified from InParanoid; Fast execution speed	Useful for initial orthology predictions	Does not incorporate phylogenetic information
Broccoli	Tree-based with network analysis	Uses network analyses to determine orthology relationships	Considers complex evolutionary relationships	Relatively new method with growing adoption
OrthNet	Synteny-aware MCL clustering	Incorporates gene colinearity information	Provides detailed colinearity information	Results can be outliers compared to other methods

Figure 1: Orthogroup Analysis Workflow for Plant NBS Genes - This diagram illustrates the standard pipeline for identifying and classifying resistance genes across multiple plant species, from initial identification through functional validation.

Experimental Protocols for Orthogroup Analysis

A typical orthogroup analysis begins with comprehensive data collection from publicly available genome databases such as NCBI, Phytozome, and Plaza [3]. For the identification of NBS-domain-containing genes, researchers commonly use the PfamScan.pl HMM search script with the NB-ARC domain (PF00931) as query, applying a stringent E-value cutoff (e.g., 1.1e-50) to ensure specificity [3]. Additional associated domains are identified through architecture analysis, classifying genes with similar domain patterns into the same classes [3].

For the orthology inference itself, a standard protocol utilizes OrthoFinder v2.5.1 (or newer), which employs DIAMOND for rapid sequence similarity searches and the MCL clustering algorithm for grouping sequences into orthogroups [3]. The orthologs and orthogrouping are further refined with DendroBLAST [3]. Multiple sequence alignment is performed using MAFFT 7.0, followed by phylogenetic tree construction via maximum likelihood algorithms implemented in FastTreeMP with appropriate bootstrap values (e.g., 1000 replicates) to assess node support [3].

Validation of orthogroup predictions often involves additional analyses including:

Chromosomal distribution mapping to identify gene clusters
Syntery analysis using tools like MCScanX to detect conserved genomic blocks
Gene structure analysis examining exon-intron organizations
Motif analysis using MEME suite to identify conserved protein motifs
Expression profiling using RNA-seq data from various tissues and stress conditions [26] [15]

Table 2: Essential Research Reagents and Tools for Orthogroup Analysis

Category	Tool/Resource	Specific Function	Application in NBS Analysis
Sequence Search	HMMER	Hidden Markov Model-based sequence search	Identifying NB-ARC domains with Pfam models
	DIAMOND	Accelerated BLAST-compatible sequence search	Fast all-vs-all sequence comparisons for large datasets
Orthology Inference	OrthoFinder	Phylogenetic orthogroup inference	Primary tool for orthogroup identification from protein sequences
	SonicParanoid	Graph-based orthology inference	Rapid initial orthogroup predictions
Domain Analysis	Pfam Database	Protein family and domain database	Validating NBS, TIR, CC, LRR, RPW8 domains
	SMART	Simple Modular Architecture Research Tool	Additional domain architecture verification
Phylogenetic Analysis	MAFFT	Multiple sequence alignment	Creating alignments for orthogroup sequences
	FastTreeMP	Maximum likelihood tree inference	Constructing phylogenetic trees for evolutionary analysis
Downstream Analysis	TBtools	Integrative toolkit for biological data	Visualizing chromosomal distributions, gene structures, etc.
	MEME Suite	Motif discovery and analysis	Identifying conserved motifs in NBS domains

Key Findings from Comparative Studies of Plant NBS Genes

Landscape of NBS Gene Repertoires Across Species

Comprehensive comparative analyses have revealed remarkable diversity in NBS gene repertoires across plant species. A landmark study examining 34 species from mosses to monocots and dicots identified 12,820 NBS-domain-containing genes, classifying them into 168 distinct classes based on domain architecture patterns [3]. These encompassed both classical structures (NBS, NBS-LRR, TIR-NBS, TIR-NBS-LRR) and species-specific structural patterns (TIR-NBS-TIR-Cupin1-Cupin1, TIR-NBS-Prenyltransf, Sugar_tr-NBS) [3]. The research identified 603 orthogroups (OGs), with some representing core orthogroups (OG0, OG1, OG2, etc.) conserved across multiple species, and others representing unique orthogroups (OG80, OG82, etc.) highly specific to particular species [3].

The size of NBS gene families exhibits tremendous variation across species. For example, the genome of Akebia trifoliata contains only 73 NBS genes [26], while garden asparagus (Asparagus officinalis) has 27 NLR genes, in contrast to its wild relatives A. setaceus (63 NLRs) and A. kiusianus (47 NLRs) [15] [36]. This contraction in the domesticated species suggests potential loss of resistance genes during artificial selection. Eggplant (Solanum melongena) possesses 269 SmNBS genes [38], while Vernicia fordii and Vernicia montana have 90 and 149 NBS-LRR genes, respectively [27]. These differences reflect varying evolutionary paths and selection pressures across plant lineages.

Evolutionary Dynamics Driving NBS Gene Diversity

The expansion and diversification of NBS gene families are primarily driven by various duplication mechanisms. Tandem duplications represent a major force for recent NBS gene increases, creating clusters of similar genes on chromosomes that facilitate rapid evolution of new specificities [38]. Dispersed duplications also contribute significantly to NBS expansion, as evidenced in Akebia trifoliata where tandem and dispersed duplications produced 33 and 29 genes, respectively [26]. Whole-genome duplications (WGD) provide another important mechanism, particularly in polyploid species, though gene families evolving through WGDs seldom undergo small-scale duplication events [3].

The evolutionary analysis of orthogroups reveals distinct patterns of conservation and divergence. Expression profiling of specific orthogroups in cotton demonstrated that OG2, OG6, and OG15 showed upregulated expression in various tissues under biotic and abiotic stresses in both susceptible and tolerant plants facing cotton leaf curl disease [3]. Furthermore, genetic variation analysis between susceptible (Coker 312) and tolerant (Mac7) Gossypium hirsutum accessions identified substantially more unique variants in NBS genes of the tolerant genotype (6583 variants) compared to the susceptible one (5173 variants), highlighting the potential functional significance of these variations [3].

Figure 2: Evolutionary Relationships Forming Core and Species-Specific Orthogroups - This diagram illustrates how speciation and duplication events create different types of orthogroups, with core orthogroups maintained across species and species-specific orthogroups arising through duplication and diversification.

Case Studies: Orthogroup Analysis in Crop Species

Disease Resistance in Cotton and Eggplant

Orthogroup analysis has proven particularly valuable for identifying candidate resistance genes in economically important crops. In cotton, researchers investigated resistance to cotton leaf curl disease (CLCuD), caused by Begomoviruses transmitted by whitefly insects [3]. The study compared tolerant (Mac7) and susceptible (Coker 312) Gossypium hirsutum accessions, identifying not only differential expression of specific orthogroups (OG2, OG6, OG15) but also sequence variations potentially underlying resistance differences [3]. Protein-ligand and protein-protein interaction analyses demonstrated strong interactions between putative NBS proteins and ADP/ATP as well as core proteins of the cotton leaf curl disease virus [3]. Most significantly, functional validation through virus-induced gene silencing (VIGS) of GaNBS (OG2) in resistant cotton demonstrated its putative role in virus tittering, confirming the practical utility of orthogroup-guided candidate gene identification [3].

In eggplant, genome-wide analysis identified 269 SmNBS genes, classified into 231 CNLs, 36 TNLs, and 2 RNLs [38]. Chromosomal mapping revealed an uneven distribution with clustering on certain chromosomes, particularly chromosomes 10, 11, and 12 [38]. Evolutionary analysis indicated that tandem duplication events primarily contributed to SmNBS expansion [38]. Expression analysis under Ralstonia solanacearum stress (bacterial wilt) identified nine SmNBS genes with differential expression patterns, with EGP05874.1 emerging as a promising candidate for involvement in resistance responses [38]. This systematic orthogroup analysis provides a foundation for marker-assisted breeding for bacterial wilt resistance in eggplant.

Functional Validation of Orthogroup Predictions

The ultimate test of orthogroup analysis lies in functional validation of predicted resistance genes. A compelling example comes from comparative analysis of two tung tree species: Fusarium wilt-susceptible Vernicia fordii and resistant Vernicia montana [27]. The study identified 90 NBS-LRR genes in V. fordii and 149 in V. montana, with notable differences in domain architectures—V. fordii completely lacked TIR domains, while V. montana possessed 12 VmNBS-LRRs with TIR domains [27]. Orthologous gene pair analysis identified Vf11G0978-Vm019719 as showing distinct expression patterns: downregulation in susceptible V. fordii but upregulation in resistant V. montana following Fusarium infection [27].

Functional investigation revealed that Vm019719 in V. montana, activated by the transcription factor VmWRKY64, conferred resistance to Fusarium wilt [27]. In the susceptible V. fordii, the allelic counterpart Vf11G0978 exhibited an ineffective defense response due to a deletion in the promoter's W-box element, preventing proper WRKY regulation [27]. This case demonstrates how orthogroup analysis can pinpoint critical genetic differences underlying disease susceptibility and resistance, providing specific targets for breeding programs.

Table 3: NBS Gene Family Characteristics Across Selected Plant Species

Plant Species	Total NBS Genes	CNL	TNL	RNL	Genome Distribution	Key Evolutionary Mechanism
Akebia trifoliata	73	50	19	4	Uneven, clustered on chromosome ends	Tandem and dispersed duplications
Garden Asparagus (A. officinalis)	27	Not specified	Not specified	Not specified	Clustered patterns	Contraction from wild relatives
Wild Asparagus (A. setaceus)	63	Not specified	Not specified	Not specified	Clustered patterns	Expansion relative to cultivated species
Eggplant (S. melongena)	269	231	36	2	Clustered on chr10, 11, 12	Tandem duplication events
Vernicia fordii	90	49 CC-containing	0	Not specified	Non-random, clustered	LRR domain loss events
Vernicia montana	149	98 CC-containing	12 TIR-containing	Not specified	Non-random, clustered	Tandem duplications of linked families

Implications for Disease Resistance Breeding

Orthogroup analysis provides a powerful strategic framework for modern crop improvement programs. By identifying core orthogroups conserved across resistant varieties and species, breeders can prioritize these candidates for marker development and introgression into elite lines. The case of garden asparagus illustrates how domestication can lead to NLR gene repertoire contraction, with cultivated A. officinalis possessing only 27 NLR genes compared to 63 and 47 in its wild relatives A. setaceus and A. kiusianus, respectively [15] [36]. This reduction correlates with increased disease susceptibility in the domesticated species [36]. Orthologous gene analysis identified 16 conserved NLR gene pairs between A. setaceus and A. officinalis, representing the NLR genes preserved during domestication [36]. Notably, most preserved NLR genes in susceptible A. officinalis showed unchanged or downregulated expression after fungal challenge, suggesting functional impairment in disease resistance mechanisms [36].

These findings highlight how orthogroup analysis can guide precise breeding strategies—in this case, potentially focusing on re-introducing lost NLR genes from wild relatives or manipulating expression of conserved but poorly responding orthologs. Similarly, in tung trees, the identification of a specific NBS-LRR gene responsible for Fusarium wilt resistance provides a direct target for marker-assisted selection [27]. The ability to distinguish functional resistance alleles from their non-functional counterparts, as demonstrated by the promoter analysis in Vernicia species, enables much more precise breeding compared to traditional phenotypic selection alone [27].

Orthogroup analysis has revolutionized our approach to understanding and utilizing plant resistance genes by providing a systematic framework for comparative genomics. The methodology enables researchers to distinguish evolutionarily conserved, core resistance mechanisms from species-specific innovations, guiding efficient candidate gene selection for functional characterization. As genomic resources continue to expand across crop species and their wild relatives, orthogroup analysis will play an increasingly critical role in unlocking the genetic basis of disease resistance. The integration of this approach with modern breeding technologies promises to accelerate the development of durable disease resistance in agricultural crops, potentially reducing reliance on chemical pesticides and enhancing global food security.

Genome-Wide Identification and Expression Profiling of NBS-LRR Genes

Nucleotide-binding site (NBS) genes represent one of the largest and most critical gene families in plant immune systems, encoding proteins that function as intracellular receptors in effector-triggered immunity (ETI) [3] [40]. These genes are characterized by a conserved NBS domain that facilitates ATP/GTP binding and hydrolysis, often accompanied by C-terminal leucine-rich repeats (LRRs) and various N-terminal domains such as TIR (Toll/Interleukin-1 Receptor), CC (Coiled-Coil), or RPW8 (Resistance to Powdery Mildew 8) [23] [40]. The functional characterization of NBS genes across plant species reveals their crucial role in recognizing pathogen effectors and initiating robust immune responses, often culminating in hypersensitive response and programmed cell death to limit pathogen spread [41] [40].

The comparative analysis of NBS genes across plant species provides invaluable insights into plant adaptation mechanisms, evolutionary history, and diversification patterns [3] [41]. Recent studies have identified substantial diversity in NBS gene copy numbers and architectural patterns across land plants, from bryophytes to higher plants, with several species-specific structural patterns observed [3]. For instance, research has identified 12,820 NBS-domain-containing genes across 34 plant species, classified into 168 distinct classes with both classical and novel domain architectures [3]. This remarkable diversity stems primarily from gene duplication events and recombination, driving the expansion and functional diversification of this critical gene family [41].

Within this context, bioinformatic pipelines for accurate NBS gene identification become paramount for evolutionary studies and disease resistance breeding programs. This guide provides a comparative analysis of two fundamental approaches: the specialized, domain-focused HMMER pipeline and the comprehensive, evolutionary-driven OrthoFinder method, framing this comparison within broader studies of NBS genes across plant species.

HMMER: A Domain-Centric Pipeline for NBS Identification

The HMMER pipeline employs a domain-based identification strategy using Hidden Markov Models (HMMs) to detect the conserved NBS domain within protein sequences. This method has been extensively applied in recent large-scale comparative studies of NBS genes across multiple plant species [3] [23] [41]. The typical workflow begins with constructing or obtaining a curated HMM profile for the NB-ARC domain (PF00931 from the PFAM database), followed by scanning proteome sequences using tools like HMMER3 or PfamScan.pl [3] [23]. Significant hits passing specific e-value thresholds (commonly 1.1e-50 or 10⁻³) are retained as putative NBS-containing genes [3] [41]. Subsequent domain architecture analysis then classifies these genes into subfamilies (TNL, CNL, RNL, NL) based on the presence of additional domains identified through complementary tools like the NCBI Conserved Domain Database (CDD) [23] [40].

This pipeline's strength lies in its precision and specialization for cataloging NBS gene repertoire. A recent study of NBS genes in three Nicotiana species exemplifies this approach, where researchers identified 1,226 NBS genes through HMMER-based domain searches followed by CDD validation [23]. Similarly, investigations in Citrus species (identifying 1,585 NLR genes across 10 genomes) and Salvia miltiorrhiza (identifying 196 NBS-LRR genes) employed this HMM-centric strategy [41] [40]. The method provides researchers with comprehensive inventories of NBS genes, including their structural classifications and distributions across genomes.

OrthoFinder: An Evolutionary and Orthology-Based Framework

OrthoFinder implements a phylogenetic orthology inference approach designed to identify orthogroups—sets of genes descended from a single gene in the last common ancestor of the species being analyzed [39]. The methodology begins with an all-vs-all sequence similarity search (using DIAMOND or BLAST) of input proteomes, followed by clustering of related sequences into orthogroups using graph-based algorithms [42] [39]. The updated OrthoFinder algorithm further infers gene trees for each orthogroup, reconstructs the rooted species tree, maps gene duplication events, and identifies orthologs and paralogs through sophisticated phylogenetic analysis [39].

For NBS gene studies, OrthoFinder enables evolutionary contextualization by clustering NBS sequences into orthogroups (OGs) that reflect shared evolutionary history [3]. This approach was effectively applied in a comprehensive study of plant NBS genes, which identified 603 orthogroups with some "core" (commonly conserved) and "unique" (species-specific) OGs showing evidence of tandem duplications [3]. This evolutionary perspective helps researchers identify conserved NBS lineages across plant taxa and species-specific expansions, providing insights into patterns of gene family evolution and diversification.

Table 1: Key Characteristics of HMMER and OrthoFinder Approaches for NBS Gene Analysis

Feature	HMMER Pipeline	OrthoFinder Framework
Primary Focus	Domain identification and classification	Evolutionary relationships and orthology inference
Methodological Core	Hidden Markov Model profiling	Sequence similarity clustering and phylogenetic analysis
Typical Input	Protein sequences from one or multiple species	Protein sequences from multiple species
Key Output	NBS gene catalog with domain architecture	Orthogroups, gene trees, orthologs/paralogs
Strengths	High precision for domain detection; Comprehensive gene inventory	Evolutionary context; Differentiation of orthologs/paralogs
Limitations	Limited evolutionary context; May miss divergent sequences	Computationally intensive for large datasets
Application in NBS Studies	Gene family characterization in single species	Comparative genomics across multiple species

Experimental Protocols for Benchmarking

To objectively evaluate the performance of bioinformatic tools, researchers employ standardized benchmarking protocols. The Quest for Orthologs (QfO) consortium maintains a benchmarking suite that assesses orthology inference methods on reference proteome sets, evaluating how well they recapitulate curated orthologous groups [42]. Additionally, benchmark studies often employ metrics such as precision, recall, and F-score calculated against manually curated gold-standard datasets, such as SwissTree and TreeFam-A [39]. Precision measures the proportion of correctly identified orthologs among all predicted orthologs (TP/[TP+FP]), while recall measures the proportion of true orthologs successfully identified (TP/[TP+FN]) [43] [39].

For NBS-specific assessments, evaluation might include the ability to identify known NBS gene families or recapitulate established evolutionary patterns, such as the presence of certain NBS orthogroups across multiple plant lineages [3]. Independent benchmarking studies have revealed that different orthology methods, while showing similar large-scale performance, can produce substantially different orthologous groups, highlighting the importance of method selection for specific research questions [42].

Performance Comparison and Experimental Data

Accuracy and Efficiency Benchmarks

Independent evaluations provide critical insights into the performance characteristics of orthology inference tools. In comprehensive benchmarking studies, OrthoFinder has demonstrated superior ortholog inference accuracy, outperforming other methods by 3-24% on the SwissTree benchmark and 2-30% on the TreeFam-A benchmark according to tests conducted through the Quest for Orthologs initiative [39]. The method achieves this while maintaining computational efficiency comparable to the fastest score-based heuristic methods, particularly when using DIAMOND for sequence similarity searches [39].

The HMMER-based approach, as implemented in tools like OrthoFisher, also shows impressive accuracy for identifying sequences with high similarity to query profiles. In performance assessments comparing OrthoFisher with BUSCO (another HMM-based tool), researchers observed near-perfect precision (0.98) and recall (1.0) values for identifying single-copy orthologous genes [43]. The HMMER strategy particularly excels at identifying domain-containing genes with high sensitivity, especially when using carefully curated model cutoffs [3].

Table 2: Performance Comparison of Orthology Inference Methods Based on Published Benchmarks

Method	Precision	Recall	F-score	Computational Efficiency	Primary Use Case
OrthoFinder (Default)	High (Top performer on QfO benchmarks)	High (Top performer on QfO benchmarks)	High (Top performer on QfO benchmarks)	Fast with DIAMOND; Scalable to hundreds of species	Genome-wide orthology inference across multiple species
HMMER/OrthoFisher	High (0.98 precision reported)	High (1.0 recall reported)	High	Fast for targeted searches; Efficient for specific domain identification	Identification of genes with specific domain architectures
BUSCO	High	High	High	Moderate	Assessment of genome completeness and ortholog identification
SonicParanoid	Moderate-High	Moderate-High	Moderate-High	Fast with MMseqs2	Rapid orthology inference for large datasets

Complementary Applications in NBS Gene Research

The HMMER and OrthoFinder approaches display complementary strengths in NBS gene research, addressing different but related biological questions. The HMMER pipeline excels at providing comprehensive inventories of NBS genes within individual genomes, as demonstrated in studies of Nicotiana species (1,226 NBS genes identified), citrus (1,585 NLR genes across 10 species), and Salvia miltiorrhiza (196 NBS-LRR genes) [23] [41] [40]. This approach enables detailed structural classification and reveals species-specific domain architectures, such as the unusual TIR-NBS-TIR-Cupin1-Cupin1 pattern discovered in some plants [3].

OrthoFinder, conversely, provides the evolutionary framework for understanding NBS gene relationships across species. In the comprehensive analysis of 12,820 NBS genes across 34 plant species, OrthoFinder identified 603 orthogroups, revealing both core conserved groups and species-specific expansions [3]. This orthogroup analysis facilitated the identification of tandem duplication events and provided insights into patterns of gene family evolution across the plant kingdom. Expression profiling of these orthogroups further revealed putative upregulation of specific OGs (OG2, OG6, OG15) under various biotic and abiotic stresses in cotton accessions with differing susceptibility to cotton leaf curl disease [3].

Integrated Workflow for Comprehensive NBS Gene Analysis

A Combined Pipeline for Maximum Insight

Leading research in the field demonstrates that the most comprehensive understanding of NBS gene evolution and function emerges from integrating both domain-centric and orthology-based approaches [3] [41]. A robust integrated pipeline begins with HMMER-based identification of NBS genes across species of interest, followed by OrthoFinder analysis to cluster these sequences into orthogroups and infer evolutionary relationships. This combined approach was successfully implemented in a recent study of NBS genes in land plants, where researchers first identified NBS-domain-containing genes using HMM searches and subsequently performed orthogroup analysis using OrthoFinder [3].

The integrated workflow allows researchers to not only catalog the NBS gene repertoire but also understand evolutionary patterns, including gene duplication events, loss patterns, and conserved lineages. This strategy proved particularly insightful in citrus NLR gene research, where it helped unravel the mechanisms underlying NLR gene diversity and evolution, revealing that gene duplication and recombination served as primary drivers of diversification [41].

Diagram 1: Integrated bioinformatic workflow for comprehensive NBS gene analysis combining HMMER and OrthoFinder approaches

Experimental Validation and Functional Characterization

Bioinformatic predictions require experimental validation to confirm functional roles of identified NBS genes. Functional studies often employ virus-induced gene silencing (VIGS) to knock down candidate NBS genes followed by pathogen challenge assays [3]. For instance, silencing of GaNBS (OG2) in resistant cotton demonstrated its putative role in virus titering against cotton leaf curl disease [3]. Similarly, transcriptomic analyses under various stress conditions and promoter analyses for cis-regulatory elements provide supporting evidence for NBS gene involvement in immune responses [3] [40].

Genetic variation analysis between resistant and susceptible accessions further strengthens functional associations, as demonstrated in comparisons between tolerant (Mac7) and susceptible (Coker 312) cotton varieties, which identified numerous unique variants in NBS genes [3]. Protein-ligand and protein-protein interaction studies can subsequently validate physical interactions between NBS proteins and pathogen effectors, completing the pipeline from identification to functional characterization [3].

Essential Research Reagents and Computational Tools

Table 3: Key Research Reagent Solutions for NBS Gene Identification and Analysis

Resource Category	Specific Tools/Databases	Function in NBS Gene Research	Application Example
Domain Databases	PFAM (PF00931), NCBI CDD	Identification of NBS and associated domains	HMMER-based domain detection [23] [40]
Orthology Tools	OrthoFinder, OrthoFisher, BUSCO	Orthogroup inference and evolutionary analysis	Cross-species orthology of NBS genes [43] [3] [39]
Sequence Search	DIAMOND, BLAST, HMMER3	Sequence similarity searches and domain detection	All-vs-all proteome comparisons [43] [39]
Phylogenetic Analysis	FastTree, IQ-TREE, MAFFT	Multiple sequence alignment and tree inference	Evolutionary relationships of NBS orthogroups [3] [41]
Genomic Databases	Phytozome, NCBI, Plaza	Source of annotated proteomes for multiple species	Retrieval of plant genome data [3] [41]
Expression Analysis	Cufflinks, HTSeq, DESeq2	Transcript quantification and differential expression	Expression profiling under stress [3] [23]

The comparative analysis of HMMER and OrthoFinder pipelines reveals complementary strengths that researchers can leverage for comprehensive NBS gene studies. The HMMER pipeline provides superior domain detection and classification capabilities, enabling precise identification and structural characterization of NBS genes within individual genomes. Meanwhile, OrthoFinder offers robust evolutionary context through orthology inference, facilitating comparative analyses across multiple species and revealing patterns of gene family evolution.

For researchers investigating NBS genes across plant species, an integrated approach that combines both methodologies delivers the most comprehensive insights. This combined strategy enables both detailed structural annotation and evolutionary analysis, supporting advanced studies into the diversification, selection pressures, and functional specialization of this critical gene family. As genomic resources continue to expand across plant species, these bioinformatic pipelines will play increasingly important roles in elucidating the evolutionary dynamics of plant immune systems and supporting disease resistance breeding programs.

Diagram 2: Complementary strengths of HMMER and OrthoFinder in NBS gene research

The domain architecture of nucleotide-binding site (NBS) and leucine-rich repeat (LRR) proteins represents a fundamental paradigm in innate immune recognition across plant species. These intracellular immune receptors, characterized by their modular NBS, LRR, TIR, and CC domains, provide a sophisticated system for pathogen detection and defense activation. The comparative analysis of these domains across species reveals remarkable structural conservation coupled with functional diversification driven by evolutionary pressures. This guide systematically evaluates the performance of various methodological approaches for validating these critical immune domains, providing researchers with experimental frameworks for structural and functional characterization. As the plant immune system relies heavily on these molecular sentinels, understanding their architectural principles has become essential for both basic research and applied crop improvement strategies.

Domain Validation Methodologies

Computational Structure Prediction and Limitations

The emergence of deep learning platforms like AlphaFold (AF2, AF3) and RoseTTAFold (RFAA) has revolutionized protein structure prediction, yet significant challenges remain for multistate multidomain proteins like NBS-LRR receptors. Performance evaluation using the solved structures of Arabidopsis ZAR1-CNL receptor reveals a complex picture of domain-specific prediction accuracy [44].

Table 1: AI Prediction Performance for CNL Domains (Cα RMSD vs Experimental Structures)

Domain	Prediction Platform	RMSD vs Active State (Å)	RMSD vs Inactive State (Å)	Key Limitations
CC Domain	AF2 (default)	>12.0	>12.0	Mixed active/inactive segment modeling
CC Domain	AF2 (template-controlled)	<3.0	<3.0	Requires curated templates
NBD Domain	All platforms	<2.0	<2.0	High accuracy across methods
LRR Domain	All platforms	<2.5	<2.5	Good ventral surface prediction
LRR Domain	All platforms	N/A	N/A	Dorsal helical bias vs experimental

The prediction performance varies dramatically by domain, with NBD and LRR domains generally showing high accuracy (RMSD <2.5Å), while CC domains present significant challenges, often exceeding 12Å RMSD without specialized template curation [44]. This domain-specific performance highlights the necessity of experimental validation, particularly for conformationally dynamic regions.

Experimental Validation Approaches

Experimental validation of NBS-LRR domain architecture employs multiple complementary techniques that provide orthogonal data for structure-function analysis. These methodologies span biophysical, biochemical, and genetic approaches.

X-ray Crystallography has been successfully applied to determine TIR domain structures, as demonstrated with the flax L6 resistance protein, where residues 59-228 formed a structure consisting of "a five-stranded parallel β sheet (βA–βE) surrounded by five α-helical regions (αA–αE)" [45]. This approach redefined the boundaries of plant TIR domains and revealed self-association interfaces critical for signaling.

Functional Complementation Assays enable validation of interdomain interactions through trans-complementation experiments. Research on the potato Rx protein demonstrated that "co-expression of the CC–NBS and LRR regions of Rx as separate molecules resulted in a CP-dependent HR" [46]. Similarly, "co-expression of Rx CC with NBS–LRR led to a CP-dependent HR," confirming functional domain interactions and their pathogen-induced disruption [46].

Site-Directed Mutagenesis of critical conserved residues validates functional motifs. In the L6 TIR domain, mutations at highly conserved positions (R73A, S129A, Y156A, P160Y) abolished autoactive cell death induction, while D159A caused partial reduction, delineating signaling-essential residues [45].

Table 2: Experimental Protocols for Domain Validation

Method	Key Protocol Steps	Domain Applications	Critical Controls
X-ray Crystallography	Protein expression & purification, crystallization, data collection, structure solution	TIR domain boundaries, self-association interfaces	SeMet derivatives for phasing
Functional Complementation	Transient expression of separate domains, HR assessment, co-immunoprecipitation	CC-NBS-LRR interactions, effector disruption	Empty vector, full-length positive control
Site-Directed Mutagenesis	Conserved residue identification, Ala-scanning, functional assays	P-loop, EDVID, TIR signaling motifs	Wild-type protein, expression validation
Virus-Induced Gene Silencing (VIGS)	Target sequence cloning, agrobacterium delivery, phenotype assessment	NBS gene function in plant immunity	Scrambled sequence control
Yeast Two-Hybrid	Domain bait/prey construction, interaction testing, specificity controls	Effector recognition, intra-molecular interactions	Empty vector autoactivation tests

Comparative Analysis Across Plant Species

Genomic Distribution and Architectural Diversity

Large-scale comparative genomics reveals remarkable diversity in NBS-LRR gene composition across plant species. A comprehensive analysis identified "12,820 NBS-domain-containing genes across 34 species covering from mosses to monocots and dicots," classified into "168 classes with several novel domain architecture patterns" [3]. This diversity encompasses both classical and species-specific structural patterns.

In pepper (Capsicum annuum), analysis of "252 NBS-LRR resistance genes" demonstrated uneven chromosomal distribution, with "54% forming 47 gene clusters" driven by "tandem duplications and genomic rearrangements" [24]. Phylogenetic analysis showed "the dominance of the nTNL subfamily over the TNL subfamily," reflecting "lineage-specific adaptations and evolutionary pressures" [24].

Table 3: NBS-LRR Domain Architecture Diversity in Plant Species

Species	Total NBS Genes	TNL Count	CNL Count	Other Architectures	Notable Features
Capsicum annuum	252	4	48 (2 typical CNL)	200 lacking CC/TIR	47 gene clusters, nTNL dominance
Arabidopsis thaliana	~100-150	~50-70	~50-80	Integrated domains	Balanced distribution
Gossypium hirsutum	Extensive repertoire	Variable	Variable	Novel combinations	Response to CLCuD
Physcomitrella patens	~25	Limited	Limited	Minimal expansion	Ancestral repertoire
Triticum aestivum	2012	Limited	Extensive	Monocot adaptation	CNL predominance

The pepper genome study revealed exceptional structural diversity, with nTNL genes classified into six subclasses based on domain structure: "N (only NB-ARC), NL (NB-ARC+LRR8), NLL (NB-ARC+2LRR8), NN (2NB-ARC), NLN (NB-LRR+NB-ARC), and NLNLN (NB-LRR+NB-LRR+NB-ARC)" [24]. This diversity highlights the modular flexibility of NBS domain architecture and its evolutionary expansion.

Functional Specialization and Signaling Mechanisms

TIR Domain Signaling employs self-association as a conserved mechanism. The L6 TIR domain structure demonstrated that "self-association is a requirement for immune signaling," with "distinct surface regions involved in self-association, signaling, and autoregulation" [45]. This NADase activity represents an evolutionarily ancient mechanism, with enzymatic TIR proteins functioning as "evolutionarily ancient immune regulators with functions in host defense across all life forms" [47].

CC Domain Functional Classes exhibit remarkable diversity. CC domains have been grouped into several classes: "CCEDVID, CCR, CC (CCCAN), I2-like and SD-CC classes," with the CCEDVID class characterized by "the highly conserved EDVID motif that is suggested to be involved in intramolecular interactions with the NB domain" [48]. This functional specialization enables optimized response to specific pathogen types.

The Scientist's Toolkit

Research Reagent Solutions

Table 4: Essential Research Reagents for Domain Architecture Studies

Reagent/Category	Specific Examples	Function/Application	Experimental Notes
Structural Prediction Platforms	AlphaFold2, AlphaFold3, RoseTTAFold	3D structure prediction from sequence	Template curation critical for CC domains
Domain Expression Constructs	CC-NBS, LRR, TIR fragments	Functional complementation assays	Epitope tagging for detection
Mutagenesis Kits	Site-directed mutagenesis systems	Functional motif validation	P-loop, EDVID, TIR catalytic residues
Crystallization Screens	Commercial sparse matrix screens	TIR domain crystallization	L6 TIR (residues 29-229) successful
VIGS Vectors	TRV-based silencing systems	NBS gene functional validation	Target conserved NBS motifs
Antibody Reagents	Anti-HA, Anti-GFP, domain-specific	Protein detection, localization	Critical for co-immunoprecipitation

Platform Selection Criteria: For accurate CC domain prediction, template-controlled AF2 implementation outperforms default protocols. RFAA shows advantages for complex multidomain proteins when experimental constraints guide modeling [44].

Functional Assay Optimization: Transient expression in N. benthamiana provides a robust system for HR-based domain function validation. Co-expression of separate domains (e.g., CC-NBS + LRR) tests functional complementation [46].
Evolutionary Analysis Tools: OrthoFinder v2.5.1 with DIAMOND for sequence similarity and MCL clustering enables phylogenetic reconstruction of NBS gene evolution across species [3].

The validation of NBS, LRR, TIR, and CC domain architecture requires an integrated approach combining computational prediction with experimental verification. While AI platforms offer remarkable accuracy for certain domains (NBD, LRR ventral surface), their limitations for dynamic regions (CC domain, LRR dorsal surface) necessitate experimental validation through crystallography, functional assays, and mutagenesis. The extensive diversification of NBS gene architectures across plant species, from minimal bryophyte repertoires to expanded angiosperm collections, reflects continuous evolutionary innovation in plant immunity. As structural prediction methodologies advance, their integration with experimental validation will continue to decipher the molecular principles of plant immune receptor function, enabling strategic engineering of disease resistance in crop species.

Chromosomal Mapping and Cluster Analysis of NBS Gene Families

Nucleotide-binding site (NBS) leucine-rich repeat (LRR) genes constitute the largest and most critical family of plant disease resistance (R) genes, serving as fundamental components of the plant immune system [49] [3]. These genes enable plants to recognize diverse pathogens and initiate robust defense responses, including the hypersensitive response and systemic acquired resistance [49]. The genomic organization of NBS-LRR genes, particularly their physical arrangement on chromosomes and tendency to form clusters, provides critical insights into their evolution, diversification, and functional mechanisms [50] [51]. This guide presents a comprehensive comparative analysis of NBS gene families across economically important plant species, synthesizing chromosomal mapping data and cluster patterns to elucidate evolutionary trends and functional implications. Through systematic comparison of quantitative data and experimental methodologies, this review serves as a reference for researchers investigating plant-pathogen co-evolution and developing disease-resistant crop varieties.

Comparative Genomic Distribution of NBS Genes

Chromosomal Organization and Density Patterns

NBS-LRR genes demonstrate non-random, uneven distribution across plant genomes, with pronounced clustering in specific chromosomal regions [50] [25]. Comparative analysis reveals that NBS genes frequently concentrate at chromosomal termini, particularly in telomeric and subtelomeric regions, as observed in pepper (Capsicum annuum), where chromosome 3 harbors the highest density (38 genes) while chromosomes 2 and 6 contain the lowest (5 genes each) [50]. This distribution pattern suggests accelerated evolution in these genomic regions, potentially facilitating rapid adaptation to evolving pathogen populations [49].

In sunflower (Helianthus annuus), NBS genes distribute across all 17 chromosomes, with one-third of identified clusters located specifically on chromosome 13 [9]. The tetraploid sweet potato (Ipomoea batatas) exhibits an exceptionally high number of NBS-encoding genes (889), with 83.13% organized in clusters across its chromosomes [25]. Similarly, the Eucalyptus grandis genome contains 1,215 putative NBS-LRR coding sequences, with 76% organized in clusters of three or more genes [51]. These clustering patterns underscore the evolutionary significance of tandem duplications and genomic rearrangements in expanding the plant immune repertoire.

Table 1: Genomic Distribution of NBS-LRR Genes Across Plant Species

Plant Species	Total NBS Genes	Genes in Clusters (%)	Number of Clusters	Chromosome with Highest Density	Key Clustering Features
Perilla citriodora 'Jeju17'	535	Information missing	Information missing	Chromosomes 2, 4, 10	Single RPW8-type gene on chromosome 7 [49]
Capsicum annuum (Pepper)	252	54% (136 genes)	47	Chromosome 3 (38 genes)	Largest cluster (8 genes) on chromosome 3 [50]
Akebia trifoliata	73	64% (41 genes)	Information missing	Information missing	64 mapped genes unevenly distributed on 14 chromosomes [26]
Salvia miltiorrhiza	196	Information missing	Information missing	Information missing	62 genes with complete N-terminal and LRR domains [5]
Ipomoea batatas (Sweet potato)	889	83.13%	Information missing	Information missing	Higher segmentally duplicated genes [25]
Helianthus annuus (Sunflower)	352	Information missing	75	Chromosome 13	One-third of clusters on chromosome 13 [9]
Eucalyptus grandis	1,215	76%	Information missing	Information missing	Higher ratio of TIR to CC class genes [51]
Vernicia montana	149	Information missing	Information missing	Chromosomes 2, 7, 11	Non-random distribution across all chromosomes [27]
Asparagus officinalis	27	Information missing	Information missing	Information missing	Contracted repertoire compared to wild relatives [15]

Evolutionary Classification and Structural Diversity

NBS-LRR genes are classified into distinct subfamilies based on their N-terminal domains: TIR-NBS-LRR (TNL), CC-NBS-LRR (CNL), and RPW8-NBS-LRR (RNL) [3] [26]. Most plant genomes display asymmetrical distribution among these subfamilies, reflecting lineage-specific evolutionary paths and adaptation pressures. Pepper demonstrates striking dominance of the nTNL subfamily (248 genes) over TNLs (only 4 genes) [50], while Akebia trifoliata maintains a more balanced ratio with 50 CNL, 19 TNL, and 4 RNL genes [26].

Structural analysis of NBS domains reveals conserved motifs critical for function, including P-loop, RNBS-A, kinase-2, RNBS-B, RNBS-C, and GLPL motifs involved in nucleotide binding and hydrolysis [50]. These motifs exhibit high conservation across species despite significant variation in LRR domains, which determine pathogen recognition specificity [50] [27]. The number of exons differs substantially between CNLs and TNLs, with CNLs typically containing fewer exons, contributing to structural and functional diversification [26].

Table 2: NBS-LRR Gene Classification Across Plant Species

Plant Species	CNL	TNL	RNL	Other/Partial Domains	Notable Structural Features
Perilla citriodora 'Jeju17'	104 (CC-NB-ARC + CC-NB-ARC-LRR)	Information missing	1 (RPW8-NB-ARC)	430 (NB-ARC + NB-ARC-LRR)	Five structural classes identified [49]
Capsicum annuum (Pepper)	48 (with CC domains)	4	Information missing	200 (lack both CC and TIR domains)	Six nTNL subclasses based on domain combinations [50]
Akebia trifoliata	50	19	4	0	CNLs have fewer exons than TNLs [26]
Vernicia montana	98 (65.8%)	12 (8.1%)	Information missing	39 (other combinations)	2 genes contain both CC and TIR domains [27]
Vernicia fordii	49 (54.4%)	0	Information missing	41 (other combinations)	Complete absence of TIR domains [27]
Helianthus annuus (Sunflower)	100 (CNL)	77 (TNL)	13 (RNL)	162 (NL)	RNLs nested within CNL-A clade phylogenetically [9]
Eucalyptus grandis	Information missing	Information missing	Information missing	Information missing	Higher ratio of TIR to CC class compared to other woody plants [51]

Experimental Protocols for NBS Gene Identification and Analysis

Genome-Wide Identification Pipeline

The standard workflow for genome-wide identification and characterization of NBS-LRR genes integrates complementary bioinformatic approaches to ensure comprehensive detection [49] [3] [25]. The protocol begins with sequence retrieval from genomic databases such as Phytozome, NCBI, or specialized genome portals [15] [9].

The core identification process employs dual search strategies: Hidden Markov Model (HMM) profiling using the conserved NB-ARC domain (Pfam: PF00931) as query, and BLASTp searches against reference NBS-LRR protein sequences from model plants like Arabidopsis thaliana, Oryza sativa, or closely related species [15] [51]. For HMMER analysis, typical e-value cutoffs of 1e-5 to 1e-10 are applied to balance sensitivity and specificity [3] [15]. BLASTp searches typically use e-value thresholds of 1e-10 with alignment length filters (>500 bp) to eliminate spurious matches [51].

Candidate sequences undergo domain validation using PfamScan, InterProScan, or NCBI's Conserved Domain Database to verify the presence of characteristic NBS domains and additional motifs (TIR, CC, LRR, RPW8) [26] [15]. Coiled-coil domains, which are often undetectable by Pfam, require prediction using tools like Coiledcoil with a threshold value of 0.5 [26]. The final non-redundant gene set is classified into structural categories based on domain architecture [3].

Chromosomal Mapping and Cluster Analysis

Chromosomal locations of validated NBS genes are determined using annotation files (GFF/GTF) and visualized with mapping tools such as RIdeogram in R or TBtools [49] [15]. Gene clusters are typically defined as genomic regions containing two or more NBS-LRR genes within a specified physical distance, commonly 100-200 kb, with no more than 8-10 intervening non-NBS genes [50] [15].

For synteny and duplication analysis, MCScanX algorithms identify collinear blocks and differentiate between tandem and segmental duplication events [49] [25]. Sweet potato exhibits higher proportions of segmentally duplicated NBS genes, while its diploid relatives (Ipomoea trifida, Ipomoea triloba) show more tandem duplications [25]. Evolutionary rates are calculated using Ka/Ks analysis, with Ka/Ks >1 indicating positive selection, which is frequently observed in LRR domains involved in pathogen recognition [25].

Signaling Pathways and Functional Mechanisms

NBS-LRR proteins function as intracellular immune receptors that directly or indirectly recognize pathogen effectors, triggering defense signaling cascades [49] [27]. Two recognition mechanisms predominate: the direct recognition model, where NBS-LRR proteins bind pathogen effectors, and the guard hypothesis, where NBS-LRR proteins monitor host proteins that are modified by pathogen effectors [51]. Upon activation, conformational changes in the NBS domain facilitate nucleotide exchange (ADP to ATP), enabling interaction with downstream signaling components [50].

CNLs and TNLs primarily function as pathogen sensors, while RNLs act as "helper" proteins involved in downstream signal transduction [3] [26]. TNL proteins typically signal through the EDS1-PAD4-ADR1 pathway, while CNL proteins often activate NRG1-mediated signaling [26]. Recent evidence indicates that some NBS-LRR proteins employ decoy domains that mimic pathogen targets, expanding recognition specificity without direct effector binding [51].

Functional validation through virus-induced gene silencing (VIGS) demonstrated that silencing specific NBS genes (e.g., GaNBS in cotton) increases susceptibility to pathogens, confirming their essential role in immunity [3]. In tung trees, orthologous gene pairs between resistant (Vernicia montana) and susceptible (Vernicia fordii) species show divergent expression patterns, with specific NBS genes (Vm019719) conferring Fusarium wilt resistance [27].

Evolutionary Insights from Comparative Analysis

Gene Family Expansion and Contraction

NBS-LRR gene families demonstrate remarkable dynamism across plant lineages, with significant expansion in some species and contraction in others. The hexaploid sweet potato contains 889 NBS-encoding genes, while its diploid relatives Ipomoea trifida and Ipomoea triloba maintain 554 and 571 genes respectively, illustrating how polyploidization contributes to repertoire expansion [25]. Conversely, domesticated asparagus (Asparagus officinalis) shows dramatic contraction to only 27 NLR genes compared to its wild relatives Asparagus setaceus (63 genes) and Asparagus kiusianus (47 genes), suggesting artificial selection for yield and quality traits may have compromised disease resistance [15].

Phylogenetic analysis of NBS genes across 34 plant species reveals 168 distinct domain architecture classes, with both conserved patterns and species-specific innovations [3]. TIR domains are consistently absent from monocot NBS-LRR genes and have been independently lost in several eudicot lineages, including Vernicia fordii and Sesamum indicum [27] [9]. These lineage-specific changes reflect contrasting evolutionary paths in plant immunity mechanisms.

Expression Patterns and Functional Diversification

NBS-LRR genes typically exhibit low basal expression with specific induction upon pathogen challenge [26] [5]. Tissue-specific expression patterns reveal specialized functions, with certain NBS genes showing preferential expression in roots, leaves, or reproductive tissues [49] [5]. Comparative transcriptomics of resistant and susceptible genotypes identifies candidate R-genes with potential breeding applications [3] [27].

In Salvia miltiorrhiza, NBS-LRR gene expression correlates with secondary metabolism, suggesting crosstalk between defense signaling and medicinal compound production [5]. Pepper NBS-LRR genes contain abundant cis-regulatory elements responsive to defense hormones (jasmonic acid, salicylic acid) and abiotic stresses, enabling integrated response coordination [50]. These expression patterns underscore the functional diversification within expanded NBS-LRR gene families.

Table 3: Essential Research Reagents and Computational Tools for NBS Gene Analysis

Category	Specific Tool/Reagent	Application	Key Features
Genomic Databases	Phytozome, NCBI Genome, PLAZA, PlantGARDEN	Genome sequence retrieval	Curated plant genomes with annotation [3] [15]
Domain Identification	HMMER v3, PfamScan, InterProScan	NBS domain detection	Hidden Markov Model profiling with Pfam databases [49] [51]
Motif Analysis	MEME Suite, SMART, CDD	Conserved motif prediction	Identifies P-loop, kinase-2, GLPL, MHD motifs [49] [26]
Chromosomal Mapping	RIdeogram (R), TBtools, Circos	Visualization of gene distribution	Gene density plots, synteny maps [49] [15]
Cluster Analysis	MCScanX, BEDTools, OrthoFinder	Gene cluster identification	Defines physical clusters, analyzes duplication patterns [49] [25]
Expression Analysis	DESeq2, featureCounts, CottonFGD	Differential expression	RNA-seq data processing, tissue-specific expression [49] [3]
Functional Validation	VIGS (Virus-Induced Gene Silencing), qRT-PCR	Gene function confirmation	Knockdown assays, expression verification [3] [27]
Primer Design	Primer-BLAST, degenerate primers	Amplification of NBS sequences	Targets conserved motifs for resistance gene analog isolation [50] [9]

Chromosomal mapping and cluster analysis of NBS gene families reveal fundamental principles of plant immunity evolution. The non-random distribution and clustering patterns observed across species highlight the importance of tandem duplications and genomic rearrangements in generating diversity for pathogen recognition. Comparative studies illuminate both conserved features and lineage-specific adaptations in NBS-LRR gene organization, with significant implications for crop improvement. The experimental frameworks and resources outlined provide researchers with standardized methodologies for future investigations. As genome sequencing technologies advance, characterization of NBS gene families in non-model plants will further elucidate the evolutionary arms race between plants and their pathogens, enabling development of durable disease resistance in agricultural systems.

Leveraging RNA-seq Data for NBS Gene Expression Profiling Under Biotic and Abiotic Stress

The Nucleotide-Binding Site-Leucine-Rich Repeat (NBS-LRR) gene family constitutes the largest and most critical class of plant resistance (R) genes, encoding intracellular immune receptors that trigger effector-triggered immunity upon pathogen recognition [13]. Approximately 80% of cloned plant R genes belong to this family, making them fundamental targets for plant immunity research and breeding programs [13] [52]. With the advancement of high-throughput sequencing technologies, RNA-seq has emerged as a powerful tool for investigating the expression profiles of these genes under various stress conditions, providing unprecedented insights into plant defense mechanisms.

The complexity of NBS gene families varies dramatically across plant species, ranging from 25 NLRs in the bryophyte Physcomitrella patens to over 2,000 in bread wheat (Triticum aestivum) [3] [10]. This genomic diversity, combined with the multifaceted nature of plant stress responses, necessitates sophisticated transcriptional profiling approaches to decipher the role of specific NBS genes in biotic and abiotic stress adaptation. This review synthesizes current methodologies, findings, and applications of RNA-seq data in profiling NBS gene expression across diverse plant species, providing a comprehensive framework for researchers in the field.

NBS Gene Family: Classification and Genomic Distribution

Structural Classification and Functional Domains

NBS-LRR proteins are characterized by a conserved tripartite domain architecture that forms the structural basis for their immune function. The central nucleotide-binding site (NBS) domain binds and hydrolyzes ATP/GTP, facilitating conformational changes essential for activation [13] [53]. The C-terminal leucine-rich repeat (LRR) domain is responsible for specific pathogen recognition through direct or indirect effector binding [52] [53]. The N-terminal domain determines the classification into major subfamilies and dictates signaling specificity.

Based on N-terminal domain architecture, NBS-LRR genes are primarily classified into:

TNLs: Contain a Toll/Interleukin-1 Receptor (TIR) domain
CNLs: Feature a Coiled-Coil (CC) domain
RNLs: Possess a Resistance to Powdery Mildew 8 (RPW8) domain [3] [13]

Additionally, numerous atypical or truncated forms exist, including TN (TIR-NBS), CN (CC-NBS), NL (NBS-LRR), and N (NBS-only) proteins, which may retain specialized functions despite domain losses [13] [10].

Table 1: NBS-LRR Gene Family Distribution Across Plant Species

Plant Species	Total NBS Genes	TNL	CNL	RNL	Atypical	Reference
Arabidopsis thaliana	207	101	-	-	-	[13]
Nicotiana tabacum	603	~2.5%	~23.3%	-	~45.5% NBS-only	[10]
Salvia miltiorrhiza	196	2	75	1	118	[13]
Brassica oleracea (cabbage)	138	105	33	-	-	[52]
Citrus sinensis (sweet orange)	111	15	31	3	62	[53]
Asparagus officinalis (garden asparagus)	27	-	-	-	-	[15]

Evolutionary Dynamics and Genomic Organization

NBS-LRR genes exhibit remarkable evolutionary plasticity, with gene numbers fluctuating dramatically due to species-specific expansion and contraction events. Whole-genome duplication (WGD) and tandem duplications serve as primary drivers of NBS gene family expansion, enabling rapid adaptation to evolving pathogen pressures [3] [10]. Comparative genomic analyses reveal frequent subfamily loss in certain lineages; for instance, monocots like rice (Oryza sativa) have completely lost TNL genes, while dicots like Salvia miltiorrhiza show marked reduction in both TNL and RNL subfamilies [13].

NBS-LRR genes typically display clustered genomic arrangements, often localizing in recombination-rich regions that facilitate the generation of novel recognition specificities. Studies in Asparagus species revealed significant gene family contraction during domestication, with wild relative A. setaceus harboring 63 NLRs compared to only 27 in cultivated A. officinalis, potentially explaining enhanced disease susceptibility in the domesticated species [15].

Experimental Design for RNA-seq Profiling of NBS Genes

Comprehensive Stress Treatment Protocols

Effective transcriptional profiling of NBS genes requires carefully designed stress imposition regimes that capture both temporal dynamics and stress-specific responses. The following treatments have proven effective across multiple studies:

Biotic Stress Challenges:

Fungal pathogens: Fusarium oxysporum in cabbage (root dipping method, sampling at 0, 6, 12, 24, 48, and 72 hours post-inoculation) [52]
Bacterial pathogens: Xanthomonas campestris pv. vesicatoria in pepper (injection with 10⁸ CFU/mL, sampling at 0, 3, 6, 12, 24, and 48 hpi) [54]
Oomycetes: Phytophthora capsici in pepper (5×10⁴ zoospores/mL, sampling at 0, 1, 2, 4, 6, 12, and 24 hpi) [54]
Viruses: Tobacco mosaic virus P2 strain (TMV-P2) in pepper (mechanical inoculation, sampling at 0, 0.5, 4, 24, 48, and 72 hpi) [54]

Abiotic Stress Applications:

Drought stress: Withholding water or osmotic agents (e.g., PEG) with sampling during progressive stress stages [55]
Salt stress: Application of NaCl solutions (e.g., 150-200mM) with sampling at early (0.25h) and later time points [56]
Temperature stress: Exposure to low (4°C) or high (38-42°C) temperatures with time-course sampling [56]
Oxidative stress: Hydrogen peroxide application with sampling at 0.25h post-treatment [56]

Tissue Selection and Sampling Strategies

NBS gene expression demonstrates significant tissue specificity, necessitating strategic tissue selection for comprehensive profiling. In cabbage, 37.1% of TNL genes show preferential or specific expression in root tissues, highlighting the importance of including below-ground organs in studies of soil-borne pathogens [52]. Multiple studies incorporate time-course sampling to capture both early and late response genes, as demonstrated in tomato studies where differential expression peaked at varying time points (1-48 hours) depending on the pathogen [56].

Figure 1: Experimental workflow for RNA-seq analysis of NBS genes under stress conditions

RNA-seq Data Analysis Pipelines for NBS Gene Profiling

Computational Workflows and Quality Control

Robust bioinformatic pipelines are essential for accurate assessment of NBS gene expression from RNA-seq data. The following workflow has been validated across multiple plant species:

Data Preprocessing:

Quality Control: FastQC (v0.11.9) for quality assessment and MultiQC for aggregation of results [54]
Adapter Trimming: Cutadapt (v1.15) or Trimmomatic (v0.38) with parameters "LEADING:3, TRAILING:3, SLIDINGWINDOW:4:20, MINLEN:36" [54]
Read Mapping: HISAT2 (v2.1.0) or STAR aligner with default parameters against respective reference genomes [54]

Transcript Quantification:

Assembly: StringTie (v1.3.5) for transcript assembly and merge function for integrated transcript model construction [54]
Normalization: FPKM (Fragments Per Kilobase per Million) for paired-end data or RPKM (Reads Per Kilobase per Million) for single-end data [3] [54]
Differential Expression: Cuffdiff (v2.2.1) or DESeq2 for statistical assessment of differential expression with FDR correction [10]

Specialized NBS Gene Analysis:

Orthogroup Classification: OrthoFinder (v2.5.1) with DIAMOND for sequence similarity searches and MCL for clustering [3]
Domain Verification: Integration with Pfam (NB-ARC domain PF00931) and CDD databases to confirm NBS identity [10] [53]

Integration with Alternative Splicing Analysis

Recent evidence indicates that alternative splicing (AS) significantly expands the functional diversity of NBS-LRR genes. Large-scale analyses in pepper identified 1,642,007 AS events across 425 RNA-seq datasets, with biotic stressors generating the most AS events (689,238), followed by abiotic stressors (433,339) [54]. Tools such as rMATS (v4.0.2) effectively classify AS events into five types: exon skipping (SE), intron retention (RI), mutually exclusive exons (MXE), alternative 3' splice sites (A3SS), and alternative 5' splice sites (A5SS) [54].

Table 2: Key Bioinformatics Tools for NBS Gene Expression Analysis

Tool Category	Software/Resource	Key Function	Application in NBS Studies
Read Processing	Trimmomatic, Cutadapt	Adapter trimming, quality filtering	Pre-processing of raw RNA-seq reads [10] [54]
Read Alignment	HISAT2, STAR	Splice-aware alignment to reference	Mapping reads to plant genomes [10] [54]
Transcript Assembly	StringTie	Transcript reconstruction and quantification	Generating expression values for NBS genes [54]
Differential Expression	Cuffdiff, DESeq2	Statistical identification of DEGs	Finding stress-responsive NBS genes [56] [10]
Orthogroup Analysis	OrthoFinder	Clustering of orthologous genes	Identifying conserved NBS orthogroups [3] [15]
Domain Identification	HMMER, Pfam Scan	Protein domain prediction	Verifying NBS domain architecture [3] [10]

Case Studies: NBS Gene Expression Across Plant Species

Comparative Expression Profiling in Crop Species

Multi-species analyses reveal both conserved and species-specific patterns of NBS gene regulation under stress conditions:

Cotton (Gossypium hirsutum):

Orthogroup-based profiling identified putative upregulation of OG2, OG6, and OG15 orthogroups in different tissues under various biotic and abiotic stresses in cotton accessions with contrasting responses to cotton leaf curl disease (CLCuD) [3]
Virus-induced gene silencing (VIGS) of GaNBS (OG2) in resistant cotton demonstrated its critical role in viral titer reduction, validating functional importance [3]
Genetic variation analysis between susceptible (Coker 312) and tolerant (Mac7) accessions identified 6,583 unique NBS gene variants in the tolerant line versus 5,173 in the susceptible line [3]

Tomato (Solanum lycopersicum):

Meta-analysis of 12 transcriptomic studies identified 1,474 DEGs common between biotic and abiotic stress responses, including RLKs, MAPKs, and key transcription factors (MYBs, bZIPs, WRKYs, ERFs) that potentially coregulate NBS genes [56]
Pathogen-specific responses varied significantly, with P. infestans, P. syringae, and S. sclerotiorum inducing >10,000 DEGs, while TSWV infection resulted in only 1,490 DEGs at peak response [56]

Sweet Orange (Citrus sinensis):

Expression profiling under Penicillium digitatum infection and abiotic stresses revealed differential regulation of specific NBS-LRR genes, providing candidate genes for disease resistance breeding [53]
Promoter analysis identified numerous cis-elements responsive to defense signals and phytohormones in NBS gene promoters, suggesting complex regulatory networks [53]

Temporal Dynamics of NBS Gene Expression

Time-course analyses reveal sophisticated temporal regulation of NBS genes following stress imposition:

In cotton, expression of specific NBS orthogroups showed distinct temporal patterns in susceptible versus tolerant accessions following CLCuD infection, with early induction correlating with resistance [3]
Tomato response to Ralstonia solanacearum showed progressively increasing DEG numbers from 1,178 at 1 dpi to 4,381 at 2 dpi, illustrating dynamic transcriptome reprogramming [56]
Pepper responses to bacterial pathogens (Xanthomonas species) demonstrated rapid induction of specific NBS genes within 3-6 hours post-inoculation, highlighting early defense activation [54]

Figure 2: Temporal dynamics of NBS-mediated defense signaling following stress perception

Table 3: Key Research Reagent Solutions for NBS Gene Expression Studies

Reagent/Resource	Specifications	Application	Example Use
RNA Extraction Kits	TRIzol reagent, column-based kits	High-quality RNA isolation from plant tissues	Pepper leaf samples for stress-responsive AS analysis [54]
Library Prep Kits	Strand-specific libraries, insert size 150-200bp	RNA-seq library construction	132 cDNA libraries for pepper transcriptome analysis [54]
Sequencing Platforms	Illumina HiSeq 2500/X Ten, 101-151bp reads	High-throughput transcriptome sequencing	Various platforms used for pepper stress studies [54]
Reference Genomes	Species-specific genome assemblies/annotations	Read mapping and transcript quantification	C. annuum v1.6, C. sinensis v3.0 genomes [54] [53]
Domain Databases	Pfam (PF00931), CDD, InterPro	NBS domain identification and verification	HMMER searches with PF00931 model [3] [10]
VIGS Vectors	Tobacco rattle virus (TRV)-based systems	Functional validation through gene silencing	GaNBS silencing in cotton for functional analysis [3]

RNA-seq technologies have revolutionized our understanding of NBS gene expression dynamics under biotic and abiotic stresses, revealing complex regulatory networks and species-specific adaptation strategies. The integration of large-scale transcriptomic datasets with orthogroup classification and functional validation approaches has enabled researchers to identify key NBS genes governing stress responses across diverse plant species.

Future research directions should focus on:

Multi-omics integration combining transcriptomic, proteomic, and epigenomic data to comprehensively understand NBS gene regulation
Single-cell RNA-seq applications to resolve cell-type-specific NBS expression patterns during stress responses
Machine learning approaches to predict functional NBS genes based on expression signatures and protein features
Cross-species conserved orthogroup analysis to identify universal stress-responsive NBS genes with potential for translational applications

The continuing refinement of RNA-seq methodologies and analytical frameworks will undoubtedly accelerate the discovery and utilization of NBS genes in crop improvement programs, ultimately contributing to enhanced agricultural sustainability and food security in the face of mounting environmental challenges.

Cotton leaf curl disease (CLCuD) presents a major threat to global cotton production, causing severe economic losses estimated to reduce yield by 80-87% in severe epidemics [57]. This devastating disease is caused by a complex of single-stranded DNA begomoviruses (family Geminiviridae) transmitted by the whitefly Bemisia tabaci [58] [59]. The virus manifests through characteristic symptoms including leaf curling, vein thickening, enation formation, and plant stunting [57] [58]. While the extensively cultivated Gossypium hirsutum varieties are highly susceptible, sources of tolerance exist within adapted germplasm lines like Mac7, and strong resistance is found in the diploid species Gossypium arboreum [60] [3]. This case study provides a comparative analysis of the expression dynamics of Nucleotide-Binding Site (NBS) disease resistance genes between CLCuD-susceptible and tolerant cotton varieties, contextualized within broader research on NBS genes across plant species.

Experimental Protocols

Plant Materials and Disease Screening

Research typically employs comparative designs using genetically distinct cotton genotypes. Standard protocols utilize:

Resistant/Tolerant Genotypes: G. arboreum accessions (e.g., 'Ravi'), G. hirsutum Mac7 derivatives, and other breeding lines with confirmed resistance [60] [57] [3].
Susceptible Controls: G. hirsutum varieties such as 'Coker 312' and 'Karishma' [3] [58].
Screening Methods:
- Field Evaluation: Using single plant progeny rows (SPPRs) under natural whitefly infestation with disease rating scales (0-6), where 0 = no symptoms and 6 = severe leaf curling and enations with significant yield loss [57].
- Glasshouse Inoculation: Graft-inoculation or controlled whitefly-mediated transmission of CLCuV, with symptoms assessed 90 days post-inoculation [57] [58].

Molecular and Biochemical Profiling

Resistance Gene Analogue (RGA) Identification

Degenerate Primer Design: Conserved regions of NBS-LRR resistance gene sequences from databases are aligned to design degenerate primers, typically 24-mer with maximum four degeneracies per primer and no degeneracy at the 3' end [60].
PCR Amplification and Cloning: Genomic DNA PCR products from resistant and susceptible genotypes are cloned into TA vectors, transformed into E. coli, and sequenced [60].
Sequence Analysis: BLAST searches against genomic databases (e.g., CottonGen) identify homologous sequences and chromosomal locations [60].

Transcriptomic Profiling

RNA Sequencing: Total RNA is isolated from leaf tissues of control and CLCuD-infected plants. Strand-specific cDNA libraries are prepared and sequenced using Illumina platforms (e.g., HiSeq 2500) [58].
Differential Expression Analysis: HISAT2 aligns reads to reference genomes, followed by differential gene expression analysis using Cufflinks/cuffdiff with FPKM normalization and statistical cutoff (q-value < 0.05) [58].
Validation: RT-qPCR validates expression patterns of selected genes in independent samples [58].

Biochemical Profiling

Antioxidant Assays: Spectrophotometric measurements of peroxidase (POD), ascorbate peroxidase (APX), catalase (CAT), and superoxide dismutase (SOD) activities [57].
Metabolite Quantification: Total phenolic content (TPC), tannins, total oxidant status (TOS), total soluble proteins (TSP), and malondialdehyde (MDA) levels are measured to assess oxidative stress and defense responses [57].

Genetic Mapping and QTL Analysis

Population Development: F₂ populations are developed from crosses between resistant and susceptible parents [61] [62].
Genotyping: High-density genotyping using platforms like CottonSNP63K array [61].
QTL Mapping: Composite interval mapping identifies genomic regions associated with CLCuD resistance, with LOD score thresholds typically set at 2.0-3.0 [61] [62].

The following workflow diagram integrates these key experimental approaches for studying CLCuD resistance:

Comparative Expression Dynamics of NBS Genes

Genomic Architecture of Cotton NBS Genes

Comprehensive genomic analyses reveal significant diversity in NBS-encoding genes among cotton species. A systematic study identified 12,820 NBS-domain-containing genes across 34 plant species, classified into 168 distinct domain architecture classes [3]. In the diploid cotton G. raimondii, 355 NBS-encoding resistance genes were identified, characterized by high proportions of non-regular NBS genes and diverse N-terminal domains [63]. Orthogroup analysis revealed 603 orthogroups (OGs), with certain core OGs (OG0, OG1, OG2) demonstrating conservation across species, while others (OG80, OG82) showed species-specific patterns [3].

The structural diversity of NBS genes includes classical architectures (NBS, NBS-LRR, TIR-NBS, TIR-NBS-LRR) and species-specific patterns (TIR-NBS-TIR-Cupin1-Cupin1, TIR-NBS-Prenyltransf) [3]. Phylogenetic comparisons indicate that TIR-NBS-LRR genes in cotton follow distinct evolutionary patterns compared to non-TIR NBS genes and exhibit species-specific characteristics that differ from TIR genes in other plants [63].

Expression Profiling in Susceptible and Tolerant Varieties

Transcriptomic analyses reveal distinct expression dynamics of NBS genes between CLCuD-responsive cotton varieties:

Resistant/Tolerant Varieties: Show coordinated upregulation of specific NBS gene orthogroups. OG2, OG6, and OG15 demonstrate significant induction in resistant accessions like Mac7 following CLCuV infection [3]. In G. arboreum, five identified resistance gene analogues (RM1, RM6, RM8, RM12, and RM31) showed homology with known R genes, with one fragment exhibiting 94% homology with G. raimondii toll/interleukin receptor-like protein [60].
Susceptible Varieties: Exhibit different expression patterns, with general suppression of defense-related genes. Transcriptome analysis of susceptible G. hirsutum 'Karishma' identified 468 differentially expressed genes (DEGs) upon whitefly-mediated CLCuD infection, with 248 downregulated genes enriched in cellular processes [58]. This systematic under-expression potentially facilitates viral establishment and disease progression.

Table 1: NBS Gene Expression Profiles in Cotton Varieties with Differential CLCuD Response

Gene Orthogroup	Expression in Resistant	Expression in Susceptible	Putative Function
OG2	Strong upregulation	No significant change	TIR-NBS-LRR class
OG6	Moderate upregulation	Downregulation	CC-NBS-LRR class
OG15	Moderate upregulation	Slight downregulation	NBS-LRR class
RGA 395	No change	No change	Constitutive expression
RM1	Upregulated	Not detected	TIR-like domain

Genetic Variation in NBS Genes

Comparative genomic analysis between susceptible (Coker 312) and tolerant (Mac7) G. hirsutum accessions reveals substantial sequence variation in NBS genes:

Mac7: Contains 6,583 unique variants in NBS genes [3]
Coker 312: Contains 5,173 unique variants in NBS genes [3]

Protein-ligand and protein-protein interaction studies demonstrate strong binding of specific NBS proteins from resistant varieties with ADP/ATP and different core proteins of the cotton leaf curl disease virus, suggesting direct interaction mechanisms [3]. Functional validation through virus-induced gene silencing (VIGS) of GaNBS (OG2) in resistant cotton demonstrated increased viral titers, confirming its role in virus resistance [3].

Biochemical and Physiological Response Profiles

Comparative biochemical profiling reveals distinct defense responses between CLCuD-resistant and susceptible cotton varieties:

Table 2: Biochemical Profiles of CLCuD-Resistant and Susceptible Cotton Varieties Under Field and Glasshouse Conditions

Parameter	Resistant Varieties	Susceptible Varieties
Antioxidant Enzymes
- Peroxidase (POD)	Increased by 3% (field) to ~62% (glasshouse)	Lower activity
- Ascorbate Peroxidase (APX)	Increased by 8% (field) to ~6% (glasshouse)	Lower activity
- Catalase (CAT)	Increased by 32% (field) to 15% (glasshouse)	Lower activity
- Superoxide Dismutase (SOD)	Decreased by 25% (field) to increased by 3% (glasshouse)	Variable activity
Metabolites
- Total Phenolic Content	Moderate increase	Significantly elevated
- Tannins	Moderate increase	Significantly elevated
- Malondialdehyde (MDA)	Lower levels	Elevated levels
- Total Soluble Proteins	Stable	Elevated
Photosynthetic Pigments
- Chlorophyll a	Higher levels maintained	Reduced levels
- Chlorophyll b	Higher levels maintained	Reduced levels
- Lycopene	Elevated in resistant varieties	Reduced levels

Under field conditions, resistant varieties exhibit elevated antioxidant enzymes, with CAT, POD, and APX activities increasing by 32%, 3%, and 8% respectively, while SOD activity decreases by 25% compared to susceptible lines [57]. Under controlled glasshouse conditions, resistant genotypes show stronger antioxidant responses, with POD and APX activities approximately 62% and 6% higher, respectively, while CAT and SOD increase by 15% and 3% [57].

Principal component analysis (PCA) of field experiments indicated that five key factors contributed to 80.26% of the variation observed among genotypes, while glasshouse experiments explained 74.24% of the total cumulative variability [57]. These biochemical markers effectively differentiate resistance mechanisms and provide measurable indicators for breeding programs.

Genetic Mapping of CLCuD Resistance Loci

Quantitative Trait Loci (QTL) Associated with CLCuD Resistance

Genetic mapping studies have identified multiple QTLs associated with CLCuD resistance across different cotton populations:

Table 3: Identified QTLs and Genomic Regions Associated with CLCuD Resistance

QTL Name	Chromosome	Population	LOD Score	Phenotypic Variance	Reference
qCLCVa1	Chr09	(G. barbadense × G. anomaulum) × G. hirsutum F₂	3.36	18%	[62]
qCLCVa2	Chr09	(G. barbadense × G. anomaulum) × G. hirsutum F₂	3.26	18%	[62]
MQTLchr7-1	Chr07	Meta-analysis of 50 studies	-	-	[64]
MQTLchr14-1	Chr14	Meta-analysis of 50 studies	-	-	[64]
MQTLchr24-1	Chr24	Meta-analysis of 50 studies	-	-	[64]
Unnamed	A01 (Chr15)	Synthetic tetraploid × G. hirsutum	-	-	[61]
Unnamed	D07 (Chr16)	Synthetic tetraploid × G. hirsutum	-	-	[61]

Recent meta-analysis integrating 2,864 QTLs from 50 independent studies identified 75 meta-QTLs (MQTLs) with reduced confidence intervals, including 14 novel MQTLs reported for the first time [64]. Stable MQTL clusters such as MQTLchr7-1, MQTLchr14-1, and MQTLchr24-1 harbor key fiber quality and stress tolerance traits [64].

Candidate Resistance Genes

Candidate gene analysis within MQTL regions identified 75 genes, 38 with significant gene ontology terms related to lignin catabolism, flavin binding, and stress responses [64]. Notable candidates include:

GhLAC-4: Involved in lignin catabolism
GhCTL2: Chitinase-like protein
UDP-glycosyltransferase 92A1: Potential role in fiber development and abiotic stress tolerance

Kompetitive allele specific PCR (KASP) markers have been developed and validated for a subset of QTLs, enabling marker-assisted selection for CLCuD resistance [61].

The Scientist's Toolkit: Essential Research Reagents

Table 4: Key Research Reagent Solutions for CLCuD Resistance Studies

Reagent Category	Specific Examples	Application/Function
Molecular Markers	CottonSNP63K array [61]	High-density genotyping for QTL mapping
	SSR markers [61]	Genetic mapping and diversity analysis
	KASP markers [61]	Marker-assisted selection for resistance breeding
Sequencing Tools	Illumina HiSeq 2500 [58]	RNA-Seq for transcriptome analysis
	Degenerate primers for RGAs [60]	Amplification of resistance gene analogues
Cloning Systems	TA vector (pTZ57R/T) [60]	Cloning of PCR fragments for sequencing
	Escherichia coli DH5α competent cells [60]	Transformation and plasmid propagation
Biochemical Assays	Antioxidant enzyme activity kits [57]	Quantification of POD, APX, CAT, SOD activities
	Total phenolic content assay [57]	Measurement of defense-related metabolites
	Malondialdehyde (MDA) assay [57]	Lipid peroxidation and oxidative stress assessment
Functional Validation	VIGS vectors [3]	Virus-induced gene silencing for functional studies
	Grafting materials [57]	Controlled virus transmission studies

Integrated Model of CLCuD Resistance Mechanisms

The following diagram illustrates the integrated signaling pathways and molecular interactions underlying CLCuD resistance in cotton:

This integrated model highlights the multi-layered defense strategy in CLCuD-resistant cotton varieties, emphasizing the crucial role of NBS genes in orchestrating both immediate and systemic responses to viral infection.

This case study demonstrates distinct expression dynamics of NBS genes between CLCuD-susceptible and tolerant cotton varieties, contextualized within broader research on NBS genes across plant species. Resistant genotypes exhibit coordinated upregulation of specific NBS orthogroups (OG2, OG6, OG15), enhanced antioxidant systems, and activation of multi-layered defense responses. The identification of stable QTLs and candidate genes, coupled with advanced research reagents and methodologies, provides valuable resources for marker-assisted breeding and functional genomics. These findings advance our understanding of plant-virus interactions and contribute to the development of durable CLCuD resistance in cotton, supporting global efforts to sustain cotton fiber security.

In the genomic landscape of plant disease resistance, promoter regions serve as critical control centers where the complex interplay between pathogen invasion and defense activation is coordinated. Nucleotide-binding site leucine-rich repeat (NBS-LRR) genes, which constitute the largest family of plant resistance (R) genes, rely heavily on these regulatory regions to mount timely and effective immune responses [65] [3]. The identification and characterization of cis-regulatory elements within these promoters have emerged as fundamental to understanding how plants fine-tune their defense mechanisms against rapidly evolving pathogens.

Promoter analysis reveals short, non-coding DNA sequences known as cis-elements that function as molecular switches, controlling when, where, and to what extent defense genes are activated in response to both external threats and internal signaling molecules [66]. These elements achieve this by serving as binding platforms for transcription factors, proteins that in turn regulate the expression of downstream genes. Within the context of plant immunity, the strategic placement and combination of these elements enable precise orchestration of defense programs, ensuring that energetically costly immune responses are deployed only when necessary and with appropriate intensity [67].

The broader thesis of comparative NBS gene analysis across plant species reveals remarkable conservation of certain regulatory architectures while also highlighting species-specific adaptations. This article provides a comprehensive guide to the experimental approaches, findings, and resources central to dissecting these regulatory codes, with particular emphasis on their role in mediating defense and hormone-responsive pathways.

Core Concepts: Cis-Element Architecture in NBS Gene Regulation

Definition and Functional Classes

Cis-regulatory elements are typically 5-15 base pairs in length and are often located within 1.5 kilobases upstream of the transcription start site [66]. In NBS-LRR genes, these elements can be categorized into two primary functional classes based on their response triggers:

Defense-Responsive Elements: These elements respond directly to pathogen invasion and include W-boxes (for WRKY transcription factors), MYB-binding sites, and other elements that recognize conserved molecular patterns from pathogens or danger signals from damaged host tissues.
Hormone-Responsive Elements: These elements mediate responses to defense-related phytohormones such as salicylic acid (SA), jasmonic acid (JA), and ethylene (ET). Key examples include TGACG-motifs (JA-responsive), as-1 elements (SA-responsive), and GCC-boxes (ET-responsive) [67].

The functional significance of these elements lies in their combinatorial arrangement. Rather than functioning in isolation, they form integrated cis-regulatory modules (CRMs) that process multiple signaling inputs simultaneously [66]. This modular architecture allows plants to tailor specific defense responses to different pathogen types and to integrate defense signaling with other physiological processes.

Analytical Workflow for Cis-Element Identification

The standard pipeline for identifying and characterizing cis-elements in NBS gene promoters involves a series of bioinformatic and experimental steps, as visualized below:

Figure 1: Experimental workflow for comprehensive promoter analysis, integrating bioinformatic predictions with experimental validation.

Comparative Analysis of Cis-Elements Across Plant Species

Case Studies in Diverse Species

Nicotiana benthamiana: A recent genome-wide analysis of 156 NBS-LRR genes identified 29 shared types of cis-regulatory elements, with four kinds unique to irregular-type NBS-LRR genes. The analysis revealed that these promoter elements are potentially critical upstream regulation factors, with subcellular localization predictions showing 121 NBS-LRRs located in the cytoplasm, 33 in the plasma membrane, and 12 in the nucleus [65].

Rosa chinensis: In rose species, promoter analysis of 96 TNL genes identified crucial cis-elements responsive to gibberellin, jasmonic acid, and salicylic acid. The study demonstrated that specific RcTNL genes, particularly RcTNL23, showed significant responses to three hormones (gibberellin, jasmonic acid, and salicylic acid) and three pathogens (Botrytis cinerea, Podosphaera pannosa, and Marssonina rosae) [67].

Oryza sativa (Rice): Research on broad-spectrum defense response (BS-DR) genes in rice revealed that specific cis-regulatory modules (CRMs) are enriched in the promoters of co-expressed defense genes. Polymorphisms in these CRMs between resistant and susceptible haplotypes provide evidence that these regulatory architectures predict the effectiveness of the defense response [66].

Dendrobium officinale: Analysis of NBS-LRR genes in this medicinal orchid identified promoter cis-elements involved in the ETI system, plant hormone signal transduction, and Ras signaling pathways. Transcriptome analysis following salicylic acid treatment identified 1,677 differentially expressed genes, including six NBS-LRR genes that were significantly upregulated [11].

Quantitative Comparison of Cis-Element Distribution

Table 1: Distribution of Key Cis-Element Classes Across Plant Species

Plant Species	Total NBS Genes Analyzed	Hormone-Responsive Elements	Defense-Responsive Elements	Unique/Specialized Elements	Primary Analysis Tool
Nicotiana benthamiana	156	JA, SA, GA-responsive	W-boxes, MYB-binding sites	4 types unique to irregular-type NBS-LRR	PlantCARE
Rosa chinensis	96 (TNL only)	GA, JA, SA-responsive (RcTNL23)	Fungal pathogen-responsive	Elements specific to black spot response	PlantCARE
Oryza sativa	BS-DR gene cluster (385)	JA, SA, BTH-responsive	MAMP-responsive modules	CRMs predictive of resistance	Custom CRM identification
Dendrobium officinale	22 (NBS-LRR)	SA-responsive (6 genes)	ETI system elements	Ras signaling pathway elements	PlantCARE

Table 2: Experimental Validation Methods for Cis-Element Function

Validation Method	Technical Approach	Information Gained	Case Study Example
Expression Profiling	RNA-seq, qRT-PCR under hormone/pathogen treatment	Expression dynamics and co-expression patterns	D. officinale response to SA treatment [11]
Promoter Mutagenesis	Targeted mutation of specific cis-elements	Functional necessity of specific elements	Rice BS-DR gene haplotype analysis [66]
Protein-DNA Interaction	EMSA, ChIP-seq	Direct transcription factor binding	Not specified in results
Genetic Mapping	QTL analysis with promoter polymorphism screening	Association between natural variation and resistance	Rice blast resistance QTLs [66]

Experimental Protocols and Methodologies

Standardized Workflow for Promoter Analysis

Protocol 1: Genome-Wide Identification of Cis-Elements in NBS Genes

Promoter Sequence Extraction: Obtain 1500-2000 bp genomic sequences upstream of the translation start site (ATG) of identified NBS-LRR genes from genome annotation files [65] [67].
In Silico Screening: Utilize the PlantCARE database (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) or PLACE database to identify putative cis-elements within the promoter sequences. Use default parameters with significance thresholds based on position-specific scoring matrices [65] [67].
Element Classification: Categorize identified elements into functional groups: hormone-responsive (JA, SA, ABA, GA, ET), defense-responsive (MYB, WRKY, MYC binding sites), stress-responsive (DRE, LTRE), and development-related elements.
Visualization: Employ visualization tools such as TBtools to generate schematic diagrams of cis-element distribution across different promoter types [65].

Protocol 2: Expression Correlation Analysis

Treatment Design: Subject plant materials to hormone treatments (SA, JA, GA) or pathogen inoculations using standardized concentration and time course experiments [67] [11].
RNA Sequencing: Extract total RNA from treated tissues, construct cDNA libraries, and perform RNA-seq analysis with appropriate biological replicates.
Differential Expression: Identify differentially expressed genes (DEGs) using standard pipelines (e.g., |log2FoldChange| > 1, FDR < 0.05) and correlate expression patterns with cis-element profiles [11].
Co-expression Analysis: Perform weighted gene co-expression network analysis (WGCNA) to identify modules of co-expressed genes and their association with specific cis-regulatory motifs [11].

Signaling Pathways in Plant Immunity Regulation

The cis-elements identified in NBS gene promoters function as integration points for multiple defense signaling pathways, with salicylic acid playing a particularly crucial role in mediating TNL-type gene expression, as illustrated below:

Figure 2: Defense signaling pathways converging on promoter cis-elements to activate NBS-LRR gene expression and plant immunity.

Table 3: Key Research Reagent Solutions for Promoter Analysis

Resource Category	Specific Tool/Database	Function/Application	Access Information
Promoter Databases	PlantCARE	Cis-element identification and annotation	http://bioinformatics.psb.ugent.be/webtools/plantcare/html/
Genomic Resources	PlantGDB	Plant genome database with analytical tools	http://www.plantgdb.org/ [68]
Resistance Gene Databases	PRGdb 4.0	Curated database of plant resistance genes	https://www.prgdb.org/ [69]
Sequence Analysis	TBtools	Bioinformatics software for genomic analysis	Available from GitHub
Domain Identification	Pfam Database	Protein domain identification (NB-ARC: PF00931)	http://pfam.sanger.ac.uk/ [65]
Motif Analysis	MEME Suite	Discovery of conserved protein motifs	https://meme-suite.org/ [65]

The comprehensive analysis of defense and hormone-responsive cis-elements in NBS gene promoters has transcended basic scientific inquiry to become an indispensable tool in modern crop improvement programs. The comparative approach across species reveals both conserved regulatory principles and species-specific innovations that can be exploited for engineering broad-spectrum disease resistance. The experimental data and protocols compiled in this guide provide researchers with a standardized framework for dissecting these regulatory codes in both model and crop species.

The emerging paradigm shift from analyzing single genes to understanding entire cis-regulatory modules represents the future of promoter analysis in plant immunity research [66]. As genome sequencing technologies continue to advance, enabling more chromosome-scale assemblies across diverse plant taxa, our ability to identify predictive CRM signatures that correlate with effective defense responses will become increasingly sophisticated. This knowledge, in turn, will empower more precise breeding strategies aimed at pyramiding not only functional R genes but also their optimal regulatory architectures, ultimately leading to more durable and broad-spectrum disease resistance in agricultural systems.

Navigating Challenges in NBS Gene Analysis and Functional Characterization

Overcoming High Sequence Diversity and Gene Copy Number Variation

Nucleotide-binding site-leucine-rich repeat (NBS-LRR) genes represent one of the largest and most critical plant gene families, encoding intracellular immune receptors that confer resistance to diverse pathogens. These genes exhibit remarkable sequence diversity and extensive copy number variation (CNV) across plant species, presenting significant challenges for their characterization and functional analysis. This comparative guide examines the experimental strategies and bioinformatic tools that enable researchers to overcome these obstacles, facilitating robust cross-species analyses of NBS gene evolution and function. Understanding these dynamic genetic elements is essential for advancing crop improvement programs and enhancing sustainable agricultural production.

Table 1: Documented Patterns of NBS-LRR Gene Copy Number Variation Across Plant Species

Plant Species/Family	Number of NBS-LRR Genes	Evolutionary Pattern	Key Features	Citation
Rosaceae (12 species)	2,188 total (highly variable)	Distinct patterns: "continuous expansion" (R. chinensis), "first expansion then contraction" (R. occidentalis)	Independent gene duplication/loss events after speciation	[12]
Ipomoea species (sweet potato)	889	Higher segmental duplications	83% of genes occur in clusters	[7]
Ipomoea trifida	554	Higher tandem duplications	77% of genes occur in clusters	[7]
Brassica napus (canola)	563 RGAs with CNVs	CNVs more frequent in clustered RGAs	25 of 112 blackleg resistance QTLs affected by CNV	[70]
Orchidaceae	Varies by species (20-fold difference)	"Early contraction to recent expansion" (P. equestris) or "contraction" (G. elata)	Extreme variation in gene number between species	[3] [12]
Solanaceae	Varies by species	"Continuous expansion" (potato), "expansion then contraction" (tomato), "shrinking" (pepper)	Lineage-specific evolutionary patterns	[12]

Understanding Sequence Diversity and CNV in NBS Genes

Fundamental Characteristics of NBS Gene Diversity

NBS-LRR genes encode intracellular immune receptors that recognize pathogen effectors and activate plant defense responses. These genes are characterized by three fundamental domains: an variable N-terminal domain (TIR, CC, or RPW8), a central nucleotide-binding site (NBS), and C-terminal leucine-rich repeats (LRRs) [71] [3]. The NBS domain contains several conserved motifs (P-loop, RNBS-A, etc.) that facilitate phylogenetic analysis and classification, while the LRR domain evolves rapidly under diversifying selection to recognize diverse pathogens [72].

This gene family exhibits extraordinary structural diversity, with 168 distinct domain architecture patterns identified across 34 plant species [3]. These include not only classical configurations (NBS, NBS-LRR, TIR-NBS-LRR) but also species-specific structural patterns (TIR-NBS-TIR-Cupin_1, TIR-NBS-Prenyltransf) that reflect lineage-specific evolutionary innovations [3].

Evolutionary Drivers of Diversity

The extensive sequence diversity and CNV in NBS genes primarily results from evolutionary arms races with rapidly evolving pathogens. Several molecular mechanisms drive this diversification:

Birth-and-death evolution: New genes are created by duplication, followed by divergent evolution or pseudogenization [72]
Tandem duplications: Frequent local duplications create gene clusters [7] [12]
Segmental duplications: Large-scale genomic rearrangements duplicate NBS gene arrays [7]
Diversifying selection: Particularly on solvent-exposed residues in LRR domains that interact with pathogens [71] [72]

This dynamic evolution results in substantial variation in NBS gene numbers across species, ranging from fewer than 100 to over 1,000 copies per genome [71]. Even within the Rosaceae family, NBS-LRR genes display distinct evolutionary patterns including "continuous expansion," "first expansion then contraction," and "expansion followed by contraction, then further expansion" [12].

Comparative Genomic Methodologies

Genome-Wide Identification and Classification

Identification Pipeline

The standard workflow for comprehensive NBS gene identification combines multiple complementary approaches:

BLAST-based searches using known NBS domains as queries (E-value threshold typically 1.0) [12]
HMMER searches with the NB-ARC domain (PF00931) hidden Markov model [12]
Domain validation through Pfam and NCBI-CDD databases (E-value cutoff 10⁻⁴) [12]
Classification into TNL, CNL, and RNL subclasses based on N-terminal domains (TIR, CC, or RPW8) [3] [12]

This integrated approach overcomes limitations of individual methods, particularly given the high sequence diversity of NBS genes. The PfamScan.pl HMM search script with stringent E-value thresholds (1.1e-50) effectively identifies NB-ARC domains while minimizing false positives [3].

Orthology Assessment

Comparative analyses require careful orthology assignment to distinguish true orthologs from paralogs:

OrthoFinder v2.5.1 with DIAMOND for rapid sequence similarity searches [3]
MCL clustering algorithm for orthogroup detection [3]
DendroBLAST for ortholog identification and phylogenetic analysis [3]

This pipeline identified 603 orthogroups across 34 plant species, revealing both core (widely conserved) and unique (lineage-specific) orthogroups [3].

CNV Detection and Analysis

CNV Detection Methods

Advanced sequencing technologies have enabled comprehensive CNV profiling in plant genomes:

Read depth-based approaches using tools like CNVnator v0.3.3 with multiple bin sizes [70]
Segmental duplication detection with mrCaNaVaR algorithm analyzing coverage depth in sliding windows (5 kb) [73]
Stringent filtering to remove false positives (E-value ≥ 0.05, q0 value ≥ 0.5) [70]
Overlap analysis using BEDTools to associate CNVs with genomic features (>50% overlap threshold) [70]

These methods have revealed that CNVs are particularly enriched in NBS-LRR gene clusters compared to singleton genes [70]. In Brassica napus, approximately 7,000-20,000 genes show CNVs between any two accessions, with defense response genes significantly overrepresented [70] [73].

Table 2: Experimental Solutions for NBS Gene Analysis Challenges

Research Challenge	Experimental Solution	Key Features/Benefits	Citation
Overcoming sequence diversity in PCR	Degenerate primers targeting conserved NBS motifs	Amplifies diverse NBS sequences; enables phylogenetic analysis	[72]
Identifying recently diverged sequences	MEME motif analysis & WebLogo	Reveals subfamily-specific conserved motifs; identifies evolutionary relationships	[12]
Resolving complex evolutionary patterns	OrthoFinder + MCL clustering	Distinguishes orthologs from paralogs; identifies lineage-specific expansions	[3]
CNV detection in complex genomes	Read-depth analysis (CNVnator/mrCaNaVaR)	Identifies large segmental duplications/deletions; works with short-read data	[70] [73]
Functional validation of candidate genes	Virus-Induced Gene Silencing (VIGS)	Rapid functional assessment; avoids stable transformation	[3]

Analysis of Sequence Diversity

Phylogenetic and Motif Analysis

Phylogenetic reconstruction using the NBS domain provides insights into evolutionary relationships despite high sequence diversity:

Multiple sequence alignment with MAFFT 7.0 [3]
Phylogenetic reconstruction using maximum likelihood algorithms in FastTreeMP with 1000 bootstrap replicates [3]
Motif identification with MEME suite analyzing up to 10 conserved motifs [12]

These analyses consistently reveal three major monophyletic clades corresponding to CNL, TNL, and RNL subclasses, distinguished by characteristic amino acid motifs [7]. The NBS domain contains several strictly ordered motifs (P-loop, RNBS-A, RNBS-B, etc.) that facilitate phylogenetic analysis despite sequence variation in flanking regions [72].

Expression and Functional Analysis

Transcriptomic approaches help prioritize candidate NBS genes for functional studies:

Differential expression analysis using RNA-seq data from infected and healthy tissues [3] [7]
FPKM-based expression profiling across different tissues, biotic and abiotic stresses [3]
qRT-PCR validation of selected candidates to verify expression patterns [7]

In sweet potato, this approach identified 11 NBS genes differentially expressed in response to stem nematodes and 19 responsive to Ceratocystis fimbriata infection [7]. Similarly, analysis of cotton NBS genes revealed specific orthogroups (OG2, OG6, OG15) upregulated in response to cotton leaf curl disease [3].

Functional validation through virus-induced gene silencing (VIGS) demonstrated that silencing GaNBS (OG2) in resistant cotton increased viral titers, confirming its role in disease resistance [3].

Managing Copy Number Variation in Research

CNV Detection and Analysis Protocols

Accurate CNV detection requires specialized bioinformatic approaches:

Reference genome alignment using BWA-MEM or mrsFAST with limited mismatch rates (5% of read length) [70] [73]
CNV calling with CNVnator v0.3.3 using multiple bin sizes to ensure standard deviation of read depth signal between 4-5 [70]
Segmental duplication identification through mrCaNaVaR algorithm detecting excess depth of coverage [73]
Quality control including mapping rates (>98% for high-quality data) and filtering of gap regions [70]

These methods enable researchers to detect CNVs affecting NBS-LRR genes, which frequently occur in genomic clusters and show substantial variation between accessions [70] [7].

Comparative CNV Analysis Across Populations

Population-level CNV analysis reveals evolutionary patterns and selective pressures:

CNV-based population structure analysis differentiating wild and domesticated accessions [73]
Selective sweep detection identifying regions with significant CN differentiation (VST > 0.28) [73]
Functional enrichment analysis of genes showing species-specific CNV patterns [73]

In apple, CNV analysis of 116 Malus accessions revealed that domesticated apples (M. domestica) show distinct CNV profiles compared to wild progenitors (M. sieversii and M. sylvestris), with specific enrichments in defense response genes in wild species and fruit quality genes in domesticated varieties [73].

Table 3: Essential Research Reagents and Computational Tools for NBS Gene Analysis

Tool/Resource Category	Specific Tools	Application in NBS Gene Research
Bioinformatic Pipelines	RGAugury, OrthoFinder v2.5.1	Automated RGA prediction; orthogroup analysis across species
Sequence Analysis Tools	HMMER, PfamScan, MEME, WebLogo	Domain identification; motif discovery and visualization
CNV Detection Software	CNVnator v0.3.3, mrCaNaVaR	Read depth-based CNV calling; segmental duplication detection
Genomic Databases	Genome Database for Rosaceae, Phytozome, NCBI	Access to annotated genome sequences; comparative genomics
Experimental Validation	VIGS vectors, qRT-PCR assays	Functional characterization; expression validation

The challenges posed by high sequence diversity and copy number variation in NBS genes can be effectively addressed through integrated experimental and computational approaches. Genome-wide identification pipelines, sophisticated CNV detection algorithms, comparative phylogenetic methods, and functional validation techniques together provide a powerful framework for unraveling the complex evolution and function of these critical plant immune genes. As genomic technologies continue to advance, particularly in long-read sequencing and pangenome analyses, our ability to resolve the full extent of NBS gene diversity will further improve, accelerating the discovery and deployment of disease resistance traits in crop improvement programs.

Managing Genetic Redundancy and Functional Overlap in Dense Gene Clusters

In plant genomes, disease resistance is often mediated by dense clusters of Nucleotide-Binding Site Leucine-Rich Repeat (NBS-LRR) genes, which constitute one of the largest and most variable families of plant resistance (R) proteins [3] [74]. These genes play a crucial role in the plant immune system, encoding intracellular receptors that recognize pathogen effectors and initiate effector-triggered immunity (ETI) [25]. The genomic organization of these genes presents a fascinating paradox: while genetic redundancy within clusters seems wasteful, it provides a critical evolutionary advantage in arms races against rapidly evolving pathogens [75] [76].

Managing this redundancy requires understanding how plants maintain, expand, and regulate these complex gene families. This comparative guide analyzes experimental approaches and findings across key plant species, providing researchers with methodological insights and conceptual frameworks for studying dense gene clusters. We examine how different plant lineages have evolved distinct strategies to balance the benefits of genetic redundancy against its potential costs, offering lessons for both basic plant biology and applied crop improvement.

Comparative Genomics of NBS Gene Clusters Across Species

Genomic Distribution and Organizational Patterns

The distribution of NBS-encoding genes across plant genomes follows distinctive patterns that reveal important insights into evolutionary strategies for managing genetic redundancy. Across species, these genes demonstrate non-random distribution, frequently forming dense clusters primarily in subtelomeric regions where higher recombination rates facilitate rapid evolution [75] [74] [25].

Table 1: Comparative Analysis of NBS-Encoding Genes Across Plant Species

Plant Species	Genome Type	Total NBS Genes	Clustered Genes	Major NBS Types	Key Duplication Mechanism
Gossypium hirsutum (cotton)	Allotetraploid	588	~83%	CNL, TNL	Segmental duplication [74]
Gossypium barbadense (cotton)	Allotetraploid	682	~86%	TNL, CNL	Segmental duplication [74]
Ipomoea batatas (sweet potato)	Hexaploid	889	83.13%	CN, N	Segmental duplication [25]
Ipomoea trifida	Diploid	554	76.71%	CN, N	Tandem duplication [25]
Ipomoea triloba	Diploid	571	90.37%	CN, N	Tandem duplication [25]
Ipomoea nil	Diploid	757	86.39%	CN, N	Tandem duplication [25]
Brassica oleracea	Diploid	157	Not specified	TNL, CNL	Tandem duplication [76]
Brassica rapa	Diploid	206	Not specified	TNL, CNL	Tandem duplication [76]
Arabidopsis thaliana	Diploid	167	Not specified	TNL, CNL	Tandem duplication [76]

The table reveals several key patterns. Allopolyploid species like cotton maintain approximately double the number of NBS genes compared to their diploid progenitors, suggesting selective retention of redundant gene copies [74]. Sweet potato, a hexaploid species, shows the highest absolute number of NBS genes, indicating potential dosage advantages in complex genomes [25]. Across all species, clustering percentages remain remarkably high (76-90%), underscoring the fundamental importance of this organizational principle.

Evolutionary Dynamics and Selection Pressures

The evolution of NBS gene clusters is driven by diverse mechanisms that create and maintain genetic redundancy while allowing functional diversification. Whole genome duplication (WGD) events provide raw genetic material, while tandem duplication enables rapid local expansion of successful resistance specificities [3] [76]. In Brassica species, comparative genomics reveals that after divergence from Arabidopsis thaliana, NBS-encoding genes experienced species-specific amplification through tandem duplication, with triplicated homologous gene pairs from whole genome triplication being rapidly deleted or lost [76].

Evolutionary analysis of orthologous gene pairs in Brassica species indicates that CNL-type genes have undergone stronger negative selection in B. rapa compared to B. oleracea, while TNL-type genes show no significant differences between species [76]. This suggests differential evolutionary constraints acting on distinct NBS subfamilies. In Gossypium species, asymmetric evolution of NBS-encoding genes helps explain differential disease resistance, with G. raimondii and G. barbadense inheriting more TNL-type genes that may confer resistance to Verticillium wilt [74].

Experimental Approaches for Functional Analysis

Identification and Classification Methodologies

Standardized protocols for identifying and classifying NBS-encoding genes enable meaningful cross-species comparisons. The foundational methodology involves:

HMMER Search: Using PfamScan with default e-value (1.1e-50) and Pfam-A_hmm model to identify genes containing NB-ARC domains [3] [76]. The NB-ARC domain contains five strictly ordered motifs: P-loop, kinase-2, kinase-3a, GLPL, and MHDL, which facilitate ATP/GTP binding and hydrolysis [74].
Domain Architecture Analysis: Classifying NBS genes based on N-terminal domains (TIR, CC, or RPW8) and C-terminal LRR domains, using notation where "T," "C," or "R" indicates N-terminal domains, "N" represents NB-ARC, and "L" indicates LRR domains [74] [25].
Orthology Grouping: Using OrthoFinder with DIAMOND for sequence similarity searches and MCL clustering algorithm to identify orthogroups across species [3].

These methods have revealed significant diversity in NBS gene architectures, with 168 distinct classes identified across 34 plant species, including both classical (NBS, NBS-LRR, TIR-NBS, TIR-NBS-LRR) and species-specific structural patterns [3].

Expression Analysis and Functional Validation

Understanding the regulation and function of clustered NBS genes requires multifaceted experimental approaches:

Transcriptomic Profiling: Analyzing RNA-seq data from various tissues and stress conditions to identify differentially expressed NBS genes. Studies typically extract FPKM values from specialized databases and categorize expression patterns into tissue-specific, abiotic stress-specific, and biotic stress-specific profiles [3] [25].
Virus-Induced Gene Silencing (VIGS): Transient knockdown of candidate NBS genes to assess their role in disease resistance. For example, silencing of GaNBS (OG2) in resistant cotton demonstrated its putative role in controlling virus titers [3].
Transgenic Complementation: Testing gene function through heterologous expression, as demonstrated in wheat where a transgenic array of 995 NLRs from diverse grass species identified 31 new resistance genes against rust pathogens [77].
Protein Interaction Studies: Conducting protein-ligand and protein-protein interaction assays to validate interactions between NBS proteins and pathogen effectors or signaling components [3].

Recent evidence challenges the long-held belief that NLRs require strict transcriptional repression, with studies showing that functional NLRs actually exhibit high steady-state expression levels in uninfected plants across both monocot and dicot species [77].

Diagram 1: Experimental workflow for analyzing NBS-LRR gene clusters, showing the progression from gene identification to functional validation and practical applications.

The Scientist's Toolkit: Essential Research Reagents and Solutions

Table 2: Key Research Reagents for NBS Gene Cluster Analysis

Reagent/Resource	Primary Function	Application Examples	Considerations
HMMER/Pfam Databases	Identification of NB-ARC domains	Domain annotation in genomic sequences	Use Pfam-A_hmm model with e-value 1.1e-50 [3]
OrthoFinder	Orthogroup inference across species	Evolutionary analysis of NBS gene families	Integrates DIAMOND for sequence similarity [3]
RNA-seq Datasets	Expression profiling under various conditions	Identification of differentially expressed NBS genes	Categorize into tissue/stress-specific expression [3] [25]
VIGS Vectors	Transient gene silencing	Functional validation of candidate NBS genes	Requires optimized protocols for each species [3]
Transgenic Arrays	High-throughput functional screening	Identification of new resistance specificities	Wheat transformation array tested 995 NLRs [77]
PfamScan Script	Domain architecture analysis	Classification of NBS gene types	Distinguishes TNL, CNL, RNL subtypes [3]

This toolkit enables researchers to navigate the complexity of NBS gene clusters from identification to functional characterization. The integration of bioinformatic tools with experimental validation creates a powerful pipeline for dissecting genetic redundancy in these dense clusters.

Regulatory and Evolutionary Insights

Expression Regulation of Clustered NBS Genes

The traditional view that NLR expression must be tightly repressed due to fitness costs has been challenged by recent evidence. Studies now show that functional NLRs are often highly expressed in uninfected plants, with known resistance genes appearing among the most highly expressed NLR transcripts [77]. In Arabidopsis thaliana, NLRs in the top 15% of expressed transcripts are significantly enriched for known functional genes, and the most highly expressed NLRs exceed median expression levels for all genes [77].

This high expression may be necessary for effective pathogen recognition, as demonstrated by the barley NLR Mla7, which requires multiple copies for full resistance function. Transgenic studies showed that single insertions of Mla7 were insufficient to confer resistance, while higher-order copies (2-4 copies) provided strong resistance to Blumeria hordei and Puccinia striiformis pathogens [77]. This copy-number dependency suggests that expression threshold effects are crucial for NLR function.

Evolutionary Innovation in Gene Clusters

Dense NBS gene clusters function as evolutionary innovation centers where new resistance specificities can emerge through several mechanisms:

Birth-and-Death Evolution: New genes are created by duplication, while others are deleted or become pseudogenes, creating dynamic gene clusters [75].
Associations with Duplication-Inducing Elements: In barley, arms-race genes show statistical association with duplication-prone genomic regions, particularly Kb-scale tandem repeats. This association creates a cooperative relationship where duplication-inducing elements generate diversity for pathogen defense genes [75].
Helper-Sensor Systems: In Solanaceae species, NBS genes have evolved specialized functions, with "sensor" NLRs detecting pathogen effectors and "helper" NLRs facilitating immune signaling. Both types show high expression patterns, with some helpers exhibiting tissue specificity [77].

Diagram 2: Evolutionary mechanisms driving innovation in NBS gene clusters, showing how different processes contribute to enhanced diversity for pathogen resistance.

The management of genetic redundancy in dense NBS gene clusters represents a sophisticated evolutionary solution to the challenge of plant-pathogen arms races. Rather than eliminating redundancy, plants have evolved mechanisms to harness it through structured genomic organization, dynamic evolutionary processes, and regulated expression systems.

Key insights emerge from cross-species comparisons: First, different plant lineages have developed distinct strategies for maintaining NBS gene clusters, with some favoring expansion and others contraction. Second, the traditional view of NLRs as tightly repressed genes requires revision, as evidence demonstrates that functional NLRs are often highly expressed. Third, the association of defense genes with duplication-inducing elements creates a cooperative system that enhances evolutionary potential.

For researchers and drug development professionals, these findings suggest new approaches for engineering disease resistance in crops. Harnessing natural duplication mechanisms, rather than focusing solely on single genes, may provide more durable resistance solutions. The experimental frameworks and reagents described here offer pathways for systematically exploring these complex genomic regions across diverse plant species.

Plant immunity relies on a sophisticated surveillance system where intracellular nucleotide-binding site leucine-rich repeat (NBS-LRR or NLR) proteins serve as critical immune receptors that recognize pathogen effectors and activate effector-triggered immunity (ETI) [78] [79]. However, this powerful defense system carries inherent risks—unregulated expression or activation of NBS-LRR genes can inhibit plant growth and lead to autoimmunity, characterized by stunted growth, leaf chlorosis and necrosis, runaway cell death, and reduced reproductive fitness [80] [78]. Plants therefore face a fundamental trade-off: maintaining a diverse NBS-LRR repertoire for pathogen recognition while avoiding the fitness costs of autoimmunity. Recent research has revealed that microRNAs (miRNAs) serve as essential regulatory components in this balancing act, providing precise post-transcriptional control of NBS-LRR genes to maintain immune homeostasis in the absence of pathogen challenge [80] [71] [81]. This review synthesizes current understanding of miRNA-mediated regulation of plant immune genes, comparing regulatory mechanisms across diverse species and providing experimental approaches for investigating these critical immune regulatory networks.

Molecular Mechanisms of miRNA-Mediated NBS-LRR Regulation

Core Regulatory Circuitry: miRNA Families and Their NBS-LRR Targets

Several conserved miRNA families have evolved to target NBS-LRR genes across diverse plant species. The miR482/2118 superfamily represents the most ancient and widespread among these regulators, first emerging in gymnosperms and undergoing extensive radiation in seed plants [80] [71]. These miRNAs typically target the conserved P-loop motif encoded in NBS-LRR genes, allowing a single miRNA to regulate multiple NBS-LRR lineages [71]. In legume species, miR1507 and miR2109 (also known as miR5213 in Medicago truncatula) perform similar regulatory functions, targeting different conserved domains within NBS-LRR genes [80]. These miRNAs function through a dual regulatory mechanism: they directly cleave target NBS-LRR mRNAs and, in the case of 22-nucleotide miRNAs, trigger the production of phased secondary siRNAs (phasiRNAs) that amplify silencing efficiency [80] [81].

Table 1: Major miRNA Families Regulating NBS-LRR Genes in Plants

miRNA Family	Evolutionary Origin	Target Site	phasiRNA Production	Plant Lineages
miR482/2118	Gymnosperms	P-loop motif	Yes (22-nt variants)	Gymnosperms to dicots
miR1507	Fabaceae	Conserved NBS-LRR domains	Yes	Prevalent in legumes
miR2109/miR5213	Fabaceae	Conserved NBS-LRR domains	Yes	Medicago truncatula and other legumes
miR472	Eudicots	NBS-LRR transcripts	Yes	Arabidopsis, poplar

The regulatory interaction between miRNAs and NBS-LRR genes represents a co-evolutionary arms race. As NBS-LRR genes diversify to recognize new pathogen effectors, new miRNAs periodically emerge to regulate them, often targeting the same conserved protein motifs [71]. This co-evolution maintains a balance between immune recognition capability and autoimmunity suppression. Nucleotide diversity in the wobble position of codons within target sites drives further diversification of miRNAs, creating complex regulatory networks [71].

The phasiRNA Amplification Loop

A crucial amplification mechanism in miRNA-mediated NBS-LRR regulation involves phased siRNAs (phasiRNAs). When 22-nucleotide miRNAs guide cleavage of NBS-LRR transcripts, the cleavage products are converted into double-stranded RNA by RNA-DEPENDENT RNA POLYMERASE 6 (RDR6) and SUPPRESSOR OF GENE SILENCING 3 (SGS3) [81]. This double-stranded RNA is subsequently processed by DICER-LIKE 4 (DCL4) and DOUBLE-STRANDED-RNA-BINDING PROTEIN 4 (DRB4) to generate a cluster of 21-nucleotide phased siRNAs [81]. These secondary siRNAs can target additional NBS-LRR mRNAs, creating an amplified silencing cascade that enables robust suppression of NBS-LRR gene expression in the absence of pathogen challenge [80] [81].

The following diagram illustrates this coordinated regulatory pathway:

Figure 1: miRNA-phasiRNA regulatory pathway for NBS-LRR gene regulation. Primary miRNAs are processed through sequential steps to form mature miRNAs that guide RNA-induced silencing complex (RISC) to target NBS-LRR mRNAs. For 22-nt miRNAs, this triggers production of phased secondary siRNAs (phasiRNAs) that amplify silencing of additional NBS-LRR targets.

Comparative Analysis of miRNA-NBS-LRR Networks Across Species

Evolutionary Patterns and Species-Specific Adaptations

The relationship between miRNAs and NBS-LRR genes exhibits both conserved features and species-specific adaptations across plant lineages. A comprehensive analysis of 70 land plants revealed a tight association between NBS-LRR diversity and miRNA regulation, with miRNAs typically targeting highly duplicated NBS-LRRs [71]. In contrast, families of heterogeneous NBS-LRRs are rarely targeted by miRNAs in Poaceae and Brassicaceae genomes, suggesting alternative regulatory mechanisms in these lineages [71].

In legumes such as Medicago truncatula, which possesses approximately 540 NBS-LRR genes, more than 60% are potentially targeted by NB-LRR-regulating miRNAs (miR1507, miR2109, and miR2118) or by phasiRNAs produced from at least 114 phasiRNA-generating loci [80]. This extensive regulatory network proves essential for establishing symbiotic relationships—during nodulation in Medicago, upregulation of these miRNAs suppresses NBS-LRR genes, creating a suitable niche for bacterial colonization without triggering immune responses [80].

Table 2: Comparative Analysis of miRNA-NBS-LRR Networks Across Plant Species

Plant Species	NBS-LRR Count	Key Regulatory miRNAs	Regulatory Features	Biological Context
Medicago truncatula	~540 genes	miR1507, miR2109, miR2118	>60% NBS-LRRs targeted by miRNAs/phasiRNAs	Nodulation, symbiosis
Gossypium hirsutum (cotton)	Not specified	miR482	Reduced miR482 and increased NBS-LRRs in resistance to Verticillium dahliae [81]	Fungal resistance
Solanum tuberosum (potato)	Not specified	miR482, miR160	miR482 regulates NBS-LRRs; miR160 targets ARF10/16 for immunity [81]	Fungal/oomycete resistance
Populus species (poplar)	Not specified	miR472a	Targets NBS-LRRs for defense against fungal pathogens [81]	Fungal resistance
Arabidopsis thaliana	~200 genes	miR472	Triggers phasiRNAs from NBS-LRRs in bacterial immunity [81]	Bacterial resistance
Hordeum vulgare (barley)	Not specified	miR9863	22-nt variant triggers phasiRNAs from Mla transcripts [81]	Powdery mildew resistance

Recent studies in Ipomoea species (sweet potato and relatives) further illustrate the dynamic evolution of NBS-LRR genes. Comprehensive analysis revealed 889 NBS-encoding genes in sweet potato (Ipomoea batatas), with 554, 571, and 757 in its relatives I. trifida, I. triloba, and I. nil respectively [7]. These genes show non-random chromosomal distribution, with 83-90% occurring in clusters across these species, suggesting frequent tandem duplications [7]. Expression profiling identified specific NBS-encoding genes differentially expressed in resistant versus susceptible cultivars during infection by stem nematodes and Ceratocystis fimbriata, highlighting their functional importance in disease resistance [7].

Genomic Architecture and Expression Dynamics

The genomic organization of NBS-LRR genes significantly influences their regulation. Studies across multiple plant species reveal that NBS-LRR genes are frequently arranged in clusters, which may facilitate the evolution of diverse recognition specificities but also creates regulatory challenges [71] [7]. In sugarcane, whole-genome duplication, gene expansion, and allele loss significantly impact NBS-LRR gene numbers, with whole-genome duplication likely being the primary driver of NBS-LRR gene abundance [82]. Transcriptome analyses further revealed that more differentially expressed NBS-LRR genes in modern sugarcane cultivars derive from wild Saccharum spontaneum than from domesticated S. officinarum, highlighting the contribution of wild relatives to disease resistance [82].

Research in sorghum demonstrates how global mRNA and miRNA expression dynamics change during pathogen attack. In resistant and susceptible sorghum lines infected with Colletotrichum sublineolum, the resistant genotype showed significant transcriptional reprogramming at 24 hours post-inoculation, followed by a decrease at 48 hours, while the susceptible line exhibited continued changes in gene expression concordant with increasing fungal growth [83]. Small RNA sequencing identified 75 differentially expressed miRNAs, including 36 novel miRNAs, with inverse correlation between miRNA expression and their target genes [83].

Experimental Approaches for Investigating miRNA-NBS-LRR Networks

Methodologies for Profiling miRNA and NBS-LRR Expression

Cutting-edge research in this field employs integrated transcriptomic approaches to simultaneously profile miRNA and mRNA expression dynamics during immune responses. A standard experimental workflow involves:

Plant Material and Growth Conditions: For studies in model legumes like Medicago truncatula, seeds are chemically scarified, sterilized, and germinated on agar plates followed by cold treatment. Seedlings are typically grown in nitrogen-free medium under controlled photoperiod conditions before pathogen inoculation or symbiotic interaction [80].
Pathogen/Virus Inoculation: For viral studies, cotyledons of 5-day-old seedlings may be infected with virus sap (e.g., alfalfa mosaic virus) [80]. For fungal pathogen studies, standardized inoculation protocols ensure consistent infection pressure across biological replicates [83].
RNA Sequencing Library Preparation: For mRNA sequencing, total RNA is extracted from treated and control tissues, and libraries are prepared for Illumina sequencing. Typically, 18 mRNA libraries (3 biological replicates × 2 genotypes × 3 time points) provide sufficient statistical power for differential expression analysis [83]. For miRNA sequencing, size-fractionated small RNA libraries are prepared to enrich for 18-30 nucleotide RNAs.
Bioinformatic Analysis: Sequencing reads are processed through quality control, adapter trimming, and mapping to reference genomes. For mRNA analysis, reads are typically mapped to the host genome and pathogen genome to quantify host gene expression and pathogen biomass [83]. For miRNA analysis, specialized tools like ShortStack or miRDeep2 are used to identify known and novel miRNAs and quantify their expression.
Integration of miRNA-mRNA Data: Target prediction algorithms (e.g., TargetFinder) identify putative miRNA targets, followed by correlation analysis to identify inverse miRNA-target relationships. Functional validation typically requires additional experimental approaches.

Functional Validation Techniques

Several well-established methods enable functional validation of miRNA-NBS-LRR regulatory relationships:

Virus-Induced Gene Silencing (VIGS): VIGS has been successfully employed to silence specific NBS genes in cotton, demonstrating their role in virus resistance [3]. This approach involves engineering viral vectors to contain fragments of target genes, which trigger silencing when infected into plants.
Transgenic Approaches: Modulation of miRNA expression through overexpression or artificial target mimics (e.g., STTM, MIM) allows researchers to investigate the consequences of disrupted miRNA regulation. In Medicago truncatula, modification of NB-LRR-regulating miRNA expression (either upregulation or downregulation) significantly changed the number of symbiotic nodules, demonstrating their functional importance [80].
qRT-PCR Validation: Candidate regulatory relationships identified through transcriptomics require validation using quantitative reverse-transcription PCR. This approach confirmed consistent expression patterns for six differentially expressed NBS genes in sweet potato under pathogen challenge [7].

The following diagram outlines a comprehensive experimental workflow for investigating miRNA-NBS-LRR networks:

Figure 2: Integrated experimental workflow for investigating miRNA-NBS-LRR regulatory networks. The approach combines parallel transcriptomic profiling with bioinformatic integration and functional validation to comprehensively characterize miRNA-mediated regulation of plant immune genes.

Table 3: Key Research Reagents and Resources for Investigating miRNA-NBS-LRR Networks

Reagent/Resource	Specific Example	Application/Function	Experimental Context
Plant Growth Media	Nitrogen-free Gibson medium	Supports nodulation studies in legumes	Medicago truncatula-rhizobia interactions [80]
Pathogen Isolates	Colletotrichum sublineolum	Anthracnose pathogen for infection assays	Sorghum anthracnose resistance [83]
Viral Vectors	Alfalfa mosaic virus (AMV)	Virus-induced gene silencing (VIGS)	Functional validation of NBS genes [80] [3]
Immunity Elicitors	flg22 peptide	Pattern-triggered immunity induction	miRNA expression response to immune signaling [80]
Reference Genomes	Medicago truncatula Jemalong genome	Read mapping and expression quantification	Comparative genomics and transcriptomics [80]
Bioinformatic Tools	OrthoFinder, MEME, PhyloSuite	Evolutionary analysis, motif detection, phylogenetics	Comparative analysis of NBS-LRR genes [3] [82]
Expression Databases	Plant NBS-LRR Gene Database	Repository for NBS-LRR gene information	Comparative expression analysis [82]
qRT-PCR Primers	Gene-specific primers for candidate NBS-LRRs	Validation of RNA-seq results	Confirmation of differential expression [7]

The intricate regulatory networks through which miRNAs fine-tune NBS-LRR expression represent a sophisticated evolutionary solution to the fundamental challenge of maintaining effective immunity while avoiding autoimmune pathology. The comparative analysis across species reveals both conserved principles and lineage-specific adaptations in these regulatory circuits. From an applied perspective, understanding miRNA-NBS-LRR networks opens exciting avenues for crop improvement. Engineering miRNA regulatory elements or developing miRNA-resistant NBS-LRR variants could enhance disease resistance while avoiding fitness costs. The research methodologies and resources summarized here provide a foundation for further exploration of these critical immune regulatory mechanisms. As genomic technologies advance, particularly in single-cell sequencing and spatial transcriptomics, future research will likely reveal additional layers of complexity in how plants achieve the delicate balance between immunity and growth.

Addressing Gene Contraction and Loss of Function During Domestication

Plant domestication represents a profound evolutionary transition, during which human selection for desirable agronomic traits can inadvertently alter a plant's innate defense mechanisms. A critical component of this defense system is the nucleotide-binding site-leucine rich repeat (NBS-LRR) gene family, which constitutes the largest and most prevalent class of disease resistance (R) genes in plants [84] [85]. These genes encode proteins that function as intracellular immune receptors, playing a crucial role in detecting pathogen-derived molecules and initiating defense responses [86]. The NBS domain facilitates nucleotide binding and hydrolytic reactions that provide energy for downstream signaling, while the highly variable LRR domain is responsible for pathogen recognition specificity [86]. Throughout domestication, the genetic architecture of crop plants has undergone significant restructuring, with evidence suggesting that immune receptor gene repertoires have experienced particularly notable contractions [87]. This review provides a comparative analysis of NBS gene contraction across domesticated plant species, examines the methodologies for its investigation, and explores the evolutionary mechanisms driving this phenomenon, with implications for future crop breeding strategies.

Quantitative Comparison of NBS Gene Contraction Across Domesticated Species

Patterns of NBS Gene Family Contraction in Crop Genomes

Comparative genomic analyses across multiple plant families have revealed that domestication has frequently been associated with a reduction in the diversity of immune receptor genes. A comprehensive study analyzing 15 domesticated crop species and their wild relatives found that five crops—grapes, mandarins, rice, barley, and yellow sarson—exhibited significantly reduced immune receptor gene repertoires compared to their wild counterparts [87]. Interestingly, the overall rate of immune receptor gene loss generally reflected the background rate of gene loss in these species, suggesting a subtle, cumulative pressure rather than a single dramatic bottleneck event. Furthermore, a positive association was observed between domestication duration and immune receptor gene loss, supporting the hypothesis that the reduction in resistance gene diversity accumulates over the extended period of human cultivation and selection [87].

Table 1: Comparative Analysis of NBS-LRR Gene Family Size Across Plant Species

Plant Species	Status	NBS-LRR Gene Count	Key Findings	Citation
Vernicia montana	Wild	149	Contains TIR-NBS-LRR genes; resistant to Fusarium wilt	[86]
Vernicia fordii	Domesticated	90	Complete absence of TIR-NBS-LRR genes; susceptible to Fusarium wilt	[86]
Potato (S. tuberosum)	Domesticated	447	Shows "consistent expansion" pattern	[84]
Tomato (S. lycopersicum)	Domesticated	255	Exhibits "first expansion and then contraction" pattern	[84]
Pepper (C. annuum)	Domesticated	306	Presents a "shrinking" pattern	[84]
Sorghum (cultivated)	Domesticated	346	Significant reduction in NBS diversity in improved inbreds	[85]

Evolutionary Dynamics of NBS Genes in Cereal Crops

In sorghum, a critical food staple for over 500 million people, NBS-encoding genes demonstrate significantly higher diversity compared to non-NBS-encoding genes and are significantly enriched in genomic regions under both purifying and balancing selection [85]. This pattern was observed through both domestication and improvement, characterized by elevated differentiation between wild, landrace, and improved groups, along with low nucleotide diversity and negatively skewed allele frequency spectra. Approximately 20% of all NBS-encoding genes in sorghum showed patterns of molecular variation consistent with the action of selection, with just over half of these (38 genes) displaying signatures of purifying selection—characterized by a drive toward beneficial allele fixation and selective removal of deleterious alleles through both natural and human-mediated selection [85]. Eleven NBS-encoding genes were completely invariant in both cultivated and wild groups, with a further 10 genes fixed only in cultivated groups, and the majority of these invariant genes (86%) occurred in gene clusters [85].

Table 2: Evolutionary Patterns of NBS Genes in Cereal Crops

Crop Species	Evolutionary Pattern	Key Observations	Functional Consequences
Sorghum	Purifying and balancing selection	20% of NBS genes under selection; 11 genes invariant in all groups	Diversity reduction in improved inbreds; enrichment in disease resistance QTL regions
Foxtail Millet	Directional selection for domestication traits	Les1 gene disrupted by transposon insertion in domesticated allele	Loss of seed shattering; maintained disease resistance
Rice	Duration-dependent gene loss	Significant reduction in immune receptor repertoire compared to wild relatives	Association between domestication duration and immune gene loss
Setaria species	Diverse evolutionary trajectories	Independent gene loss and duplication events after speciation	Species-specific NBS gene numbers and disease resistance profiles

Experimental Protocols for Investigating NBS Gene Contraction

Genomic Identification and Classification of NBS-Encoding Genes

The standard methodology for comprehensive identification of NBS-encoding genes involves a multi-step computational pipeline. First, BLAST and hidden Markov model (HMM) searches using the NB-ARC domain (Pfam accession number: PF00931) as the query sequence are simultaneously performed to scan candidate NBS-encoding genes in plant genomes [84]. For BLAST analysis, the threshold expectation value is typically set to 1.0, while default parameter settings are used for HMM searches. All obtained sequence hits from both methods are then merged, and redundant hits are removed. The remaining sequences are subjected to online Pfam analysis to further confirm the presence of the NBS domain using an E-value cutoff of 10⁻⁴ [84]. Subsequently, all identified NBS-encoding genes are analyzed using the Pfam database, SMART protein motif analyses, and Multiple Expectation Maximization for Motif Elicitation (MEME) to determine if they encode TIR, RPW8, or LRR motifs. The CC motifs are detected by the COILS program with a threshold of 0.9 followed by visual inspection [84].

NBS Gene Identification Workflow: This diagram illustrates the bioinformatic pipeline for identifying and classifying NBS-encoding genes from plant genomic sequences, integrating multiple complementary approaches.

Population Genetic Analysis of NBS Gene Diversity

To investigate the evolutionary forces acting on NBS genes during domestication, researchers employ sophisticated population genetic analyses. High-coverage whole-genome sequencing data from diverse genotypes—spanning wild, weedy, landrace, and improved varieties—enables the calculation of multiple diversity statistics [85] [88]. Key metrics include nucleotide diversity (θπ and θw), allele frequency spectra (Tajima's D), and between-group differentiation (FST) values. The mlHKA test is particularly valuable for validating whether NBS-encoding domestication candidates show patterns of genetic variation consistent with positive selection [85]. Genome-wide association studies (GWAS) can identify loci underlying important domestication traits, while pan-genome analyses capture gene presence-absence variation across diverse accessions, revealing the full complement of NBS genes within a species [88].

Population Genomics Pipeline: This workflow outlines the process for analyzing genetic diversity and selection signatures in NBS genes across diverse plant accessions, from sequencing to identification of domestication-related selection.

Evolutionary Mechanisms Driving NBS Gene Contraction

Contrasting Selection Pressures on NBS Genes During Domestication

The evolutionary trajectory of NBS genes during domestication is shaped by multiple, often contrasting, selection pressures. Analysis of NBS-encoding genes in sorghum revealed they are significantly enriched in regions of the genome under both purifying selection (characterized by selective removal of deleterious alleles) and balancing selection (which maintains multiple alleles at intermediate frequencies) [85]. This suggests that different NBS genes experience distinct evolutionary pressures based on their specific functions and interactions with pathogens. Purifying selection appears to drive beneficial allele fixation in certain NBS genes, while balancing selection maintains diversity in others, possibly to recognize evolving pathogen populations. The overall pattern of NBS gene contraction during domestication appears more consistent with relaxed selection rather than a strong cost-of-resistance effect, suggesting that in agricultural environments with reduced pathogen pressure, maintaining large, diverse NBS gene repertoires may impose unnecessary metabolic costs [87].

Species-Specific Evolutionary Patterns in Solanaceae

Comparative analysis of Solanaceae species reveals distinct evolutionary patterns for NBS-encoding genes. Potato demonstrates a "consistent expansion" pattern, tomato exhibits "first expansion and then contraction," while pepper presents a "shrinking" pattern [84]. These differences suggest that despite shared ancestry, NBS gene families have undergone independent evolutionary trajectories following speciation. Phylogenetic analyses indicate that the NBS-encoding genes in present-day potato, tomato, and pepper were derived from approximately 150 CNL, 22 TNL, and 4 RNL ancestral genes, with species-specific tandem duplications contributing most significantly to gene expansions [84]. The earlier expansion of CNLs in the common ancestor led to the dominance of this subclass in gene numbers, while RNLs remained at low copy numbers, potentially due to their specialized functions [84].

Research Toolkit for Investigating Domestication-Associated Gene Contraction

Table 3: Research Reagent Solutions for Studying NBS Gene Evolution

Resource Type	Specific Tool/Resource	Function and Application	Representative Examples
Genomic Databases	Species-specific genome portals	Provide reference sequences and annotations for gene identification	RAP-DB, RGAP for rice [89]; Setaria genome resource [88]
Pan-genome Platforms	Pan-genome browsers	Enable analysis of gene presence-absence variation across populations	RPAN for rice [89]; Setaria pan-genome [88]
Diversity Resources	SNP databases and variation browsers	Facilitate population genetic analyses and selection scans	RiceVarMap v2.0 [89]; SNP-Seek [89]
Expression Repositories	Transcriptomic databases	Provide gene expression patterns across tissues and conditions	RiceXPro [89] [90]; RiceFREND [90]
Mutant Collections	Insertion mutant databases	Offer resources for functional validation of candidate genes	Rice Tos17 Insertion Mutant Database [89]; Oryzabase mutants [91]
Comparative Genomics Platforms	Multi-species comparative databases	Enable cross-species evolutionary analyses	Gramene [89] [90]; RGI [89]

Model Systems for Functional Validation

Functional characterization of NBS gene contraction requires model systems that combine experimental tractability with relevant evolutionary insights. The Setaria system—comprising the wild green foxtail (Setaria viridis) and its domesticated relative foxtail millet (Setaria italica)—provides an excellent platform for such investigations [92] [88]. S. viridis offers numerous advantages as a model system: small stature, diploid genetics, short life cycle (seed to seed in 8-10 weeks), small genome (~500 Mb), efficient transformation, and CRISPR-Cas9 mutagenesis capability [88]. Similarly, Brachypodium distachyon serves as a model for temperate cereals and forage grasses, with a small genome, simple chromosomes (2n = 10), rapid life cycle (under 4 months), high capacity for plant regeneration, and efficient transformation systems [93]. These model systems enable researchers to move beyond correlation to causation by functionally validating the role of candidate genes in domestication-related traits.

The systematic contraction of NBS gene repertoires during domestication represents a significant trade-off in the evolution of crop plants. While human selection has successfully enhanced yield, harvestability, and other agronomic traits, it has often done so at the expense of genetic diversity for pathogen recognition. The cumulative nature of this process, with longer domestication history correlating with greater immune gene loss, suggests that recently domesticated crops or wild relatives may harbor valuable resistance genes that have been lost from major crops [87]. Future crop improvement strategies should leverage pan-genome approaches to identify such lost resistance genes and utilize advanced breeding technologies to reincorporate functional diversity while maintaining favorable agronomic traits. Furthermore, understanding the specific evolutionary forces—purifying selection, balancing selection, or relaxed selection—acting on different NBS gene classes will enable more precise engineering of durable disease resistance in crop plants.

Nucleotide-binding leucine-rich repeat (NLR) genes constitute the largest family of plant disease resistance (R) genes and play a crucial role in the plant immune system by recognizing pathogen effectors and initiating defense responses [36] [94]. Domestication processes in crop species have often been associated with reduced genetic diversity and potential loss of defensive traits, possibly due to artificial selection for yield and quality characteristics alongside human management practices that reduce pathogen exposure [95]. Comparative genomics approaches now enable systematic investigation of how domestication has shaped NLR repertoires across crop species, providing insights for future disease resistance breeding [36] [95].

This case study examines the specific example of garden asparagus (Asparagus officinalis), a valuable horticultural crop known as the "king of vegetables" in international markets [36] [96]. We present a comprehensive analysis of NLR gene contraction during asparagus domestication and its functional consequences for disease susceptibility, serving as a model for understanding how human selection has impacted plant immunity mechanisms.

Comparative Analysis of NLR Repertoires in Asparagus Species

Genome-Wide Identification and Classification of NLR Genes

A comprehensive genome-wide analysis was conducted to identify NLR genes across domesticated garden asparagus (A. officinalis) and two wild relatives (A. setaceus and A. kiusianus) [36] [96]. Researchers employed a dual identification approach using Hidden Markov Model (HMM) searches with the conserved NB-ARC domain (Pfam: PF00931) as query, complemented by local BLASTp analyses against reference NLR protein sequences from Arabidopsis thaliana, Oryza sativa, and Allium sativum with a stringent E-value cutoff of 1e-10 [36]. Candidate sequences identified through both methods were subsequently validated through domain architecture analysis using InterProScan and NCBI's Batch CD-Search [36].

Table 1: NLR Gene Distribution in Asparagus Species

Species	Domestication Status	Total NLR Genes	Chromosomal Distribution	Clustered Loci
A. setaceus	Wild relative	63	Uneven, clustered patterns	~68% in clusters
A. kiusianus	Wild relative	47	Uneven, clustered patterns	~68% in clusters
A. officinalis	Domesticated	27	Uneven, clustered patterns	~68% in clusters

The phylogenetic classification of identified NLR genes based on N-terminal domains revealed three principal subfamilies: CNLs (containing CC domains), TNLs (with TIR domains), and RNLs (featuring RPW8 domains) [36]. This classification system follows the established framework for NLR gene categorization across plant species [97] [94]. The subcellular localization predictions indicated that nuclear-localized NLRs predominated (56% in A. setaceus, 49% in A. kiusianus), exceeding cytoplasmic localization in both wild species [98].

Structural and Regulatory Features of Asparagus NLR Genes

Analysis of the domain architecture using MEME suite revealed ten conserved motifs within the NLR proteins, with their order and amino acid sequences exhibiting high conservation across the three Asparagus species [36] [98]. The promoter analysis identified numerous cis-elements responsive to defense signals and phytohormones, with MeJA-responsive elements being the most abundant across all three NLR classes [36]. Among the total disease resistance-related cis-acting elements, plant hormone elements accounted for 80.8%, while stress response elements accounted for 5.5% [98].

Table 2: NLR Gene Structural Characteristics in Asparagus Species

Structural Feature	A. setaceus	A. kiusianus	A. officinalis
Conserved motifs in NBS domains	10 identified	10 identified	10 identified
Exon-intron structure	Conserved patterns	Conserved patterns	Conserved patterns
Cis-regulatory elements	Abundant defense-related	Abundant defense-related	Abundant defense-related
Chromosomal clustering	~68% in clusters	~68% in clusters	~68% in clusters

The chromosomal distribution analysis revealed that NLR genes in all three species displayed clustering patterns, with approximately 68% located in clusters across the genomes [36] [98]. This distribution pattern aligns with observations in other plant species where NLR genes often reside in complex clusters that facilitate rapid evolution of pathogen recognition capabilities [85] [99].

Experimental Analysis of Disease Resistance Mechanisms

Pathogen Inoculation and Phenotypic Responses

Experimental validation of disease resistance was conducted through pathogen inoculation assays using Phomopsis asparagi, a significant fungal pathogen affecting asparagus cultivation [36] [96]. The wild species A. setaceus remained completely asymptomatic following fungal challenge, demonstrating strong resistance [36]. In contrast, the domesticated A. officinalis showed clear susceptibility to the pathogen, developing characteristic disease symptoms [36]. This phenotypic divergence provided direct evidence for the functional consequences of NLR repertoire differences between wild and cultivated asparagus genotypes.

The transcriptomic analysis following pathogen challenge revealed that the majority of preserved NLR genes in A. officinalis demonstrated either unchanged or downregulated expression, indicating a potential functional impairment in disease resistance mechanisms [36]. This differential expression pattern suggests that domestication may have affected not only the number of NLR genes but also their regulatory mechanisms and responsiveness to pathogen attack [36] [98].

Orthologous Gene Analysis and Evolutionary Dynamics

Orthologous gene analysis identified 16 conserved NLR gene pairs between A. setaceus and A. officinalis, representing the NLR genes preserved during the domestication process of A. officinalis [36]. The evolutionary analysis indicated that both tandem and dispersed duplications contributed to NLR expansion in the Asparagus genus, with recent duplications dominating this process [99]. Contraction events were particularly evident during the domestication transition, with A. officinalis losing approximately 57% of NLR genes present in A. setaceus [36].

Table 3: Expression Patterns of Preserved NLR Genes in A. officinalis After Fungal Challenge

Expression Pattern	Percentage of NLR Genes	Functional Implication
Unchanged expression	Majority	Lack of responsiveness to pathogen
Downregulated expression	Significant portion	Potential suppression of defense
Upregulated expression	Minority	Limited effective resistance

The collinearity analysis between species revealed that the contraction of NLR genes in A. officinalis occurred through both loss of entire gene clusters and reduction within clusters [36] [98]. This pattern suggests that the domestication process selectively maintained only a core subset of the ancestral NLR repertoire, potentially focusing on genes essential for basic cellular functions while losing specialized pathogen recognition elements [36] [95].

Broader Context of NLR Evolution During Domestication

Comparative Patterns Across Plant Families

The contraction pattern observed in asparagus aligns with findings in other domesticated species. A comprehensive analysis of 15 domesticated crop species and their wild relatives across nine plant families revealed that five crops—grapes, mandarins, rice, barley, and yellow sarson—exhibited significantly reduced immune receptor gene repertoires compared to their wild counterparts [95]. This broader pattern suggests that NLR reduction may be a recurrent phenomenon in domestication processes across diverse plant lineages [95].

In the Apiaceae family, comparative genomic analysis of four species (Angelica sinensis, Coriandrum sativum, Apium graveolens, and Daucus carota) revealed dynamic NLR gene loss and gain events during speciation [97]. The number of NLR genes in these species ranged from 95 in A. sinensis to 183 in C. sativum, indicating substantial variation in NLR repertoire size even within the same plant family [97]. This variation highlights the evolutionary plasticity of NLR genes in response to different selective pressures [97] [3].

Evolutionary Mechanisms Driving NLR Repertoire Dynamics

The evolutionary analysis indicates that NLR genes are under strong selection pressure in plants, experiencing both purifying and balancing selection [85]. In sorghum, NBS-encoding genes were significantly enriched in regions of the genome under selection and exhibited higher diversity compared to non-NBS-encoding genes [85]. This pattern of contrasting evolutionary processes has impacted ancestral genes more than species-specific genes [85].

Several mechanisms may explain the recurrent NLR loss during domestication. The "cost of resistance" hypothesis suggests that maintaining NLR genes is metabolically expensive, and in agricultural environments with reduced pathogen pressure, selection may favor individuals with smaller NLR repertoires that allocate more resources to growth and yield [95]. Alternatively, relaxed selection in managed environments may permit the accumulation of loss-of-function mutations in NLR genes without immediate fitness consequences [95]. The positive association between domestication duration and immune receptor gene loss supports the relaxed selection hypothesis [95].

Experimental Protocols and Methodologies

Genome-Wide NLR Identification and Characterization

Diagram 1: Workflow for Genome-Wide NLR Identification. The pipeline illustrates the dual approach for comprehensive NLR gene identification combining HMM searches and BLAST analyses with subsequent validation and classification steps.

The methodological framework for NLR identification and characterization follows established protocols in comparative genomics [36] [3] [94]. The key steps include:

Genome Data Acquisition: Obtain genomic and annotation data for target species from public databases or original sequencing efforts. For asparagus studies, data for A. kiusianus was sourced from Plant GARDEN and for A. setaceus from Dryad Digital Repository [36].
NLR Identification: Perform iterative searches using both HMM profiles of the NB-ARC domain and BLASTp analyses against reference NLR sequences with stringent E-value cutoffs (1e-10) [36] [94].
Domain Validation: Confirm NBS domains in candidate sequences using InterProScan and NCBI's Batch CD-Search with E-value ≤ 1e-5 [36].
Classification and Characterization: Categorize validated NLR genes into subfamilies based on N-terminal domains (CNL, TNL, RNL) and analyze chromosomal distribution, motif composition, and gene structures [36] [99].

Expression Analysis and Functional Validation

Diagram 2: Experimental Workflow for Disease Response Assessment. The diagram outlines the integrated approach combining phenotypic evaluation with transcriptomic analysis to link NLR gene expression to functional disease resistance outcomes.

The functional validation of NLR genes involves integrated approaches combining phenotypic assessments with molecular analyses:

Pathogen Inoculation Assays: Conduct controlled infections using relevant pathogens (e.g., Phomopsis asparagi for asparagus) with appropriate experimental controls and replication [36].
Phenotypic Scoring: Evaluate disease symptoms using standardized rating scales at multiple time points post-inoculation to quantify resistance levels [36].
Transcriptomic Profiling: Extract RNA from infected tissues at critical time points and perform RNA sequencing to quantify NLR gene expression patterns in response to pathogen challenge [36] [98].
Orthologous Gene Tracking: Identify conserved NLR pairs between wild and domesticated species through orthogroup analysis to trace evolutionary fate during domestication [36].

Table 4: Essential Research Reagents and Tools for NLR Gene Analysis

Tool/Resource	Specific Application	Function in Research
TBtools v2.136	Genomic data analysis	Chromosomal mapping, gene distribution visualization, collinearity analysis
InterProScan	Protein domain characterization	Identification and validation of NBS domains and other structural motifs
MEME Suite	Conserved motif analysis	Prediction of conserved protein motifs within NBS domains
OrthoFinder	Orthologous gene analysis	Clustering of orthologous NLR genes across species
PlantCARE	Cis-element prediction	Identification of regulatory elements in promoter regions
WoLF PSORT	Subcellular localization	Prediction of protein localization within cellular compartments
MEGA Software	Phylogenetic analysis	Construction of evolutionary trees using maximum likelihood methods

The bioinformatics toolkit for NLR research encompasses specialized software for domain identification, phylogenetic reconstruction, and genomic visualization [36] [3] [94]. These tools enable researchers to move from raw genomic data to functional insights about NLR gene evolution and function.

For experimental validation, key resources include:

Reference genomes with high-quality annotations for both domesticated and wild relatives
Pathogen isolates for controlled inoculation studies
RNA sequencing platforms for transcriptomic profiling
Virus-induced gene silencing (VIGS) systems for functional characterization of candidate NLR genes [3]

This case study demonstrates that domestication-associated NLR contraction in garden asparagus represents both quantitative reduction in gene number and functional alterations in retained genes. The combination of comparative genomics, expression profiling, and phenotypic validation provides compelling evidence that artificial selection during domestication has compromised the immune repertoire of cultivated asparagus, rendering it more susceptible to fungal pathogens like Phomopsis asparagi.

The findings from asparagus align with broader patterns observed across diverse crop species, where domestication frequently results in contracted NLR repertoires and reduced disease resistance [95]. This recurring phenomenon highlights the importance of conserving wild relatives as valuable genetic resources for disease resistance breeding. Future efforts to enhance asparagus disease resistance could focus on introgressing NLR genes from wild species like A. setaceus and A. kiusianus into cultivated backgrounds, potentially restoring lost defensive capabilities while maintaining desirable agronomic traits.

The methodological framework presented here provides a roadmap for similar investigations in other crop species, enabling researchers to systematically evaluate how domestication and breeding have shaped plant immune systems. As genomic technologies continue to advance, such comparative approaches will increasingly inform precision breeding strategies aimed at developing more durable disease resistance in agricultural crops.

In the field of plant science, functional validation of candidate genes is a critical step in understanding molecular mechanisms underlying disease resistance. Within the context of comparative analysis of Nucleotide-Binding Site (NBS) genes across plant species, researchers require efficient and reliable techniques to confirm gene function. Virus-Induced Gene Silencing (VIGS) has emerged as a powerful reverse genetics tool that enables rapid characterization of plant resistance genes without the need for stable transformation. This method is particularly valuable for studying the expansive NBS-LRR gene family, which represents the largest class of plant disease resistance genes and displays remarkable diversity across plant species. This guide objectively compares VIGS with alternative functional validation methods and provides supporting experimental data from recent studies.

VIGS Methodology: Principles and Workflow

VIGS functions by harnessing a plant's natural RNA-based antiviral defense mechanism. The process begins with the engineering of a viral vector to carry a fragment of the target plant gene. When this recombinant virus infects the plant, double-stranded RNA replication intermediates trigger the plant's RNA interference (RNAi) machinery, leading to sequence-specific degradation of endogenous mRNA transcripts corresponding to the inserted fragment. This targeted degradation results in reduced expression of the gene of interest, allowing researchers to observe the phenotypic consequences of gene silencing.

The diagram below illustrates the workflow for functional validation of NBS-LRR genes using VIGS:

Standard VIGS Experimental Protocol:

Vector Selection: Choose appropriate VIGS vector (e.g., Tobacco Rattle Virus (TRV), Barley Stripe Mosaic Virus (BSV)) based on plant species compatibility
Insert Design: Amplify 300-500 bp gene-specific fragment from target NBS-LRR gene using PCR with gene-specific primers containing appropriate restriction sites
Vector Construction: Clone fragment into VIGS vector using restriction digestion and ligation or Gateway recombination
Plant Inoculation: In vitro transcript inoculation or agrobacterium-mediated delivery for DNA-based vectors
Silencing Period: Allow 2-4 weeks for systemic silencing establishment
Validation: Confirm silencing efficiency via qRT-PCR and document phenotypic changes
Pathogen Challenge: Inoculate silenced plants with target pathogen and evaluate disease symptoms

Comparative Analysis of Functional Validation Methods

The table below provides a systematic comparison of VIGS against alternative approaches for validating NBS-LRR gene function:

Table 1: Comparison of Functional Validation Methods for Plant Disease Resistance Genes

Method	Key Features	Time Required	Technical Requirements	Key Advantages	Major Limitations
VIGS	Transient silencing; no stable transformation	3-6 weeks	Vector construction, plant inoculation	Rapid assessment, applicable to non-transformable species, suitable for high-throughput screening	Variable silencing efficiency, potential off-target effects, transient nature
Stable Transformation	Permanent gene integration; overexpression or knockout	6-12 months	Tissue culture, transformation facilities	Stable and heritable, precise genetic modification	Time-consuming, species-dependent efficiency, somaclonal variation
CRISPR-Cas9	Targeted gene editing; precise mutations	9-15 months	Vector design, tissue culture, transformation	Precise genome editing, stable knockouts, heritable modifications	Technically demanding, time-consuming, off-target potential
TILLING	Identification of natural mutations; non-transgenic	4-6 months	Mutation screening platform	Non-GMO approach, available for diverse germplasm	Limited to existing mutations, laborious screening

VIGS Applications in NBS-LRR Gene Validation: Case Studies

Recent research has demonstrated the effectiveness of VIGS for functional characterization of NBS-LRR genes across multiple plant species. The following examples highlight its application in different experimental contexts:

1. Validation of Fusarium Wilt Resistance in Tung Trees In a 2024 study, researchers employed VIGS to validate the role of Vm019719, an NBS-LRR gene conferring resistance to Fusarium wilt in the resistant tung tree species Vernicia montana. Silencing of Vm019719 led to compromised resistance, confirming its essential role in disease defense. The orthologous gene in the susceptible species V. fordii (Vf11G0978) contained a promoter deletion that rendered it non-functional [27].

2. Functional Analysis of Pasmo Resistance in Flax A 2025 study utilized VIGS to investigate the role of LuWRKY39 in flax resistance to Septoria linicola, the causal agent of pasmo disease. Silenced plants exhibited enhanced susceptibility to fungal infection, with corresponding disease index statistics confirming the crucial role of this gene in flax disease resistance [100].

3. Characterization of Cotton NBS Genes in Virus Resistance In a comprehensive analysis of NBS domain genes, researchers used VIGS to silence GaNBS (OG2) in resistant cotton, which demonstrated its putative role in controlling virus titers during cotton leaf curl disease infection [3].

Research Reagent Solutions for VIGS Experiments

The table below outlines essential reagents and materials required for implementing VIGS in functional studies of plant disease resistance genes:

Table 2: Essential Research Reagents for VIGS-Based Functional Studies

Reagent/Material	Function/Purpose	Examples/Specifications
VIGS Vectors	Delivery of target gene fragments into plant cells	TRV (Tobacco Rattle Virus), BSMV (Barley Stripe Mosaic Virus), ALSV (Apple Latent Spherical Virus)
Agrobacterium Strains	Mediated delivery for DNA-based VIGS vectors	Agrobacterium tumefaciens GV3101, AGL1
Enzymes for Cloning	Vector construction and insert preparation	Restriction enzymes, ligases, polymerases for PCR
Plant Growth Facilities	Controlled environment for plant maintenance	Growth chambers with temperature, light, and humidity control
Pathogen Isolates	Challenge experiments after gene silencing	Verified isolates of target pathogens (e.g., Fusarium oxysporum, Septoria linicola)
Molecular Analysis Kits	Validation of silencing efficiency	RNA extraction kits, cDNA synthesis kits, qPCR reagents

Signaling Pathways in NBS-LRR Mediated Resistance

NBS-LRR proteins function as intracellular immune receptors that recognize pathogen effectors and initiate defense signaling cascades. The diagram below illustrates key signaling pathways in NBS-LRR mediated disease resistance:

The NBS domain serves as a molecular switch for activation of NBS-LRR proteins. In the resting state, the domain binds ADP. Upon pathogen recognition, conformational changes promote ADP-ATP exchange, triggering the protein's active state and initiation of defense signaling [101]. This activation leads to downstream immune responses including the hypersensitive response, which confines pathogens to infection sites through localized programmed cell death [102].

Advantages and Limitations of VIGS in Comparative NBS Gene Studies

VIGS offers distinct advantages for comparative functional analysis of NBS genes across multiple plant species. Its transient nature enables rapid assessment of gene function without the need for stable transformation, which is particularly valuable for species with recalcitrant transformation systems or long life cycles. This approach facilitates medium-throughput screening of multiple candidate genes identified through comparative genomic analyses, allowing researchers to prioritize the most promising targets for in-depth characterization.

However, VIGS does present certain limitations that must be considered in experimental design. Silencing efficiency can be variable across different plant species, tissues, and individual genes. The technique typically generates partial rather than complete loss-of-function phenotypes, potentially missing subtle gene functions. Additionally, careful controls are essential to account for potential off-target effects and physiological impacts of viral infection itself.

VIGS represents a powerful methodology for functional validation within comparative studies of NBS genes across plant species. Its rapid implementation, applicability to diverse species, and compatibility with medium-throughput approaches make it particularly valuable for initial functional screening of candidate genes identified through genomic analyses. While stable transformation and gene editing methods provide more permanent genetic modifications for conclusive validation, VIGS serves as an efficient frontline tool for prioritizing candidate resistance genes. The integration of VIGS into comparative functional genomics workflows accelerates the identification of key NBS-LRR genes underlying disease resistance, ultimately supporting the development of improved crop varieties with enhanced pathogen resistance.

Functional Validation and Cross-Species Comparative Genomics of NBS Genes

The nucleotide-binding site (NBS) domain represents a fundamental architectural component of plant resistance (R) proteins, which constitute the frontline defense against diverse pathogens. These genes, particularly those belonging to the NBS-LRR (NLR) superfamily, function as specialized immune receptors that recognize pathogen-secreted effectors and activate robust defense responses through effector-triggered immunity (ETI) [3]. The NBS-encoding gene family exhibits remarkable structural diversity across plant species, with classifications primarily falling into TNL (TIR-NBS-LRR), CNL (CC-NBS-LRR), and RNL (RPW8-NBS-LRR) subfamilies based on their N-terminal domains [25] [15]. Among these, CNL-type genes frequently serve as detectors of pathogen effectors, either through direct interaction or by monitoring host protein modifications [3].

In the context of plant-virus interactions, NBS genes play a pivotal role in conferring resistance against viral pathogens. Recent comparative genomic analyses have identified 12,820 NBS-domain-containing genes across 34 plant species, revealing both classical and species-specific structural architectures [3]. These findings establish NBS genes as crucial components in the evolutionary arms race between plants and their viral pathogens, with implications for breeding resistant crop varieties.

Comparative Genomic Landscape of NBS Genes Across Plant Species

Genomic Distribution and Evolutionary Patterns

NBS genes display non-random chromosomal distribution patterns, frequently organizing in clusters that potentially facilitate rapid evolution through gene duplication and diversification events. Comparative analyses across multiple plant families reveal significant variation in NBS gene abundance and architecture, reflecting distinct evolutionary trajectories in different lineages.

Table 1: Comparative Analysis of NBS Gene Family Across Plant Species

Plant Species	Family	Total NBS Genes	Predominant Types	Clustering Pattern	Key Evolutionary Feature
Gossypium hirsutum (Cotton)	Malvaceae	Not specified	CNL, TNL	Tandem clusters	OG2 (GaNBS) associated with CLCuD resistance
Ipomoea batatas (Sweet potato)	Convolvulaceae	889	CN-type, N-type	83.13% in clusters	Higher segmental duplications
Ipomoea trifida	Convolvulaceae	554	CN-type, N-type	76.71% in clusters	More tandem duplications
Ipomoea nil	Convolvulaceae	757	CN-type, N-type	86.39% in clusters	Species-specific expansions
Salvia miltiorrhiza	Lamiaceae	196	CNL, TNL	Not specified	Reduction in TNL/RNL members
Asparagus officinalis	Asparagaceae	27	CNL, TNL	Chromosomal clusters	Domesticated contraction from wild relatives

This comparative analysis reveals that frequent gene duplication events, both tandem and segmental, have driven the expansion and diversification of NBS genes across plant genomes. The observed clustering patterns suggest evolutionary mechanisms for generating diversity in pathogen recognition capabilities. Notably, domesticated species like Asparagus officinalis exhibit significant gene family contraction (63 to 27 NLR genes) compared to their wild relatives, potentially explaining increased disease susceptibility in cultivated varieties [15].

Orthogroup Conservation and Functional Specialization

Orthogroup analysis has identified 201 conserved NBS-encoding orthologous genes forming syntenic gene pairs across four Ipomoea species, indicating common ancestry and potential functional conservation [25]. Among these, several core orthogroups (OG0, OG1, OG2, etc.) demonstrate conserved functions across species, while unique orthogroups (OG80, OG82) exhibit species-specific specialization [3]. This evolutionary framework provides critical context for understanding the position of GaNBS (OG2) within the broader landscape of plant NBS genes, suggesting it belongs to a conserved functional class with specialized roles in viral defense.

Experimental Validation: VIGS Silencing of GaNBS (OG2) in Cotton

Virus-Induced Gene Silencing (VIGS) Methodology

Virus-Induced Gene Silencing represents a powerful reverse genetics technology that exploits the plant's innate RNA-mediated antiviral defense mechanism to silence endogenous genes [103] [104]. The fundamental principle involves engineering viral vectors to carry host gene fragments, which trigger sequence-specific mRNA degradation through the post-transcriptional gene silencing (PTGS) pathway [104] [105].

The technical workflow for VIGS-mediated validation of GaNBS function encompasses several critical stages:

Vector Construction: A recombinant viral vector (pV190 derivative) is generated containing a 300-500bp fragment of the target GaNBS gene sequence [105].
Agrobacterium-Mediated Delivery: The recombinant vector is introduced into Agrobacterium tumefaciens strain GV3101, cultured to OD₆₀₀ 0.6-0.8, and resuspended in infiltration buffer (10 mM MgCl₂, 10 mM MES, 200 μM AS) [105].
Plant Inoculation: Cotton seedlings at the 2-true-leaf stage are inoculated via agroinfiltration, with needleless syringes used to deliver the bacterial suspension through minor wounds on cotyledons or true leaves [105].
Silencing Induction: Inoculated plants are maintained under high humidity for 24-48 hours, then transferred to standard growth conditions (28°C/24°C, 16h light/8h dark) for phenotypic development [3].
Validation Assessment: Silencing efficiency is quantified through RT-qPCR analysis of target gene expression, with parallel monitoring of viral titers and disease symptom development [3].

Functional Evidence: GaNBS Silencing and Viral Susceptibility

The critical experiment validating GaNBS function in viral defense involved comparative analysis between resistant cotton accessions (Mac7) and susceptible varieties (Coker 312) following GaNBS silencing. Genetic variation analysis revealed 6,583 unique NBS gene variants in resistant Mac7 compared to 5,173 in susceptible Coker312, suggesting structural differences in NBS repertoires contribute to resistance mechanisms [3].

Table 2: Experimental Results of GaNBS Silencing in Cotton-Virus Interaction

Experimental Parameter	Control (Non-Silenced)	GaNBS-Silenced Plants	Measurement Technique
GaNBS Expression Level	100% (Baseline)	Significantly reduced	RT-qPCR
Viral Titer Accumulation	Low	Markedly increased	qPCR against viral genomes
Disease Symptom Severity	Mild	Severe	Phenotypic scoring
Plant Defense Activation	Strong	Compromised	Marker gene expression
Orthogroup Expression	OG2, OG6, OG15 upregulated in resistant lines	OG2 specifically downregulated	RNA-seq and expression profiling

Experimental data demonstrated that silencing of GaNBS (OG2) in naturally resistant cotton resulted in significantly increased viral titers of Cotton Leaf Curl Disease (CLCuD) pathogens, which are begomoviruses from the Geminiviridae family [3]. This finding provides direct evidence that GaNBS plays a pivotal role in conferring resistance against this economically significant viral disease. Protein interaction studies further supported these findings, revealing strong binding between putative NBS proteins and core proteins of the cotton leaf curl disease virus, suggesting a potential recognition mechanism [3].

Molecular Mechanisms of NBS-Mediated Viral Defense

Signaling Pathways in NBS-Mediated Immunity

NBS proteins function as central components in plant immune signaling networks, triggering coordinated defense responses upon pathogen recognition. The molecular mechanism of GaNBS-mediated defense likely follows established NLR signaling paradigms with possible virus-specific adaptations.

The NBS-mediated defense signaling involves conformational changes in the NB-ARC domain upon pathogen perception, leading to activation of downstream defense cascades. These include hypersensitive response (HR) initiation, systemic acquired resistance (SAR) activation, and direct execution of antiviral restriction mechanisms [3] [25]. The specific involvement of GaNBS in restricting cotton leaf curl virus accumulation suggests it may directly or indirectly interfere with viral replication or movement within plant tissues.

Transcriptional Regulation of NBS Genes

Expression profiling of NBS orthogroups across different cotton accessions revealed that OG2, OG6, and OG15 show distinct upregulation patterns in resistant genotypes under biotic stress conditions [3]. This coordinated expression suggests potential functional specialization within the NBS network, with different orthogroups potentially targeting distinct pathogen components or activating specific defense pathways. The promoter regions of NBS genes are enriched with cis-elements responsive to defense signals and phytohormones, providing mechanistic insight into their pathogen-inducible expression patterns [5] [15].

Table 3: Essential Research Reagents for NBS Gene Functional Studies

Reagent/Resource	Function/Application	Specific Examples	Experimental Role
VIGS Vectors	Delivery of target gene fragments for silencing	TRV, CGMMV-pV190, ALSV	Induce transient gene silencing
Agrobacterium Strains	Plant transformation vehicle	GV3101, LBA4404	Deliver viral vectors into plant cells
Gene-Specific Primers	Amplification of target sequences	LaPDS, LaTEN, GaNBS fragments	Clone specific regions into VIGS vectors
Infiltration Buffers	Facilitating bacterial entry	10 mM MgCl₂, 10 mM MES, 200 μM AS	Maintain bacterial viability during inoculation
Antibiotic Selection	Maintaining vector integrity	Kanamycin, Rifampicin	Select for transformed Agrobacterium
RNAi Machinery Components	Endogenous silencing apparatus	DCL, AGO, RDR proteins	Execute sequence-specific mRNA degradation

This research toolkit highlights the essential materials required for implementing VIGS-based functional validation of NBS genes. The CGMMV-based pV190 vector system has demonstrated particular utility in cucurbit species including Luffa acutangula, where it effectively silenced the tendril development gene TEN and the marker gene PDS [105]. Similarly, TRV-based vectors remain widely applicable across Solanaceae species including pepper and tobacco [106]. The selection of appropriate vector systems must consider host range specificity and silencing efficiency in the target species.

The functional validation of GaNBS (OG2) through VIGS silencing provides compelling evidence for its essential role in antiviral defense against cotton leaf curl disease. This finding significantly advances our understanding of NBS gene function within the broader context of plant immunity, demonstrating that specific orthogroups have specialized functions against particular pathogen types. The conservation of NBS orthogroups across multiple plant species suggests that knowledge gained from cotton may be transferable to other crops facing similar viral challenges.

From a practical perspective, these results have substantial implications for molecular breeding programs aimed at enhancing viral resistance in cotton and potentially other crops. The identification of GaNBS as a key resistance component enables the development of marker-assisted selection strategies and provides potential targets for precision genome editing approaches. Furthermore, the successful application of VIGS technology for rapid gene function validation establishes a methodology that can be extended to characterize other NBS family members across diverse crop species, accelerating the discovery of valuable resistance genes for agricultural improvement.

Plant nucleotide-binding site (NBS) leucine-rich repeat (LRR) proteins constitute one of the largest gene families in plants and serve as critical intracellular immune receptors that mediate effector-triggered immunity [2] [107]. These proteins function as molecular switches in disease signaling pathways, with specific ATP/ADP binding and hydrolysis providing the energy for conformational changes that activate downstream defense responses [107]. The NBS domain, also referred to as the NB-ARC (Nucleotide Binding Adaptor shared by APAF-1, R proteins, and CED-4) domain, contains several conserved motifs characteristic of the "signal transduction ATPases with numerous domains" (STAND) family of ATPases [107]. Plant NBS-LRR proteins are broadly categorized into two major subfamilies based on their N-terminal domains: those with Toll/interleukin-1 receptor (TIR) domains (TNLs) and those with coiled-coil (CC) domains (CNLs) [108] [107]. This comparative analysis examines the molecular mechanisms of NBS protein interactions with nucleotides and viral pathogen effectors, providing researchers with experimental frameworks and mechanistic insights relevant to plant immunity and disease resistance breeding.

Molecular Switch Mechanism: NBS Domain Interactions with ADP/ATP

The NBS domain functions as a nucleotide-dependent molecular switch that regulates the transition between inactive and active signaling states. Specific binding and hydrolysis of ATP has been experimentally demonstrated for the NBS domains of tomato CNL proteins I2 and Mi [107]. The conformational alterations induced by nucleotide exchange are thought to promote the exchange of ADP for ATP by the NBS domain, which activates 'downstream' signaling through an unknown mechanism that ultimately leads to pathogen resistance [2].

Table 1: Conserved Motifs in Plant NBS Domains and Their Proposed Functions

Motif Name	Conserved Sequence	Function in Nucleotide Binding
P-loop (Walker A)	GxGGLGKT	Phosphate binding of ATP/ADP [107]
Walker B	hhhhDD	Coordination of Mg²⁺ ion and ATP hydrolysis [107]
RNBS-A	FDLxLxKF	Nucleotide binding specificity [107]
RNBS-C	GxPLLR	Domain stability and nucleotide sensing [107]
RNBS-D	CFGCYxL	Redox regulation and signaling [107]
MHD	MxHxDxS	Nucleotide exchange regulation [107]

Recent protein-ligand interaction studies have confirmed strong binding of NBS proteins with ADP/ATP. In a comprehensive analysis of NBS-domain-containing genes across 34 plant species, protein-ligand interaction experiments demonstrated a strong interaction of putative NBS proteins with ADP/ATP, highlighting the conserved nature of nucleotide binding across diverse plant species [3]. Threading plant NBS domains onto the crystal structure of human APAF-1 has provided significant insights into the spatial arrangement and function of the conserved motifs in plant NBS domains, revealing a three-layered α/β architecture that facilitates nucleotide-dependent conformational changes [107].

Table 2: Experimental Evidence for NBS-ATP/ADP Interactions

Experimental Method	Key Findings	Representative NBS Proteins Studied
Homology modeling	NBS domains share structural similarity with APAF-1 nucleotide-binding domain	Arabidopsis RPS5, Tomato I2 and Mi [107]
Protein-ligand interaction	Strong binding affinity to ADP/ATP	Cotton GaNBS (OG2) [3]
ATP hydrolysis assays	Demonstrated ATP binding and hydrolysis activity	Tomato I2 and Mi CNL proteins [107]
Mutational analysis	Walker A mutations disrupt nucleotide binding	Rx NB domain fragment [108]

Pathogen Recognition: Direct and Indirect Interaction Mechanisms

Plant NBS-LRR proteins have evolved two primary mechanistic strategies for pathogen detection: direct and indirect recognition. Direct recognition involves physical binding between the NBS-LRR protein and pathogen effector molecules, while indirect recognition occurs through monitoring host proteins that are modified by pathogen effectors [2].

Direct Recognition of Pathogen Effectors

Substantial evidence supports the direct interaction model, particularly through binding of pathogen effectors to the LRR domain of NBS proteins. Key examples include:

The rice R protein Pi-ta directly binds the effector AVR-Pita from the rice blast fungus Magnaporthe grisea through its LRR domain, as demonstrated by yeast two-hybrid experiments [2].
The flax L5, L6, and L7 proteins directly interact with specific variants of the flax rust AvrL567 effector in yeast two-hybrid systems, recapitulating in vivo specificity [2].
The wheat Ym1 CC-NBS-LRR protein specifically interacts with wheat yellow mosaic virus (WYMV) coat protein, with this interaction leading to nucleocytoplasmic redistribution and activation of defense responses [109].
The Arabidopsis RRS1 protein binds the bacterial wilt pathogen protein PopP2 in split-ubiquitin yeast two-hybrid experiments [2].

The LRR domain forms barrel-like structures with parallel β-sheets lining the inner concave surface, creating a versatile binding interface capable of recognizing diverse pathogen molecules [2]. Diversifying selection has maintained variation in the solvent-exposed residues of the β-sheets of the LRR domain, enhancing recognition capacity [107].

Indirect Recognition Through Guarded Host Proteins

The guard hypothesis proposes that NBS proteins monitor the status of host "guardee" proteins that are targeted by pathogen effectors [2]. Well-characterized examples include:

The Arabidopsis RPM1 protein detects the bacterial effectors AvrRpm1 and AvrB through their modification of the host protein RIN4. Both effectors induce phosphorylation of RIN4, which is detected by RPM1 [2].
The Arabidopsis RPS2 protein recognizes cleavage of RIN4 by the bacterial effector AvrRpt2 [2].
The Arabidopsis RPS5 protein detects proteolytic cleavage of the kinase PBS1 by the bacterial effector AvrPphB, forming a ternary complex that activates defense signaling [2].
The tomato Prf protein indirectly detects the effectors AvrPto and AvrPtoB through their interaction with the host Pto kinase [2].

Figure 1: Direct and Indirect Pathogen Recognition Pathways by NBS-LRR Proteins

Experimental Approaches for Studying NBS Protein Interactions

Protein-Protein Interaction Assays

Multiple experimental systems have been successfully employed to characterize NBS protein interactions with pathogen effectors and host proteins:

Yeast Two-Hybrid Systems: This approach has been particularly valuable for detecting direct protein interactions. The methodology involves cloning the NBS-LRR gene (often as the "bait") and pathogen effector gene (as the "prey") into specialized vectors, co-transforming into yeast strains, and selecting for interactions on appropriate dropout media [2]. For the rice Pi-ta and AVR-Pita interaction, yeast two-hybrid experiments demonstrated binding between the LRR domain and the fungal effector, representing the first direct evidence of an AVR-R protein interaction [2]. Similarly, split-ubiquitin yeast two-hybrid experiments confirmed the interaction between Arabidopsis RRS1 and the bacterial PopP2 protein [2].

Virus-Induced Gene Silencing (VIGS): This powerful functional tool allows rapid assessment of NBS gene function in plant resistance. The protocol involves cloning a fragment of the target NBS gene into a viral vector, infecting plants with the modified virus, and assessing changes in disease susceptibility after pathogen challenge [3] [27]. For example, silencing of GaNBS (OG2) in resistant cotton demonstrated its putative role in reducing virus titers in cotton leaf curl disease [3]. Similarly, VIGS experiments with Vm019719 in Vernicia montana confirmed its role in Fusarium wilt resistance [27].

Protein-Ligand Interaction Studies

Molecular Dynamics Simulations: Computational approaches including molecular dynamics simulations spanning microsecond timescales have been employed to investigate nucleotide binding and allosteric communication in NBS domains [110]. These simulations analyze conformational changes, nucleotide binding stability, and the impact of lipid environments on NBS protein function.

Protein-Ligand Binding Assays: Experimental characterization of NBS protein interactions with ADP/ATP has been demonstrated through in vitro binding assays using recombinant NBS domains. Isothermal titration calorimetry and surface plasmon resonance have been applied to quantify binding affinities and thermodynamic parameters [3] [107].

Table 3: Research Reagent Solutions for NBS Protein Interaction Studies

Reagent/Resource	Specific Application	Function and Utility
Yeast Two-Hybrid Systems	Direct protein interaction detection	Identifies physical binding between NBS proteins and effectors [2]
Split-Ubiquitin System	Membrane protein interactions	Detects interactions with membrane-associated proteins [2]
VIGS Vectors	Functional validation in plants	Assesses NBS gene function through targeted silencing [3] [27]
Recombinant NBS Domains	Nucleotide binding studies	Enables in vitro characterization of ATP/ADP interactions [107]
HMMER Software	NBS gene identification	Identifies NBS-encoding genes in genome sequences [3] [111]
Phytozome/BRAD Databases	Comparative genomics	Provides genomic data for cross-species comparisons [111]

Comparative Analysis of NBS Genes Across Plant Species

Genome-wide comparative analyses have revealed significant diversity in NBS gene composition, organization, and evolution across plant species. A recent study identified 12,820 NBS-domain-containing genes across 34 species ranging from mosses to monocots and dicots, classifying them into 168 distinct classes with diverse domain architectures [3]. Key comparative insights include:

Species-Specific Variations: The number of NBS genes varies substantially between species, with approximately 150 in Arabidopsis thaliana, over 400 in Oryza sativa, 90 in Vernicia fordii, and 149 in its resistant counterpart Vernicia montana [107] [27]. This expansion and contraction of NBS gene families reflects species-specific evolutionary paths and adaptation to distinct pathogen pressures.

TNL and CNL Distribution: TNL-type genes are completely absent from cereal genomes, suggesting loss in the monocot lineage after divergence from dicots [107] [111]. Analysis of Vernicia species revealed an absence of TNL genes in susceptible V. fordii, while resistant V. montana possesses 12 TNL-type genes, indicating potential correlation with disease resistance capacity [27].

Genomic Organization: NBS-encoding genes are frequently clustered in plant genomes as a result of both segmental and tandem duplications [107] [111]. Comparative analysis between Brassica species and Arabidopsis revealed that after whole genome triplication of the Brassica ancestor, NBS-encoding homologous gene pairs were rapidly deleted or lost, but species-specific gene amplification occurred through tandem duplication after species divergence [111].

Case Study: Wheat Ym1 Recognition of Viral Coat Protein

The recently cloned wheat Ym1 gene provides a compelling case study of direct NBS-effector recognition in viral immunity. Ym1 encodes a typical CC-NBS-LRR protein that is specifically expressed in roots and induced upon WYMV infection [109]. Key mechanistic insights include:

Ym1-mediated resistance blocks viral transmission from the root cortex into steles, preventing systemic movement to aerial tissues.
The Ym1 CC domain is essential for triggering cell death signaling.
Ym1 specifically interacts with WYMV coat protein, with this interaction leading to nucleocytoplasmic redistribution.
The Ym1-CP interaction facilitates transition from an auto-inhibited to an activated state, subsequently eliciting hypersensitive responses and establishing WYMV resistance.

This case exemplifies the molecular details of direct recognition, where a single NBS-LRR protein specifically binds a viral component and initiates defense signaling cascades that limit pathogen spread and establish resistance [109].

This comparative analysis demonstrates that NBS proteins function as central signaling hubs in plant immunity through their nucleotide-regulated conformational states and diverse pathogen recognition mechanisms. The experimental data summarized herein provides researchers with validated approaches for investigating NBS protein interactions with both nucleotides and pathogen effectors. Future research directions should focus on structural characterization of full-length NBS proteins in different nucleotide states, elucidation of downstream signaling components, and application of this knowledge to engineer broad-spectrum disease resistance in crop species. The continuing integration of genomic, protein interaction, and functional data will further illuminate the sophisticated mechanisms through which NBS proteins mediate plant immunity and offer new strategies for crop improvement.

Nucleotide-binding leucine-rich repeat receptors (NLRs) constitute the most prominent class of intracellular immune receptors in plants, responsible for recognizing pathogen effectors and activating robust defense responses, a mechanism known as effector-triggered immunity (ETI) [13] [97]. The composition and evolution of the NLR gene family are dynamic processes shaped by constant pathogen pressure, leading to significant variation across plant species [3]. This case study delves into a comparative genomic analysis of the NLR gene family in the horticulturally important crop garden asparagus (Asparagus officinalis) and its wild relatives, Asparagus setaceus and Asparagus kiusianus [112] [15]. Cultivated asparagus faces significant disease challenges, whereas its wild relatives often exhibit superior resistance, making them valuable genetic resources [15]. This research provides a framework for understanding how domestication and selection have impacted the NLR repertoire and, consequently, the immune capacity of a major vegetable crop.

Experimental Protocols and Methodologies

The comparative analysis of NLR genes in asparagus species relied on a multi-faceted bioinformatics and experimental pipeline. The following workflow outlines the key procedural stages.

Genome-Wide Identification and Classification of NLR Genes

Genome Data Acquisition: Genomic and annotation data for A. officinalis, A. kiusianus, and A. setaceus were obtained from respective databases (e.g., Plant GARDEN, Dryad Digital Repository) [15].
NLR Gene Identification: A dual approach was employed for comprehensive identification:
- HMM Searches: Hidden Markov Model (HMM) searches were performed using the profile for the conserved NB-ARC domain (Pfam: PF00931) [112] [15].
- BLASTp Analysis: Local BLASTp searches were conducted against reference NLR proteins from Arabidopsis thaliana, Oryza sativa, and Allium sativum with a stringent E-value cutoff of 1e-10 [15].
Domain Validation and Classification: Candidate sequences were validated using InterProScan and NCBI's Batch CD-Search to confirm the presence of the NB-ARC domain. Genes were classified into CNL, TNL, and RNL subfamilies based on their N-terminal domain (CC, TIR, or RPW8, respectively) and overall domain architecture using Pfam and PRGdb 4.0 databases [15].

Phylogenetic and Evolutionary Analysis

Multiple Sequence Alignment and Tree Construction: Protein sequences of candidate NLRs were aligned using Clustal Omega. A phylogenetic tree was constructed using the maximum likelihood method based on the JTT matrix-based model in MEGA software, with branch support assessed via 1000 bootstrap replicates [15].
Orthogroup Analysis: Orthologous gene pairs between species (e.g., A. setaceus and A. officinalis) were identified and clustered using OrthoFinder v2.2.7, based on sequence similarity [15].
Cluster Analysis: NLR genes located within 250 kb of each other were considered clustered. The significance of clustering patterns was evaluated against random expectations using χ² tests [15] [97].

Cis-Element and Motif Analysis

Promoter Analysis: A 2000 bp region upstream of the start codon (ATG) for each NLR gene was analyzed for cis-acting regulatory elements using the PlantCARE database [15].
Motif Discovery: Conserved motifs within the NBS domains were identified using the MEME suite, with the number of motifs set to 10 [15].

Expression Analysis via Pathogen Inoculation

Plant Material and Pathogen Challenge: A. officinalis and A. setaceus plants were inoculated with the fungal pathogen Phomopsis asparagi. A. setaceus remained asymptomatic, while A. officinalis was susceptible [112] [15].
Expression Profiling: Expression studies of the conserved NLR genes in A. officinalis were conducted after fungal challenge to assess their transcriptional response [112].

Key Findings and Comparative Data

Dramatic Contraction of the NLR Repertoire in Domesticated Asparagus

A central finding of this study was the significant reduction in the number of NLR genes from wild asparagus species to the cultivated garden asparagus.

Table 1: NLR Gene Count in Asparagus Species

Species	Status	Total NLR Genes	CNL Subfamily	TNL Subfamily	RNL Subfamily
Asparagus setaceus	Wild	63	Data Not Specified	Data Not Specified	Data Not Specified
Asparagus kiusianus	Wild	47	Data Not Specified	Data Not Specified	Data Not Specified
Asparagus officinalis	Domesticated	27	Data Not Specified	Data Not Specified	Data Not Specified

This data, derived from the cited research [112] [15], demonstrates a clear gene count contraction during the domestication process, with A. officinalis possessing less than half the NLR genes found in its wild relative A. setaceus.

Evolutionary Conservation and Expression Dynamics

Orthologous NLR Pairs: The analysis identified 16 conserved NLR gene pairs between the resistant wild species A. setaceus and the susceptible cultivated A. officinalis, representing the core NLR repertoire preserved during domestication [112] [15].
Dysfunctional Expression in Domesticated Asparagus: Following fungal challenge, the majority of the conserved NLR orthologs in A. officinalis showed either no change or downregulation in their expression. This indicates a potential functional impairment in the immune response machinery of the cultivated species, alongside the numerical loss of genes [112] [15].

Table 2: Essential Reagents and Databases for Comparative NLR Genomics

Reagent / Resource	Function in the Study
Pfam NB-ARC HMM (PF00931)	Core Hidden Markov Model profile for identifying the conserved nucleotide-binding domain in candidate NLR proteins.
InterProScan / NCBI CD-Search	Tools for validating the presence of protein domains and defining the domain architecture of identified NLRs.
PlantCARE Database	Repository for identifying cis-acting regulatory elements in promoter sequences of NLR genes.
MEME Suite	Software for discovering conserved protein motifs within the NLR domains.
OrthoFinder	Algorithm for clustering protein sequences into orthologous groups across species.
Phomopsis asparagi	Fungal pathogen used for inoculation assays to study phenotypic resistance and NLR gene expression.

Contextualizing Asparagus NLR Evolution

The patterns observed in asparagus are not isolated. A similar NLR repertoire reduction was noted in the medicinal plant Salvia miltiorrhiza, which shows a marked contraction in TNL and RNL subfamily members compared to other angiosperms [13]. Furthermore, studies across the Apiaceae family revealed dynamic NLR gene loss and gain events during speciation, highlighting that rapid gene content variation is a common feature shaping NLR evolution in plants [97]. The convergent contraction of NLRs in unrelated species suggests a potential link between domestication, certain agricultural traits, and a relaxation of pathogen defense constraints.

This comparative genomic case study demonstrates that the heightened disease susceptibility of domesticated garden asparagus is a consequence of a two-pronged evolutionary process: a significant contraction of the NLR gene repertoire and a functional impairment of the retained NLR genes, as evidenced by their lack of induction upon pathogen attack [112] [15]. This is likely a trade-off resulting from artificial selection for agricultural traits like yield and quality, potentially at the expense of robust immune system maintenance. The findings underscore the value of wild germplasm as a reservoir of NLR diversity for future disease-resistant breeding programs in asparagus.

Visualizing the Experimental Workflow and Immune Signaling

The following diagram illustrates the core experimental workflow used in this case study to identify and characterize NLR genes.

NLR-Mediated Immune Signaling [13] [97] [94] This diagram outlines the general signaling pathway triggered by NLR proteins, which was investigated in the asparagus study.

The genus Dendrobium represents one of the largest and most economically important groups in the Orchidaceae family, with significant value in horticulture and traditional medicine [11]. Dendrobium officinale Kimura et Migo, in particular, is a prized Traditional Chinese Medicine rich in polysaccharides, flavonoids, and alkaloids [11] [113]. The industrial cultivation of Dendrobium species faces substantial challenges from various pathogens, including viruses and fungi, leading to significant production losses [11].

Plant immunity relies on a sophisticated two-layered defense system consisting of pathogen-associated molecular pattern-triggered immunity (PTI) and effector-triggered immunity (ETI) [13]. Nucleotide-binding site (NBS) genes constitute the largest class of plant disease resistance (R) genes, with approximately 80% of characterized R genes belonging to the NBS superfamily [11] [13]. These genes encode proteins characterized by a conserved NBS domain and C-terminal leucine-rich repeats (LRRs), which are responsible for pathogen recognition and immune signal activation [11] [3].

This case study examines the unique evolutionary patterns of NBS genes within Dendrobium species, focusing on the phenomena of gene degeneration and the influence of salicylic acid (SA) on NBS-LRR gene expression. By comparing these patterns across multiple orchid species and investigating SA-induced expression changes in D. officinale, we aim to provide insights into the evolution of disease resistance mechanisms in this economically significant genus.

Comparative Analysis of NBS Genes Across Plant Species

Diversity of NBS Genes in Land Plants

NBS domain genes represent one of the largest resistance gene families in plants, exhibiting remarkable structural diversity across species. A comprehensive analysis of 12,820 NBS-domain-containing genes across 34 plant species revealed 168 distinct classes of domain architecture, encompassing both classical and species-specific structural patterns [3]. The proportional distribution of NBS gene subfamilies varies significantly among plant lineages, reflecting diverse evolutionary paths in immune system adaptation [13].

Table 1: NBS Gene Distribution Across Plant Species

Plant Species	Total NBS Genes	NBS-LRR Genes	CNL-type	TNL-type	RNL-type	Reference
Arabidopsis thaliana	210	Not specified	40	Not specified	Not specified	[11]
Dendrobium officinale	74	22	10	0	Not specified	[11] [113]
Dendrobium nobile	169	Not specified	18	0	Not specified	[11]
Dendrobium chrysotoxum	118	Not specified	14	0	Not specified	[11]
Salvia miltiorrhiza	196	62	61	0	1	[13]
Asparagus officinalis	27	Not specified	Not specified	Not specified	Not specified	[15]
Asparagus setaceus	63	Not specified	Not specified	Not specified	Not specified	[15]
Oryza sativa	505	Not specified	Not specified	0	Not specified	[13]
Triticum aestivum	>2000	Not specified	Not specified	0	Not specified	[3] [15]
Pinus taeda	311	Not specified	Not specified	89.3% (of typical NLRs)	Not specified	[13]

Lineage-Specific Patterns of NBS Gene Evolution

The composition of NBS gene repertoires shows remarkable lineage-specific patterns. Monocot species, including orchids and cereals, consistently lack TNL-type genes, with this loss potentially driven by NRG1/SAG101 pathway deficiency [11] [3]. In Dendrobium species, phylogenetic analysis of CNL-type proteins revealed significant degeneration in specific branches, with type changing and NB-ARC domain degeneration identified as two prominent characteristics of Dendrobium NBS gene evolution [11] [113].

Similar patterns of subfamily-specific reduction are observed in other medicinal plants. In Salvia miltiorrhiza, from 196 identified NBS genes, only 62 possess complete N-terminal and LRR domains, with a notable reduction in TNL and RNL subfamily members [13]. This trend extends to asparagus species, where a marked contraction of NLR genes occurs from wild species (A. setaceus: 63 genes) to domesticated garden asparagus (A. officinalis: 27 genes) [15].

NBS Gene Degeneration in Dendrobium Species

Genomic Evidence for NBS Gene Degeneration

Comprehensive analysis of NBS genes across multiple Dendrobium species has revealed extensive degeneration patterns. A study examining six orchids and A. thaliana identified 655 putative NBS genes, with 74 found in D. officinale, 169 in D. nobile, and 118 in D. chrysotoxum [11] [113]. The NBS genes were classified into two main subclasses: NBS-LRR genes containing both NB-ARC and LRR domains, and non-NBS-LRR genes that have lost the LRR domain [11].

Notably, no TNL-type genes were identified in any of the six examined orchid species, confirming that TIR domain degeneration represents a common phenomenon in monocots [11]. This pattern aligns with observations in other monocot species such as rice, wheat, and maize, which also completely lack TNL subfamilies [13]. The CNL-type genes were the most abundant among NBS-LRR genes in all examined Dendrobium species [11].

Phylogenetic Analysis of NBS Gene Evolution

Phylogenetic reconstruction using CNL-type protein sequences from multiple orchid species and A. thaliana revealed that orchid NBS-LRR genes have significantly degenerated on two primary branches (branches a and b) [11]. The phylogenetic relationships of CNL genes in each branch were generally consistent with the established orchid species tree, supporting the role of species divergence in shaping NBS gene evolution [11].

Homology analysis of Dendrobium NBS genes identified two prominent characteristics: type changing and NB-ARC domain degeneration [11] [113]. These evolutionary patterns contribute significantly to the diversity of NBS genes within the genus and may reflect adaptive responses to pathogen pressures.

NBS Gene Evolution in Dendrobium

Salicylic Acid Signaling in Plant Immunity

SA Biosynthesis and Signaling Pathways

Salicylic acid serves as a crucial defense hormone in plants, playing pivotal roles in both local and systemic immune responses [114]. SA biosynthesis in plants primarily proceeds through two pathways: the isochorismate synthase (ICS) pathway and the phenylalanine ammonia-lyase (PAL) pathway [114]. In Arabidopsis, nearly 90% of defense-related SA is produced via the ICS pathway, starting with chorismate in plastids [114].

The NONEXPRESSOR OF PATHOGENESIS-RELATED GENES (NPR) proteins function as SA receptors, with NPR1 serving as the master regulator of SA-mediated signaling [114]. NPR1 interacts with TGACG-BINDING (TGA) transcription factors to up-regulate defense-related genes, such as PATHOGENESIS-RELATED 1 (PR1) [114]. NPR3 and NPR4, despite structural homology with NPR1, operate redundantly as transcriptional corepressors in SA signaling, creating a sophisticated homeostatic network that orchestrates appropriate immune responses [114].

Temperature Influence on SA Signaling

Temperature significantly influences SA-mediated immunity, with high temperatures suppressing SA biosynthesis and signaling, while low temperatures enhance these pathways [114]. In Arabidopsis, moderately elevated temperatures (24 hours at 28°C) inhibit the expression of key regulators in the ICS pathway, including SYSTEMIC ACQUIRED RESISTANCE DEFICIENT 1 (SARD1) and CALMODULIN-BINDING PROTEIN 60G [114]. These transcription factors normally activate SA biosynthesis by directly inducing the expression of ICS1, EDS5, PBS3, EDS1, and PAD4 [114].

This temperature-sensitive regulation of SA pathways has significant implications for plant immunity under changing climate conditions. Similar temperature-dependent regulation has been observed in tobacco, where maintaining virus-infected plants at 32°C suppressed SA accumulation and inhibited PR gene expression, effects that were reversible upon returning plants to 22°C [114].

SA Signaling Pathway and Regulation

SA-Induced NBS-LRR Expression in D. officinale

Experimental Design and Transcriptomic Analysis

To investigate the response of NBS-LRR genes to SA treatment in D. officinale, researchers conducted transcriptomic analysis using SA-treated samples [11] [113]. From the SA treatment transcriptome data, 1,677 differentially expressed genes (DEGs) were identified, among which six NBS-LRR genes (Dof013264, Dof020566, Dof019188, Dof019191, Dof020138, and Dof020707) showed significant up-regulation [11] [113].

Weighted gene co-expression network analysis (WGCNA) revealed that only one of these six NBS-LRR genes, Dof020138, was closely associated with multiple defense-related pathways, including pathogen identification pathways, MAPK signaling pathways, plant hormone signal transduction pathways, biosynthetic pathways, and energy metabolism pathways [11] [113]. This suggests that Dof020138 may play a central role in SA-mediated immune responses in D. officinale.

Table 2: SA-Responsive NBS-LRR Genes in D. officinale

Gene ID	Fold Change	Domain Architecture	Putative Function	Pathway Association
Dof020138	Significant up-regulation	NBS-LRR	Central immune regulator	Multiple pathways (Pathogen identification, MAPK, Hormone signaling)
Dof013264	Significant up-regulation	NBS-LRR	Immune receptor	Not specified
Dof020566	Significant up-regulation	NBS-LRR	Immune receptor	Not specified
Dof019188	Significant up-regulation	NBS-LRR	Immune receptor	Not specified
Dof019191	Significant up-regulation	NBS-LRR	Immune receptor	Not specified
Dof020707	Significant up-regulation	NBS-LRR	Immune receptor	Not specified

Functional Annotation and Pathway Analysis

Analysis of the 22 identified D. officinale NBS-LRR genes revealed their involvement in the ETI system, plant hormone signal transduction pathway, and Ras signaling pathway [11] [113]. Gene structure analysis showed diverse exon-intron arrangements among these genes, while conserved motif analysis identified characteristic patterns across the NBS-LRR family [11].

Cis-element analysis of promoter regions identified numerous elements related to defense and hormone responsiveness, providing molecular evidence for the involvement of these genes in stress response pathways [11]. The integration of structural, phylogenetic, and expression data supports the conclusion that NBS-LRR genes generally participate in D. officinale ETI system and signal transduction pathways, with specific genes like Dof020138 potentially having important breeding value due to their responsiveness to SA signaling [11] [113].

Research Reagent Solutions and Methodologies

Genomic and Transcriptomic Analysis Tools

Table 3: Essential Research Reagents and Tools for NBS Gene Analysis

Category	Specific Tools/Reagents	Function	Application in Dendrobium Studies
Genome Assembly	PacBio/Nanopore sequencing, Chromosome-level assembly	High-quality genome sequencing	D. officinale genome (1.23 Gb, contig N50: 1.44 Mb) [11]
Gene Identification	HMM profiles (Pfam), InterProScan, SMART, NCBI CD-Search	NBS domain identification	Identified 655 NBS genes from 7 species [11] [3]
Phylogenetic Analysis	MAFFT, FastTreeMP, OrthoFinder, MCL algorithm	Evolutionary relationship reconstruction	CNL gene phylogenetic trees [11] [3]
Expression Analysis	RNA-seq, WGCNA, Differential expression analysis	Gene expression profiling	Identified 1,677 DEGs and 6 upregulated NBS-LRRs after SA treatment [11] [113]
Functional Validation	Virus-Induced Gene Silencing (VIGS)	Functional characterization of NBS genes	Validated role of GaNBS (OG2) in virus resistance in cotton [3]

Experimental Protocols for Key Analyses

Protocol 1: Genome-Wide Identification of NBS Genes

Data Collection: Obtain genome assemblies and annotation files for target species from public databases (NCBI, Phytozome, Plaza) [3].
Domain Identification: Use HMMER with Pfam HMM models (PF00931 for NB-ARC domain) with default e-value cutoff (1.1e-50) to identify candidate NBS genes [3] [15].
Architecture Classification: Analyze domain architecture using InterProScan and NCBI's Batch CD-Search to classify genes into specific subfamilies (CNL, TNL, RNL, etc.) based on complete domain composition [3] [13].
Validation: Manually verify domain organization and remove redundant or incomplete sequences [11] [15].

Protocol 2: SA Treatment and Transcriptome Analysis

Plant Material: Use uniform D. officinale plants at similar developmental stages [11].
SA Application: Apply salicylic acid treatment at appropriate concentration (specific concentration not provided in search results) [11] [113].
RNA Extraction: Isolate high-quality RNA from treated and control tissues at multiple time points.
Library Preparation and Sequencing: Prepare RNA-seq libraries and sequence using Illumina platform [11].
Differential Expression: Identify differentially expressed genes using standard pipelines (e.g., DESeq2, edgeR) with adjusted p-value < 0.05 and |log2FC| > 1 [11] [113].
Co-expression Analysis: Perform WGCNA to identify gene modules associated with SA response and construct co-expression networks [11].

Protocol 3: Phylogenetic and Evolutionary Analysis

Sequence Alignment: Perform multiple sequence alignment using MAFFT or Clustal Omega with default parameters [3].
Tree Construction: Build phylogenetic trees using maximum likelihood method implemented in FastTreeMP or MEGA with 1000 bootstrap replicates [11] [3].
Orthogroup Analysis: Identify orthogroups using OrthoFinder v2.5+ with DIAMOND for sequence similarity searches and MCL for clustering [3].
Evolutionary Dynamics: Analyze gene expansion/contraction using CAFE or similar tools, and identify tandem duplicates as genes separated by ≤8 intervening genes [3] [15].

This case study demonstrates the unique evolutionary patterns of NBS genes in Dendrobium species, characterized by significant gene degeneration, particularly in TNL-type genes, and diversification through domain degeneration and type changing. The responsiveness of specific NBS-LRR genes, notably Dof020138, to SA treatment highlights the integration of these resistance genes into hormone-mediated defense signaling pathways.

The comparative analysis across plant species reveals both conserved and lineage-specific features of NBS gene evolution, with monocots consistently showing TNL loss and CNL predominance. The experimental evidence from D. officinale provides insights into how SA signaling activates specific NBS-LRR genes, potentially offering targets for future disease resistance breeding in this valuable medicinal orchid.

These findings contribute to our understanding of plant immune system evolution and have practical implications for developing disease-resistant Dendrobium varieties through marker-assisted selection or genetic engineering approaches targeting key SA-responsive NBS-LRR genes.

Structural variations (SVs), defined as genomic alterations larger than 30 base pairs, represent a significant source of genetic diversity in plant genomes. These variations include deletions, duplications, inversions, translocations, and presence/absence variations [115]. In the context of plant immunity, SVs play a crucial role in the evolution of nucleotide-binding site and leucine-rich repeat (NBS-LRR) genes, which constitute the largest and most important class of disease resistance (R) genes in plants [26] [3]. The impact of SVs on pathogen recognition and signaling specificity extends beyond simple gene presence or absence, influencing gene clustering, domain architecture, and functional diversification across plant species. This review provides a comprehensive comparison of how structural variations shape the NBS-LRR gene family across diverse plant species, with implications for pathogen recognition specificity and immune signaling mechanisms.

NBS-LRR Gene Family: Classification and Functional Significance

Domain Architecture and Subclassification

NBS-LRR genes encode proteins characterized by a central nucleotide-binding site (NBS) domain and C-terminal leucine-rich repeats (LRRs). Based on their N-terminal domains, they are classified into three principal subfamilies: TIR-NBS-LRR (TNL) with Toll/interleukin-1 receptor domains, CC-NBS-LRR (CNL) with coiled-coil domains, and RPW8-NBS-LRR (RNL) with resistance to powdery mildew 8 domains [26] [9]. The NBS domain contains several conserved motifs (P-loop, RNBS-A, kinase-2, RNBS-B, RNBS-C, and GLPL) that facilitate nucleotide binding and are crucial for defense signaling activation [50]. The LRR domain is responsible for pathogen recognition through protein-protein interactions and exhibits high sequence variability, enabling recognition of diverse pathogen effectors [2].

Table 1: NBS-LRR Gene Subfamilies and Their Characteristics

Subfamily	N-terminal Domain	Signaling Pathway	Pathogen Recognition Role	Species Distribution
TNL	TIR (Toll/Interleukin-1 Receptor)	EDS1-dependent	Direct and indirect pathogen recognition	Dicots only
CNL	CC (Coiled-Coil)	EDS1-independent/NRG1-dependent	Direct and indirect pathogen recognition	Monocots and Dicots
RNL	RPW8 (Resistance to Powdery Mildew 8)	Signal transduction	Helper proteins for TNL/CNL signaling	Monocots and Dicots

Mechanisms of Pathogen Recognition

NBS-LRR proteins function as intracellular immune receptors that detect pathogen effector molecules through two primary mechanisms: direct and indirect recognition. Direct recognition involves physical binding between the NBS-LRR protein and pathogen effector, as demonstrated by the rice Pi-ta protein binding to the Magnaporthe grisea effector AVR-Pita [2]. Indirect recognition, described by the "guard hypothesis," occurs when NBS-LRR proteins monitor host cellular components (guardees) that are modified by pathogen effectors. For example, the Arabidopsis RPM1 and RPS2 proteins detect modifications to the host protein RIN4 by bacterial effectors AvrRpm1/AvrB and AvrRpt2, respectively [2]. The LRR domain is primarily responsible for recognition specificity, while the NBS domain functions as a molecular switch activated by nucleotide exchange (ADP to ATP), triggering downstream defense signaling [2].

Structural Variations in NBS-LRR Genes: Comparative Genomic Analysis

Variation in NBS-LRR Gene Copy Number Across Species

Comparative genomic analyses reveal extensive variation in NBS-LRR gene numbers across plant species, reflecting distinct evolutionary paths and adaptation to pathogen pressures. A comprehensive study analyzing 34 plant species identified 12,820 NBS-domain-containing genes, highlighting the dramatic expansion and contraction of this gene family throughout plant evolution [3]. The number of NBS-LRR genes does not directly correlate with genome size but rather with the specific evolutionary history and selective pressures experienced by each species.

Table 2: NBS-LRR Gene Distribution Across Plant Species

Plant Species	Total NBS Genes	CNL	TNL	RNL	Notable Features
Akebia trifoliata	73	50	19	4	First characterization in this species
Helianthus annuus (Sunflower)	352	100	77	13	One-third of clusters on chromosome 13
Dioscorea rotundata (Yam)	167	166	0	1	Lacks TNL genes, typical of monocots
Capsicum annuum (Pepper)	252	248	4	-	54% of genes form 47 clusters
Dendrobium officinale	74	10	0	-	NBS-LRR gene degeneration observed
Arabidopsis thaliana	210	40	-	-	Reference species for comparative studies

Genomic Distribution and Cluster Formation

NBS-LRR genes are frequently distributed non-randomly across plant chromosomes, with a strong tendency to form gene clusters. These clusters often reside in chromosomal regions with high recombination rates, particularly near chromosome ends [26]. In sunflower, 75 NBS-LRR gene clusters were identified, with one-third located specifically on chromosome 13 [9]. Similarly, in pepper, 54% of NBS-LRR genes (136 genes) form 47 physical clusters across the genome, with chromosome 3 containing the highest number (10 clusters) [50]. This clustered organization facilitates the generation of diversity through unequal crossing-over and gene conversion, enabling plants to rapidly evolve new recognition specificities [72].

Impact of Structural Variations on Pathogen Recognition

Gene Duplication and Functional Diversification

Tandem and dispersed duplications represent major mechanisms for NBS-LRR gene expansion and diversification. In Akebia trifoliata, tandem and dispersed duplications produced 33 and 29 NBS genes, respectively [26]. These duplication events create genetic raw material for functional innovation through several processes:

Neofunctionalization: Duplicated genes acquire new recognition specificities through diversifying selection, particularly in the LRR domain [72].
Subfunctionalization: Duplicates partition ancestral functions, potentially leading to specialization in recognizing different pathogen variants.
Gene conversion: Sequence exchange between paralogs generates novel combinations of recognition specificities [72].

The birth-and-death evolution model, characterized by continuous gene duplication and loss, drives the rapid turnover of NBS-LRR genes, allowing plants to adapt to changing pathogen populations [72].

Domain Architecture Variations and Integrated Decoys

Structural variations affecting protein domain architecture significantly impact recognition capabilities. Beyond canonical NBS-LRR proteins, many variants exist with atypical domain combinations. In Dioscorea rotundata, NBS-LRR genes were classified into six distinct architectural groups, with 16 different integrated domains detected in 15 genes [116]. These integrated domains often function as "decoys" that mimic the structure of authentic pathogen targets, enabling indirect recognition of effectors through integrated decoy domains [116]. This evolutionary strategy allows plants to expand their recognition repertoire without developing completely new binding interfaces.

Structural Variations and Signaling Specificity

Subfamily-Specific Signaling Pathways

Structural variations in NBS-LRR genes have profound implications for signaling specificity, particularly through the differential utilization of subfamily-specific signaling components. TNL proteins generally require ENHANCED DISEASE SUSCEPTIBILITY 1 (EDS1) for signal transduction, while CNL proteins typically signal through NON-RACE-SPECIFIC DISEASE RESISTANCE 1 (NDR1) [12]. RNL proteins, represented by the NRG1 and ADR1 lineages, function as signaling helpers that operate downstream of TNL and CNL activation [3]. Species-specific variations in subfamily composition therefore directly influence signaling pathway utilization and immune response outcomes.

Diagram 1: NBS-LRR mediated pathogen recognition and signaling activation pathways. NBS-LRR proteins can be activated through either direct effector binding or indirect detection via guardee modification, leading to conformational changes and nucleotide exchange that trigger defense responses.

Evolutionary Patterns Across Plant Families

Comparative analyses reveal distinct evolutionary patterns of NBS-LRR genes across plant families, reflecting different pathogen pressures and evolutionary strategies. In Rosaceae species, independent gene duplication and loss events have resulted in diverse evolutionary patterns: "first expansion and then contraction" in Rubus occidentalis and Fragaria iinumae, "continuous expansion" in Rosa chinensis, and "early sharp expanding to abrupt shrinking" in Prunus and Maleae species [12]. Similarly, in Solanaceae species, potato exhibits "consistent expansion," tomato shows "expansion followed by contraction," while pepper displays a "shrinking" pattern [12]. These distinct evolutionary trajectories highlight how structural variations drive species-specific adaptations to local pathogen environments.

Methodologies for Structural Variation Analysis

Genome-Wide Identification of NBS-LRR Genes

Standardized pipelines for NBS-LRR gene identification typically combine multiple complementary approaches to ensure comprehensive detection:

Hidden Markov Model (HMM) Profiling: Using the NB-ARC domain (PF00931) as a query to scan protein sequences with tools like HMMER [26] [9].
BLAST Searches: Employing reference NBS-LRR sequences from model species like Arabidopsis thaliana to identify homologs [9].
Domain Validation: Confirming identified candidates through Pfam and NCBI-CDD searches for characteristic domains (TIR, CC, RPW8, LRR) [12].
Manual Curation: Removing redundant hits and verifying domain architecture through multiple databases.

This integrated approach maximizes sensitivity and specificity in NBS-LRR gene annotation, facilitating cross-species comparisons.

Structural Variation Detection Methods

Advanced sequencing technologies have revolutionized our ability to detect structural variations in plant genomes:

Short-read sequencing: Enables paired-end mapping, split-read mapping, and read-depth analysis for SV detection [115].
Long-read sequencing: Permits comprehensive characterization of large chromosomal rearrangements and complex variations [115].
Array-based methods: SNP arrays and comparative genomic hybridization provide complementary approaches for copy number variation detection [115].
Pan-genome construction: Reveals presence/absence variations and species-specific genes by comparing multiple individuals [115].

The combination of these approaches has enabled researchers to move beyond single reference genomes to develop pan-genome resources that capture the full spectrum of structural variations within species [115].

Diagram 2: Experimental workflow for structural variation analysis in NBS-LRR genes. The integrated approach combines genome sequencing, NBS-LRR annotation, and structural variation detection to enable comparative genomic studies.

Table 3: Essential Research Reagents and Databases for NBS-LRR Gene Analysis

Resource Type	Specific Tool/Reagent	Function/Application	Example Use Case
Bioinformatics Databases	Pfam Database	Protein family and domain annotation	Verify NBS domain presence (PF00931)
	NCBI-CDD	Conserved domain identification	Classify TIR, CC, RPW8 domains
	PRGdb	Pathogen Recognition Gene database	Reference for characterized R genes
	Plaza Genome Database	Comparative genomics platform	Cross-species NBS-LRR comparisons
Experimental Reagents	Degenerate PCR primers	Amplify NBS domain fragments	Isolate R-gene analogs from unsequenced species [72]
	RNA-seq libraries	Transcript expression profiling	Identify responsive NBS-LRR genes under pathogen challenge
	Virus-Induced Gene Silencing (VIGS) constructs	Functional validation	Knockdown candidate NBS-LRR genes [3]
Analysis Tools	MEME Suite	Conserved motif identification	Characterize NBS domain motifs [26]
	OrthoFinder	Orthogroup inference	Determine evolutionary relationships among NBS-LRR genes [3]
	MCL algorithm	Gene clustering analysis	Identify tandemly duplicated NBS-LRR genes [3]

Structural variations in NBS-LRR genes represent a powerful evolutionary force driving plant adaptation to diverse pathogen challenges. The comparative analysis across plant species reveals that SVs influence not only gene copy number and genomic distribution but also functional specialization in pathogen recognition and signaling specificity. The dynamic evolutionary patterns observed—from expansion and contraction to subfamily-specific diversification—highlight the complex interplay between structural variations and immune system adaptation. Future research leveraging pan-genome approaches and long-read sequencing technologies will further illuminate how structural variations shape the evolutionary arms race between plants and their pathogens, providing insights for developing durable disease resistance in crop species.

Nucleotide-binding leucine-rich repeat receptors (NLRs) constitute a pivotal class of intracellular immune receptors that enable plants to recognize pathogen-derived effectors and initiate robust defense responses [117] [118]. Understanding the evolutionary dynamics of these genes requires sophisticated comparative genomics approaches, with synteny and orthology analyses emerging as fundamental methodologies. Synteny, the conservation of gene order on chromosomes across evolutionary time, provides crucial insights into the evolutionary interconnections of genes within and across species [119]. When combined with orthology analysis—the identification of genes descended from a common ancestor—these approaches powerfully illuminate the genomic evolutionary trajectories of NLR genes, revealing patterns of expansion, contraction, and diversification that have shaped plant immunity systems across angiosperms [119] [120].

The rapid duplication and loss characteristic of NLR genes have historically complicated evolutionary inferences, particularly exemplified by the mysterious loss of TNL family genes in monocots [119]. However, recent advances in synteny-informed classification systems and large-scale comparative genomics are now unraveling these complexities, providing unprecedented insights into the malleability-driven journey that has shaped NLR functionality and diversity across plant lineages [119] [121]. This guide systematically compares the experimental approaches and data types employed in synteny and orthology analyses of NLR loci, providing researchers with practical methodologies for tracing the evolutionary conservation of these critical immune receptors.

Comparative Framework for NLR Evolutionary Analysis

Classification and Diversity of NLR Genes

Plant NLR proteins typically consist of three fundamental domains: an N-terminal domain, a central nucleotide-binding site (NBS) domain, and a C-terminal leucine-rich repeat (LRR) region [119]. Based on the N-terminal domain, NLRs are categorized into several subclasses: TIR-NBS-LRR (TNL), CC-NBS-LRR (CNL), and RPW8-NBS-LRR (RNL) [119] [3]. Recent synteny-informed analyses have refined this classification, further subdividing CNLs into three distinct subclasses (CNLA, CNLB, and CNL_C) while maintaining TNL and RNL as separate categories [119].

This refined classification system has proven particularly valuable for resolving long-standing evolutionary puzzles. For instance, compelling microsynteny evidence indicates a clear synteny correspondence between non-TNLs in monocots and the extinct TNL subclass, providing a model to explain the disappearance of TNL genes in monocot lineages [119]. Such insights demonstrate the power of synteny-based approaches for elucidating NLR evolutionary history.

Table 1: NLR Subclassification Based on Synteny and Phylogenetic Analysis

NLR Class	N-terminal Domain	Key Characteristics	Evolutionary Notes
CNL_A	Coiled-coil (CC)	Includes GmRps1k, OsXa1, OsR3, SlI2	Expanded subfamily with specific syntenic relationships
CNL_B	Coiled-coil (CC)	Includes AtZAR1, AtLOV1, TaSr35, OsPi9	Forms resistosomes as cation channels for Ca2+ influx
CNL_C	Coiled-coil (CC)	Includes AtSUMM2, AtRPS5, AtRPS2	Sister group to CNLA and CNLB
TNL	TIR	NADase activity producing signaling molecules	Lost in monocots; shows synteny to non-TNLs in monocots
RNL	RPW8	Helper NLRs (ADR1 and NRG1 subclasses)	Sister group to all CNLs; limited expansion

Genomic Distribution and Dynamic Evolution

NLR genes display remarkable variation in copy number across plant species, ranging from just a few dozen in some plants to over two thousand in others like Triticum aestivum (bread wheat) [119] [15]. This variation reflects dynamic evolutionary processes including whole-genome duplication (WGD), tandem duplications, and gene loss events [3] [120].

Comparative analyses across diverse angiosperm families reveal distinct evolutionary patterns. In the Apiaceae family, significant variation in NLR gene counts has been observed, with Coriandrum sativum (coriander) possessing 183 NLR genes compared to 95 in Angelica sinensis [120]. Phylogenetic analysis suggests these genes were derived from 183 ancestral NLR lineages that experienced different levels of gene loss and gain events during speciation [120].

Similarly, studies in the Oleaceae family reveal how different ecological pressures have shaped NLR evolution. While Fraxinus (ash) species predominantly employ a strategy of gene conservation, Olea (olive) species have undergone extensive gene expansion driven by recent duplications and the birth of novel NLR gene families [122]. These differences likely reflect distinct immunological strategies: Olea's expanded NLR repertoire enhances pathogen recognition capabilities, while Fraxinus maintains specialized immune responses through conserved genes [122].

Table 2: NLR Gene Distribution Across Plant Lineages

Plant Species	Family	Total NLRs	CNLs	TNLs	RNLs	Research Findings
Arabidopsis thaliana	Brassicaceae	151	55	94	2	Reference for comparative studies
Oropetium thomaeum	Poaceae	Dozens	Majority	0	Few	Minimal NLR expansion
Triticum aestivum	Poaceae	>2,000	Majority	0	Few	Extensive NLR expansion
Asparagus officinalis	Asparagaceae	27	-	-	-	Domesticated contraction
Asparagus setaceus	Asparagaceae	63	-	-	-	Wild relative with expanded NLR
Glycine max (annual)	Fabaceae	Expanded	Various	Various	Few	Recent duplication (0.1-0.5 MYA)
Glycine spp. (perennial)	Fabaceae	Contracted	Various	Various	Few	Post-polyploidy contraction

Experimental Frameworks for Synteny and Orthology Analysis

Genomic Identification and Annotation of NLR Genes

Step 1: Sequence Identification The initial step involves comprehensive identification of NLR genes from genomic sequences. Two primary approaches are employed:

Hidden Markov Model (HMM) searches using the conserved NB-ARC domain (Pfam: PF00931) as query with stringent E-value cutoffs (e.g., 1e-10) [120] [15]
BLAST-based searches using reference NLR protein sequences from model species like Arabidopsis thaliana, Oryza sativa, and Allium sativum [15]

Specialized tools have been developed to enhance annotation accuracy:

NLRtracker: A sensitive annotation tool that utilizes protein sequence files as input [117] [122]
NLR-Annotator: Suitable for users working with nucleotide sequence files [117]

Step 2: Domain Architecture Validation Candidate sequences identified through initial searches must be validated using domain analysis tools:

InterProScan: Characterizes protein domains and functions [117] [15]
NCBI's Batch CD-Search: Verifies presence of NB-ARC domain (E-value ≤ 1e-5) [15]

Only sequences containing the definitive NB-ARC domain are retained as bona fide NLR genes, with subsequent classification based on complete domain architecture [15].

Synteny Network Construction and Orthology Assignment

Microsynteny Network Analysis Large-scale synteny analysis involves:

Identifying syntenic blocks across multiple genomes using tools like MCScanX [120]
Constructing synteny networks with nodes representing NLR genes and edges representing syntenic relationships [119]
Applying k-core filtering (e.g., k=3) to visualize the main network structure [119]

Orthology Assignment Orthologous groups are determined using:

OrthoFinder: Utilizes DIAMOND for sequence similarity searches and MCL for clustering [3] [15]
Notung software: Compares phylogenetic trees of NLR genes with species trees to determine gene loss/duplication events [120]

This integrated approach allows researchers to distinguish between orthologs (genes separated by speciation events) and paralogs (genes separated by duplication events), crucial for understanding NLR evolutionary trajectories.

Phylogenetic Reconstruction and Motif Analysis

Multiple Sequence Alignment and Tree Building

Sequence Alignment: MAFFT or ClustalW/Clustal Omega for aligning NLR protein sequences [117] [120] [15]
Phylogenetic Inference: IQ-TREE or RAxML for maximum likelihood-based tree construction with robust branch support values (e.g., 1,000 bootstrap replicates) [117] [120]

Conserved Motif Prediction

MEME Suite: Identifies conserved sequence motifs within NLR subfamilies [117] [120] [15]
Alternative approach: HMMER for searching sequence homologs and building sequence alignments [117]

This workflow enables researchers to identify functionally important motifs, such as the MADA and EDVID motifs in CC-NLRs, that have remained conserved over evolutionary time [117].

Diagram Title: NLR Evolutionary Analysis Workflow

Essential Research Reagents and Computational Tools

Table 3: Essential Research Reagents and Computational Tools for NLR Evolutionary Studies

Tool/Resource	Type	Primary Function	Application in NLR Studies
NLRtracker	Software	NLR annotation from proteomes	Identifies NLR genes from protein sequences with high sensitivity [117] [122]
OrthoFinder	Software	Orthogroup inference	Clusters NLR genes into orthologous groups across species [3] [15]
MCScanX	Software	Synteny analysis	Identifies syntenic blocks and gene collinearity [120]
MAFFT	Software	Multiple sequence alignment	Aligns NLR protein sequences for phylogenetic analysis [117] [120]
MEME Suite	Software	Motif discovery	Identifies conserved sequence motifs in NLR subfamilies [117] [15]
ANNA Database	Database	Angiosperm NLR atlas	Contains >90,000 NLR genes from 304 angiosperm genomes [3] [121]
Pfam NB-ARC	HMM Profile	Domain identification	PF00931 for identifying NBS domains [3] [120]

Key Insights from Comparative Synteny and Orthology Studies

Evolutionary Patterns Across Plant Lineages

Synteny and orthology analyses have revealed several fundamental patterns in NLR gene evolution:

Convergent NLR Reduction Comparative genomic analyses have identified convergent NLR reduction associated with adaptations to aquatic, parasitic, and carnivorous lifestyles [121]. This contraction pattern resembles the lack of NLR expansion observed in green algae before terrestrial colonization, suggesting ecological constraints powerfully shape NLR repertoire size [121].

Life History Strategy Influences In the genus Glycine, striking differences exist between annual and perennial species. Annual species (G. max and G. soja) exhibit expanded NLRomes compared to perennial relatives, with recent accelerated gene duplication events occurring between 0.1-0.5 million years ago [123]. Perennials experienced significant contraction following the Glycine-specific whole-genome duplication event (~10 million years ago) but developed a unique, highly diversified NLR repertoire with limited interspecies synteny [123].

Domestication-Associated Contraction Comparative analysis of garden asparagus (Asparagus officinalis) and its wild relatives reveals marked NLR contraction during domestication, with gene counts decreasing from 63 in A. setaceus to just 27 in domesticated A. officinalis [15]. This contraction, coupled with reduced expression of retained NLR genes, likely contributes to increased disease susceptibility in cultivated varieties [15].

Co-evolution with Immune Signaling Components

Synteny-informed studies have revealed crucial co-evolutionary patterns between NLR subclasses and components of plant immune signaling pathways. Notably, immune pathway deficiencies appear to drive TNL loss in certain lineages [121]. Furthermore, researchers have identified a conserved TNL lineage that may function independently of the canonical EDS1–SAG101–NRG1 module, suggesting previously unrecognized diversity in NLR signaling mechanisms [121].

Diagram Title: Evolutionary Forces Shaping NLR Repertoires

Synteny and orthology analyses have transformed our understanding of NLR gene evolution, revealing dynamic patterns of expansion, contraction, and diversification across plant lineages. The methodological framework presented in this guide—integrating genomic identification, synteny network construction, orthology assignment, and phylogenetic analysis—provides researchers with powerful approaches for tracing the evolutionary conservation of NLR loci. As genomic resources continue to expand, these comparative methods will undoubtedly yield further insights into the co-evolutionary arms race between plants and their pathogens, potentially informing future crop improvement strategies aimed at enhancing disease resistance.

Conclusion

The comparative analysis of NBS genes across plant species reveals a dynamic evolutionary landscape shaped by duplication, diversification, and selection pressure from pathogens. Key takeaways include the extensive diversity of NBS domain architectures, the significant expansion of these genes in flowering plants, and the critical role of regulatory mechanisms like miRNAs in maintaining this vast repertoire. The functional validation of specific NBS genes, such as GaNBS in cotton, underscores their direct role in pathogen defense. Future directions for research should leverage pan-genomic approaches to capture the full spectrum of NBS diversity, particularly in non-model and wild species that harbor valuable resistance alleles. For biomedical and clinical research, the principles of plant NBS gene evolution—including the mechanisms for recognizing diverse pathogen effectors and the regulatory networks that control immune receptor activity—offer valuable conceptual parallels for understanding innate immunity in humans and developing novel strategies for managing genetic diseases. The integration of NBS gene data into breeding programs via marker-assisted selection holds immense promise for developing durable disease-resistant crops, enhancing global food security.