Stylized visualization of a classroom social graph used by a GCN to predict student performance
Shenzhen, China, September 1, 2025
A lightweight two-layer Graph Convolutional Network (GCN) can predict four levels of classroom performance with strong accuracy by combining student attributes and social interaction data. Tested on a cleaned dataset of 732 students and a social graph of 5,184 edges, the model uses a 16-feature input matrix and achieves AUC scores near 0.91–0.92 and an F1 around 87%. The approach outperforms GAT and GraphSAGE, and ablation shows social ties are critical. The study highlights interpretability via GNNExplainer, notes limits in scale and multimodality, and recommends ethical adaptation before wider deployment.
A lightweight Graph Convolutional Network (GCN) has been developed to predict four-class classroom performance by fusing students’ individual attributes with social interaction data. In a study published in a mainstream science journal, researchers report an area under the curve (AUC) of approximately 0.91–0.92, indicating strong discrimination among performance categories. The approach treats each student as a graph node and represents interactions such as cooperation, discussion, peer evaluation, and online engagement as weighted connections between students. The overarching aim is to increase the objectivity and reliability of classroom grade assessments by combining multiple sources of data in a single predictive framework.
The article is titled Application of artificial intelligence graph convolutional network in classroom grade evaluation. It appears in Scientific Reports, volume 15, Article number 32044 (2025). The DOI is 10.1038/s41598-025-17903-4. The manuscript was received by the journal on 12 June 2025, accepted on 28 August 2025, and published on 01 September 2025. The corresponding author for data is Shuying Wu, with the dataset available on reasonable request via email at wushuying1234@126.com. Authorship highlights include Shuying Wu being credited with conceptualization, methodology, software, validation, formal analysis, investigation, resources, data curation, writing (original draft and review/editing), visualization, supervision, project administration, and funding acquisition.
The study draws on multi-source data from 12 classes across 4 grade levels in two Shenzhen schools. The data span classroom management systems, classroom observation records, and online learning platforms. The online platforms named include a Smart Education Platform for Primary and Secondary Schools of Shenzhen and the xueleyun Teaching Platform. All data collection was conducted with authorization from the schools and educational authorities, with ethics approval from the Liyuan Foreign Language Primary School in Futian District (Approval Number: 2023.39498000). Participants provided written informed consent, and the study followed relevant guidelines and regulations. The article is open access under the Creative Commons Attribution‑NonCommercial‑NoDerivatives 4.0 International (CC BY‑NC‑ND 4.0) license.
Dataset scope includes data from 12 classes across four grade levels in Shenzhen, collected over two academic semesters. The initial sample contained 802 individual student records, with 732 records retained after cleaning (records with more than 30% missing data excluded). The resulting graph contains 732 nodes and 5,184 edges, yielding an average degree of 14.16 where, on average, each student connects to about 14 peers. The feature matrix is a 732 × 16 array, meaning each node carries a 16‑dimensional feature vector.
The 16 input features fall into three categories: individual attributes, classroom behavior, and online behavior. They include age (normalized), gender (one‑hot), class, historical achievements, attendance rate, self‑rating, classroom speech frequency, group cooperation activity, teacher rating, peer rating, video learning duration, homework timeliness rate, forum posts, platform access frequency, online questioning frequency, and click path length. Numerical features were standardized (mean 0, standard deviation 1), categorical features were one‑hot encoded, and missing values were filled via multiple interpolation. Classroom speech frequency was standardized by class size to ensure comparability across classes.
The core idea is to treat students as graph nodes and to connect them through weighted edges that encode social interactions and collaboration. The study proposes a specific edge weight combination for a primary method, w_ij^comb, using three indicators: frequency of cooperation in class discussions, interaction frequency on online platforms, and peer ratings of learning communication. Weights are set as λ1 = 0.4, λ2 = 0.3, and λ3 = 0.3, with all indicators normalized to the range [0, 1] before they are combined into A, the adjacency matrix. In addition, an auxiliary approach uses cosine similarity between high‑dimensional behavior vectors to compute edge weights (w_ij^cos). Other graph variants tested include peer‑evaluation graphs, Pearson‑correlation graphs, fully connected graphs, and graphs built from different strategies for comparison.
The researchers used a two‑layer lightweight GCN, with hidden layer sizes of 128 and 64 neurons, respectively. ReLU activation was employed, and dropout probability was set to 0.5 after each layer. The loss function combined cross‑entropy for multi‑class classification with L2 regularization (weight decay of 0.0005). The optimizer was Adam with an initial learning rate of 0.01 and a learning rate decay schedule. Training followed standard gradient descent procedures.
The model outputs per‑node probability vectors ŷ_i ∈ [0, 1]^4 for the four classes. An array of baselines was tested, including Graph Attention Network (GAT), GraphSAGE, Support Vector Machine (SVM), linear regression, decision trees, and a rule‑based method. A random classifier served as a minimal baseline. The dataset was split into training (70% ≈ 512 samples), validation (15% ≈ 110), and test (15% ≈ 110), with stratified sampling to preserve label distribution. Five‑fold cross‑validation was conducted, and averages across five runs were reported to reduce randomness.
Across the board, the GCN outperformed the baselines. In one figure, the GCN achieved an AUC of about 0.92, while the GAT reached approximately 0.88 and GraphSAGE about 0.85. In cross‑validation (Table 4 in the study), the GCN delivered precision around 88.52%, recall about 86.47%, and F1‑score near 87.32% (all rounded). Compared with traditional methods such as linear regression and decision trees, GCN showed superior performance, with the F1‑score advantage exceeding 13% relative to those baselines. When the training set ratio increased to 80% (Figure 5), the model reached an accuracy of 87.6%, an F1‑score of 87.3%, and an AUC of 0.91, with gains leveling off beyond this point.
Confusion analysis showed the strongest accuracy for identifying “Excellent” (about 91%) and “To be improved” (about 86%). The ablation study revealed that using the full feature set with social structure intact yielded the best AUC around 0.91. Removing the social graph dramatically reduced AUC to roughly 0.68 and lowered overall accuracy to about 71%, underscoring the critical role of social relationships in the model. Features based solely on individual attributes produced an AUC around 0.74, while methods using only interaction features performed in between, illustrating the value of combining data sources.
Among the graph variants, the peer evaluation graph (mutual student ratings of each other’s learning communication) yielded the best performance with an AUC around 0.91. The Pearson similarity graph produced about 0.87, and a fully connected graph performed worst at roughly 0.81 due to noise from many edges. This points to the importance of a well‑chosen graph structure that respects social feedback in educational settings.
Interpretability analyses used GNNExplainer to identify influential neighbors and input features for predictions on test samples. In an example where a student was predicted as “excellent,” frequent group collaboration and high teacher ratings were among the most influential factors. Visualization of the learned embeddings with t‑SNE showed distinct clustering by performance category, supporting the model’s discriminative power. Practically, the researchers argue that a two‑layer GCN offers a balance of performance and efficiency, making it more feasible for primary and secondary school settings than heavier GNNs.
The authors acknowledge that the current graph relies mainly on questionnaires and behavioral logs; richer multimodal data (voice, video, facial cues) could enhance future graphs. They also note that training on much larger networks could require more efficient architectures or distributed training. While the article provides detailed experimental settings (software versions, hardware, and data processing steps) to aid reproducibility, full public release of code is not stated. The work also emphasizes that further interpretability enhancements, such as additional attention mechanisms or explainers, would strengthen transparency for educational use.
The project is supported by multiple sources including Futian District Primary School Chinese Master Teacher Studio in Shenzhen, Guangdong Province’s 2024 “Hundred, Thousand and Ten Thousand Talents Project” Special Research Project, Shenzhen Achievement Cultivation Project, and local school support from Liyuan Foreign Language Primary School. The article is open access under CC BY‑NC‑ND 4.0, and data can be requested from the corresponding author. The study situates itself within the broader goal of advancing objectivity and personal‑level insight in classroom assessment through graph‑based learning models.
Overall, the study demonstrates that a lightweight GCN approach can meaningfully improve the objectivity and accuracy of classroom grade evaluation by integrating multi‑source data and focusing on the dynamics of social interaction. It provides a practical blueprint for schools considering data‑driven assessment tools and points toward future work in multimodal data fusion and scalable training for larger educational networks.
Feature area | What it means |
---|---|
Model | Lightweight two‑layer Graph Convolutional Network designed for classroom data |
Data inputs | Multi‑source data: student attributes, classroom behavior, and online behavior |
Graph construction | Weighted edges based on cooperation, online interactions, and peer ratings; alternative cosine‑based graphs examined |
Dataset size | 732 valid records from 12 classes across two schools |
Performance | AUC ≈ 0.91–0.92; precision ≈ 88.5%; recall ≈ 86.5%; F1 ≈ 87.3–87.4% |
Baselines | Compared with GAT, GraphSAGE, SVM, linear regression, decision trees, and rule‑based methods |
Interpretability | GNNExplainer highlights influential neighbors and features; t‑SNE shows embedding clusters |
Ethics and license | Ethics approval obtained; CC BY‑NC‑ND 4.0 license; data access on reasonable request |
Applications | Objective, data‑driven classroom assessment support; potential for personalized teaching decisions |
Key features | Description |
---|---|
Model type | Lightweight two‑layer Graph Convolutional Network designed for classroom data |
Data inputs | Multi‑source data: student attributes, classroom behavior, and online behavior |
Graph design | Weighted edges based on cooperation, online interactions, and peer ratings; cosine similarity as an alternative |
Dataset size | 732 valid records from 12 classes across two schools |
Performance highlights | AUC ≈ 0.91–0.92; precision ≈ 88.5%; recall ≈ 86.5%; F1 ≈ 87.3% |
Baselines | Outperforms GAT, GraphSAGE, SVM, linear regression, and rule‑based methods |
Interpretability | GNNExplainer and embedding visualizations help explain predictions |
Ethics and license | Ethics approval obtained; CC BY‑NC‑ND 4.0 license; data available on request |
Practical impact | Supports objective, data‑driven classroom assessment and potential personalized teaching decisions |
Port of Brownsville, Texas, September 6, 2025 News Summary The Federal Energy Regulatory Commission reissued the…
Washington, D.C., September 6, 2025 News Summary Newport Beach-based T2 Hospitality has purchased the Washington Marriott…
256 Observer Highway, Hoboken, NJ, September 6, 2025 News Summary A $162 million senior construction loan…
California, September 6, 2025 News Summary Major investor-owned utilities in California are accelerating programs to place…
Global, September 6, 2025 News Summary A new forecast finds the global architectural services sector expanding…
Munich, September 6, 2025 News Summary Nemetschek will acquire Firmus AI through its Bluebeam subsidiary to…