Predicting Long-Term Student Outcomes with Relative First-Year Performance

Performance in first-year courses—particularly when viewed in terms of relative standing among peers—is a strong and consistent predictor of whether a student ultimately completes their degree. Students consistently in the top performance quintile across these early courses graduate at exceptionally high rates, while those consistently in the bottom quintile are far more likely to leave without a qualification. This contrast underscores the importance of identifying relative academic risk early—especially because such risk is not always visible through conventional pass/fail rates or average grade thresholds. Relative performance measures, such as quintile standing or distance from the median, offer insights that remain hidden when relying solely on aggregate indicators. These approaches reveal how students perform in comparison to their peers, offering a more sensitive and independent signal of academic vulnerability that can trigger earlier and more tailored interventions. Institutions that incorporate these signals into predictive models and support systems can shift from reactive remediation to proactive, student-centered success strategies.

During an analytics meeting a couple of years ago, a member made an off-hand but memorable remark: “I always tell my students to not only look at their grades but also where they stand in relation to their friends.” The comment, though informal, sparked a line of thinking that reshaped how I approached academic performance metrics. It suggested that academic risk may not lie solely in failing grades or low averages, but in being consistently behind one’s peers—even when passing. This reflection led to the concept of “distance from the median”—a performance indicator that is not tied to the absolute value of the median itself, but to how far an individual deviates from the central tendency of the group. Unlike pass/fail markers or raw grade averages, this perspective offers a more context-sensitive understanding of academic performance and risk.

This insight found empirical traction in institutional research when I examined first-year performance in 1000-level courses. A clear pattern emerged: students whose grades are consistently higher than the median of their class (i.e., in the higher performance quintiles) graduate at much higher rates, while those consistently much lower than the median (e.g., in the bottom quintile) are far more likely to exit the institution either through academic exclusion or voluntary departure in good standing. These findings affirm that relative academic positioning offers a sharper, earlier, and more proactive lens for identifying risk than traditional measures alone.

Establishing these performance groupings is simple: students’ grades were sorted in descending order (ranked), and these ordered grades are then divided into five equal segments (quintiles), each segment comprising 20% of the student cohort. Those in the top quintile were among the highest performers in their first-year courses, while those in the bottom quintile represented the lowest. This method isolates performance extremes, helping to highlight which students are most at risk and which patterns warrant further institutional attention.

Whether a student is excluded or chooses to leave, the result is an uncompleted degree. Encouragingly, the data suggest a modest upward trend in graduation rates even among those initially in the bottom quintile—perhaps an early signal that targeted academic interventions are gaining traction.

The implications of these patterns are substantial. If first-year course performance can reliably predict student trajectory, then those early signals must be treated as operational inputs into a system of proactive intervention. Predictive analytics allows universities to identify students who may be at risk within the first semester—or even the first few weeks—of enrollment. By aggregating signals from formative assessments, participation, and early course grades, institutions can construct actionable profiles for timely support.

What emerges is not just a snapshot of student success, but a blueprint for institutional action. If the university takes these early academic signals seriously—treating them as diagnostic rather than merely descriptive—it can shift from passive observation to active intervention. In doing so, it transforms the first-year experience from a sorting mechanism into a launchpad. The first year is not simply a prerequisite for progress; it is a formative period that, if understood and acted upon, can shape the future of both individual learners and the institution itself.

Identifying ‘At-Risk’ Courses Through Campus-Wide Analysis of Grade Volatility

Analysis of longitudinal, campus-wide assessment data can be used to identify important differences between courses based on how grade volatility affects final grade distributions. The basic tenet here is that a well-organized course enrolling similarly capable cohorts of students year after year should have a relatively stable distribution of grades. Using the Mean Absolute Deviation (MAD) values of all course median grades over a 10-year period can quickly produce a list of potentially problematic courses that exhibit wildly varying class medians and variations in student grades. With minimal effort and without delving into the complexities of pedagogy and academic administration, such an analysis can provide an important signal that a course may be in trouble, motivating further investigation.

Strategies for student success mostly encompass some form of either (1) strengthening students through various support measures or (2) removing unreasonable barriers to their success. Academic analytics of assessment data can be used to illustrate differences between courses and potentially reveal problematic courses that may not be self-evident unless examined from a longitudinal perspective. This post is concerned with the latter: could there be courses that exhibit unreasonable variation, and if so, who are they and where are they located?

To answer this, we turn to statistical measures that can effectively quantify such variations. Mean Absolute Deviation (MAD) is particularly well-suited for this analysis, as it quantifies the average distance of data points from the median, making it a robust tool for assessing grade volatility over time. Additionally, when combined with the Coefficient of Variation (CoV), MAD enables a comprehensive evaluation of grading stability by considering both absolute median shifts and relative variability in student performance. These two measures together allow institutions to pinpoint courses with erratic grading patterns, guiding targeted academic interventions and quality assurance efforts.

This plot visualizes (for 6 different faculties) course stability by mapping the Mean Absolute Deviation (MAD) of median grades against the MAD of the Coefficient of Variation (CoV). The x-axis represents the MAD of median grades, while the y-axis represents the MAD of CoV, allowing us to observe how much variation exists both within and across years. The graph is divided into four quadrants using threshold lines at x=4 and y=4, creating a classification system for course stability. The bottom-left quadrant indicates courses with the least volatility, suggesting stable grading patterns and consistent student performance. In contrast, the top-right quadrant highlights courses with the highest volatility, signaling potential inconsistencies in assessment practices, instructional quality, or course design. Courses are plotted as individual points on this scatter plot, providing an intuitive way to identify outliers and prioritize further investigation into courses exhibiting extreme variability.

The broader significance of this approach lies in its ability to function as a signal. Courses that demonstrate significant grade volatility may not always be problematic, but they warrant closer scrutiny. In some cases, shifts in grading distributions may coincide with changes in faculty, curriculum reforms, or shifts in student demographics. In other cases, they may signal deeper issues—poorly designed assessments, inconsistent grading policies, or structural barriers that disproportionately impact student success.

From a systems theory perspective, analyzing final-grade distributions is the necessary function of a university as a self-referential entity, extracting signal from noise through selective processing of information. Fluctuations in grading patterns are not mere statistical anomalies but alarm bells indicating that a course may require closer scrutiny. By leveraging MAD in a data-driven approach, institutions move beyond reliance on faculty self-reporting or periodic program reviews, creating a continuous feedback loop that highlights courses needing attention. This methodology fosters institutional reflexivity, encouraging universities to investigate root causes, implement necessary interventions, and ultimately improve student outcomes while reinforcing academic integrity.

From Noise to Meaning: Sifting “Course Repeat Rates” Through Systems Theory

Institutional research extends beyond data analysis, often functioning as a systemic process of self-observation in higher education. Even a cursory understanding of Luhmann’s Social Systems Theory reveals how the operation of self-observation is the necessary condition for the possibility of transforming raw data into actionable insights. It is precisely this process that enables universities to sift through vast amounts of information to identify, for example, key academic bottlenecks that influence student success—often without explicitly relying on theoretical frameworks. Therefore, recognizing metrics such as Course Repeat Rates (CRR) as institutional operations presents an opportunity to illustrate how data-driven decision-making aligns with social systems theory. By providing a framework for analyzing complex interdependencies and communication flows within educational institutions, Luhmann’s theory empowers institutional researchers to uncover underlying patterns and dynamics previously inaccessible through conventional IR approaches. The significance of this alignment for institutional research simply cannot be overestimated.

Institutional research often grapples with vast amounts of raw data, seeking to transform it into actionable insights that inform academic policy. One such dataset—Course Repeat Rates (CRR)—holds significant potential for understanding student progression and the structural barriers within degree programs. In a previous post, I examined how repeat rates function as indicators of academic bottlenecks, identifying courses that either facilitate student advancement or obstruct it. However, this exploration gains deeper analytical clarity when framed within Niklas Luhmann’s systems theory, particularly his model of how information moves from noise to signal to meaning.

Luhmann’s theories provide a robust conceptual foundation for understanding how universities, as autopoietic systems, filter, interpret, and act upon information. By situating institutional research within the broader academic discourse of systems theory, we do more than analyze data—we engage in a theoretical discussion about how knowledge is produced and operationalized within higher education.

Luhmann argues that systems exist in environments saturated with information, most of which is mere noise. Noise, in this sense, represents unprocessed data—vast amounts of student performance records, enrollment figures, and academic results that, without context, remain unintelligible. When examining course repeat rates, the initial dataset is just that: a collection of numbers indicating how many students enroll, pass, fail, or repeat specific courses. At this stage, the data is indiscriminate and without interpretive structure. It does not yet communicate anything meaningful to the institution.

The process of identifying signal occurs when the university system begins to filter through this mass of data, isolating patterns that warrant attention. Some courses emerge as outliers, with disproportionately high repeat rates. These courses potentially hinder student progression, delaying graduation and increasing dropout risks. Here, the system differentiates between random variations and persistent academic obstacles, recognizing that certain courses act as gatekeepers. The repeat rate ceases to be just a statistic; it becomes a signal—a piece of information that demands further investigation.

Yet, a signal alone does not equate to meaning. In Luhmannian terms, meaning only emerges when signals are contextualized within the system’s self-referential operations. At the institutional level, this means interpreting course repeat rates not merely as numerical trends but as reflections of deeper structural and pedagogical issues. The university, as a system, must ask: Are these high-repeat courses designed in ways that disproportionately disadvantage students? Do they require curricular revisions? Should additional academic support structures be implemented? Through this process of self-referential engagement, the institution constructs meaning from the data and translates it into policy discussions, resource allocations, and strategic interventions.

By framing course repeat rates within Luhmann’s meaning-making, institutional research becomes more than just data analysis—it becomes a theoretical exercise in understanding how universities process, adapt, and evolve. Higher education institutions are not passive recipients of data; they are systems that continuously redefine themselves through the selective interpretation of information. In this way, the study of course repeat rates, for example, demonstrates how institutional research could be deeply embedded in systems theory, shaping academic policies through an ongoing feedback loop of observation, selection, and adaptation.

This discussion (and this blog) is an attempt to locate institutional research within the epistemological framework of systems theory. By invoking Luhmann, we recognize that data-driven decision-making in higher education is not a straightforward process of collecting numbers and drawing conclusions. It is a complex, systemic function, where institutions filter out noise, extract meaningful signals, and ultimately construct the knowledge that informs their operations. Thus, tracking course repeat rates is not just about measuring academic performance—it is about understanding how universities, as self-referential systems, generate meaning from information and use it to sustain their functions.

Analyzing Course Repeat Rates as Indicators of Academic Progression

Student progression and eventual graduation are directly bound to the successful completion of a series of mandatory courses. These courses not only form the backbone of degree programs but also serve as critical gatekeepers in a student’s academic journey. This exploration seeks to investigate Course Repeat Rate (CRR) as a potential indicator of a course’s significance in determining academic progression and graduation outcomes. Given that students must repeat and pass required courses to advance through their programs, the frequency with which these courses are repeated by students enrolled in the same degree programmes provides valuable insights into their role as pivotal checkpoints within degree pathways.

From time to time I am tasked with examining data trends that influence our academic environment. Recently, a request from one of our faculty prompted a closer investigation into the role of compulsory service courses within our university. These courses sometimes appear to be barriers, preventing students from advancing efficiently through their degree programs.

In addressing this Issue, I proposed focusing on the course repeat rate as a tool for understanding these academic obstacles. At UCT, like many institutions worldwide, students’ progression and graduation depend on completing a series of mandatory courses. When students fail these required courses, they must retake and pass them to progress or graduate. This situation provides an opportunity to analyze how often these courses are repeated across various degree programs. By doing so, we can identify which courses function as significant gatekeepers in academic progression.

The importance of identifying high repeat rate courses lies in their dual role: facilitating student advancement or hindering it. By concentrating on these ‘gatekeeper’ courses, we can explore opportunities for intervention through curriculum modifications or additional support mechanisms. The goal is to ensure these courses act more as facilitators rather than barriers. My proposal suggests using course repeat rates not just as data points but as indicators of importance within our academic structure. This approach aims to enhance educational efficacy at UCT by improving individual student outcomes and refining institutional practices.

About me and this blog

Institutional research, when viewed through the lens of systems theory, embodies the university’s capacity for self-observation and self-description—key operations that sustain and adapt complex systems. By exploring these concepts, I aim to locate institutional research within its proper theoretical context: as the mechanism by which the university reflects on itself, generates knowledge about its structures and processes, and adapts to changing conditions. This blog will serve as my laboratory for analyzing these ideas, testing their practical applications, and ultimately contributing to a richer understanding of how institutional research supports the university’s continuous evolution. Through thoughtful analysis and dialogue, I hope to bridge theory and practice, building a framework that not only enhances my professional growth but also advances the field of institutional research itself.
– KM Kefale


Welcome to “Systems Theory for Institutional Research”, a blog where I explore the intersections of social systems theory and higher education analytics. My name is Kende Kefale, and I am an information analyst with particular interest in higher education. This blog reflects my continued work in analyzing institutions as complex systems and leveraging data-driven insights to improve their operations and outcomes.

In 2013, I completed my PhD titled “The University as a Social System,” inspired by the groundbreaking work of Niklas Luhmann. Luhmann’s theory of social systems, which emphasizes the self-referential and operationally closed nature of systems, closely informs my approach to understanding universities. This lens allows me to analyze the interplay of subsystems within academic institutions and identify the feedback loops that drive their adaptation and evolution.

Over my career, I have worked closely with the University of Cape Town, contributing to institutional research, data analytics, and decision-making. My current role in the Institutional Information Unit and the Data Analytics for Student Success (DASS) team  involves transforming institutional data into actionable insights that improve student outcomes and support evidence-based policies. I use tools like PowerBI, SQL, and Python to create impactful visualizations and prototypes that inform decisions across various university departments.

With my career trajectory now firmly set towards becoming an institutional researcher, I see this blog as a space to refine my ideas, share insights, and engage with the broader academic and professional community.

Institutional research, when viewed through the lens of systems theory, embodies the university’s capacity for self-observation and self-description—key operations that sustain and adapt complex systems. By delving deeply into these concepts, I aim to locate institutional research within its proper theoretical context: as the mechanism by which the university reflects on itself, generates knowledge about its structures and processes, and adapts to changing conditions. This blog will serve as my laboratory for exploring these ideas, testing their practical applications, and ultimately contributing to a richer understanding of how institutional research supports the university’s continuous evolution. Through thoughtful analysis and dialogue, I hope to bridge theory and practice, building a framework that not only enhances my professional growth but also advances the field of institutional research itself.

Thank you for visiting “Systems Theory for Institutional Research.” I hope you find the ideas shared here thought-provoking and relevant. Let’s explore how data, theory, and systems thinking can converge to shape the future of higher education.