Student retention is critical for educational institutions, impacting financial sustainability and
academic success. High dropout rates can lead to revenue losses and reputational damage.
Study Group, a global education provider, aims to enhance student success by identifying at
risk students early and implementing proactive interventions. This study applies supervised
machine learning techniques to predict dropout risks, enabling Study Group to refine its support
strategies and improve student retention.
- Project Definition
- Jupyter Notebook
- Report
Business context
Study Group specialises in providing educational services and resources to students and professionals across various fields. The company’s primary focus is on enhancing learning experiences through a range of services, including online courses, tutoring, and educational consulting. By leveraging cutting-edge technology and a team of experienced educators, Study Group aims to bridge the gap between traditional learning methods and the evolving needs of today’s learners.
Study Group serves its university partners by establishing strategic partnerships to enhance the universities’ global reach and diversity. It supports the universities in their efforts to attract international students, thereby enriching the cultural and academic landscape of their campuses. It works closely with university faculty and staff to ensure that the universities are prepared and equipped to welcome and support a growing international student body. Its partnership with universities also offers international students a seamless transition into their chosen academic environment. Study Group runs several International Study Centres across the UK and Dublin in partnership with universities with the aim of preparing a pipeline of talented international students from diverse backgrounds for degree study. These centres help international students adapt to the academic, cultural, and social aspects of studying abroad. This is achieved by improving conversational and subject-specific language skills and academic readiness before students progress to a full degree programme at university.
Through its comprehensive suite of services, Study Group supports learners and universities at every stage of their educational journey, from high school to postgraduate studies. Its approach is tailored to meet the unique needs of each learner, offering personalised learning paths and flexible scheduling options to accommodate various learning styles and commitments.
Study Group’s services are designed to be accessible and affordable, making quality education a reality for many individuals. By focusing on the integration of technology and personalised learning, the company aims to empower learners to achieve their full potential and succeed in their academic and professional pursuits. Study Group is at the forefront of transforming how people learn and grow through its dedication to innovation and excellence.
I have been provided with three data sets:
Applicant and course information (Stage1_data.csv)
The data set contains 25,060 rows, each representing a learner with details about the learner (e.g. nationality or home country). It encompasses data related to the entire course and provides an overview of the learner’s performance and engagement throughout the course. The data set has 16 features I can choose from:
CentreName: Study Group centre name/identifier
LearnerCode: Student identifier
BookingType: The type of booking made for the course
LeadSource: How the learner found out about the course
DiscountType: The type of discount applied to the learner’s course fees, if any
DateofBirth: The student’s date of birth
Gender: The student’s gender
Nationality: The student’s nationality
HomeState: The state or province where the learner’s permanent residence is located
HomeCity: The city where the learner’s permanent residence is located
CourseLevel: The academic level of Study Group course, e.g. foundation, International Year 1 or Pre-Masters
CourseName: The full name of the course studied, e.g. Pre-Masters Business
IsFirstIntake: A boolean indicating if this is the learner’s first intake for the course
CompletedCourse: Yes/No whether the student completed the course
ProgressionDegree: The student’s intended degree of study at the partner university (if eligible to progress)
ProgressionUniversity: The student’s intended partner university (if eligible to progress)
Student engagement data – (Stage2_data.csv)
Contains all the fields provided in Stage1_data.csv plus 2 additional fields:
AuthorisedAbsenceCount: Count of lessons where the student’s absence was explained and authorised e.g. medical reasons, extenuating circumstances, etc. (across all modules)
UnauthorisedAbsenceCount: Count of lessons where no explanation was given for student absence, or explanation was deemed unsatisfactory (across all modules)
Academic performance data – (Stage3_data.csv)
Contains all the fields provided in Stage2_data.csv plus 3 additional fields:
AssessedModule: Total number of modules with assessment data
PassedModules: The total number of modules the learner passed
FailedModules: The total number of modules the learner failed
I successfully prepared a comprehensive report for stakeholders, presenting key insights derived from my analysis. My findings demonstrated how my solution could enhance student retention, improve educational outcomes, and save the institution money. Specifically, I:
- I developed a predictive model to anticipate student dropout among adult learners, leveraging learner data provided by Study Group.