Skip to main content
All CollectionsGroup Formation
Group Formation: Algorithm Explainer
Group Formation: Algorithm Explainer
Updated over 6 months ago

Introduction

This article is your guide to understanding our group formation algorithm, shedding light on its functionality and how it effectively manages diverse situations and input from students.

Where Can I Read More About Group Formation Tool?

We have a full suite of user guides for our Group Formation Tool - you can find the rest of the guides here:

What is the Algorithm?

A group formation algorithm is a computational process used to organize individuals into groups based on predefined parameters.

In educational contexts, particularly in schools or universities, group formation algorithms are employed by teachers or administrators to create balanced and effective groups of students for collaborative projects or activities.

What Does it Consider?

In the group formation algorithm, the following parameters are considered, all of which are determined by the teacher on the configuration stage:

Number of Students in the Group:

Question-Level Grouping Preference: Teachers decide whether they want students to be grouped based on similarities or dissimilarities on the individual question level.

Multiple Answers: In scenarios where selecting multiple answers is allowed for a question, all answer contributes to the similarity or dissimilarity score.

How It Works

Step 1: Calculating Normalized (Dis)Similarity Score

A normalized score is calculated for each pair of student, and each question. This score measures the degree of (dis)similarity between two students’ answers on an individual question.

Normalization ensures fair comparisons across questions with different numbers of possible answers.

Normalized Similarity Score

When the question is configured to group students based on similar answers, the normalized Similarity Score is calculated as follows:

Where:

  • C = Number of common answers chosen by both students

  • T = Total number of answers chosen by both students

Example: in Question 1

  • Student 1 answers: A, B

  • Student 2 answers: A

  • Number of common answers: 1 {A}

  • Total number of answers chosen by both students: 3 {A, A, B}

Normalized Dissimilarity Score

When the question is configured to group students based on dissimilar answers, the normalized Dissimilarity Score is calculated as follows:

Where:

  • U = Number of unique answers from all answers chosen by the students

  • C = Number of common answers chosen by both students

  • A = Total number of all answers (including repeating ones)

Example: in Question 2

  • Student 1 answers: A, B

  • Student 2 answers: B, C

  • Number of common answers: 1 {B}

  • Number of unique answers from all answers: 3 {A, B, C}

  • Total number of all answers: 4 {A, B, B, C}

Skipping Questions

In a question where both students have skipped, the students are considered similar to each other.

Example: in Question 3, where students should be grouped similarly

  • Student 1 skips

  • Student 2 skips

Example: in Question 4, where students should be grouped similarly

  • Student 1 skips

  • Student 2 skips

Step 2: Calculate Student Pair Score

For each student pair, a student pair score is created from the sum of all normalized question scores.

From Examples 1-4, for Students 1 and 2, their student pair score is calculated as the following:

This process is repeated for every pair of students in the assignment.

Step 3: Matrix of Normalized Scores

Then, a matrix is created from sum of all normalized question scores. For each student, a sorted list of pair scores is created. The list is sorted by pair score from highest to lowest horizontally and vertically.

This means that the top left corner of the matrix is the highest student pair score.

The following is an example of the matrix:

Step 4: Grouping Students by Algorithm

  1. Matrix Processing: The algorithm processes the matrix of normalized scores created in the previous steps. Each row in this matrix represents a student, and the scores in that row indicate how well this student pairs with every other student.

  2. Window Concept: The algorithm uses a "window" to find groups of students. The initial size of the window matches the desired group size. If it doesn't find enough matching students, it increases the window size.

  3. Identifying Groups:

    • The algorithm starts with the first row and sets the window size to match the desired group size.

    • It looks at the top matches within the window size for the first student.

    • Then, it goes to the row of the first match, and checks if the same students exist in that window. It repeats this process for all students in the window.

    • If it finds matching students in all windows, those students are grouped together.

    • If it doesn’t find enough students who match across their respective rows within this window, it increases the window size and repeats the process until it finds a sufficient number of matching students.

  4. Forming Groups: Once a group is formed, the students in that group are removed from the matrix. The algorithm then repeats the process with the remaining students until all groups are formed.

Example:

Suppose the desired group size is 3 students.

  • The algorithm starts with student 1 with a window size of 3, and looks at the matches within the window (students 4 and 2).

  • It then checks the rows of students 4 to see if students 1 and 2 appears as a top match within the window size.

  • It also checks the row of student 1 to see if students 1 and 2 appears as a top match within the window size.

  • If not enough matches are found, it increases the window size and looks again until it finds a consistent set of students to form a group.

  • In this case, students 2, 4, 1 appear in each other’s respective windows. Therefore, they are grouped together and discarded from the matrix.

Participants and Non-Participants

The algorithm runs separately for participants (students who answered some or all questions) and non-participants (students who didn't answer any questions).

This ensures that participants are grouped with participants and non-participants are grouped with non-participants, allowing for fair and relevant groupings based on survey responses.

Students who skipped all questions are considered “participants” who have answered all questions.

Did this answer your question?