Welcome to this training material on Generative AI (GenAI). As you may know, publicly available Generative AI tools can generate fresh text responses to inputs. If you ask a tool like ChatGPT to "generate a children's story about a tree" three times, you will get three different stories.
This has obvious implications for universities, with students potentially using such tools to answer assessment questions without proper acknowledgement. Our aim with this material is to help you distinguish between text produced by students and text produced by Generative AI tools.
While we have no resources to maintain this site, if you have any questions or comments relating to this material, please contact us.
Please note: This site originally included online surveys and quizzes that were not anonymous. These have been removed, but we have outlined the nature of the questions asked.
This work was partially funded by the Assessment Innovation Fund of the NCFE as part of a study into the robustness of GenAI assessment submissions across a range of assessment types and the effectiveness of training markers in the detection of GenAI outputs (2023-2024). This work has also been supported by eSTEeM, the STEM Centre for Scholarship and Innovation at the Open University.
On completion of this training material, you should be aware of:
The focus of this staff development is to give an overview of issues to be aware of in identifying and responding to inappropriate use of Generative AI tools that generate textual output, in assessment material submitted by students.
We also briefly cover what the university expects around student use of AI. The university framework for Generative AI allows students to use AI outputs under certain conditions – i.e., provided its use is acknowledged and no attempt is made to pass the AI outputs as the student’s own work. If this is not followed, or if the student shows poor practice in the use of AI (such as failing to ‘sense check’ the AI content they use) then these will generate a range of responses by the marker, from identifying teaching points in assessment feedback, requesting additional study skills session, or reporting the issues for consideration by the Academic Conduct officers. All of these responses must, of course, be within the university policies and module, qualification and discipline practices.
This work will not cover how Generative AI tools work, or explore issues around how to frame assessment that avoids excessive use of AI generated output, nor a discussion on the strengths and weaknesses in allowing students to learn with AI support. The focus is specifically on recognising when AIs might have been used in assessment content.
We will not be considering examples of non-textual Generative AI tools and tools built on technologies such as Stable Diffusion (Stable Diffusion is a generative artificial intelligence (generative AI) model that produces unique photorealistic images from text and image prompts.)
We must begin to consider the use of GenAI in assessment submissions in much the same way as we consider poor academic practice in writing styles, tone, presentation and referencing. When identified, these often lead to teachable moments and sometimes trigger disciplinary investigations. We need to make similar judgements about the appropriate handling of GenAI. In all cases, this starts with identifying the problem.
Unfortunately, the wide variety of student behaviours makes the identification of artificial answers very difficult.
Because of the way GenAI tools work, we will never be able to identify the source documents or unreferenced content – it doesn’t exist; each new output is “generated” on demand.
GenAI tools are rapidly changing, and as more AI training data is gathered, and as new uses/application areas are found, the characteristics, strengths and weaknesses of the different tools will change. The advice given here will become less relevant as the tools develop.
Similarly, as the education sector’s experience of the uses and abuses of AI develops over time, we can expect that advice and guidance on its use in teaching, learning and assessment and the policies within organisations will also develop. Please be aware of changes to university advice and guidance over time.
Anecdotally, frequent users of GenAI find it easier to see the particular styles of language use, errors and weaknesses common in the text produced by GenAIs. Therefore, one of the best ways to develop an awareness of the writing style of the tools, within the remit of the university policy of not presenting university materials to AI tools, is to spend some time with the tools.
If you have never or rarely used a GenAI, it’s recommended to spend about 10 minutes exploring one of the tools. Pick 1 – or more – of the tools suggested, and spend 10 minutes discussing one of your hobbies with the tool. Ask for a summary of an aspect of the hobby, and ask questions about it, both factual and opinion-based. Ask questions that use the kinds of language prompts that we use in assessment elements – explain, describe, give an example, show how, reflect on, and discuss the impact of X on Y.
The library has some advice on using GenAIs and examples of how to write an effective prompt. (See [insert link here]).
You might also try some basic prompt engineering – ask ChatGPT or Gemini to write in different styles, such as that of a 1st-year University student or as the holder of an advanced degree. Ask it to work to word lengths, or shorten the length of an answer it has already given. Ask it to include spelling and grammar errors.
Signing up to use software tools for practical activities |
We make no recommendation as to which platform you should use, but the following provide the functionality you will need and have free accounts. You should read the terms and conditions of the platforms before signing up. |
Finally, we'd appreciate you completing this survey to capture information about your familiarity with Generative AI tools. This will help us review the advice and guidance in the content, and to improve the material for future users.
This short survey (8 questions) captures information about your experience with Generative AI. This, along with the closing survey at the end of this material, will help us review the content, style, and structure of the advice and guidance, to improve the topics for future users. Please note: the survey is not anonymous.
Before reading the current guidance for students and assessment use of GenAI, please note the University has a blanket policy restricting the use of university materials with AI tools, covering teaching material, assessment content, and content from student scripts.
[Outline the academic contact advice on GenAI]
The generic advice and guidance for Staff and Students on appropriate uses of GenAI can be found on the following GenAI polices and statements pages.
Overview: [insert link]
Students: [insert link]
Staff: [insert link]
In addition, there may be specific guidance supplied for particular qualifications or modules which may apply to specific students and groups of students.
Before we start, it would be useful to get an idea of how well you can distinguish answers to questions produced by real people and synthetic solutions produced by Generative AI tools. This will help us assess whether the training is effective in supporting the recognition of student vs generative AI material.
The Pre-training quiz contains a selection of snippets from student coursework and generative AI responses to coursework questions. Each question requires you to decide if the answer to a question was produced by a human or by a generative AI tool. Each question should take around 2 minutes. At the end of the quiz we will give feedback to indicate some of the ‘tells’ within the text that we will explore further in the rest of this material.
Please note: The quiz is not anonymous within the VLE site. We’ll repeat this test after you’ve completed the rest of the material. Hopefully, it will reassure you in engaging with assessment marking. It will also help us review and improve the training content.
While GenAI tools, and the potential for students to inappropriately use GenAI outputs in their learning and assessment, has currently captured the academic zeitgeist, you should not be spending additional time checking the student scripts specifically for GenAI, it should be part of the general ‘take’ you get from a script when marking.
This is similar to what we do already. When marking student work, we sometimes get the feeling that something is not quite right, something doesn’t ring true, or that’s an odd answer for a level 1, 2, or 3 undergraduate. And then, if that feeling is sufficiently strong, we might look for evidence that the unease is a sign of plagiarism, copying, falsification and, more recently, the added possibility we are reading the output of an essay mill. To this list, we now add GenAIs.
Only if you get an initial sense that something feels suspicious should you try to explore further. The purpose of the core of this training is to help you identify many elements that can suggest something is suspicious.
Deciding that a student-submitted assessment contains inappropriate use of GenAI is similar to how you make the decision that a piece of work may be plagiarised, copied, or the output from an essay mill etc. You look for evidence to support the suspicion. However, further investigation of copying or plagiarism may lead to definitive evidence (an identical piece of work is found elsewhere). In the case of Essay Mill and GenAI output, there won't be an identifiable source or identical piece of work. In both these cases, your concerns will remain a suspicion.
Identifying GenAI tool output will always be a ‘balance of probabilities’ judgement. Except for a very few cases, there will be no irrefutable proof that text was generated by an AI.
The definitive cases will be where a student simply copies and pastes GenAI output directly into their assignment without even a cursory attempt to review and edit what has been copied. In a few cases you will spot phrases such as “As a Generative AI, I cannot….”
This also applies to the AI detection/measures featured in tools such as TurnItIn. Turnitin can report the percentage of text that may (with a probability of greater than 95%) have been generated by an AI. This measure is based on a statistical analysis of the patterns of language use and word selection within the text and does not provide concrete evidence of the material being AI-generated, just a probability measure. Consequently, students can always challenge or deny the suspicions with responses such as "I write bland text", "I made a mistake with my facts" or "I’m not good at referencing". This is why the Academic Conduct Office stresses the need for careful documentation of concerns that you identify.
Most suspicions are likely to lead to study skills and teachable moments because, as we see later, many of the 'tells' that genAI might have been used are also the kinds of errors that students regularly make when submitting assessment materials.
For the detection of Generative AI output, there is no equivalent to TurnItIn's identifying blocks of verbatim text in external sources, or CopyCatch identifying blocks of identical text in two or more student submissions. Also, there is no point in putting suspected generated text into a search engine or matching a student submission text with text generated when you put the same questions into a GenAI tool.
GenAIs create novel outputs in response to prompting. Only in very short passages of generated text are you likely to see the entire output repeated. So, there will be no exact matching text unless it has been prompted to provide an exact quote or a short fact.
And don’t forget, university policy says we cannot put student-generated material into an AI as we do not have the student's permission to give away their text, and we should not be putting our own copyright questions into GenAI prompts.
A problem faced when trying to identify traces of GenAI in submitted text arises because GenAI’s text manifests many of the same problems we find in student work. GenAIs have, after all, been trained on a vast quantity of general content, not on curated academic or discipline-specific content, and have no sense of the meaning of the text they process or output.
For example, Generative AI output can have the appearance of text from someone who has weak academic skills, or lacks specific study skills around written work, is a non-native speaker of English, has specific reading or writing disabilities resulting in the use of permitted support tools that use AI, or that is poor at referencing source materials. In the absence of concerns related to AI, these would more often result in the inclusion of comments, feedback, or targeted suggestions for additional support or study skills practice than an academic misconduct referral.
In cases where there is insufficient evidence of GenAI but where the assessment material warrants giving student feedback, this fits with the University policy of helping students develop their work in an AI-enabled World.
When working through the examples in the next section you will recognise many occasions where you will have identified student work that is indistinguishable from the examples of GenAI weaknesses we show.
The next main section gives examples of what might raise concerns when we see student scripts.
In the following sub-sections, each page will have a description of what might arouse suspicion and then usually be followed by one or more example “question and answer” pairs that illustrate aspects of that suspicion. There are cases where an example would be too large to be useful, or too ‘obvious’ to be needed and these are usually put at the bottom of the page.
The answers will consist of a mix of ChatGPT-generated and human-edited ChatGPT text. We’ve followed the answer, in many cases, with a note of what a marker might flag if they saw that answer, and how a student could respond to accusations of the use of a GenAI tool to prepare the answer, beyond the simple response of ‘Prove it!’.
The sections are grouped under the following headings:
All examples were generated by the public ChatGPT 3.5 between 25th January and April 15th 2024.
These are places where GenAIs simply get things wrong. Usually because, despite appearances, they don't understand the text they are processing, they work on patterns and word associations. The output is often quite convincing at a surface level, but when you check the output carefully you often find weaknesses, mistakes and contradictions.
Similar to the basic mistakes of the previous section, Generative AI output does not always follow the guidance and common-sense expectations when constraints are placed on the way questions are expected to be answered.
These relate to how such features as the patterns of language use, discipline-specific terminology and practices, and typographical layout can suggest issues with the content.
It is not just the content of the answer that suggests something is odd about a student submission. The context and data around that submission can also raise concerns.
We are interested in whether the training is effective in supporting the identification of student-generated and potentially GenAI-generated content. Now you have completed the staff development materials, we'd like you to take a post-training quiz.
The questions are in the same format as the Pre-training quiz you completed earlier, but the content of those questions is different. As before, you will be asked to select whether an answer is from a student or a generative AI.
Please note: The quiz is not anonymous within the VLE site. Thank you for completing the quiz.
The impact GenAI is going to have on academia remains uncertain; hopefully this training material has given you some understanding of the current situation for exploring and reporting the suspicion of GenAI tool use. Some of the key takeaways are:
Finally, we'd appreciate you completing this closing survey to capture information about your experience in engaging with the material. This will help us review the content, style and structure of the material of the advice and guidance, to improve the topics for future users.