News:

Welcome to the new (and now only) Fora!

Main Menu

A whole new ballgame in cheating. Introducing ChatGPT

Started by Diogenes, December 08, 2022, 02:48:37 PM

Previous topic - Next topic

Diogenes

So new "AI" chatbot just hit the world called ChatGPT. It's based on OpenAI which crawled massive tomes of internet data. The software has existed for awhile, but this is the first freely available user friendly platform. Ask it questions, it gives you answers but in a conversational way. It'll even try and emulate voice and tone if you ask for say, frat bro lingo. Ask it to write you an essay on x, it'll give it to you. And since it's not just copy/pasting from the net but using machine learning, cheating software will not catch it.

I'm not in a "sky is falling" camp. Playing around with it, I've found first off that it's a lot of fun to play with. Second, when testing it for standard definitions and essays from my field it gave satisfactory responses, but only at the cursory level. Guessing wikipedia had something to do with it's "knowledge" I asked specific questions and got regurgitated answers very similar to wikipedia, but different enough. Higher level questions in my field it often had wrong or incomplete answers. Maybe the student wouldn't get full credit, but I wouldn't see cheating in the responses. But it would give seemingly real world examples of concepts, and for a quick essay or discussion post in a 101 course, I could be fooled. It just goes to show scaffolding and hyper-specific criteria are very much important.

The final thing I learned that it can do in this context is it'll create multiple choice and essay questions for you! Which convolutes the ethics even more. I mean, it wrote some better ones for me than some from the textbook banks, which will certainly just use this kind of software in the future if they don't already.

https://chat.openai.com/

Morden

This is terrifying for 1st and 2nd year essays. By the time we're in 3rd year courses (which require scholarly sources and citations), it's not as bad--although honestly if you used it as a skeleton, and then added a few sources here and there, you could come up with a decent mark.

Liquidambar

Diogenes, did you have to create an account to try this?  It wouldn't show me anything without an account, so I started creating one, but I stopped when it asked for my real name and phone number.
Let us think the unthinkable, let us do the undoable, let us prepare to grapple with the ineffable itself, and see if we may not eff it after all. ~ Dirk Gently

Diogenes

Quote from: Liquidambar on December 08, 2022, 04:20:00 PM
Diogenes, did you have to create an account to try this?  It wouldn't show me anything without an account, so I started creating one, but I stopped when it asked for my real name and phone number.

Oh, yes it did require a free login. I don't recall what details were required, but it never asked for sensitive financial data or anything. Ironic it has to keep the fakes out.

Parasaurolophus

I haven't tried it for multiple choice questions. But for essays, it does a passable job. Whether it will continue to do so as more AI text moves online. (It ruined AI image generation.)

I've returned to in-class essays for lower-level courses.

People have started developing tools to detect AI-generated text: https://huggingface.co/openai-detector
I know it's a genus.

Puget

It is definitely the death of the take-home essay exam, at least when it comes to fairly generic questions.

However, what is most striking to me is how it recapitulates so many cognitive errors that humans make. Not surprisingly, cognitive psychology twitter spent a few days testing it on all the old heuristic classics and posting the results. It is especially bad at even elementary logic problems and math when embedded in word problems, but it isn't random, it fails in the same way humans do (which raises the question-- is the goal to be like humans, or to be right?)
"Never get separated from your lunch. Never get separated from your friends. Never climb up anything you can't climb down."
–Best Colorado Peak Hikes

Liquidambar

Quote from: Parasaurolophus on December 08, 2022, 06:08:08 PM
People have started developing tools to detect AI-generated text: https://huggingface.co/openai-detector

I typed a few sentences about my day (about twice the 50 tokens it said were needed for accuracy).  It gave a 99.73% chance my text was fake.  Okay, it wasn't an exciting day, but I'm a bit insulted.
Let us think the unthinkable, let us do the undoable, let us prepare to grapple with the ineffable itself, and see if we may not eff it after all. ~ Dirk Gently

Parasaurolophus

Quote from: Liquidambar on December 08, 2022, 07:28:01 PM
Quote from: Parasaurolophus on December 08, 2022, 06:08:08 PM
People have started developing tools to detect AI-generated text: https://huggingface.co/openai-detector

I typed a few sentences about my day (about twice the 50 tokens it said were needed for accuracy).  It gave a 99.73% chance my text was fake.  Okay, it wasn't an exciting day, but I'm a bit insulted.

We get several hundred bot registrations every day. Perhaps you slipped through!
I know it's a genus.

Liquidambar

Quote from: Parasaurolophus on December 08, 2022, 07:36:15 PM
Quote from: Liquidambar on December 08, 2022, 07:28:01 PM
Quote from: Parasaurolophus on December 08, 2022, 06:08:08 PM
People have started developing tools to detect AI-generated text: https://huggingface.co/openai-detector

I typed a few sentences about my day (about twice the 50 tokens it said were needed for accuracy).  It gave a 99.73% chance my text was fake.  Okay, it wasn't an exciting day, but I'm a bit insulted.

We get several hundred bot registrations every day. Perhaps you slipped through!

That must be it.  It's easy to fake being a professor here.  Blah, blah, plagiarism, kids these days never read the syllabus...
Let us think the unthinkable, let us do the undoable, let us prepare to grapple with the ineffable itself, and see if we may not eff it after all. ~ Dirk Gently

Morden

Quote from: Liquidambar on December 08, 2022, 04:20:00 PM
Diogenes, did you have to create an account to try this?  It wouldn't show me anything without an account, so I started creating one, but I stopped when it asked for my real name and phone number.

I lied about my name, but it did want my cell number; I tried a landline number from the university, but it didn't take it. Later there is info about how to cancel your account.

marshwiggle

Quote from: Diogenes on December 08, 2022, 04:23:14 PM
Quote from: Liquidambar on December 08, 2022, 04:20:00 PM
Diogenes, did you have to create an account to try this?  It wouldn't show me anything without an account, so I started creating one, but I stopped when it asked for my real name and phone number.

Oh, yes it did require a free login. I don't recall what details were required, but it never asked for sensitive financial data or anything. Ironic it has to keep the fakes out.

So, could you create one real account, and then use the bot itself to create fake accounts? Kind of anti-AI ju-jitsu.
It takes so little to be above average.

Caracal

Quote from: Parasaurolophus on December 08, 2022, 06:08:08 PM
I haven't tried it for multiple choice questions. But for essays, it does a passable job. Whether it will continue to do so as more AI text moves online. (It ruined AI image generation.)

I've returned to in-class essays for lower-level courses.

People have started developing tools to detect AI-generated text: https://huggingface.co/openai-detector

Yeah, when I returned to classroom teaching, I flirted with the idea of keeping the essay exams take home, but I just didn't want to have to worry about the cheating. I really don't worry much about plagiarism with the out of class essays and assignments I assign because students are always supposed to be doing something quite specific involving primary sources. I'm sure I sometimes miss plagiarism, but when I do it's probably because a disjointed cut and paste job that doesn't fulfill the requirements of the assignment can be hard to distinguish from a disjointed paper the student actually wrote that doesn't fulfill the requirements. The cheaters aren't usually getting an advantage from their cheating. I'm not even sure they save much time. over the student who wrote their own crummy paper.

In an in class or take home essay, however, the questions basically have to be pretty broad and when you don't expect students to be spending huge amounts of time writing their answers, all but the very best exams are going to be pretty general and formulaic. Even when students discuss things we haven't covered in class or reading, I can't assume that means they are cheating. I teach American history, so people bring in all kinds of stuff they learned from high school, or even their own reading and while I only give points for evidence from class, it wouldn't be reasonable to penalize students for knowing something else that is relevant to the question. Better to just keep the exams in class and not have to worry about it...

Wahoo Redux

There was a very alarmist, sky-is-falling thread over on Reddit about this very subject.

I took the unpopular position that we may just have to accommodate AI, whether we like it or not.

Eventually the programs will be able to perform the higher end, even expert writing projects.  "Writing a paper" for school may be as alien to your grandchildren as shoeing a horse is to most of us.  AI may be the new automobile of the brain.

I don't know how education will adjust, but it probably will have to adjust.  Perhaps a return to Platonic and Socratic oral presentation.
Come, fill the Cup, and in the fire of Spring
Your Winter-garment of Repentance fling:
The Bird of Time has but a little way
To flutter--and the Bird is on the Wing.

aprof

For fun, I gave ChatGPT the T/F portion of my final exam (1st year grad STEM course).  It scored 64%. Hard to point to a pattern on the problems it missed.  Anything that is just a basic definition check is easy for it to get correct, just as it would be for a student who had the opportunity to Google an answer.  The more inferences that must be drawn to concepts beyond what's in the statement, the more it seems to struggle to connect them properly.

I also gave it a few calculated problems but it bombed quite badly on those.  I would have probably given it 25% partial credit for identifying a few equations and concepts that were required but then applying them all wrong.

Overall, its responses read to me as that undergrad who is very confident but not quite as smart as they think they are.  It says things so authoritatively that they sound correct, and if you're just skimming or not an expert on the subject, you'd probably overlook them. I'm not sure if this is better or worse that actually being smart and capable.  Seems to me it can just lead to greater spread of disinformation and false expertise.

marshwiggle

Quote from: aprof on December 09, 2022, 08:01:52 AM
For fun, I gave ChatGPT the T/F portion of my final exam (1st year grad STEM course).  It scored 64%. Hard to point to a pattern on the problems it missed.  Anything that is just a basic definition check is easy for it to get correct, just as it would be for a student who had the opportunity to Google an answer.  The more inferences that must be drawn to concepts beyond what's in the statement, the more it seems to struggle to connect them properly.

I also gave it a few calculated problems but it bombed quite badly on those.  I would have probably given it 25% partial credit for identifying a few equations and concepts that were required but then applying them all wrong.

Overall, its responses read to me as that undergrad who is very confident but not quite as smart as they think they are. It says things so authoritatively that they sound correct, and if you're just skimming or not an expert on the subject, you'd probably overlook them. I'm not sure if this is better or worse that actually being smart and capable.  Seems to me it can just lead to greater spread of disinformation and false expertise.

Artificial Dunning-Kruger effect?
It takes so little to be above average.