NYT: A.I. Wrote Ivy League Application Essays

Started by Wahoo Redux, September 08, 2023, 08:13:13 AM

Previous topic - Next topic

Sun_Worshiper

Quote from: Hegemony on September 10, 2023, 12:51:34 AM
Quote from: Sun_Worshiper on September 09, 2023, 06:45:23 PMLet's say that I ask chatGPT to write me an essay about how the new rules in baseball are ruining the game. In doing so, I explain to it the argument that I want it to make. The LLM spits out about 1000 words in seconds, capturing the essence of my argument and providing some good prose in the process. But it is a bit clunky, with several run on sentences and examples from history that turn out to be inaccurate. From there, I rewrite the essay to sharpen up the argument, clean up the grammar, and input accurate historical examples. By this point, the essay looks rather different than the one that ChatGPT created, although it still follows the structure of the first draft and there are large segments of prose that the LLM wrote. If I submit to a magazine for publication, the editor will have some ideas for how it should be updated and amended for publication, and the essay will look still more different than what ChatGPT created in the first place once these changes are made.

So is this plagiarism/cheating? It was my idea, yes, and I did a lot of work to bring it to fruition. But I had help in this scenario turning it into an article and some of the essay structure and prose will be from the program. If a human had written that first draft for me, then they should rightfully be the coauthor, but it isn't a human, so those same rules may not apply.

As for whether it is a good idea, I guess that is for every individual to decide. But I will say this: Nobody will know that I used LLM, since it can't be accurately detected. And 95% of the other writers in newspapers and magazines are doing the same thing, perhaps in an even more egregious way.

Each individual piece of writing will look pretty much okay, though probably a bit bland. But soon people will get used to the clichéd writing that AI does. I am a reader for a fiction publication, and we are getting a ton of AI-written submissions. I imagine the submitter gives the AI some instructions like "Write a story about a romance that goes wrong. Include a cute dog." Something like that. But the stories are all the same. The sentence structure of the first sentence is always the same. They always develop in the same way. You can predict where the first dialogue will occur. You can predict when the narrator will summarize part of the story. This is whether the story is four paragraphs long or twenty pages long. It is so, so recognizable.

I'm sure each submitter reads the product and thinks, "Yeah, this is a pretty good story! Totally worth submitting!" They don't have the experience to see that it's like the other dozen AI-written stories we received today, and the hundred we received last week.

I also notice that a lot of the anonymous "heartwarming stories" and anecdotes on Facebook and websites are now AI-written. Because once you're aware of the formula, you can spot it a mile off.

When my students started turning in AI-written stuff, I recognized it instantly because there were the formulas and patterns. The students confirmed my instincts by confessing.

I'd guess the same would be true of your piece about baseball. You may think the AI spits out something pretty intelligible. But if you leave some AI-written parts in after your revision, the editor will be all too familiar with those formulas, and will reject them because they're tired and weak and she gets a dozen submissions like that all the time. And if you don't leave in the AI-written formulas and clichéd expression, well, you might as well write the thing from scratch in the first place.

You are describing something very different than the scenario that I laid out:

I am explaining how an experienced writer can craft an article with LLM, by feeding it a detailed argument (not "write a romance gone wrong with dogs") and then editing it a lot. You are describing a scenario where an inexperienced writer gives the LLM very simple instructions, then sends more-or-less exactly what ChatGPT wrote to a magazine. Of course the example you offered will not produce a great piece of writing, but an experienced writer using the approach that I described could certainly write a publishable editorial - every bit as good as those that are frequently published in Foreign Policy or Harvard Business Review, for example (and it is quite likely that many of the folks who write those sorts of articles are using ChatGPT just as I described).

As for students, of course some of them are transparently using ChatGPT to write an essay and pasting it into the word doc as is. You can often tell this is written by AI, although you won't really know for sure since the AI detection tools we have are not reliable. However, I am very skeptical that you would be able to tell if a student follows the instructions that I offered above.

Parasaurolophus

I think that, where students are concerned, we're just far more likely to encounter Hegemony's strategy than the one you describe, Sun_Worshiper. And although I still think that the ideal scenario you describe violates the principles of academic integrity, it's certainly a lot closer to being acceptable.

As far as proof is concerned, you're right that we're unlikely to be able to reliably detect work generated in that ideal scenario. There's probably no point in trying to do so, beyond warning students against doing it at all. But I think that the standards of proof are fairly easily met for Hegemony's non-ideal scenario, which is the kind of thing I'm encountering a lot of here.

For the most part, the generated essays don't adequately deal with course material or speak to the prompt, or simply make up sources or attribute made-up stuff to them (e.g. my example in another thread of Foucault's Archaeology of Knowledge being about dinosaurs); for those, even if the student could prove it was their own work, it merits a zero. That's easy. Some are a step up, and harder to detect or prove considered in themselves. But when you place it side-by-side with other, very similar essays submitted on the same topic, then you have a pretty good indication that it was AI-generated. Similarly, if I can prompt the AI to generate something sufficiently similar, then I've got good proof. In my experience so far, 'sufficiently similar' means basically identical, but thesaurusized, like the good old days when students would run their work through paraphrasing software.

The AI will, of course, get better over time, even with just the basic prompts. But for now, I think we can have sufficient confidence to fail a large class of generated essays.
I know it's a genus.

Sun_Worshiper

Quote from: Parasaurolophus on September 10, 2023, 10:32:34 AMI think that, where students are concerned, we're just far more likely to encounter Hegemony's strategy than the one you describe, Sun_Worshiper. And although I still think that the ideal scenario you describe violates the principles of academic integrity, it's certainly a lot closer to being acceptable.

As far as proof is concerned, you're right that we're unlikely to be able to reliably detect work generated in that ideal scenario. There's probably no point in trying to do so, beyond warning students against doing it at all. But I think that the standards of proof are fairly easily met for Hegemony's non-ideal scenario, which is the kind of thing I'm encountering a lot of here.

For the most part, the generated essays don't adequately deal with course material or speak to the prompt, or simply make up sources or attribute made-up stuff to them (e.g. my example in another thread of Foucault's Archaeology of Knowledge being about dinosaurs); for those, even if the student could prove it was their own work, it merits a zero. That's easy. Some are a step up, and harder to detect or prove considered in themselves. But when you place it side-by-side with other, very similar essays submitted on the same topic, then you have a pretty good indication that it was AI-generated. Similarly, if I can prompt the AI to generate something sufficiently similar, then I've got good proof. In my experience so far, 'sufficiently similar' means basically identical, but thesaurusized, like the good old days when students would run their work through paraphrasing software.

The AI will, of course, get better over time, even with just the basic prompts. But for now, I think we can have sufficient confidence to fail a large class of generated essays.

What are the standards of proof? AI detection software is not reliable, so instead you are just using your own intuition about what is generated by LLM? Seems like a questionable basis for failing students.

Anyway, people tend to look at LLMs in very black and white terms: either they are amazing or the writing they produce is crap. The reality is that it is a very helpful writing tool that experienced writers can and do use to supercharge their productivity and that many, if not most, students are using to complete assignments* - mostly without being detected. Trying to ban its use or failing students that you suspect used it is, imo, the wrong strategy. Better to accept that this technology is going to be used in professional writing for the foreseeable future and teach students to understand its strengths and weaknesses.

89% of students have used it according to this article: https://www.forbes.com/sites/chriswestfall/2023/01/28/educators-battle-plagiarism-as-89-of-students-admit-to-using-open-ais-chatgpt-for-homework/?sh=6a167cfc750d



Parasaurolophus

If I compare twenty papers on the same topic and they're virtually identical, barring some synonyms here and there, I'd say that satisfies any reasonable standard of proof. Under the old regime, that would have suggested they copied from the same source.

For now, that's what most of the AI work I detect looks like, and I'm sorry to say that the numbers aren't inflated. A big chunk of the rest just don't meet the minimum parameters for engaging with the material, so they fail regardless of whether I can comfortably demonstrate AI use.

As I see it, my task is to help students to develop particular thinking and writing skills for themselves. Once they have those, then they can work on using AI shortcuts. But they have to actually learn the long-cut first.
I know it's a genus.

Sun_Worshiper

Quote from: Parasaurolophus on September 10, 2023, 11:11:31 AMIf I compare twenty papers on the same topic and they're virtually identical, barring some synonyms here and there, I'd say that satisfies any reasonable standard of proof. Under the old regime, that would have suggested they copied from the same source.

For now, that's what most of the AI work I detect looks like, and I'm sorry to say that the numbers aren't inflated. A big chunk of the rest just don't meet the minimum parameters for engaging with the material, so they fail regardless of whether I can comfortably demonstrate AI use.

As I see it, my task is to help students to develop particular thinking and writing skills for themselves. Once they have those, then they can work on using AI shortcuts. But they have to actually learn the long-cut first.

Well, to each their own on detecting bad behavior in the classroom, but I'm not comfortable relying on anything less than overwhelming proof on plagiarism, whether the content is copied from a wikipedia page or from ChatGPT, and what you are describing does not meet that standard.

As for the bolded, I agree in theory with what you are saying, but reality is that they are using AI and you can't stop them or detect it reliably, so this business of actually learning the long-cut first is out the window in practice.


marshwiggle

Quote from: Diogenes on September 09, 2023, 01:35:11 PMIt does bring up a bigger issue. If they were that formulaic of a hoop to jump through, why were we using them to begin with?

And since in private industry "A.I." has been screening resumes and cover letters for years to filter applicants, why has anyone wasted their time and cognitive energy on writing those?

I notice that no-one has picked up on this point yet, which was what had occurred to me as well. The whole emphasis on admission essays seems to leave a lot of room for manipulation of outcomes by those who know how the process works to the detriment of people who don't. (I seem to recall discussions before about people hiring "consultants" and so to do the essays, which just favours the rich and well-connected.)

It takes so little to be above average.

Hegemony

Quote from: Sun_Worshiper on September 10, 2023, 10:54:53 AMWhat are the standards of proof? AI detection software is not reliable, so instead you are just using your own intuition about what is generated by LLM? Seems like a questionable basis for failing students.


Here are the standards for failing students:

First, the dreck produced by ChatGPT is so bad that it would fail even if the student had written it entirely themselves. Weak arguments, no details, no references to the course material, wrong facts and made-up references. So the assignment is an F no matter what.

Second, the student confesses that they wrote it by ChatGPT. So far every AI-written assignment I've suspected has been confirmed by the student confessing. After all, they have nothing to lose — the assignment is a fail anyway.

One thing the students don't yet understand is that another giveaway is that ChatGPT spells all its words correctly and punctuates correctly. The number of my students who do this on their own is vanishingly small.

As for the sophisticated revision of an AI article about baseball or whatever — sure, you could do that, just as you could take a crummy essay of whatever type and carefully revise it to be a better essay. You could even write your own crummy essay and revise it. But the revisions sufficient to make it look as if it was not written by machine are going to be a lot of trouble. As I say, the ChatGPT's first draft will look well-phrased to someone who isn't wearied of the same old formulas by reading reams of the stuff. So you would need to know what those are, so you can edit them out, or a canny editor will say "Would you make this sound a little less dreary and bot-written?"


Sun_Worshiper

Quote from: Hegemony on September 10, 2023, 01:34:26 PM
Quote from: Sun_Worshiper on September 10, 2023, 10:54:53 AMWhat are the standards of proof? AI detection software is not reliable, so instead you are just using your own intuition about what is generated by LLM? Seems like a questionable basis for failing students.


Here are the standards for failing students:

First, the dreck produced by ChatGPT is so bad that it would fail even if the student had written it entirely themselves. Weak arguments, no details, no references to the course material, wrong facts and made-up references. So the assignment is an F no matter what.

Second, the student confesses that they wrote it by ChatGPT. So far every AI-written assignment I've suspected has been confirmed by the student confessing. After all, they have nothing to lose — the assignment is a fail anyway.

One thing the students don't yet understand is that another giveaway is that ChatGPT spells all its words correctly and punctuates correctly. The number of my students who do this on their own is vanishingly small.

As for the sophisticated revision of an AI article about baseball or whatever — sure, you could do that, just as you could take a crummy essay of whatever type and carefully revise it to be a better essay. You could even write your own crummy essay and revise it. But the revisions sufficient to make it look as if it was not written by machine are going to be a lot of trouble. As I say, the ChatGPT's first draft will look well-phrased to someone who isn't wearied of the same old formulas by reading reams of the stuff. So you would need to know what those are, so you can edit them out, or a canny editor will say "Would you make this sound a little less dreary and bot-written?"


You are really underestimating LLMs, both in terms of the quality of their output and the value they offer as a writing tool. Odds are that your students are using them more than you think and so are your favorite writers, and you can't tell the difference.


Parasaurolophus

Quote from: Sun_Worshiper on September 10, 2023, 03:10:24 PMYou are really underestimating LLMs, both in terms of the quality of their output and the value they offer as a writing tool. Odds are that your students are using them more than you think and so are your favorite writers, and you can't tell the difference.



Although she may also be underestimating the technology, I think that Hegemony's point is that your own view of ChatGPT use by students is perhaps a little idealistic. My experience here--where plagiarism and contract cheating were rampant before ChatGPT!--is that students are definitely not putting in the kind of work required to disguise their ChatGPT use/improve their own work/do something that's not essentially just contract cheating minus the contract. And that's because they're by and large not interested in using it to fine-tune their work; they're interested in using it instead of doing the work themselves. Putting in the work would be inimical to the task. So what we end up with are classes of 35 in which 20+ students turn in virtually identical answers to a short answer question or essay prompt. Not only is that unlikely in itself, it also never happened before ChatGPT (because their cheating was distributed across several different platforms), so we can be sufficiently confident that that's how the student got to that answer. Honestly, my experience is that it's easier to tell who's cheating now than it was (because they're all using the same source), it's just harder to prove (except when they all turn in the same thing).

But in most cases, as Hegemony said, the cheating question is often moot since the work submitted is just not passable anyway. Not because it couldn't be, with more work, but because the students aren't interested in putting in that kind of work to begin with. Honestly, if they were, they could spend just as much time doing the thing themselves, and they'd pass.
I know it's a genus.

Sun_Worshiper

Quote from: Parasaurolophus on September 10, 2023, 05:18:02 PM
Quote from: Sun_Worshiper on September 10, 2023, 03:10:24 PMYou are really underestimating LLMs, both in terms of the quality of their output and the value they offer as a writing tool. Odds are that your students are using them more than you think and so are your favorite writers, and you can't tell the difference.



Although she may also be underestimating the technology, I think that Hegemony's point is that your own view of ChatGPT use by students is perhaps a little idealistic. My experience here--where plagiarism and contract cheating were rampant before ChatGPT!--is that students are definitely not putting in the kind of work required to disguise their ChatGPT use/improve their own work/do something that's not essentially just contract cheating minus the contract. And that's because they're by and large not interested in using it to fine-tune their work; they're interested in using it instead of doing the work themselves. Putting in the work would be inimical to the task. So what we end up with are classes of 35 in which 20+ students turn in virtually identical answers to a short answer question or essay prompt. Not only is that unlikely in itself, it also never happened before ChatGPT (because their cheating was distributed across several different platforms), so we can be sufficiently confident that that's how the student got to that answer. Honestly, my experience is that it's easier to tell who's cheating now than it was (because they're all using the same source), it's just harder to prove (except when they all turn in the same thing).

But in most cases, as Hegemony said, the cheating question is often moot since the work submitted is just not passable anyway. Not because it couldn't be, with more work, but because the students aren't interested in putting in that kind of work to begin with. Honestly, if they were, they could spend just as much time doing the thing themselves, and they'd pass.

The post that set this off was in response to Wahoo saying that their skills are going to be useless in an LLM world. It was not about students, but rather about people who do serious writing professionally. That is why I focused on somebody using LLMs to write an essay for submission to a magazine. My point is that ChatGPT is a very useful tool that can help make a writer more efficient and productive. If you are a professional writer, then ignore or dismiss these tools at your own peril.

With regards to students, I deal with them and their writing constantly and I am all too familiar with the lazy approach to assignments that many of them display. That is exactly why I keep emphasizing that it is important to teach them the strengths and weaknesses of LLMs - so that they know what can go wrong if they rely on the analysis written by ChatGPT without refining it. Telling them "LLMs produce crap and you should never use them" is totally wrongheaded, both because it is not true and because students actually will be expected to use these tools in their white collar jobs. And students know this just as well as I do, which is why they're mostly all using it and few are being caught or penalized.




Hegemony

Quote from: Sun_Worshiper on September 10, 2023, 03:10:24 PMYou are really underestimating LLMs, both in terms of the quality of their output and the value they offer as a writing tool. Odds are that your students are using them more than you think and so are your favorite writers, and you can't tell the difference.


Nope, not a chance. The subjects I assign are too specialized, relying on class-specific readings and materials, for ChatGPT to make any headway whatsoever. The one assignment in which the students try it is a different type, but it too is supposed to make use of topics and motifs we have discussed at length in class. Do these topics and motifs ever show up in the bot-written submissions? They do not. Fail. To get them in there, you'd have to feed the entire semester's reading assignments into the instructions, and even then it wouldn't know how to pick good examples and analyze them perceptively. It's not smart.

I also am a reader for a fiction magazine, as I probably mentioned, and I can guaran-damn-tee you that not one accepted piece has used ChatGPT. ChatGPT has no imagination. It is predictive, not creative. It is literally the "Family Feud" method of writing — what next word is most predictable in this sentence? That's the one to use! A couple of days spent comparing our bot-written submissions and human-written submissions would persuade anyone.

If you're writing boilerplate guff about widget production, I guess ChatGPT is sufficient, though I have better things to do than to teach my students how to modify it and check all its facts and make it acceptable. But if you want actual original thought, not just machine-generated wording, ChatGPT is useless. As for its output, as Truman Capote famously said about Jack Kerouac, "That's not writing, that's typing."

Kron3007

I have been using it quite a bit to help with tedious/redundant tasks and it is pretty awesome.  It is better at EDI sections than I am, and I have actually learned a bunch of things from its output.  I use it as Sun_Worshiper suggests though, with a lot of editing.

One thing I have learned is how much the quality of the output is impacted by the quality of this input.  People that claim they can tell ChatGPT generated text may be missing a lot, especially at the undergrad level.  Students are becoming skilled with it, and you may be surprised....

Regarding chat GPT in class, this will vary a lot by field.  I am in STEM, and chat GPT is pretty good at summarizing this type of thing.  For written assignments, I will allow them to use it and warn them that it is prone to making things up and they need to be careful.  This is a new tool, and learning to use it well makes sense.  That being said, I have shifted grades more to non-written assignments.


Kron3007

Quote from: Wahoo Redux on September 09, 2023, 05:41:36 PM
Quote from: Sun_Worshiper on September 09, 2023, 02:10:43 PMAt this point LLMs can't replace serious writing/writers. What they can do is to help a writer to supercharge their productivity.

In theory, it is plagiarism and/or cheating to let a machine perform even a part of one's homework.

But is it a bad idea?

I'm sure that's what people were saying about spell check and grammar correction.  Now, they are ubiquitous.  AI isn't going anywhere.

Caracal

Quote from: Diogenes on September 09, 2023, 01:35:11 PMIt does bring up a bigger issue. If they were that formulaic of a hoop to jump through, why were we using them to begin with?

And since in private industry "A.I." has been screening resumes and cover letters for years to filter applicants, why has anyone wasted their time and cognitive energy on writing those?

When I first started applying for jobs, I asked a friend who had just gotten a job and had been on the market for a year before if I could see her letter. I didn't copy it, but I did use it as a basic template, because I didn't really know where to start. And when I write a new job letter, I don't start with a blank document, I pull out an old letter and edit it to fit.

Perhaps an AI program could manage a passable cover letter with some editing, but I can't really understand why you would want it to. The danger is that it's going to produce something that is fine as a generic cover letter, but is going to look weird in the context of the field. You can avoid that problem by getting an appropriate model to start from.

In general this is where the idea that AI is going to be particularly useful for first drafts strikes me as strange. It's much easier to work with your own first draft, in the case of very formulaic writing, perhaps created with an appropriate template, then it is to take some random other first draft that isn't actually trying to say the things you want it to. There are some particular cases where it might work well, but fewer than many people seem to think.

Sun_Worshiper

Quote from: Hegemony on September 11, 2023, 02:37:45 AM
Quote from: Sun_Worshiper on September 10, 2023, 03:10:24 PMYou are really underestimating LLMs, both in terms of the quality of their output and the value they offer as a writing tool. Odds are that your students are using them more than you think and so are your favorite writers, and you can't tell the difference.


Nope, not a chance. The subjects I assign are too specialized, relying on class-specific readings and materials, for ChatGPT to make any headway whatsoever. The one assignment in which the students try it is a different type, but it too is supposed to make use of topics and motifs we have discussed at length in class. Do these topics and motifs ever show up in the bot-written submissions? They do not. Fail. To get them in there, you'd have to feed the entire semester's reading assignments into the instructions, and even then it wouldn't know how to pick good examples and analyze them perceptively. It's not smart.

I also am a reader for a fiction magazine, as I probably mentioned, and I can guaran-damn-tee you that not one accepted piece has used ChatGPT. ChatGPT has no imagination. It is predictive, not creative. It is literally the "Family Feud" method of writing — what next word is most predictable in this sentence? That's the one to use! A couple of days spent comparing our bot-written submissions and human-written submissions would persuade anyone.

If you're writing boilerplate guff about widget production, I guess ChatGPT is sufficient, though I have better things to do than to teach my students how to modify it and check all its facts and make it acceptable. But if you want actual original thought, not just machine-generated wording, ChatGPT is useless. As for its output, as Truman Capote famously said about Jack Kerouac, "That's not writing, that's typing."

Quote from: Sun_Worshiper on September 10, 2023, 08:15:14 PMignore or dismiss these tools at your own peril.