News Score: Score the News, Sort the News, Rewrite the Headlines

Large Language Models Pass the Turing Test

View PDF HTML (experimental) Abstract:We evaluated 4 systems (ELIZA, GPT-4o, LLaMa-3.1-405B, and GPT-4.5) in two randomised, controlled, and pre-registered Turing tests on independent populations. Participants had 5 minute conversations simultaneously with another human participant and one of these systems before judging which conversational partner they thought was human. When prompted to adopt a humanlike persona, GPT-4.5 was judged to be the human 73% of the time: significantly more often tha...

Read more at arxiv.org

© News Score  score the news, sort the news, rewrite the headlines