ChatGPT could be the reviewer of your next scientific paper. Evidence on the limits of AI-assisted academic reviews

David Carabantes; José L. González-Geraldo; Gonzalo Jover

doi:10.3145/EPI.2023.SEP.16

ChatGPT could be the reviewer of your next scientific paper. Evidence on the limits of AI-assisted academic reviews

David Carabantes ¹
José L. González-Geraldo ²
Gonzalo Jover ¹

1 Universidad Complutense de Madrid

Universidad Complutense de Madrid

Madrid, España

ROR 02p0gd045
2 Universidad de Castilla-La Mancha

Universidad de Castilla-La Mancha

Ciudad Real, España

ROR https://ror.org/05r78ng12

Revista:

El profesional de la información

ISSN: 1386-6710, 1699-2407

Año de publicación: 2023

Título del ejemplar: Disinformation and online media

Volumen: 32

Número: 5

Tipo: Artículo

DOI: 10.3145/EPI.2023.SEP.16 DIALNET GOOGLE SCHOLAR Acceso abierto editor

Otras publicaciones en: El profesional de la información

Objetivos de desarrollo sostenible

Resumen

La irrupción de la inteligencia artificial (IA) en todos los ámbitos de nuestra vida es una realidad a la que la universidad, como institución de educación superior, ha de responder con prudencia, pero también con decisión. El presente artículo discute el potencial que recursos basados en la IA presentan como potenciales evaluadores de artículos científicos en una hipotética revisión por pares de artículos ya publicados. A través de distintos modelos (GPT-3.5 y GPT-4) y plataformas (ChatPDF y Bing), obtuvimos tres revisiones completas, tanto cualitativas como cuantitativas, para cada uno de los cinco artículos examinados, pudiendo así delinear y contrastar los resultados de todas ellas en función de las revisiones humanas que estos mismos artículos recibieron en su momento. Las evidencias encontradas ponen de relieve hasta qué punto podemos y debemos confiar en los modelos de lenguaje generativos para sostener nuestras decisiones como expertos cualificados en nuestro campo. Asimismo, los resultados corroboran las alucinaciones propias de estos modelos al mismo tiempo que señalan uno de sus grandes defectos actuales: el límite de la ventana contextual. Por otro lado, el estudio también señala las bondades inherentes de un modelo que se encuentra en plena fase de expansión, proporcionando una visión detallada del potencial y las limitaciones que estos modelos ofrecen como posibles asistentes a la revisión de artículos científicos, proceso clave en la comunicación y difusión de la investigación académica.

Referencias bibliográficas

Alkaissi, Hussam; McFarlane, Samy I. (2023). "Artificial hallucinations in ChatGPT: Implications in scientific writing". Cureus, v. 15, n. 2, e35179. https://doi.org/10.7759/cureus.35179
Álvarez-Castillo, José-Luis; Fernández-Caminero, Gemma (2023). "El concepto de diversidad en la universidad desde la política institucional y las creencias del personal docente e investigador. Convergencias y desencuentros". Revista internacional de teoría e investigación educativa, v. 1, e86441. https://doi.org/10.5209/ritie.86441
Brown, Tom B.; Mann, Benjamin; Ryder, Nick; Subbiah, Melanie; Kaplan, Jared; Dhariwal, Prafulla; Neelakantan, Arvind; Shyam, Pranav; Sastry, Girish; Askell, Amanda; Agarwal, Sandhini; Herbert-Voss, Ariel; et al. (2020). "Language models are few-shot learners". In: NIPS´20: Proceedings of the 34th international conference on neural information processing systems, pp. 1877-1901. https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
Campanario, Juan-Miguel (1998a). "Peer review for journals as it stands today. Part 1". Science communication, v. 19, n. 3, pp. 181-211. https://doi.org/10.1177/1075547098019003002
Campanario, Juan-Miguel (1998b). "Peer review for journals as it stands today. Part 2". Science communication, v. 19, n. 4, pp. 277-306. https://doi.org/10.1177/1075547098019004002
Checco, Alessandro; Bracciale, Lorenzo; Loreti, Pierpaolo; Pinfield, Stephen; Bianchi, Giuseppe (2021). "AI-assisted peer review". Humanities & social sciences communications, v. 8, n. 25. https://doi.org/10.1057/s41599-020-00703-8
Chomsky, Noam; Roberts, Ian; Watumull, Jeffrey (2023). "The false promise of ChatGPT". The New York Times, March 8. https://www.nytimes.com/2023/03/08/opinion/noam-chomsky-chatgpt-ai.html
CIS (2014). Actitudes de la juventud en España hacia la participación y el voluntariado. Estudio nº 3039. http://www.cis.es/cis/opencm/ES/1_encuestas/estudios/ver.jsp?estudio=14108
Crawford, Joseph; Cowling, Michael; Allen, Kelly-Ann (2023). "Leadership is needed for ethical ChatGPT: Character, assessment, and learning using artificial intelligence (AI)". Journal of university teaching & learning practice, v. 3, n. 1. https://doi.org/10.53761/1.20.3.02
García, Manuel B. (2023). "Using AI tools in writing peer review reports: should academic journals embrace the use of ChatGPT?". Annals of biomedical engineering, 2023. https://doi.org/10.1007/s10439-023-03299-7
García-Peñalvo, Francisco-José (2023). "La percepción de la inteligencia artificial en contextos educativos tras el lanzamiento de ChatGPT: disrupción o pánico". Education in the knowledge society, v. 24, e31279. https://doi.org/10.14201/eks.31279
Golan, Roei; Reddy, Rohit; Muthigi, Akhil; Ramasamy, Ranjith (2023). "Artificial intelligence in academic writing: a paradigm-shifting technological advance". Nature reviews urology, v. 20, pp. 327-328. https://doi.org/10.1038/s41585-023-00746-x
González-Geraldo, José-Luis; Jover, Gonzalo; Martínez, Miquel (2017). "La ética del aprendizaje servicio en la universidad: una interpretación desde el pragmatismo". Bordón. Revista de pedagogía, v. 69, n. 4, pp. 63-78. https://doi.org/10.13042/BORDON.2017.690405
González-Geraldo, José-Luis; Ortega-López, Leticia (2023). "Valid but not (too) reliable? Discriminating the potential of ChatGPT within higher education". In: Carmo, Mafalda (ed.). Education and new developments 2023. Volume 2. Lisbon: Science Press, pp. 575-579. https://end-educationconference.org/wp-content/uploads/2023/07/2023v2end127.pdf
Hosseini, Mohammad; Horbach, Serge P. J. M. (2023). "Fighting reviewer fatigue or amplifying bias? Considerations and recommendations for use of ChatGPT and other large language models in scholarly peer review". Research integrity and peer review, v. 8, n. 4. https://doi.org/10.1186/s41073-023-00133-5
Igelmo, Jon; Jover, Gonzalo (2019). "Cuestionando la narrativa del aprendizaje servicio a partir de dos iniciativas de extensión social universitaria de orientación católica en la década de 1950 en Espaí±a". Utopía y praxis latinoamericana, v. 24, n. 87, pp. 151-162. https://doi.org/10.5281/zenodo.3464055
Jalil, Sajed; Rafi, Suzzana; LaToza, Thomas D.; Moran, Kevin; Lam, Wing (2023). "ChatGPT and software testing education: Promises & perils". In: 2023 IEEE international conference on software testing, verification and validation workshops (ICSTW), pp. 4130-4137. https://doi.org/10.1109/ICSTW58534.2023.00078
Jover, Gonzalo; Fleta, Teresa; González-García, Rosa (2016). "La formación inicial de los maestros de educación primaria en el contexto de la enseñanza bilingüe en lengua extranjera". Bordón. Revista de pedagogía, v. 68, n. 2, pp. 121-135. https://doi.org/10.13042/BORDON.2016.68208
Jover, Gonzalo; Gozálvez, Vicent (2012). "La universidad como espacio público un análisis a partir de dos debates en torno al pragmatismo". Bordón. Revista de pedagogía, v. 64, n. 3, pp. 39-52. https://recyt.fecyt.es/index.php/BORDON/article/view/22034
Kasneci, Enkelejda; Sessler, Kathrin; Küchemann, Stefan; Bannert, Maria; Dementieva, Daryna; Fischer, Frank; Gasse, Urs; Groh, Georg; Günnemann, Stephan; Hüllermeier,, Eyke; Krusche, Stephan; Kutyniok, Gitta; et al. (2023). "ChatGPT for good? On opportunities and challenges of large language models for education". Learning and individual differences, v. 103, 102274. https://doi.org/10.1016/j.lindif.2023.102274
Lin, Jialiang; Song, Jiaxin; Zhou, Zhangping; Chen, Yidong; Shi, Xiaodong (2023). "Automated scholarly paper review: Concepts, technologies and challenges". Information fusion, v. 98, 101830. https://doi.org/10.1016/j.inffus.2023.101830
Lira, Rodrigo-Pessoa-Cavalcanti; Rocha, Eduardo-Melani; Kara-Junior, Newton; Costa, Dácio-Carvalho; Procianoy, Fernando; De-Paula, Jayter-Silva; Gracitelli, Carolina P. B.; Prata, Tiago-da-Silva; Regatieri, Caio V.; Biccas-Neto, Laurentino; Alves, Monica (2023). "Challenges and advantages of being a scientific journal editor in the era of ChatGPT". Arquivos brasileiros de oftalmologia, v. 86, n. 3, pp. 5-7. https://doi.org/10.5935/0004-2749.2023-1003
Marcus, Gary (2022). "How come GPT can seem so brilliant one minute and so breathtakingly dumb the next?". Marcus on AI, December 1. https://garymarcus.substack.com/p/how-come-gpt-can-seem-so-brilliant
Monroy, Fuensanta; González-Geraldo, José-Luis (2022). "Development of a procrastination scale in Spanish and measurement of education students´ procrastination levels". Bordón. Revista de pedagogía, v. 74, n. 2, pp. 63-76. https://doi.org/10.13042/Bordon.2022.93054
Peña-Fernández, Simón; Meso-Ayerdi, Koldobika; Larrondo-Urena, Ainara; Díaz-Noci, Javier (2023). "Sin periodistas, no hay periodismo. La dimensión social de la inteligencia artificial generativa en los medios de comunicación". Profesional de la información, v. 32, n. 2, e320227. https://doi.org/10.3145/epi.2023.mar.27
Perkins, Mike (2023). "Academic integrity considerations of AI large language models in the post-pandemic era: ChatGPT and beyond". Journal of university teaching & learning practice, v. 20, n. 2, Article 07. https://doi.org/10.53761/1.20.02.07
Rudolph, Jürgen; Tan, Samson; Tan, Shannon (2023). "ChatGPT: Bullshit spewer or the end of traditional assessments in higher education". Journal of applied learning & teaching, v. 6, n. 1. https://doi.org/10.37074/jalt.2023.6.1.9
Santandreu-Calonge, David; Medina-Aguerrebere, Pablo; Hultberg, Patrik; Shah, Mariam-Aman (2023). "Can ChatGPT improve communication in hospitals?". Profesional de la información, v. 32, n. 2, e320219. https://doi.org/10.3145/epi.2023.mar.19
Schulz, Robert; Barnett, Adrian; Bernard, René; Brown, Nicholas J.L.; Byrne, Jennifer A.; Eckmann, Peter; Gazda, MaÅ‚gorzata A.; Kilicoglu, Halil; Prager, Eric M.; Salholz-Hillel, Maia; Ter-Riet, Gerben; Vines, Timothy; et al. (2022). "Is the future of peer review automated?". BMC research notes, v. 15, n. 203. https://doi.org/10.1186/s13104-022-06080-6
Severin, Anna; Strinzel, Michaela; Egger, Matthias; Barros, Tiago; Sokolov, Alexander; Mouatt, Julia-Vilstrup; Mí¼ller, Stefan (2022). "Journal impact factor and peer review thoroughness and helpfulness: A supervised machine learning study". arXiv, 2207.09821. https://doi.org/10.48550/arXiv.2207.09821
Sok, Sarin; Heng, Kimkong (2023). "ChatGPT for education and research: a review of benefits and risks". Social science research network (SSRN), March 9. https://doi.org/10.2139/ssrn.4378735
Srivastava, Mashrin (2023). "A day in the life of ChatGPT as an academic reviewer: Investigating the potential of large language model for scientific literature review". OSF preprints, February 16. https://doi.org/10.31219/osf.io/wydct
Å vab, Igor; Klemenc-KetiÅ¡, Zalika; ZupaniÄ, SaÅ¡a (2023). "New challenges in scientific publications: Referencing, artificial intelligence and ChatGPT". Slovenian journal of public health, v. 62, n. 3, pp. 109-112. https://doi.org/10.2478/sjph-2023-0015
Tlili, Ahmed; Shehata, Boulus; Adakwah, Michael-Agyemang; Bozkurt, Aras; Hickey, Daniel T.; Huang, Ronghuai; Agyemang, Brighter (2023). "What if the devil is my guardian angel: ChatGPT as a case study of using chatbots in education". Smart learning environments, v. 10, n. 15. https://doi.org/10.1186/s40561-023-00237-x
Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gómez, Aidan N.; Kaiser, Åukasz; Polosukhin, Illia (2017). "Attention is all you need". In: NIPS´17: Proceedings of the 31st international conference on neural information processing systems, pp. 6000-6010. https://dl.acm.org/doi/pdf/10.5555/3295222.3295349
Wang, Xuezhi; Wei, Jason; Schuurmans, Dale; Le, Quoc; Chi, Ed; Narang, Sharan; Chowdhery, Aakanksha; Zhou, Denny (2022). "Self-consistency improves chain of thought reasoning in language models". arXiv, 2203.11171v4. https://doi.org/10.48550/arXiv.2203.11171
Zhai, Xiaoming (2023). "ChatGPT for next generation science learning". Crossroads, v. 29, n. 3, pp. 42-46. https://doi.org/10.1145/3589649

Fuente de los datos: Dialnet

ChatGPT could be the reviewer of your next scientific paper. Evidence on the limits of AI-assisted academic reviews

Universidad Complutense de Madrid

Universidad de Castilla-La Mancha

Objetivos de desarrollo sostenible

Resumen

Referencias bibliográficas