Comment 8: Persisting in an Age of Generative Text

In most conversations about AI and research generally, both within and outside academia, the conversations mostly focus on the outcome. We see this especially with hallucinations and failures to cite. While AI results are important, of equal importance is the basic mechanisms of research which is not being talked about much. Honestly, another conversation with AI is exhausting but, to be fair, it’s pretty much impossible to avoid. So today, we’ll discuss researching and technology.

The process of research is forever evolving in reaction to technological innovation. Over the last few decades, the process has adapted to (more-or-less in chronological order) the transition from card catalogs to computers; the internet; search engines; Wikipedia; and AI. In the 2010s, around the periods of search engines (namely Google) and Wikipedia, I was working through graduate programs in History and Library and Information Sciences. During this time, I worked as a graduate assistant in class and later as a student worker in the university library’s reference department. Sufficed to say, my days were filled with discussing resources with students. While search engines were embraced, the rote directive for Wikipedia was “Don’t look at it and don’t use it.”

Though I’d often advise students that looking at Wikipedia couldn’t hurt, it is useful for getting up to speed on a topic and mining citations, I agreed with my colleagues that Wikipedia was not something to use (see: cite) in their research. The reasoning was sound. Wikipedia articles are plastic and lack the authority granted by permanence. No matter what editorial controls Wikipedia has in place they are just insufficient for making it usable in ninety-nine percent of research. My opinion on Wikipedia has held into my current position and I still give similar advice: look but don’t use.

The reason for discussing Wikipedia is because AI has the same exact issues, both pro and con. AI can be very useful on a first pass or in getting up speed with a topic but the results, like articles on Wikipedia, are plastic and lack authority. AI does provide some additional benefits: the models tend to be very good at summarizing and generating ‘mind maps’ of your sources, especially when using NotebookLM. It also has more problems in the form of all the hallucinations. It reminds me of an old axiom in engineering: the more complex the machine the more points of failure.

Research, from hard science to the humanities, are built and founded on two fundamentally imperative mechanisms: replication and reproduction (R&R). Replication is getting consistent results from the same methods with new data. Reproduction is obtaining the same result from the same methods and data. It’s a necessary form of checks and balances built into research. So necessary that many within their fields worry that modern research has strayed so far from these mechanisms that all current work in multiple fields are fundamentally broken.

AI intensifies this fracture and highlights why librarians frequently direct users to databases. Databases lack the level of permanence of a physical book, though is still better than other digital options.. When you use a search engine it passes the replication test but will always fail at reproduction. Why is this? Because there are a huge number of variables that affect search results. Assume that you use the exact same string for two different searches: variables that affect the results include your current location, previous searches you’ve conducted, SEO (search engine optimization), and countless more. Even if you can control those variables for yourself, you can’t for others because they have fundamentally different variables than you. A person standing directly next to you using the same search string in the same engine may get radically different results!

AI for research can fail both R&R because of how the technology works, and that problem exists both outside the researcher and within. The external issues are like those of search engines, but the internal issues need examined. For every prompt, even when when you are following-up, generative AI tools are creating something new. As a researcher, or somebody testing research, how can you ever return to something that is perpetually ephemeral? Which is why databases are preferred to search engines, Wikipedia, and AI. The results in a database search pass both R&R. The same search string used in the same database will get you almost the same exact results every time. The same search string in other databases should get you different but vastly similar results. The resources themselves, predominately from academic journals, are themselves reproducible and replicable.

What does this mean for law students and those who assist them? The methodology of the researcher needs to be repeatable and replicable because that is the best way to keep track of what you’ve tried (i.e. search strings) and where you’ve been so that you can plan what to look next. When your tools (search engines, Wikipedia, AI) don’t support R&R natively, then your twentieth search may not build on its nineteen predecessors. Your research has a hard time progressing from day one. Further, the resources that you use need to be based in permanency so that R&R is applicable, which you are more likely to find in a database than in search engines; Wikipedia; or AI. In the age of the computer finding resources is easy, finding good resources takes time and ability, and your writing can only be as good as the resources that you use.

—

The ABA Rules of Professional Conduct, Model Rule 1.1 Comment 8 requires, “To maintain the requisite knowledge and skill, a lawyer shall keep abreast of changes in the law and its practice, including the benefits and risks associated with relevant technology.” To that end, we have developed this regular series to develop the competence and skills necessary to responsibly choose and use the best technologies for your educational and professional lives. If you have any questions, concerns, or topics you would like to see discussed, please reach out to e.koltonski@csuohio.edu with “Comment 8” in the subject.