This compendium was produced for the 2023 Texas Advanced Business Law Conference. This document is seen as far more useful than a traditional paper. Moreover, this compendium may be updated in the future, in stark contrast to traditional papers.
- Why Lawyers Should Care About AI
- What is Artificial Intelligence?
- Types of AI
- AI for the Practice of Law
- Legal Issues Involving AI
- A Few Things That Litigators Should Know About AI
- A Few Things That Transactional Attorneys Should Know About AI
- Regulation and Governance of AI
- Go over what our profession is about (knowledge for, and acting on behalf of clients)
- Go over how AI has elements of agency (but not regulated by a Bar)
- Roberto Legaspi, Zhengqi He, and Taro Toyoizumi, "Synthetic agency: sense of agency in artificial intelligence", Current Opinion in Behavioral Sciences, Vol. 29, October 2019, Pages 84-90. (From the Abstract: "The concept of sense of agency (SoA) has garnered considerable attention in human science at least in the past two decades. Coincidentally, about two decades ago, artificial intelligence (AI) research witnessed an explosion of proposed theories on agency mostly based on dynamical approaches. However, despite this early burst of enthusiasm, SoA models in AI remain limited. We review the state of AI research on SoA, seen predominantly in developmental robotics, vis-à-vis the psychology and neurocognitive treatments, and examine how AI can further achieve stronger SoA models. We posit that AI is now poised to better inform SoA given its advances on self-attribution of action–outcome effects, action selection, and Bayesian inferencing, and argue that synthetic agency has never been more compelling.")
Defining artificial intelligence in a single term is extremely difficult, and for some commentators, considered unwise. See, e.g., Theodore F. Claypoole, "AI Classifications for Law and Regulation", Business Law Today (September 15, 2023). In fact, there is no one single (adequate) definition of artificial intelligence. Fortunately, there is a "classic" formulation of AI in the categorization of thinking and acting, as illustrated in the following set of quotes, courtesy of AI pioneers Stuart Russell and Peter Norvig in their book "Artificial Intelligence: A Modern Approach" (4th ed., 2023):
"The exciting new effort to make computers think ... machines with minds, in the full and literal sense." (Gaugeland, 1985)
"{The automation of} Activities that we associate with human thinking, activities such as decision-making, problem solving, learning..." (Bellman, 1978)
"The study of mental faculties through the use of computational models." (Charniak and McDermott, 1985)
"The study of the computations that make it possible to perceive, reason, and act." (Winston, 1992)
"The art of creating machines that perform functions that require intelligence when performed by people." (Kurzweil, 1990)
"The study of how to make computers do things at which, at the moment, people are better." (Rich and Knight, 1991)
"Computational Intelligence is the study of the design of intelligent agents." (Poole et al., 1998)
AI ...is concerned with intelligent behavior in artificats." (Nilsson, 1998)
You can find other attempts at defining AI here, here, and here.
A few things to note. All of those quotes were from the Twentieth Century. AI was first conceived by Aristotle. Work on neural networks started in the 1950's. What you see now is the culmination of decades of work by thousands of people.
Based on the definition outline above (Thinking/Acting & Humanly/Rationally) you could derive a chart such as the following:
However, it is often more practical to group the types of AI into categories that resemble the products and services with which we are accustomed or encounter regularly. Some types (or combinations of types) of AI are famous. Many types of AI labor in the background -- unseen and unnoticed -- yet some of which have significant legal implications.
The most important thing to remember is that there are many types of AI. One set of types has to do with the four categories above (thinking/acting humanly/rationally). A more practical set of categories can be distilled from the products that we encounter in our daily lives, as outlined in the chart below:
We will address each type of AI, as well as link to various other sources, below.
NLP is the aspect of AI that lawyers deal with most often -- both directly and indirectly.
What is Generative AI?
Generative AI is a subset of Natural Language Processing. Specifically, Generative AI employs the Answering Questions and Text Generation aspects of NLP. The Answering Questions aspect is the interface that is used to ask the question. In recent parlance, the Answering Questions aspect is called a Prompt. The answer that comes back is from the Text Generation machine. A classic example of the Promt/Generation duo is ChatGPT.
Essentially, Generative AI makes use of a model (such as a "Large Language Model" like GPT4) that enables the AI to generate new content with little human interaction. You should think of it as an automated paraphraser, albeit with some serious caveats. AI professionals refer to Generative AI as "a stochastic parrot" or "a reality-starved paraphraser". The AI doesn't know what it is saying (or whether what it says is real or not), but it says it very nicely. Sometimes, the Generative AI can depart significantly from reality, and this is known as "going hysterical." However, even with those limitations, Generative AI has met with great commercial success. Since most Generative AI is not tied to reality, marketing people love it.
Beware, however, for lawyers have been sanctioned for using Generative AI unwisely. See, e.g., Mata v. Avianca, Inc. ("Peter LoDuca, Steven A. Schwartz and the law firm of Levidow, Levidow & Oberman P.C. (the "Levidow Firm") (collectively, "Respondents") abandoned their responsibilities when they submitted non-existent judicial opinions with fake quotes and citations created by the artificial intelligence tool ChatGPT, then continued to stand by the fake opinions after judicial orders called their existence into question.")
Unwise reliance on Generative AI is enough of a problem that some courts now have standing orders related to AI-generated material. See, e.g., an order by Magistrate Judge Gabriel A. Fuentes (N.D. Ill.) that was adopted on May 31, 2023.
Having said all of this, you should know that you can "fine tune" your own AI on something real. In other words, you can direct the AI to look at a corpus of documents (e.g., opinions from the Texas Supreme Court, or ESI from a witness) and let it answer questions -- and identify the document(s) from which it drew to answer your question. Such fine tuning can reduce the amount of hysteria produced by the model and, in some cases, cause the model to tell you from whence it derived the answer -- which makes it easier to verify the results.
Here is a set of articles that discuss Generative AI in depth.
- Gartner Experts Answer the Top Generative AI Questions for Your Enterprise (This article discusses what Generative AI is, the history, the hype, the reality, and the use cases.)
- Generative AI: Perspectives from Stanford HAI (This March, 2023 paper is from the Stanford University Human-Centered Artificial Intelligence initiative discusses the potential of Generative AI.)
- The Generative AI Revolution: Key Legal Considerations for the Consumer Products Industry (This article discusses the important legal aspects of generative AI systems such as OpenAI's ChatGPT, Microsoft's Copilot, and similar commercial offerings.)
- ChatGPT Goes to Law School (The authors of this paper used ChatGPT to generate answers to four real exams at the University of Minnesota Law School. The resulting answers were graded (blindly) and ChatGPT received an average grade of C+ -- low, but passing. Further advise was provided on how to use ChatGPT in legal writing.)
- Generative AI in the Legal Profession ("When Open AI released ChatGPT in late 2022, it galvanized the collective imagination—and collective anxiety. Even lawyers wondered how such new AI technologies would change their profession.")
- The Implications of ChatGPT for Legal Services and Society ("To demonstrate ChatGPT's remarkable sophistication and potential implications, for both legal services and society more generally, most of this paper was generated in about an hour through prompts within the chatbot. The disruptions from AI’s rapid development are no longer in the distant future. They have arrived, and this essay offers a small taste of what lies ahead.")
- A Complete Guide to Fine Tuning Large Language Models ("Fine-tuning in large language models (LLMs) involves re-training pre-trained models on specific datasets, allowing the model to adapt to the specific context of your business needs. This process can help you create highly accurate language models, tailored to your specific business use cases.")
- LLM Finetuning (discusses how to fine tune your large language model (LLM) with your own data.)
- Build LLM-powered chatbot in 5 minutes using HugChat and Streamlit ("We will dive into a step-by-step process of developing an LLM-powered chatbot using HugChat, a powerful Python library that simplifies the integration of LLMs into chatbot applications. Furthermore, we will leverage Streamlit, a user-friendly framework for creating interactive web applications, to provide a seamless user interface and deployment platform for our chatbot.")
- How to use HuggingChat (free ChatGPT) in Python (A step by step guide to using the HuggingChat Python API.)
- How To Build Your Own Custom ChatGPT With Custom Knowledge Base (How to feed your own ChatGPT bot with custom data sources, such as certain court opinions, custodian's ESI, etc.)
- Step-By-Step Guide to Building a Chatbot Knowledge Base ("Without data, AI chatbots are just a pretty face with nothing upstairs. An important first step in using automation and natural language processing to help your customers is to connect your chatbot to your knowledge base.")
- How To Create A Knowledge Base For Your Chatbot (If you are making a chatbot to interact with your clients (for such tasks as basic information intake, answering common questions, etc.), then this article is for you.)
- How to Train an AI Chatbot With Custom Knowledge Base Using ChatGPT API (This is a step-by-step guide for general users to create a specialized version of ChatGPT that is more specific to your practice. Note, this methodology works specifically with ChatGPT, so you'll need an account with OpenAI in order to implement the technique disclosed in the article.)
- Talk to Claude. Claude2 is a large language model that can accommodate much larger documents for discussion purposes. For example, ChatGPT limits you to about 3,000 words (about 4,000 tokens) in the question. Claude2 on the other hand, allows for up to 100,000 tokens (about 75,000 words) so you're able to incorporate large contracts, deposition transcripts, etc. and ask questions about those documents specifically. Claude2 is also able to ingest normal court opinions, so Claude2 is particularly better suited to legal research. See, e.g., Jeremy Caplan Meet Claude: A helpful new AI assistant, Wonder Tools Blog (July 20, 2023).
- Generative AI in the Legal Profession ("When Open AI released ChatGPT in late 2022, it galvanized the collective imagination—and collective anxiety. Even lawyers wondered how such new AI technologies would change their profession.")
- The Implications of ChatGPT for Legal Services and Society ("To demonstrate ChatGPT's remarkable sophistication and potential implications, for both legal services and society more generally, most of this paper was generated in about an hour through prompts within the chatbot. The disruptions from AI’s rapid development are no longer in the distant future. They have arrived, and this essay offers a small taste of what lies ahead.")
- Generative Legal Minds ("How ChatGPT and other technologies might change legal research and writing")
- Liepia, R., Sartor, G. & Wyner, A. (2019). Arguing about causes in law: a semi-formal framework for causal arguments Artificial Intelligence and Law (2019). ("Disputes over causes play a central role in legal argumenta- tion and liability attribution. Legal approaches to causation often strug- gle to capture cause-in-fact in complex situations, e.g. overdetermination, preemption, omission. In this paper, we first assess three current theo- ries of causation (but-for, NESS, ‘actual causation’) to illustrate their strengths and weaknesses in capturing cause-in-fact. Secondly, we intro- duce a semi-formal framework for modelling causal arguments through strict and defeasible rules. Thirdly, the framework is applied to the Al- then vaccine injury case. And lastly, we discuss the need for new crite- ria based on a common causal argumentation framework and propose ideas on how to integrate the current theories of causation to assess the strength of causal arguments, while also acknowledging the tension be- tween evidence-based and policy-based causal analysis in law.")
- Saskia van de Ven, Rinke Hoekstra, Joost Breuker, Lars Wortel, and Abdallah El-Ali, "Judging Amy: Automated Legal Assessment using OWL 2" (This paper discusses using AI (in the form of an Ontology of legal concepts) as a form of artificial judge. From the Abstract: "One of the most salient tasks in law is legal assessment, and concerns the problem of determining whether some case is allowed or disallowed given an appropriate body of legal norms. In this paper we describe a system and Prot ́eg ́e 4 plugin, called OWL Judge, that uses standard OWL 2 DL reasoning for legal assessment. Norms are represented in terms of the LKIF Core ontology, as generic situation descriptions in which something (state, action) is deemed obliged, prohibited or permitted. We demonstrate the design patterns for defining the norms and actual cases. Furthermore we show how a DL classifier can be used to assess individual cases and automatically generate a lex specialis exception structure using OWL Judge. We illustrate our approach with a worked-out example of university library regulations.")
- Adam Wyner, "A Legal Case OWL Ontology with an Instantiation of Popov v. Hayashi", The Knowledge Engineering Review, (January, 2010). ("The paper provides an OWL ontology for legal cases with an instantiation of the legal case Popov v. Hayashi. The ontology makes explicit the conceptual knowledge of the legal case domain, supports reasoning about the domain, and can be used to annotate the text of cases, which in turn can be used to populate the ontology. A populated ontology is a case base which can be used for information retrieval, information extraction, and case based reasoning. The ontology contains not only elements of indexing the case (e.g. the parties, jurisdiction, and date), but as well elements used to reason to a decision such as argument schemes and the components input to the schemes. We use the Prote ́ge ́ ontology editor and knowledge acquisition system, current guidelines for ontology development, and tools for visual and linguistic presentation of the ontology.")
- Shulayeva, O., Siddharthan, A. & Wyner, A., Recognizing cited facts and principles in legal judgements, Artificial Intelligence and Law, 25(1), 107-126 (2017). ("In common law jurisdictions, legal professionals cite facts and legal principles from precedent cases to support their arguments before the court for their intended outcome in a current case. This practice stems from the doctrine of stare decisis, where cases that have similar facts should receive similar decisions with respect to the principles. It is essential for legal professionals to identify such facts and principles in precedent cases, though this is a highly time intensive task. In this paper, we present studies that demonstrate that human annotators can achieve reasonable agreement on which sentences in legal judgements contain cited facts and principles (respectively, j 1⁄4 0:65 and j 1⁄4 0:95 for inter- and intra-annotator agreement). We further demonstrate that it is feasible to automatically annotate sentences containing such legal facts and principles in a supervised machine learning framework based on linguistic features, reporting per category precision and recall figures of between 0.79 and 0.89 for classifying sentences in legal judgements as cited facts, principles or neither using a Bayesian classifier, with an overall j of 0.72 with the human-annotated gold standard.")
- Wyner, A., Bench-Capon, T., Dunne, P. & Cerutti, F, "Senses of ‘argument’ in instantiated argumentation frameworks" Argument & Computation, 6(1), 50-72 (2015). (From the Abstract: "Abstract Argumentation Frameworks (AFs) provide a fruitful basis for exploring issues of defeasible reasoning. Their power largely derives from the abstract nature of the arguments within the framework, where arguments are atomic nodes in an undifferentiated relation of attack. This abstraction conceals different senses of argument, namely a single-step reason to a claim, a series of reasoning steps to a single claim, and reasoning steps for and against a claim. Concrete instantiations encounter difficulties and complexities as a result of conflating these senses. To distinguish them, we provide an approach to instantiating AFs in which the nodes are restricted to literals and rules, encoding the underlying theory directly. Arguments in these senses emerge from this framework as distinctive structures of nodes and paths. As a consequence of the approach, we reduce the effort of computing argumentation extensions, which is in contrast to other approaches. Our framework retains the theoretical and computational benefits of an abstract AF, distinguishes senses of argument, and efficiently computes extensions. Given the mixed intended audience of the paper, the style of presentation is semi-formal.")
There are many legal issues related to AI. The obvious ones include agency, but one of the major underlying issues with generative AI is the source of the text used to generate the large language models (LLMs). Here are several articles on the topis.
Agency:
- (Myth) AI has agency. ("This web page look[s] at some of the ways that we ascribe agency to AI and thereby mask the human agency behind these systems. We've got an interactive tool for you to play with that rephrases dodgy headlines, and some tips on how to avoid anthropomorphising AI systems. Finally, we've got a piece on AI and legal personality and a collection of resources for further reading.")
- Lior, Anat (2020) "AI Entities as AI Agents: Artificial Intelligence Liability and the AI Respondeat Superior Analogy," Mitchell Hamline Law Review: Vol. 46 : Iss. 5 , Article 2. Available at: https://open.mitchellhamline.edu/mhlr/vol46/iss5/2. (From the Abstract: "Artificial Intelligence (AI) based entities are already causing damages and fatalities in today’s commercial world. As a result, the dispute about tort liability of AI-based machines, algorithms, agents, and robots is exponentially advancing in the scholarly world and outside of it. When it comes to AI accidents, different scholars and key figures in the AI industry advocate for different liability regimes. This ever-growing disagreement is condemning this new emergent technology, soon to be found in almost every home and street in the US and around the world, into a realm of regulatory uncertainty. This obstructs our ability to fully enjoy the many benefits AI has to offer us as consumers and as a society. This Article advocates for the adoption and application of a strict liability regime on current and future AI accidents. It does so by delving into and exploring the realm of legal analogies in the AI context and promoting the agency analogy, and subsequently, the respondeat superior doctrine. This Article explains and justifies why the agency analogy is the best-suited one in contrast to other analogies which have been suggested in the context of AI liability (e.g., products, animals, electronic persons and even slaves). As a result, the intuitive application of the respondeat superior doctrine provides the AI industry with a much-needed underlying liability regime which will enable it to continue to evolve in the years to come, and its victims to receive remedy once accidents occur.")
Copyright:
- AI and Copyright (a blog about artificial intelligence and copyright law). See, e.g., Peter Schoppert, "Has your book been used ot train the AI?", AI and Copyright (March 5, 2023). (Discusses the Book3 dataset that is used to train one or more LLMs, but that (allegedly) contains copyrighted ebooks.)
- Grimm, et al., "Artificial Intelligence as Evidence" Northwestern Journal of Technology and Intellectual Property Volume 19 | Issue 1 | Article 2 (December, 2021). (From the Abstract: "This article explores issues that govern the admissibility of Artificial Intelligence (“AI”) applications in civil and criminal cases, from the perspective of a federal trial judge and two computer scientists, one of whom also is an experienced attorney. It provides a detailed yet intelligible discussion of what AI is and how it works, a history of its development, and a description of the wide variety of functions that it is designed to accomplish, stressing that AI applications are ubiquitous, both in the private and public sectors. Applications today include: health care, education, employment-related decision-making, finance, law enforcement, and the legal profession. The article underscores the importance of determining the validity of an AI application (i.e., how accurately the AI measures, classifies, or predicts what it is designed to), as well as its reliability (i.e., the consistency with which the AI produces accurate results when applied to the same or substantially similar circumstances), in deciding whether it should be admitted into evidence in civil and criminal cases. The article further discusses factors that can affect the validity and reliability of AI evidence, including bias of various types, “function creep,” lack of transparency and explainability, and the sufficiency of the objective testing of AI applications before they are released for public use. The article next provides an in-depth discussion of the evidentiary principles that govern whether AI evidence should be admitted in court cases, a topic which, at present, is not the subject of comprehensive analysis in decisional law. The focus of this discussion is on providing a step-by-step analysis of the most important issues, and the factors that affect decisions on whether to admit AI evidence. Finally, the article concludes with a discussion of practical suggestions intended to assist lawyers and judges as they are called upon to introduce, object to, or decide on whether to admit AI evidence.")
- Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi, "SELF-INSTRUCT: Aligning Language Models with Self-Generated Instructions" Arxiv.org (May, 2023). (AI that instructs itself! From the Abstract: "Large “instruction-tuned” language models (i.e., finetuned to respond to instructions) have demonstrated a remarkable ability to gener- alize zero-shot to new tasks. Nevertheless, they depend heavily on human-written instruction data that is often limited in quantity, diversity, and creativity, therefore hindering the generality of the tuned model. We introduce SELF-INSTRUCT, a framework for improving the instruction-following capabilities of pre-trained language models by bootstrapping off their own generations. Our pipeline generates instructions, input, and output samples from a language model, then filters invalid or similar ones before using them to finetune the original model. Applying our method to the vanilla GPT3, we demonstrate a 33% absolute improvement over the original model on SUPER-NATURALINSTRUCTIONS, on par with the performance of InstructGPT001 which was trained with private user data and human annotations. For further evaluation, we curate a set of expert-written instructions for novel tasks, and show through human evaluation that tuning GPT3 with SELF-INSTRUCT outperforms using existing public instruction datasets by a large margin, leaving only a 5% absolute gap behind InstructGPT001. SELF-INSTRUCT provides an almost annotation-free method for aligning pre-trained language models with instructions, and we release our large synthetic dataset to facilitate future studies on instruction tuning." Incidentally, the code for SELF-INSTRUCT (and associated data) is available here.)
- Mark Chinen, "The need for the international governance of AI" The International Governance of Artificial Intelligence (pp. 8-33, (Chapter 1)). ("...[A]rtificial intelligence stands to impact every domain of human life, and several of these impacts will be international in scope. This raises the question whether it is desirable or possible to govern AI applications at the international level. This chapter lays the groundwork for that inquiry. It begins with basic concepts that will be used throughout the book. It then surveys possible international impacts of AI in various domains and concludes by placing debates about the governance of artificial intelligence within larger problems with the governance of technology in general.")
- El-Mahdi El-Mhamdi, Sadegh Farhadkhani, Rachid Guerraoui, Nirupam Gupta, Lˆe-Nguyˆen Hoang, Rafa ̈el Pinot, S ́ebastien Rouault, and John Stephan, "On the Impossible Safety of Large AI Models" Arxiv.org (May, 2023). ("Large AI Models (LAIMs), of which large language models are the most prominent recent example, showcase some impressive performance. However they have been empirically found to pose serious security issues. This paper systematizes our knowledge about the fundamental impossibility of building arbitrarily accurate and secure machine learning models. More precisely, we identify key challenging features of many of today’s machine learning settings. Namely, high accuracy seems to require memorizing large training datasets, which are often user-generated and highly heterogeneous, with both sensitive information and fake users. We then survey statistical lower bounds that, we argue, constitute a compelling case against the possibility of designing high-accuracy LAIMs with strong security guarantees.")
- Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale, "Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback" Arxiv.org (March, 2023). ("Large language models (LLMs) are used to generate content for an increasingly wide range of tasks, and are set to reach a growing audience in coming years due to integration in product interfaces like ChatGPT or search engines like Bing. This intensifies the need to ensure that models are aligned with human preferences and do not produce unsafe, inaccurate or toxic outputs. While alignment techniques like reinforcement learning with human feedback (RLHF) and red-teaming can mitigate some safety concerns and improve model capabilities, it is unlikely that an aggregate fine-tuning process can adequately represent the full range of users’ preferences and values. Different people may legitimately disagree on their preferences for language and conversational norms, as well as on values or ideologies which guide their communication. Personalising LLMs through micro-level preference learning processes may result in models that are better aligned with each user. However, there are several normative challenges in defining the bounds of a societally-acceptable and safe degree of personalisation. In this paper, we ask how, and in what ways, LLMs should be personalised. First, we review literature on current paradigms for aligning LLMs with human feedback, and identify issues including (i) a lack of clarity regarding what alignment means; (ii) a tendency of technology providers to prescribe definitions of inherently subjective preferences and values; and (iii) a “tyranny of the crowdworker”, exacerbated by a lack of documentation in who we are really aligning to. Second, we present a taxonomy of benefits and risks associated with personalised LLMs, for individuals and society at large. Finally, we propose a three-tiered policy framework that allows users to experience the benefits of personalised alignment, while restraining unsafe and undesirable LLM-behaviours within (supra-)national and organisational bounds.")
- Gary Marcus, "Two models of AI oversight - and how things could go deeply wrong, The Road to AI We Can Trust Blog (June 8, 2023). (Discusses two disparate ways that governments are addressing the AI governance issue. For him, the "positive future" is one where a global AI agency is formed, and AI is thoroughly regulated. The "bleak future" is one where there is no agreed-upon reglation and, in the unregulated future, small numbers of "quickly become far more powerful that states..." that ultimately leads to "[a]narchy.")
- Ethics and Governance of AI, Berkman Klein Center for Intenet & Society at Harvard University.