Legal Natural Language Processing Lab

Master Practical Course

See presentation slides.

Course Outline

The analysis of legal data/text and the design and development of systems that provide valuable functionality to legal practitioners pose various challenges. These include noisy raw data that must be carefully preprocessed, ill-defined tasks for which only small datasets exist and for which learning supervision and evaluation is difficult to obtain, and domain-specific information of various kinds that must be taken into account at many stages of the process.

This lab course provides students with an opportunity to gain practical experience in working with legal data in small teams. The instructors will be offering projects centered around a research question/hypothesis. They will typically involve one or more datasets from a legal domain, one or more formal tasks, and one or more methods to be tried. Over the course of the semester, teams will develop an experimental system/prototype and evaluate it, thereby producing new insight about that hypothesis.

After an initial introduction of the legal informatics topic, students will be matched into teams and assigned projects. Teams will meet with their project mentors regularly to present work updates, discuss progress, and define action items. At the end of three milestone intervals, teams will present their progress to the whole cohort and discuss all projects with their peers.

Screenshot of a commercial legal Q+A software sold by LexisNexis.

Learning Outcomes

After completing this module, students will have gained practice in planning, implementing, and evaluating a legal data science/informatics project. In particular, they will have gained experience in:

  1. formulating an experimental hypothesis
  2. identifying characteristics of data from the legal domain and explain how they influence technical aspects of project work
  3. conduct a targeted prior work survey in the legal informatics literature for a given project context
  4. designing an experimental system towards producing insight from data and/or developing new functionality of interest
  5. conducting model evaluation and behavior analysis
Left: Commercial contract analysis software offered by Kira Systems. Right: Legal search engine "Parallel Search" by casetext.

Requirements

Students must have experience in machine learning and, ideally, natural language processing. They should have taken the following courses or be sufficiently proficient in the topics and methods they cover:

    IN2332: Statistical Modeling and Machine Learning

    IN2062: Grundlagen der künstlichen Intelligenz / Foundations of Artificial Intelligence

    IN2361: Natural Language Processing

    IN2395: Legal Data Science & Informatics

If a student has not taken IN2395, it is expected that they familiarize themselves with background materials relevant to their respective project.

References

2022

  1. thumbnail_6.png
    Extractive summarization of legal decisions using multi-task learning and maximal marginal relevance
    Abhishek Agarwal, Shanshan Xu, and Matthias Grabmair
    arXiv preprint arXiv:2210.12437, 2022
  2. thumbnail_7.png
    Attack on Unfair ToS Clause Detection: A Case Study using Universal Adversarial Triggers
    Shanshan Xu, Irina Broda, Rashid Haddad, and 2 more authors
    arXiv preprint arXiv:2211.15556, 2022