[TMP-058] Graybox methods for augmenting human-driven narrative analyses – beta release of the software package
Improvement of the Segram package for automated narrative analysis, aiming for user-friendly querying, analysis, and broad adoption
This project extends a previous microproject by further developing the Segram package for Python. The package automates narrative analysis of text, focusing on extracting basic narrative elements such as agents (active and passive), actions, events, and their relationships (e.g., subjects and objects of actions). It also includes descriptions of these elements. The package uses a graybox model: an opaque statistical language model provides linguistic annotations, which are processed by transparent deterministic algorithms to discover narrative elements. This approach ensures that the output is interpretable and can be validated by users, even those without linguistic or computer science expertise.
The package aims at language understanding and information extraction, rather than language generation. It organizes narrative data for easy querying and statistical analysis. Its semi-transparent design makes validation by human users straightforward, contributing to trustworthy shared representations of narratives, which are crucial for collaboration with AI systems.
The core functionalities of Segram are implemented in an alpha version. This microproject aims to improve and release a beta version, which includes a user-friendly interface for querying and analyzing narrative data, along with comprehensive documentation. The release will be suitable for a wide range of users, regardless of their linguistic or computational background.
The project delivered a software Python package for narrative analysis as per the project description. The package is distributed through Python Package Index (PyPI) under a permissive open-source license (MIT) and therefore is easily accessible and free-to-use. Moreover, it comes with a detailed documentation page facilitating adoption by third-parties. It is worth noting that the advent of latest-generation large language models (LLMs) has partially limited the relevance of the project results.
Tangible Outcomes
- Package page at Python Package Index: https://pypi.org/project/segram/
- tutorial page documenting how to use the package https://segram.readthedocs.io/en/latest/