In line with our systemic action research approach, Congruence Engine is currently running multiple investigations, in parallel, led by different researchers. The investigations are seeking to explore historical and professional practice questions while experimenting with different computation tools and social processes in the development of a national collection social machine. Some of the investigations are developing different digital pipelines that take collections of heritage material or datasets and transform them into linkable data, machine-readable data, while others are seeking to address social, political and infrastructural issues that lie at the heart of a potential future digitally connected national collection. Current investigations include:
One of the issues in seeking to link collections using their online catalogue data is just how rudimentary those data often are. ‘Deeper Catalogue Data’ is exploring the potential of the Science Museum’s printed collections catalogues from a hundred years ago for enhancing the linkability of objects in our collections. At the time of digitisation in the 1980s these catalogues were overlooked, but now we are thinking that they may have real potential, as often their entries are significantly longer and more detailed than what’s online. Starting with the 1921 textile machinery catalogue, we are working using machine learning techniques to automatically extract from the richer detail here both names of objects, and of people, companies, places and dates. We aim thereby to be able to link our textile machines both generically and specifically to relevant material in other collections nationwide, providing mutual contextualisation. And, in a second strand, we are investigating the thought style of the curators who created our collections so as to understand how the seeming strengths and weaknesses of the collections arose. Once we have proof of concept from this one collection, we will seek to expand the techniques to the whole Science Museum collection, and possibly beyond.
The investigation aims to contribute towards the design and development of a sector-level strategy regarding the optimization, improvement and connectivity of online catalogues of cultural heritage institutions as part of a national collection’s infrastructure. Through this investigation we embark on exploring and demystifying common practices in museum online catalogues, aiming to reveal potential obstacles that suspend their connectivity and advanced use from various users.
The investigation started with an interest to explore the museum (online) catalogues from a technical and data-point of view, in order to assess potential obstacles and weaknesses to the catalogues’ underlying data that limit their linking with other museums’ collections and records, using the Science Museum Group (SMG) catalogue dataset as a case-study.
Since the invention of sound recording, archival sound has become one of the most powerful mediums to preserve information in sonic form and create a living, emotional connection with our past. However, the data currently available in oral history, sound and folk music archives often do not allow extracting machine-readable content from these sources and digitally linking them with other datasets. Two investigations in the Congruence Engine project are exploring the connective potential of sound, looking at the processes and the techniques to connect different types of audio-based sources to museum collections.
Each oral history interview is a rich, dense, and multilayered historical source that encloses different types of connections with people, places, museum objects, as well as personal experiences, family memories and social issues. This investigation is exploring the use of Automatic Speech Recognition and the latest advancements in Natural Language Processing to unlock the multiple layers of meanings enclosed in oral history archives, with the ultimate objective to open new perspectives on the use of these sources for connecting museum collections. The pipeline, which is being developed in collaboration with the Institute for Digital Culture of the University of Leicester, includes the development a visual exploratory tool that will allow to explore a range of audio interviews from different oral history projects and search across interconnected topics.
Workers’ songs are an incredibly rich source about the social history of the textile mills and mining industry. These songs are usually displayed in museums to complement an exhibit and collected in a variety of institutional and non-institutional repositories, but are usually disconnected from the objects, places and people they relate to. The aim of this investigation is to develop a pipeline to reconnect this sonic heritage to the material collections held in museums. In collaboration with the singing historian Jennifer Reid, we are developing an annotation tool able to capture the expert knowledge of folk songs performers and historians, so revealing the web of connections, vocabulary and meanings enclosed in the songs’ lyric.
Rapidly developing technological approaches are creating novel opportunities for the production of new forms of immersive engagement with the past, which will be enabled by the interoperability of metadata and data which Towards a National Collections aims to generate. However, the data in question is often ‘thin’: sparsely documented, partial in its representative quality, and of variable (in this case, visual) resolution. At the heart of this investigation is a practical and technology-focused test of the processes involved in producing animated 3D environments from historical image data (still and moving image) using Neural Field Imaging (NeRF) and the evaluation of the qualities of data required. In parallel, it interrogates the design requirements for a human-centric social machine pipeline for the identification, acquisition and preparation of the necessary image data.
The supply of energy forms one of the key technological infrastructures of society. How can visualising through maps the history of power stations help us better understand the history of this key infrastructure network? And how did the development of the Parsons’ steam turbine alternator shape the network’s history, especially in and around Newcastle-upon-Tyne? By asking such questions, the investigation explores the uses of digital tools to connect museum collections catalogues, historical documents, and new transcriptions of archival company records with the aid of mapping visualisations.
The gendered nature of domestic/household appliances advertising has formed a key area of research for historians and researchers over the past decade. With increased number of historical magazines and newspapers being digitised, the number of such advertisements accessible to researchers is on the rise. The investigation explores the suitability of computer vision tools to identify and classify objects within such advertisements.
The investigation proposes to design an ontology that can describe C19th and early C20th industrial occupations: in terms of who does what, where, when, with what, to what effect, in industries related to the project’s textiles, energy and communications themes. Informed by data from other projects and investigations, the development of the ontology will call upon data extracted from trade directories, census returns and from a government produced taxonomy of occupations created from the 1921 census return. Whilst the immediate practical focus of the ontology development is to support the open linking of data across collections, this work will also encompass an exploration of the potential for developing participatory forms of interaction around the collective authoring and maintenance of an ontology and its associated data resources.
Using pre-trained models of NER, we have found it difficult to identify heritage artefacts as entities in text. This limits the usefulness of NER tools in finding candidates for linkage in blocks of text at an object level. This investigation seeks to explore how easy it would be to use a range of object name lists used in museum cataloguing and train an NER model with it, so it was able to better recognise potential objects in entity recognition tasks. This is designed to be proof-of-concept, and not an exhaustive investigation that will train NER with all available object name vocabularies.
A multi-stranded ‘umbrella’ investigation, focused initially on the West Riding of Yorkshire. Through a practical process of data construction, the project interrogates the requirements for aggregating and linking a diversity of analogue, digitised and digitalised/machine-readable data concerning the regional history of textile mills in a single region. The data encompasses a wide range of business records, spatial data at a range of scales (from national to room-level), and spatially defined linkage of e.g., aerial photographs, alongside primary data that might support micro-historiography.
This mini-investigation is part of a wider investigation into understanding the governance of the social machine. As part of trying to get a sense of how the sector is set up to operate within a potential future ‘national collection as social machine’, this investigation seeks to map put the different types of collections management systems being used by the data partners in the project. The goal is to understand the levels of homogeneity in the systems and the range of complexity of systems (from spreadsheets to complex integrated knowledge and information systems). To do this, a survey has been sent to all data partners and the results are currently being analysed.
Advances in Generative Procedural Transformer (GPT) technology, in combination with Large Language Models (LLMs) has, over the last year, opened radically new methods for working with data, from Natural Language Querying to Knowledge-Graph referenced Summarisation. This investigation is designed to investigate how GPT technology can be tuned, using various approaches, to return reliable responses to user’s prompts and queries, with reference to specialist corpora derived from scanned analogue sources. The initial focus of the research is on the industrial processes involved in different stages of wool and worsted production, encompassing both popular and technical sources.
Interested in creating a pathway to producing a design specification for a national collection social machine, this investigation is seeking to refine and test the resonance of the emerging ideas around a national collection social machine. This investigation will run through a series of iterative phases, undertaking dialogic interviews with project partners, deepening research strands on emerging issues from those interviews, and drafting new investigations to enact and explore different aspects of the social machine.
Our investigation will draw on questions generated through the first meeting of the Race and Decolonisation working group in May 2023, principally responding to a question of how we involve the whole of Congruence Engine in this discussion. We will use different disciplinary perspectives and expertise to help us to develop a decolonial approach we can use across all our ongoing investigations. There are three core strands of work proposed: personal reflection and development of racial consciousness, continued development of a decolonial approach through a reading group series, and social machine participation and governance.
AI models such as ChatGPT area likely to become common place in the not-too-distant future and a key part of knowledge discovery, in much the same was as Google and Bing are now. How will the use of such models, alongside public Knowledge Graphs such as Wikidata, affect how the UK’s National Collections are both published and interrogated. We are exploring the limitations of these technologies as well as the new opportunities enabled by them.