I tried to teach but people fled and even my interpreter left me alone in the pulpit, then I had to run for it as well and we were all drenched, ending the meeting! Houston, Natalie M An Evaluation of Rhyme Detection Using Historical Dictionaries As part of a larger project in distant reading nineteenth-century British poetry, a method for detecting line-end rhymes was devised that utilizes rhyme dictionaries published in the eighteenth and nineteenth centuries. This method was proposed in order to account for historical debates about the definition of poetic rhymes in English as well as historical changes in pronunciation.

This paper describes an evaluation of this approach that compares it to a method commonly used in computational analysis, which is based on the CMU Pronouncing Dictionary, in order to understand what significant differences occur. Its fundamental is to build networks of Chinese characters having same syllabic elements. In response to these digital opportunities, research on the First World War has seen a digital 'big bang': the period from has greater digital coverage than any other historical period.

This paper will address the theme of community generated content CGC in the context of digital First World War initiatives across Europe, and explore the value and digital legacy of community generated content that is, methodologically, of significance to broader issues around using and sustaining digital histories.

It will discuss the sustainability and use of GCG in historiography, and as a disruption in the research life cycle. Hulden, Vilja Labor Witnesses at U. Congressional Hearings: Historical Patterns This paper examines what metadata about Congressional hearings can tell us about shifts in the relative power of workers in American society over time.

The metadata contains , instances of testimony between and ; it includes information about the witnesses appearing before Congressional committees as well as about the subjects of the hearings. This data is juxtaposed with strike and union density data to suggest that labor has been most consistently represented at Congressional hearings when a it has engaged in electoral politics and b it has possessed demonstrable strength in civil society as measured not only by isolated incidents but by consistent penetration.

Future work hopes to juxtapose these data sets with actual legislative outcomes; merely being heard does not, after all, necessarily translate into being listened or deferred to. Such juxtapositions could help elucidate whether organizational strength converts to important outcomes as well as presence at hearings.

In particular, special attention is paid to the role of historical electronic maps and the use of geo-information technologies in the preservation of historical memory. At present, issues of ethnic and political repression in the USSR are studied rather fragmentary, both in Russia and in Europe as a whole. Today, the problem of repression against the Volga Germans has two aspects, firstly, necessary the direct study of repressive practices, deportation processes and secondly, preservation of the memory of victims of repression, the creation of dialogue in society.

Achieving these goals is impossible without the use of information technology, especially when it comes to preserving memory and representing historical events in public space. Bibliographic metadata can represent important historical trends and resolve issues such as the ordering of editions. In this paper, we present the state of the art analytical approach for determining editions and their ordering. By providing harmonized data and information on historical developments in book production, this will be a great aid for projects aiming to do large-scale text mining.

Contemporary text mining approaches do not utilize edition level information to the fullest extent and therefore are limited in their scope. Using the ESTC metadata, we have developed harmonizing techniques that convert free-form text into more coherent entries for statistical analysis. Furthermore, a new gold standard was developed for validation purposes, with multiple layers of information. The use of this data would significantly enhance the understanding of early modern publishing. Ilvanidou, Maria And The First One Now Will Later Be Last, For The Times They Are A-changin': Modeling Land Communication In Roman Crete The present contribution has a twofold aim: on the one hand it will seek to demonstrate how the use of digital tools and methods enabled the reconstruction of the Roman road network in Crete back in , while on the other hand it will showcase how the rapid developments in digital tools often deems research in the field of the Humanities outdated and obsolete.

Such an initiative as the integration, connection and modeling of complex data on Roman road networks in the digital domain was indeed quite innovative back then. An analogue approach would still have been up to date and re-usable. Sustainability of this Roman roads modeling project has proven to be next to impossible. Therefore, one could argue that, what the digital so generously offered my work, it has taken it back rather fiercely. Impett, Leonardo Laurence Early Modern Computer Vision Computer vision necessarily embodies a theory of vision primarily a neuroscientific one : conversely, important discoveries in the theory of vision have come from computer vision algorithms.

This paper describes a project, Early Modern Computer Vision, which therefore attempts to prototype a computer vision that is to say, a way for machines to read images which is based on Italian theories of optics, vision and visual art of the 16th century, as an experimental apparatus to investigate those theories. I present a passage by Michael Baxandall in which he suggests something similar in the s though he didn't attempt technical implementation , and sketch an initial prototype for an Early Modern Computer VIsion: a digital colour-space based on Giovanni Paolo Lomazzo's Temple of Painting This phenomenon can originate from a conflict of interest or uneasiness during an interview.

In some contexts, such experiences are associated with negative emotions such as fear or distress. People tend to adopt different hedging strategies in such situations to avoid criticism or evade questions. In this work, we analyze several survivor interview transcripts to determine different characteristics that play crucial roles during tension situation. We discuss key components of tension experiences and propose a natural language processing model which can effectively combine these components to identify tension points in text-based oral history interviews.

The model provides a framework that can be used in future research on tension phenomena in oral history interviews. By dissolving the structure of texts, the frequencies of different words can be determined and visualized with font size. But, there are crucial theoretical problems in the design of tag clouds that question their benefit for text analysis tasks. In this paper, we evaluate the value of several tag cloud visualization techniques that have been designed to support research tasks in various digital humanities scenarios.

We base our analysis on the King James Bible being the most influential English translation. Overview and detail are typically positioned as opposites in the visual representation of information spaces. However, when visualizing image collections annotated by art historians, there is an opportunity to reveal the visual details of individual images while at the same time exposing iconographic patterns prevalent within a collection. As part of an iterative research and design process in collaboration with a museum of arts and crafts, we have devised a visualization technique that arranges detailed close-ups into frequency-based collages.

The resulting visual interface is designed for open-ended exploration of digitized glass plate negatives without requiring prior knowledge about the collection or the need for entering search queries. We implemented the concept as a web-based interface and evaluated the potential of the approach. Including Minority Voices in the Oral History Archive Through Digital Practice While digital archiving practices in the Netherlands have provided better access to oral history collections, the effort has also demonstrated that the voices heard in those oral history projects are predominantly white.

This paper argues that the composition of the Dutch oral history archive is in dire need of revision and seeks to generate a dialogue on how to remedy this silence. In a discipline that has traditionally prided itself on its emancipatory potential, ethnic minorities and formerly colonized peoples in particular have received relatively little attention. First, I closely examine the state of the art of digital oral history in the Netherlands. Second, I will explore how digital research infrastructures and repositories can contribute to a more inclusive archive through closely collaborating with community archives Kiessling, Benjamin 1,2 Kraken - an Universal Text Recognizer for the Humanities Kraken is a language-agnostic optical character recognition engine that can be applied to both printed and handwritten texts with relatively modest training effort.

It includes a number of features making it of special interest to digitization work in the humanities. Its key research question revolves around measuring the success of approaches to immersive technologies at major heritage sites in Scotland, both in terms of outcomes against business plan expectations and in terms of visitor response, and the kinds of future development supported by the evidence.

Development of an evidence-based, decision-making model is currently under-way and will be presented at DH Formulated as a policy and risk assessment document, the model is meant to help heritage institutions identify the kinds of future immersive experiences that are supported by our evidence; as well as assess how to develop effective, meaningful content into leading edge inclusive and impactful immersive experiences. Some of these images appear several times throughout the corpus. We present how we identify and analyse recurring images using an image hashing algorithm and a data visualisation tool.

The reappearance of images, combined with bibliographic metadata, can offer insights into the kind of knowledge that is being taught, which images have been successful, as well as which images might have been exchanged between different printers and publishers. It is used to collect oral data from the Germanic and Romance variaties spoken in the area in order to gain further insight into the different aspects of multilingualism and microvariation.

The data collection is done through a simple interface, which facilitates easy collaboration with speakers and speech communities, and aims to return all collected data to the community in a meaningful manner. Online use of minority languages plays an important role in increasing its prestige and visibility, and it can greatly contribute to the maintenance of the variety by the increasing awareness and pride of the own language.

In , this corpus is among the most important digitized medieval source representing more than manuscripts: this is still a work in progress! The fantasy of digital immortality is widely shared, but in reality, digital resources are highly fragile. In short, over many years, we have built a very safe and costly digital necropolis progressively covered by layers of digital sand rather than a clean organized library. This paper will present the consecutive operations made during the preservation project of this very valuable collection of manuscripts.

Larrousse, Nicolas; Marchand, Joel A Techno-Human Mesh for Humanities in France: Dealing with preservation complexity Nowadays, as the use of digital data for research in Humanities has become the norm, researchers are dealing with a huge amount of data. As a consequence, the risk of data loss is increasing. Another difficulty is to provide full access to this flood of data to users often located in distant areas. These problems can no longer be addressed individually by researchers or even at a laboratory level: it is therefore necessary to use a technical infrastructure with specific skills to provide stable preservation services.

This solution is proposed by Huma-Num, the French national infrastructure dedicated to Digital Humanities. Li, Hui Dishes on the menu: Turning Historic Menu into Menu Network Historic menus contain abundant information about changing regional tastes, the ingredients of popular dishes, the arrangements of different meals, and fascinating stories behind the menu. However, research upon the modeling, measurement, and analysis of menus network is still at its very beginning. In this paper, we aim to propose a menu network that closely resembles today's social network based on the metadata and content of menus.

We set the formalization and standard for the basic elements in most menus, and introduce our menu network, which integrates temporal, geographical, economic and textual information into a graph structure. Although semantic web technologies are gradually introduced in the digital humanities and cultural heritage institutions the representation of linked data is still very abstract and hardly allows for interactions by researchers or other users.

The 2D interface aims to preserve and present the complexities rooted in historical sources through deep mapping. It aims at the visualization and analysis of migration pattern of the creative individuals within Amsterdam during the Dutch Golden Age.

The goal of creating chronicles from biographies led us to the very challenging task of sentence compression. We used the biographies of the Taipei gazetteers as the testbed, and abbreviated sentences based on the results of constituency and dependency parsing of Stanford tools. We shortened the original sentences by heuristically dropping some nodes in the parsing results. And yet many of these projects lack adoption of controlled terminologies to represent their content and aid search, discoverability, and use.

Or will you get different results by using an older controlled vocabulary from the same time period as the documents? Our presentation describes the results of our experiments comparing the output of current and historical vocabularies to automatically index historical documents, and discusses our findings.

Lorenzini, Matteo; Rospocher, Marco; Tonelli, Sara Computer Assisted Curation of Digital Cultural Heritage Repositories The objective of metadata curatorship is to ensure that users can effectively and efficiently access objects of interest from a repository, digital library, catalogue, etc. However, we are often facing problems related to the low quality of metadata used for the description of digital resources, for example wrong definitions, inconsistencies, or resources with incomplete descriptions.

There may be many reasons for that, all completely valid, e. Taking as reference the framework developed by Bruce and Hillmann , in this paper we present our ongoing work, which aims at defining computable metrics to assess metadata quality and automatize metadata quality check process.

We argue that ARC's work with TAMU-L offers a model for sustaining digital editions in perpetuity, and ultimately, creating ways for future scholars to discover editorial commentary via the semantic web. This paper emphasizes insights we have learned while working on this project that are meaningful not only to computational musicology and music information retrieval researchers, but also to those working in the digital humanities in general.

We first focus on approaches to dataset construction and machine learning, and then explore approaches to making research data, software and results available, usable and attractive to other researchers in the humanities, including those not yet accustomed to computational approaches. McKee, Sarah E. DH And The Evolving Monograph This presentation will begin with a brief overview of the history and theoretical underpinnings of the Digital Publishing in the Humanities initiative at Emory University.

This four-year experiment seeks to find best practices for supporting faculty in the development of digital monographs, including securing funding and collaborating with publishers to create works that extend beyond the form of a traditional book. The focus will then shift to several case studies of digital monographs currently under development by Emory faculty. Mei, Ching-Hsuan; Hung, Jen-Jou Exploring Intertextuality in the Mahoyoga Section of the Rin chen gter mdzod Although Tibetan scholars have already noticed the phenomenon on textual reused of treasure literature gter ma , it remains difficult to conduct a big scale of compare reading and further identify repeated sentences and locate their origin.

Deducing from previous studies, we estimate that there might be thick intertextuality embedded in the writings of treasure texts than those already noticed. There is no systematic analysis on big Tibetan textual collections in academic circle so far, thus we propose to apply digital textual analysis technology to deconstruct the great corpus of Tibetan treasure—Mahayoaga section in the Rin chen gter mdzod.

Considering the amount of data, we try to implement digital technology to compare each phrasing in order to detect reused sentences, thus we can further interpret the so-called intertextuality in Tibetan treasure literature. After a trial period of this research project, we find it is an approachable goal. We believe that managing and characterizing the degradation of online digital humanities projects is a complex problem that demands further analysis. In this abstract, we go one step further into exploring the collectively shared distinctive signs of abandonment to quantify the planned obsolesce of online digital humanities projects.

For this purpose, we have created a framework that collectively quantifies the signs of abandonment in online digital humanities projects. Our study incorporates the retrieved HTTP response codes, number of redirects, a detailed examination of the contents and links returned by traversing the base node, external resources, HTTP headers and linked files. We intend this study to be a step forward towards better preservation mechanisms and for adopting strategies for the planned obsolesce of digital humanities projects. Given a collection of 10, English-language sonnets, we stretch each poem to fit a standardized square, with each line fully justified.

We then create a visualization for each distinct word showing the position of all instances of that word. For example, we find that "start" and "apart" appear almost always at the end of lines, and that "start" rarely occurs in the first line. The visualization allows scholars to gain an abstracted view of poetry without losing the poets' individual choices about word placement.

This tool can help scholars generate and test theories about the interplay of rhyme, meter, syntax, and emphasis. Every person in database has his own page consisting of two parts: biographical card with personal data and a field for the publication of documents and biographical texts. Crowdsourcing is very important part of a project. There are several possible activities for our users: they can find appropriate pages for unsorted photos, parse data from biographies to field in biographical form or define and merge duplicate pages.

Our aim is to normalize data in biographical fields to make academic research easier. Miyake, Maki Applying Measures of Lexical Diversity to Classification of the Greek New Testament Editions The study focuses on decision tree models based on several measures of lexical diversity, aiming at classifying genres of authorship attribution and critical types in various editions of the Greek New Testament. We use measures of lexical diversity that are not significant correlation with tokens.

After creating training and test subsets from several editions, we apply two classification algorithms such as Classification and Regression Tree and Random Forest. We then figure out the classification accuracy with the token-independent measures. Mol, Angus A. These language games are not new: the complex classification of games was already used as a discussion of generalities in language by Wittgenstein in his Philosophical Investigations. To this end data has been collected from the digital distribution platform Steam. A user-based tag recommender system is used to explore game families and genres through network community detection algorithms.

Molineaux, Benjamin Joseph The Corpus of Historical Mapudungun: This paper presents the challenges and prospects of building the Corpus of Historical Mapudungun, focusing on the difficulties of materials and methods used to reconstruct the history of a Native American Language.

Special focus is placed on sound change — particularly epenthesis — in a language with abundant complex morphological structure aka polysynthesis. To illustrate the potential use of our setup, we first introduce VedaWeb, a web-based platform that provides access to ancient Indian texts written in Vedic Sanskrit, the oldest form of ancient Indo-Aryan.

Building on that, we present the architecture behind these APIs and finally we summarize by analyzing the potential role of APIs in Digital Humanities projects. Morent, Stefan Sacred Sound — Sacred Space: In Search Of Lost Sound The project investigates the interacting of architecture of sacred spaces with sound and the relations between concepts of sacred spaces and their socio-cultural construction and religious experience as well as the shaping of liturgical forms.

Such complex systems of relations are particularly demanding if sacred buildings don't exist anymore or at least not in their original form. New approaches of research are provided by recently refined methods of virtual reconstruction of historical acoustics based on reconstructed 3D-models of the architecture. This research project will explore the contextualization of liturgical singing in its original sound space.

The innovative character of the research project consists in the combination of musicological, liturgical and ritual studies with techniques of Digital Humanities. Peter and Paul at Hirsau, St. We describe the key concepts of our approach in the context of an exemplary use case application, where the application's topology is modeled in a TOSCA-compliant way.

Our use case is the Musical Competitions Database, a web application providing comprehensive information about music related competitions from to With this contribution, we want to trigger a discussion about the applicability of methods and technologies of professional cloud deployment and provisioning strategies to problems of long-term availability of research software in the DH-community. We argue that counterfactual analysis is key to understanding their roles in the art worlds of Amsterdam and Antwerp in the seventeenth century. Palladino, Chiara; Bergman, James; Trammell, Caroline; Mixon, Eleanor; Fulford, Rebecca Using Linked Open Data to Navigate the Past: An Experiment in Teaching Archaeology Linked Open Data is a powerful tool for navigating through the complexity of the inherently multifaceted reality of archaeological sites, which results from the intersections of space, materiality, language, visual culture, history, text, and so on.

However, LOD also poses the challenge of how to manage such complexity in a meaningful way. In this paper, we report on an experimental project developed during a Classical Archaeology course in , during which we researched four different Graeco-Roman sites, with the goal of reconstructing the main aspects of their material history through exclusively LOD-based resources.


This collections mainly include classical Chinese poems that were produced between and in Taiwan. We focus on the spatiotemporal analysis of the poets and poems, and provide three application examples in this proposal. The examples include the analysis of the distribution of birthplaces of the poets of different time periods, the distribution of place names in poems of different time periods, and the temporal distribution of place names that were mentioned in poems of a specific poet.

Peroni, Silvio The Open Citations Movement Purpose: This article introduces the benefits of releasing a huge set of open citation data as public domain material. Findings: The open citations movement has reached an extensive media coverage since the launch of the I4OC, and several projects and datasets have been release so far so as to leverage the open citation data available online. Implications: The open citation data available is still far from being competitive with well-known proprietary citation databases such as Scopus and Web of Science. However, recently, several federated and interlinked open citation database have been released and are accessible and interoperable with each other by means of the Web technologies.

Value: Open citation data makes a positive disruption in the world of scholarly communication, since they change entirely how we face to science, its evolution, and all the related context, such as research assessment evaluations, science of science, bibliometrics, and future scientific discoveries. Povroznik, Nadezhda Georgievna Documentation of Digital Heritage Information Resources: Expanding Access for Research and Education This paper discusses the latest approaches to developing information systems for digital cultural heritage on a global scale, including the creation of catalogs and infrastructure for resource documentation.

Digital cultural heritage resources are diverse in content, origin, purpose, scale, technology and user audience. Documentation systems are essential to facilitate advanced digital humanities research and to provide greater user access to digital heritage information resources.

Such documentation system has been developed. The platform includes a wide range of characteristics related to describing information resources for digital cultural heritage. The resource meta-description structure includes 39 fields that represent 3 groups of data: 1 Data on the creators of the information resource; 2 General information about the information resource; 3 Content description metadata.

The method and solutions proposed to expand possibilities for finding thematically similar information resources, and provide a global model to make such resources more accessible for research and education. Silk, however, has become a seriously endangered heritage. Although many European specialized museums are devoted to its preservation, they usually lack size and resources to establish networks or connections with other collections.

In this paper, we will present how we have defined this data model, and how we have specified the entities to be represented by the ontology and the existing relationships between these entities. The shortcomings of the Uppsala project will guide the design of an extended cross-linked online dictionary of early modern Hindustani based on little known wordlists and vocabularies compiled by European merchants and missionaries in the 17th c.

If successful, this approach can be applied to other early modern vocabularies constituting unique and valuable descriptions of non-European languages. Rajan, Vinodh 1 ; Stiehl, H. Recently, Visual Language-based applications like AppInventor have gained a lot of attention. By using an intuitive visual syntax, they let non-programmers to create computational solutions easily.

It offers a largely self-usable toolbox that humanists can use to build solutions themselves. We initially outline the need and motivations for developing AMAP and further elaborate on the design and implementation of AMAP along with its potential applications. Ries, Thorsten Born-Digital Archives A Digital Forensic Perspective on the Historicity of Born-digital Primary Records The proposed paper will scope the complexity of born-digital archives from a digital forensic, historical and philological perspective.

Personal digital archives, institutional repositories, web archives, email archives and social media archives create d digital primary records that the historical humanities struggle to fully recognize as documents in their own right. The historicity of the forensic materiality and structure of the born-digital record is a concept still to be methodologically and theoretically understood in the humanities and in archival science. The purpose of this paper is to argue that forensic materiality and analysis is methodologically relevant for critical appraisal and understanding of production processes of born-digital sources in the humanities as a whole, including history, social history, political and culture studies including literature, art history etc.

Based on a dataset of four Dutch newspapers that span the period , we show how conceptual connections between the members of the MCE trinity are highly restricted. N-gram frequency measures and collocations are employed to map the word usage and, building on recent advancements in diachronic vector semantics, we use word embeddings to study changing relations between modernity, civilization and Europe. These methods show how the trinity is characterized by intermittent and alternating connections, but not by perennial semantic boundaries.

Given that these results differ from research based on elite discourse, this paper demonstrates the need for digital research into conceptual interrelationships. Rybicki, Jan Analysis of Writer-Text-Translator Social Networks This paper is an analysis of the connections between writers, their texts and their translations through social network analysis. The data for this study was limited to literary translations into Polish, a total of almost 18, individual editions of novels or collections of short stories by authors and translators from languages.

This produces a complex mesh of writer-to-translator connections, which is analyzed using the Fruchterman-Reingold force-directed algorithm. Interesting phenomena can be observed using such a visualization of otherwise unaccessible links between items in the database.

As a test corpus, it uses four digitized, OCRed, and hand-cleaned nineteenth-century French chronicles of Ottoman Algerian history in order to model socio-political networks and uncover the positions and roles of women in this society. The challenge is to extract not only named entities and their relations to one another, but to extract unnamed persons and their relationships as well.

Apollo 13: Houston, We've Got a Problem

Those who remain unnamed are most often women, servants, slaves, and Indigenous people — the very people about whom scholars are most anxious to know more. This short presentation will share the complete information extraction code, its accuracy, the resulting visualizations, a brief analysis from the case study, and additional use cases that extend far beyond the initial case study to other languages and textual sources.

Schwartz, Daniel L. Syriac is a dialect of Aramaic used in the Near East between the 3rd and 8th centuries and continues to be used liturgically by Christians in the Middle East and India as well as expatriate communities in Europe and North America. This project employs a factoid-based approach to prosopography. Where most factoid-based prosopographies organize data in a relational database, SPEAR encodes prosopographical data from primary source texts in TEI XML using a customized schema designed to facilitate linking this propopographical data to other linked data resources and for serialization into RDF.

SPEAR shows how a prosopography project can employ TEI, field-specific scholarly standards, and Linked Open Data to produce a highly structured and semantically rich database that maintains close ties to the texts from which it is derived. Schwartz, Michelle 1 ; Crompton, Constance 2 Where Our Responsibilities Lie: People, Method, and Digital Cultural History As cultural historians, should our responsibility be to the people in our historical data set or to our methodology? Is it possible to, as they say, have it both ways, and if so, what do Digital Humanities methods offer us as we seek to responsibly represent political history?

In digitizing and digitally remixing a primary source data, should we value data collection consistency or value recovering information that the original methodology could not capture? We plan to report on the data collecting practices of a TEI-based Canadian history project.