Materials and Software


Below we provide tools which were used to analyze medieval papal charters. However, they might also be used to analyze other document images.

Provided Tools

  • Annotation tool to annotate documents, the annotations are provided in XML format

    • Box and polygon annotations
    • Extraction of annotated snippets
    • Automatic word recognition for a faster annotation

  • Snippet extraction tool to extract snippets in a more specific way than it is possible with the annotation tool,
  • Layout statistics tool to compute statistics from the XML files created by the annotation tool,
  • Plot tool to visualize statistics obtained by the layout stats tool,
  • Score-tool to score image material

Code for writer identification can be found below which was used for the Submission to the IEEE Conference on Winter Conference on Application of Computer Vision 2014.





During the annotation process a lot of snippets, i.e. image excertps were annotated. They can be used e.g. for graph / glyph recognition or as training for various purposes.

All excerpts are from the Papstarchiv Göttingen, the full papal charters can be retreived at Opens external link in new .

Naming convention, explained by an example: "victor_iv__14480_11630907_apostolicae_sedis_consuetudo_ph_goe__211580__graph_p.png" tells you the name of the papcy (Victor IV), the Jaffé-Nr (14480), the probable date of the appearance (11630907 = 7th of September, 1163), and last the label (p) of the instance (graph).


Please note that mis-labelled data may occur, we are not responsible for any possible harm.



  • letters, graphs, graphical symbols, abbreviations, ligatures, words, example lines: Initiates file (1119,6 MB)
  • all text areas (appreciato, datum lines, first lines, cardinals signatures, middle bands, pope names, pope subscriptions, plicas, rota inscriptions, subscriptions): Initiates file (620,0 MB)