The encoding of local features is an essential part for writer identification and writer retrieval. While CNN activations have already been used as local features in related works, the encoding of these features has attracted little attention so far. In this work, we compare the established VLAD encoding with triangulation embedding. We further investigate generalized max pooling as an alternative to sum pooling and the impact of decorrelation and Exemplar SVMs. With these techniques, we set new standards on two publicly available datasets (ICDAR13, KHATT).

Articles in Conference Proceedings

Encoding CNN Activations for Writer Recognition

2018 13th IAPR International Workshop on Document Analysis Systems (DAS) (13th IAPR International Workshop on Document Analysis Systems (DAS)), Wien, 24.04.2018, pp. 169-174, 2018 (BiBTeX, Who cited this?)

In contrast to physiological biometric identifiers like fingerprints or iris scans, handwriting can be seen as a behavioral identifier. The process of writing is heavily influenced by factors like schooling or aging. Writer identification is the task of finding an individual scribe in a large data corpus. Typical applications lie in the fields of forensics or security. However, writer identification recently also raised interest in the analysis of historical texts, where they might give new insights of life in the past.

Group members in this project: Opens internal link in current window Vincent Christlein

Related Publications

Encoding CNN Activations for Writer Recognition [DAS 2018]
Vincent Christlein, Andreas Maier

The encoding of local features is an essential part for writer identification and writer retrieval. While CNN activations have already been used as local features in related works, the encoding of these features has attracted little attention so far. In this work, we compare the established VLAD encoding with triangulation embedding. We further investigate generalized max pooling as an alternative to sum pooling and the impact of decorrelation and Exemplar SVMs. With these techniques, we set new standards on two publicly available datasets (ICDAR13, KHATT).

Articles in Conference Proceedings

Christlein, Vincent; Maier, Andreas

Encoding CNN Activations for Writer Recognition

2018 13th IAPR International Workshop on Document Analysis Systems (DAS) (13th IAPR International Workshop on Document Analysis Systems (DAS)), Wien, 24.04.2018, pp. 169-174, 2018 (BiBTeX, Who cited this?)

Unsupervised Feature Learning for Writer Identification and Writer Retrieval [ICDAR 2017]
Vincent Christlein, Martin Gropp, Stefan Fiel, Andreas Maier

Deep Convolutional Neural Networks (CNN) have shown great success in supervised classification tasks such as character classification or dating. Deep learning methods typically need a lot of annotated training data, which is not available in many scenarios. In these cases, traditional methods are often better than or equivalent to deep learning methods. In this paper, we propose a simple, yet effective, way to learn CNN activation features in an unsupervised manner. Therefore, we train a deep residual network using surrogate classes. The surrogate classes are created by clustering the training dataset, where each cluster index represents one surrogate class. The activations from the penultimate CNN layer serve as features for subsequent classification tasks. We evaluate the feature representations on two publicly available datasets. The focus lies on the ICDAR17 competition dataset on historical document writer identification (Historical-WI). We show that the activation features trained without supervision are superior to descriptors of state-of-the- art writer identification methods. Additionally, we achieve comparable results in the case of handwriting classification using the ICFHR16 competition dataset on historical Latin script types (CLaMM16).

Articles in Conference Proceedings

Christlein, Vincent; Gropp, Martin; Fiel, Stefan; Maier, Andreas

Unsupervised Feature Learning for Writer Identification and Writer Retrieval

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) (14th IAPR International Conference on Document Analysis and Recognition (ICDAR)), Kyoto, Japan, 13.11.2017, pp. 991-997, 2017 (BiBTeX, Who cited this?)

Writer Identification Using GMM Supervectors and Exemplar-SVMs [PR 2017]
Vincent Christlein, David Bernecker, Florian Hönig, Andreas Maier, Elli Angelopoulou

This paper describes a method for robust offline writer identification. We propose to use RootSIFT descriptors computed densely at the script contours. GMM supervectors are used as encoding method to describe the characteristic handwriting of an individual scribe. GMM supervectors are created by adapting a background model to the distribution of local feature descriptors. Finally, we propose to use Exemplar-SVMs to train a document-specific similarity measure. We evaluate the method on three publicly available datasets (ICDAR / CVL / KHATT) and show that our method sets new performance standards on all three datasets. Additionally, we compare different feature sampling strategies as well as other encoding methods.

Journal Articles

Christlein, Vincent; Bernecker, David; Hönig, Florian; Maier, Andreas; Angelopoulou, Elli

Writer Identification Using GMM Supervectors and Exemplar-SVMs

Pattern Recognition, vol. 63, pp. 258-267, 2017 (BiBTeX, Who cited this?)

Automatic Writer Identification in Historical Documents: A Case Study [ZfdG 2016]
Vincent Christlein, Markus Diem, Florian Kleber, Günter Mühlberger, Verena Schwägerl-Melchior, Esther van Gelder, Andreas Maier

In recent years, Automatic Writer Identification (AWI) has received a lot of attention in the document analysis community. However, most research has been conducted on contemporary benchmark sets. These datasets typically do not contain any noise or artefacts caused by the conversion methodology. This article analyses how current state-of-the-art methods in writer identification perform on historical documents. In contrast to contemporary documents, historical data often contain artefacts such as holes, rips, or water stains which make reliable identification error-prone. Experiments were conducted on two large letter collections with known authenticity and promising results of 82% and 89% TOP-1 accuracy were achieved.

Journal Articles

Christlein, Vincent; Diem, Markus; Kleber, Florian; Mühlberger, Günter; Schwägerl-Melchior, Verena; van Gelder, Esther; Maier, Andreas

Automatic Writer Identification in Historical Documents: A Case Study

Zeitschrift für digitale Geisteswissenschaften, pp. -, 2016 (BiBTeX, Who cited this?)

Offline Writer Identification Using Convolutional Neural Network Activation Features [GCPR 2015]
Vincent Christlein, David Bernecker, Andreas Maier, Elli Angelopoulou

Convolutional neural networks (CNNs) have recently become the state-of-the-art tool for large-scale image classification. In this work we propose the use of activation features from CNNs as local descriptors for writer identification. A global descriptor is then formed by means of GMM supervector encoding, which is further improved by normalization with the KL-Kernel. We evaluate our method on two publicly available datasets: the ICDAR 2013 benchmark database and the CVL dataset.

Project Page

Articles in Conference Proceedings

Christlein, Vincent; Bernecker, David; Maier, Andreas; Angelopoulou, Elli

Offline Writer Identification Using Convolutional Neural Network Activation Features

Pattern Recognition (German Conference on Pattern Recognition), Aachen, 07.10.2015, pp. 540-552, 2015, ISBN 978-3-319-24946-9 (BiBTeX, Who cited this?)

Writer Identification Using VLAD Encoded Contour-Zernike Moments [ICDAR 2015]
Vincent Christlein, David Bernecker, Elli Angelopoulou

→ Competition Winner: ICDAR 2015 Competition on Multi-Script Writer Identification and Gender Classification (Task 1)

→ Best Poster Award for a very significant contribution in the field of document analysis and recognition

Local feature descriptors in combination with bag of (visual) words have recently become the state of the art in writer identification. In this work we propose the use of Zernike moments evaluated at the contours of the script as local descriptor. We then form a global descriptor by encoding the extracted Zernike moments into Vectors of Locally Aggregated Descriptors (VLAD).

Project Page

Articles in Conference Proceedings

Christlein, Vincent; Bernecker, David; Angelopoulou, Elli

Writer Identification Using VLAD Encoded Contour-Zernike Moments

Document Analysis and Recognition (ICDAR), 2015 13th International Conference on (13th International Conference on Document Analysis and Recognition), Nancy, France, 23.08.2015, pp. 906-910, 2015 (BiBTeX, Who cited this?)

Writer Identification and Verification Using GMM Supervectors [WACV 2014]
Vincent Christlein, David Bernecker, Florian Honig, Elli Angelopoulou

→ Competition Winner: ICFHR2014 Competition on Arabic Writer Identification Using AHTID/MW and KHATT Databases,

This paper proposes a new system for offline writer identification and writer verification. The proposed method uses GMM supervectors to encode the feature distribution of individual writers. Each supervector originates from an individual GMM which has been adapted from a background model via a maximum-a-posteriori step followed by mixing the new statistics with the background model.

Project Page

Articles in Conference Proceedings

Christlein, Vincent; Bernecker, David; Hönig, Florian; Angelopoulou, Elli

Writer Identification and Verification Using GMM Supervectors

Proceedings of the 2014 IEEE Winter Conference on Applications of Computer Vision (2014 IEEE Winter Conference on Applications of Computer Vision (WACV)), Steamboat Springs, CO, 24-26.03.2014, pp. 998-1005, 2014 (BiBTeX, Who cited this?)

→ Competition Winner: ICFHR2014 Competition on Arabic Writer Identification Using AHTID/MW and KHATT Databases,

This paper proposes a new system for offline writer identification and writer verification. The proposed method uses GMM supervectors to encode the feature distribution of individual writers. Each supervector originates from an individual GMM which has been adapted from a background model via a maximum-a-posteriori step followed by mixing the new statistics with the background model.

Opens internal link in current window Project Page

Articles in Conference Proceedings

Christlein, Vincent; Bernecker, David; Hönig, Florian; Angelopoulou, Elli

Writer Identification and Verification Using GMM Supervectors

Proceedings of the 2014 IEEE Winter Conference on Applications of Computer Vision (2014 IEEE Winter Conference on Applications of Computer Vision (WACV)), Steamboat Springs, CO, 24-26.03.2014, pp. 998-1005, 2014 (BiBTeX, Who cited this?)

→ Competition Winner: ICDAR 2015 Competition on Multi-Script Writer Identification and Gender Classification (Task 1)

→ Best Poster Award for a very significant contribution in the field of document analysis and recognition

Local feature descriptors in combination with bag of (visual) words have recently become the state of the art in writer identification. In this work we propose the use of Zernike moments evaluated at the contours of the script as local descriptor. We then form a global descriptor by encoding the extracted Zernike moments into Vectors of Locally Aggregated Descriptors (VLAD).

Opens internal link in current window Project Page

Articles in Conference Proceedings

Christlein, Vincent; Bernecker, David; Angelopoulou, Elli

Writer Identification Using VLAD Encoded Contour-Zernike Moments

Document Analysis and Recognition (ICDAR), 2015 13th International Conference on (13th International Conference on Document Analysis and Recognition), Nancy, France, 23.08.2015, pp. 906-910, 2015 (BiBTeX, Who cited this?)

Convolutional neural networks (CNNs) have recently become the state-of-the-art tool for large-scale image classification. In this work we propose the use of activation features from CNNs as local descriptors for writer identification. A global descriptor is then formed by means of GMM supervector encoding, which is further improved by normalization with the KL-Kernel. We evaluate our method on two publicly available datasets: the ICDAR 2013 benchmark database and the CVL dataset.

Opens internal link in current window Project Page

In recent years, Automatic Writer Identification (AWI) has received a lot of attention in the document analysis community. However, most research has been conducted on contemporary benchmark sets. These datasets typically do not contain any noise or artefacts caused by the conversion methodology. This article analyses how current state-of-the-art methods in writer identification perform on historical documents. In contrast to contemporary documents, historical data often contain artefacts such as holes, rips, or water stains which make reliable identification error-prone. Experiments were conducted on two large letter collections with known authenticity and promising results of 82% and 89% TOP-1 accuracy were achieved.

This paper describes a method for robust offline writer identification. We propose to use RootSIFT descriptors computed densely at the script contours. GMM supervectors are used as encoding method to describe the characteristic handwriting of an individual scribe. GMM supervectors are created by adapting a background model to the distribution of local feature descriptors. Finally, we propose to use Exemplar-SVMs to train a document-specific similarity measure. We evaluate the method on three publicly available datasets (ICDAR / CVL / KHATT) and show that our method sets new performance standards on all three datasets. Additionally, we compare different feature sampling strategies as well as other encoding methods.

Deep Convolutional Neural Networks (CNN) have shown great success in supervised classification tasks such as character classification or dating. Deep learning methods typically need a lot of annotated training data, which is not available in many scenarios. In these cases, traditional methods are often better than or equivalent to deep learning methods. In this paper, we propose a simple, yet effective, way to learn CNN activation features in an unsupervised manner. Therefore, we train a deep residual network using surrogate classes. The surrogate classes are created by clustering the training dataset, where each cluster index represents one surrogate class. The activations from the penultimate CNN layer serve as features for subsequent classification tasks. We evaluate the feature representations on two publicly available datasets. The focus lies on the ICDAR17 competition dataset on historical document writer identification (Historical-WI). We show that the activation features trained without supervision are superior to descriptors of state-of-the- art writer identification methods. Additionally, we achieve comparable results in the case of handwriting classification using the ICFHR16 competition dataset on historical Latin script types (CLaMM16).

Articles in Conference Proceedings

Christlein, Vincent; Bernecker, David; Maier, Andreas; Angelopoulou, Elli

Offline Writer Identification Using Convolutional Neural Network Activation Features

Pattern Recognition (German Conference on Pattern Recognition), Aachen, 07.10.2015, pp. 540-552, 2015, ISBN 978-3-319-24946-9 (BiBTeX, Who cited this?)

Journal Articles

Christlein, Vincent; Diem, Markus; Kleber, Florian; Mühlberger, Günter; Schwägerl-Melchior, Verena; van Gelder, Esther; Maier, Andreas

Automatic Writer Identification in Historical Documents: A Case Study

Zeitschrift für digitale Geisteswissenschaften, pp. -, 2016 (BiBTeX, Who cited this?)

Journal Articles

Christlein, Vincent; Bernecker, David; Hönig, Florian; Maier, Andreas; Angelopoulou, Elli

Writer Identification Using GMM Supervectors and Exemplar-SVMs

Pattern Recognition, vol. 63, pp. 258-267, 2017 (BiBTeX, Who cited this?)

Articles in Conference Proceedings

Christlein, Vincent; Gropp, Martin; Fiel, Stefan; Maier, Andreas

Unsupervised Feature Learning for Writer Identification and Writer Retrieval

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) (14th IAPR International Conference on Document Analysis and Recognition (ICDAR)), Kyoto, Japan, 13.11.2017, pp. 991-997, 2017 (BiBTeX, Who cited this?)