|
||
Website deprecated and outdated. Click here for the new site. | ||
Daniel Stromer M. Sc.Researcher in the Learning Approaches for Medical Big Data Analysis (LAMBDA) group at the Pattern Recognition Lab of the Friedrich-Alexander-Universität Erlangen-NürnbergReferencing and Contact
Whenever using this database, reference the following paper: If you have any questions, feel free to contact: The LME-Book-CT datasetLarge Book ExampleThe animation shows a X-ray scanned and reconstructed book and is intended to give you an idea of this research. By scanning a document (this can be for example a book or a scroll) in a non-invasive manner (in this case X-ray CT), the content should be complete digitized. This concept can only work if the used materials (inks and writing media) attenuate the penetrating rays different than it's surroundings (most likely air, but it could be also contamination material such as earth). The writings get visible, if the ink has a different attenuation for the rays than the writing material. In the given case, the iron-based ink is readable on the paper-page when slicing up the volume. The animation shows the possibility to investigate the entire book volume digitally such that the reader can virtually browse through the document. The natural appearance of such books and the limited resolution and discretization make it impossible to read the book like its analog version such that computer vision and machine learning algorithms have to remap the volumetric data into 2D. This process can be compared to flattening of the pages to get rid of the curvy and wavy structure. The book
The X-ray CT scanned book shown in the Figure below has a dimension of 17 cm x 13 cm x 3 cm (LxWxH) and consists of 56 pages of handmade paper wrapped up in a buffalo leather cover. Each page has a thickness of around 150 µm.
Contrast-to-noise ratio (CNR) estimation
The excel sheet Ink_config.xlsx shows an exemplary calculation of an X-ray scans estimated CNR. The variable parameters are:
From those parameters, the CNR for a scan ist estimated. The file can be downloaded to test own configurations. Scan volumes
The system we used for generating the volumes was a 3-D X-ray micro-CT system using cone-beam geometry. The dataset consists of three FDK reconstructed volumes of the closed book scanned with different acquisiton parameters. The book was laid down on the turntable. The current of 3 mA and an exposure time of 2 s were kept constant. As tube voltage, we configured 30 kV, 40 kV and 50 kV. Within the 50 kV scan, we furthermore used copper pre-filtration of 0.25 mm to narrow the polychromatic X-ray spectrum. With a source-to-object distance of 710 mm and a source-to-detector distance of 1377 mm, the results ended up in an isotropic resolution of 103 µm. Each volume is stored in an unit16 .tif file (3-D!). To compress the large size, the .tif-files were zipped (1.5 GByte per file). We recommend ImageJ to open the file. Feel free to download the reconstructed volumes here: Page Extraction
To make the writings of each page visible, a conversion from 3-D into 2-D pages has to be performed. If you use the algoirthmics, please reference our paper published at ICDAR 2017. All algorithmic details are discussed in there, too. To impelement the code, you need to install our CONRAD framework, first. Afterswards, create a package "book" under tutorials and place this files in there. You may have to adapt the first line and reference to the newly created directory. File description: Vesselness.java: Implementation of vesselness filtering GuidedFilter.java: Implementation of guided Filtering (pre-processing) PageFlattenerNNApproximation.java: Implementationof the algorithm discussed in the ICDAR paper. You just have to adapt the input file name. When the code ran through, the output is shown by ImageJ und you can store it. MapBookpagesTo2D.java: The texturing element of the pipeline. Maximum filtering along the page to receive a 2-D result from the 3-D page. Results
Here are the algorithm's results for five pages of the three scans. Furthermore, we provide photos of the original pages. |