Print
Univis
Search
 
FAU-Logo
Techn. Fakultät Willkommen am Institut für Informatik FAU-Logo

Bamboo Scroll Dataset

News
20.4.2018    

First release of the bamboo scroll X-ray CT dataset has been made publicly available. The dataset consists of two CT scans (clean/dirty). In future, we will extend this database and furthermore provide our algorithms to create a baseline.

Download Dataset

This dataset was created for the DAS 2018 Workshop. If using this dataset results in any publications or reports, we kindly ask you to cite the following publication:

Stromer D., Christlein V., Zippert P., Helmecke E., Hausotte T., Huang X. and Maier A. "Non-Destructive Digitization of Soiled Historical Chinese Bamboo Scrolls". In Proceedings of the 13th International Conference on Document Analysis and Recognition, IEEE, 2018.

Initiates file downloaddataset_bamboo.zip

For questions or comments please feel free to contact Opens window for sending emailDaniel Stromer.

The bamboo scroll. The bamboo scroll was bought in Beijing and consists of 32 bamboo slips (left/right image). One slip has a size of 1.1 mm x 15.7 mm x 0.3 mm (W x H x L). All slips are bound together with a string. Each slips has different carving which denote drawings ore chinese symbols. The two mostleft and three mostright slips have drawings and inverted writings, which means that the symbols surrounding is carved into the wood. The drawings and symbols of the inner slips are directly carved into the bamboo. For all scans, the scroll was completely wrapped up.

 

                 


Towards more realism. As the scrolls are mostly found underneath the earth, they are heavily contaminated by soil. This makes it a difficult and time consuming task for conservators to clean, unrwap and digitize their contents. To create this more realistic scenario, we made a first scan of the wrapped up clean, uncontaminated scroll. Afterwards, we heavily contaminated the wrapped up scroll (left image) with potting soil (including mineral particles, e.g., Ca, Mg, ...) and put it in a plastic bag (right image).

 

      


3-D X-ray Computed Tomography (CT) Scans. For allowing a non-invasive digitization, we performed two 3-D X-ray CT scans of the scroll. First, the clean scroll was scanned to proof that a digitization is possible. Next, the contimated scroll in the plastic bag was measured. The scan parameters were the same for both scans and some parameters are included within the respective zip files (Scan_and_volume_info.txt). The scroll was placed upright in the scanner (longest dimension along rotation axis) such that the cylindrical form of the scroll guaranteed an equal X-ray beam length through the object reducing artifacts. Both, wood and earth consist of cellulose, making it difficult to identify scanned carving filled with earth. However, the mass attenuation coefficient of a specific material depends on it's X-ray attenuation coefficient and the mass density. The higher the X-ray tube energies, the more the mass attenuation coefficient depends on the density of a material and the density of bamboo is much higher than the density of the celleulose particles in the earth. Therefore, we configured a tube voltage of 130 kV to allow differentiation between earth and wood, such that we can read the symbols even if they are filled with earth.

 


Dataset description.

Directory raw_volume_clean:

Volume of the clean scroll scan. Information on the volume is included the respective .txt file.

 

Directory raw_volume_soiled:

Volume of the soiled scroll scan. Information on the volume is included the respective .txt file.

 

Directory segmented_ct_clips_clean:

Manually segmented cropped slip volumes from the clean scroll scan ordered ascending (1: mostleft, 32: mostright slip). stitched_2D_result.tif includes the surface sampled 2-D slips stitched together to the complete scrolls.

 

Directory segmented_ct_clips_soiled:

Manually segmented cropped slip volumes from the clean scroll scan ordered descending (1: mostleft, 32: mostright slip). stitched_2D_result.tif includes the surface sampled 2-D slips stitched together to the complete scrolls.

 

 

Software

As we are still working on an automatic segmentation of the dataset, will publish the source code for it around June.