BIT - September 2008
Janich&Klass: A scanner with a special mission
The Stasi Puzzle
16.250 sacks of torn documents from the ministry of State Security – a gigantic puzzle that human hands cannot manage.
Scientists at the Fraunhofer Institute for Production Systems and Design Technology (IPK) in Berlin are automatically reassembling parts of this mountain of snippets in a pilot project.
With the help of a new scanning technology from Janich&Klass, the snippets are first digitised on both side by arvato services in high quality and at high speed
In order to preserve the secrets of the regime, files were systematically pre-destroyed in the former Ministry for State Security between autumn 1989 and January 1990. The quantity of documents was so enormous that the shredders failed. A large part of the documents had to be torn up by hand. An estimated 40 million DIN-A4 pages were shredded into eight to 30 pieces each. So far, it has only been possible to reconstruct a small part of these documents. This is because manual reassembly is very time-consuming. it would take 30 people 600 to 800 years to assemble the approximately 600 million scraps of paper by hand
Researchers at the Fraunhofer IPK are currently developing a computer-assisted process that will be able to assemble the snippet puzzle faster and more efficiently in the future and thus enable a timely evaluation of the documents.
In order for the Stasi documents to be recontsructed in the automated process, the puzzle snippets must first be digitised on both sides. For this purpose, scanners specially developed for this task by Janich & Klass are used.
About the Fraunhofer IPK
The Fraunhofer Institute for Production Systems and Design Technology IPK conducts applied research and development in the fields of future-oriented technologies for the production process in factories as well as other fields of application. The IPK employs around 210 people. (www.ipk.fraunhofer.de)
“Virtual puzzling follows the logic of manual puzzling,” explains Dr Bertram Nickolay, head of the safety technology department at Fraunhofer IPK. Humans use a variety of features to solve this game of patience, which they use to decide whether two pieces fit together or not – for example, the shape of the pieces as well as what colour or writing can be seen on the puzzle pieces. This pre-selection makes it easier to search for and find matching puzzle pieces. “The virtual puzzle process also starts this way,” says Nickolay.
“The system calculates various descriptive features such as shape or texture to reduce the search space. Within this smaller set, the actual reconstruction takes place.” For this, snippets are compared along their contours for matches. If matching parts are found, they are combined into a larger document. Then the process starts all over again. Snippet by snippet, page after page of the Stasi files is created.
The pilot project for the automated reconstruction of the Stasi files runs for two years and includes 400 bags. The pilot phase is primarily concerned with gaining knowledge about the procedures and technology of the reconstruction process. The client is the Federal Commissioner for the Records of the State Security Service of the Former German Democratic Republic (BStU).
About arvato services
Arvato Services, a subsidiary of Arvato AG, offers effective customer communication and supply chain management services in Europe and non-European markets. Around 28,000 people work for the company worldwide. (www.arvato-services.com)
Sisyphos 07: Technical data:
The resolution of the scanner is 300 dpi double-sided and uncompressed in a colour depth of 24 bit. It processes paper scraps from 0.5 cm x 0.5 cm to 28 cm x 50 cm at a speed of 27 cm per second; this corresponds to an effective throughput of up to 500 sq cm or 40 MB per second. The straight transport with belts is optimised for irregular documents; moreover, the transport system is free of rollers and bump edges. Crumpled snippets are illuminated without shadows from prototype 02. To avoid image distortions caused by dust and dirt, the paper guide is glassless. Snippets lying next to each other are automatically separated.