Mr Samuel Goebert
CSCAN Network Research Student
Brief biographical firstname.lastname@example.org
Distributed preservation and synchronization of Open Data between untrusted machines
The Internet is changing from a web of documents to a web of data. Open data initiatives like Wikipedia, Internet Archive, Stack Exchange or OpenStreetMap have become important sources of global knowledge. While mainly built by volunteers the quality of the data has reached and in some cases exceeded proprietary offerings. Although open data is freely available and everybody is invited to contribute, update operations have to happen on the main database or will be ignored. The data is locked in a centralized architecture. Current efforts to archive open data is based on mirroring offside copies of the main database. This prevents the raw data binary stream from getting lost but the data is disconnected. The copies must be updated in regular intervals to reflect the state of the main database. Direct editing of a mirror is possible but creates a fork as update operations are not synchronized back to the main project. With many forks of the same data it becomes difficult to determine the leading fork that should be preserved and contributed to.Mr Samuel Goebert
The MPhil stage will focus on the research of distributed systems and consents finding algorithms. Existing algorithms are not suited to synchronize data between untrusted machines. They assume that all parties have the same trust level and are known. The focus of the research at this stage will give consideration to to a decentral protocol that synchronizes data between untrusted machines. Synchronizing the data in a multi master database replication setup enables user contributions from multiple sources. This will lead to a specification of a peer-to-peer protocol that enables anonymous contributions in a formalized way.
Director of studies: Prof. Dr Bettina Harriehausen-Mühlbauer
Other supervisors: Prof. Dr Christoph Wentzel, Prof. Steven M Furnell
Towards A Unified OAI-PMH Registry
Decentralized Hosting and Preservation of Open Data
A non-proprietary RAID replacement for long term preservation systems
3 Conference papers
3 publication(s) - all categories.