Embedding metadata and other semantics in word processing documents

Sefton, Peter and Barnes, Ian and Ward, Ron and Downing, Jim (2009) Embedding metadata and other semantics in word processing documents. In: DDC 2008: Radical Sharing: Transforming Science? , 1-3 Dec 2008, Edinburgh, Scotland.


This paper describes a technique for embedding document metadata, and potentially other semantic references inline in word processing documents, which the authors have implemented with the help of a software development team. Several assumptions underly the approach; It must be available across computing platforms and work with both Microsoft Word (because of its user base) and OpenOffice.org (because of its free availability). Further the application needs to be acceptable to and usable by users, so the initial implementation covers only small number of features, which will only be extended after user-testing. Within these constraints the system provides a mechanism for encoding not only simple metadata, but for inferring hierarchical relationships between metadata elements from a ‘flat’ word processing file. The paper includes links to open source code implementing the techniques as part of a broader suite of tools for academic writing. This addresses tools and software, semantic web and data curation, integrating curation into research workflows and will provide a platform for integrating work on ontologies, vocabularies and folksonomies into word processing tools.

Statistics for USQ ePrint 7051
Statistics for this ePrint Item
Item Type: Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)
Refereed: No
Item Status: Live Archive
Additional Information: This paper was presented at the International Digital Curation Conference in Edinburgh in Dec 2008. Jim Downing (University of Cambridge) (presenting) http://www.dcc.ac.uk/events/conferences/4th-international-digital-curation-conference/
Faculty / Department / School: Historic - Australian Digital Futures Institute
Date Deposited: 10 Mar 2010 12:36
Last Modified: 15 Jul 2014 04:16
Uncontrolled Keywords: metadata; wordprocessing; microformat; ICE; Integrated Content Environment
Fields of Research : 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining
08 Information and Computing Sciences > 0806 Information Systems > 080604 Database Management
08 Information and Computing Sciences > 0803 Computer Software > 080309 Software Engineering
Socio-Economic Objective: E Expanding Knowledge > 97 Expanding Knowledge > 970108 Expanding Knowledge in the Information and Computing Sciences
URI: http://eprints.usq.edu.au/id/eprint/7051

Actions (login required)

View Item Archive Repository Staff Only