Automating Data Sharing Through Authoring Tools

dataset

posted on 2017-10-29, 16:27 authored by John KitchinJohn Kitchin, Ana Van GulickAna Van Gulick, Lisa ZilinskiLisa Zilinski

In the current scientific publishing landscape, there is a need for an authoring workflow that easily integrates data and code into manuscripts and that enables the data and code to be published in reusable form. Automated embedding of data and code into published output will enable superior communication and data archiving. In this work, we demonstrate a proof of concept for a workflow, org-mode, which successfully provides this authoring capability and workflow integration. We illustrate this concept in a series of examples for potential uses of this workflow. First, we use data on citation counts to compute the h-index of an author, and show two code examples for calculating the h-index. The source for each example is automatically embedded in the PDF during the export of the document. We demonstrate how data can be embedded in image files, which themselves are embedded in the document. Finally, metadata about the embedded files can be automatically included in the exported PDF, and accessed by computer programs. In our customized export, we embedded metadata about the attached files in the PDF in an Info field. A computer program could parse this output to get a list of embedded files and carry out analyses on them. Authoring tools such as Emacs + org-mode can greatly facilitate the integration of data and code into technical writing. These tools can also automate the embedding of data into document formats intended for consumption.

History

Publisher Statement

This is the author's accepted manuscript version of, "Kitchen, J.R., Van Gulick, A. E., Zilinski, L. D. (June 2016). Automating Data Sharing Through Authoring Tools. International Journal on Digital Libraries. 18(2). 93-98. https://doi.org/10.1007/s00799-016-0173-7."

Date

2016-06-11

Usage metrics

Keywords

data sharing embedding org-mode authoring

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM