Subject: Copyright/Attribution of Material
Question: Dear all, My colleagues and I are hoping to conduct some research for which we need to develop a corpus or research articles. The articles in question need to be transferred into .txt format, and hence we were wondering whether there would be any issues related to copyright/attribution of the material (e.g., we would be copying and pasting the articles into a .txt file for easier use in corpus software). The papers in question are available under a creative commons license, and it is stated that it 'allows reuse subject only to the use being non-commercial and to the article being fully attributed'. If anyone can see any issues relating to this, we would very much appreciate your feedback. Best wishes, Alex Marsden
Reply: As someone who has had to worry about the British law of copyright in connexion with corpus linguistics myself, I would say that you are in a remarkably favourable situation and don't seem to have a thing to worry about there. Strictly, the licence wording asks you to identify the documents you use in the publications which emerge from your own research; but this wording was manifestly included with a view to use that relates to the ideas in the documents, and if in your case it would be impractical because you are using dozens or hundreds of documents and extracting purely structural or quantitative data from them, I am sure no-one would object if you failed to list each document individually (you could perhaps give a general indication of where they came from, and indeed you would probably need to do that for your own research purposes). Corpus linguists who work with material not published under these unusual conditions have larger legal issues to contend with! Regards, Geoffrey Sampson
Date: 09-Feb-2013
