Tag Archives: data sharing

The Problem With Data Sharing

Dark_Pink_Butterfly

I recently read this article on GradHacker discussing the benefits of data sharing, along with a few of the drawbacks. If you are reading this now, then you are visiting my blog and if you look towards the top you will notice there is a tab titled Chaucer Project. The GradHacker article describes the exact balancing act the tab at the top of this page represents, specifically my want for sharing not just what I am working on, but my sources and methodology, demoralized by the fear of theft and lack of gains. Now, before I am accused of sounding so pompous as to believe everyone wants to steal my work, I assure you I used to have no hesitancy posting everything I wrote on here, and the first time I was warned about such theft practices I was all too flattered that someone thought my research was held in such high regard.

My main purpose was to use my blog as a medium for exploration, not to mention a forum for feedback that was always welcomed and for which I was grateful. But then the warnings became a recurring theme, and I began to take them seriously. Yes, my ideas on the blog were not fully developed and my research was far from fully finished (not to mention often written hastily and unedited), but everything was nonetheless all mine, and presented a potential for future projects.

Over the past few months I began reducing what I share, including my full list of resources. While the research I am posting here will not be a part of my dissertation, and will probably not garner more than a few articles at best, in a field where publication is of utmost importance to lose even the smallest opportunity is borderline terrifying. Yet in the process of withholding I am losing another opportunity – to collaborate, obtain feedback from strangers with similar if not identical interests, and for improvements to my work that would have all been gained through these interactions.

Similarly others lose out too as I am not the only one harboring these sentiments. But to believe academia is so cut-throat where sharing with each other will invariably lead to theft of intellectual property dampens an atmosphere that has for centuries thrived on association and partnership. In all honesty most would not have their articles or books if it were not for those others who supported them throughout the process by some means, which brings me to a second downfall of data sharing or, better stated, data mining- namely the lack of recognition or profit (not monetarily speaking) of reproducing useful data for others.

Compiling databases or coding manuscripts is simultaneously rewarding and thankless. While I have not been doing this for very long I understand the importance of resources that are not always available and the difficulty of obtaining said resources. I know how much labor went into researching not the manuscripts themselves, but simply finding them. During my compilation stages of the Chaucer Project I relied heavily on books that categorized the different Chaucerian manuscripts and spent numerous hours finding as many of the digitized versions that existed. For this reason I began my Canterbury Tales Manuscripts Catalogue. I realized along the way that the manuscripts were scattered across the internet in various forms. What I endeavor is to bring all the links to one place so anyone working with the Tales would easily find the links for where the manuscripts are stored, whether they are digitized, and so forth. This is very much still a work in progress as there are nearly a hundred such manuscripts, but my point is that when this project is complete my only reward is the satisfaction I will beget from it (and perhaps a thank you from a fellow scholar who will no longer have to sweep dozens of libraries). Consequently this is also why the project is taking so long. While I am extremely dedicated to it and find it a worthwhile endeavor, I recognize that to move my academic career forward such projects need to take the back seat. Less will care about my databases than the number of articles I have managed to publish. This is not to say that research, publications, and original thoughts are not important (far from it), but rather to demonstrate the pitfalls that can be associated with undertaking certain altruistic projects. If you are not being financed, or at the very least supported by an academic institution to complete data sharing projects the rewards rely solely on the feelings of personal accomplishment and satisfaction for a job well done.

So while I am busy trying to present, get published, doing school work, teaching, working in general, and compiling databases, sharing somehow gets lost in the shuffle and merely striving becomes reliant upon selfish tactics based in withholding. What a shame.