Overview

The objective of this project is to assess if adding a software to package management system (PMS) can significantly impact its adoption. To assess adoption, we use citation count as a quantifying metric.

For this study, we need the following information:

  • Metadata of all the tools hosted by a PMS; the least required metadata are:

    • Tool name;
    • Reference to the tool's scholarly paper;
    • Date when the tool was published to the PMS.
  • Citation per year for a every scholarly paper. We get this information from Scopus.

Therefore, we first obtain tool metadata from PMS, then we search on Scopus for their citation information, then we perform statistical analysis to assert if the data rejects the hypothesis or not (adding software to PMS can significantly increase their citation count). Accordingly, the TVQ project consists of multiple components each performing a unique task in collecting or studying data. The components are:

  • Offline crawlersโ€”you do not need to run them; they collect some information about tools that are resource-expensive. These crawlers are run by the maintainers of this project, and their output is cached on github to be used by the Webservice.
  • Webservice; this service collects tool metadata and search Scopus for citation information. It also generates descriptive statistics about the tools and their citation count, and exports data to be used as input for the analytical scripts.
  • Analytics scripts; these are python scripts that take data generated by Webservice, perform statistical analysis, and report their results in tables and plots.

The following figure is an abstract illustration of the components.


overview