Skip to main content
eScholarship
Open Access Publications from the University of California

UC Irvine

UC Irvine Electronic Theses and Dissertations bannerUC Irvine

Automatic Builds of Large Software Repositories

Creative Commons 'BY' version 4.0 license
Abstract

A large number of open source projects are hosted on the Internet by popular repository sites like GitHub, SourceForge, BitBucket, etc. These repositories are becoming more popular and growing in size. There are many research projects that mine these software repositories for valuable information.

Compiling the projects found in these repositories can help lter out the good, and usable

projects. It gives us a guarantee that the source code is syntactically correct, and that

all the dependencies of the project are either self contained or accessible on the Internet.

Projects can be maintained and organized in dierent ways depending on the developer culture and practice. Unfortunately, very often repositories fail to capture the environmental assumptions made by the developers such as build tools, versions, presence of external dependencies, etc. This heterogeneous nature of the projects makes the successful compilation of large numbers of projects a challenging task as one solution cannot be applied to all. It is impractical to manually correct the compilation of every project. We designed several heuristics to maximize the number of projects compiling successfully in a repository.

Sourcerer is an infrastructure for large-scale collection and analysis of open-source code. It crawls open-source Java projects from various sources on the Internet and builds an aggregated repository, and database. We used the information found in the database to automatically compile more than 55,000 Java projects in the Sourcerer repository. We propose several general, language independent heuristics to tackle the most common errors.

Using these heuristics, we were capable of building 33.18% of the projects in the repository, successfully building more than 18,000 Java projects.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View