A Study of Feature Scattering in the Linux Kernel

Name: A Study of Feature Scattering in the Linux Kernel
Creator: Mukelabai Mukelabai
License: https://creativecommons.org/licenses/by/4.0/
Keywords: Other

Citation Author(s):: Leonardo Passos (University of Waterloo, Electrical and Computer Engineering Waterloo, ON, CAN)

Rodrigo Queiroz (University of Waterloo, Electrical and Computer Engineering Waterloo, ON, CAN)

Mukelabai Mukelabai (Goteborgs Universitet, Computer Sciecne and Engineering Lindholmsplatsen 1 Goteborg, Vastra Gotaland, SE 412 96)

Thorsten Berger (Goteborgs Universitet, Computer Sciecne and Engineering Goteborg, Vastra Gotaland, SE)

Sven Apel (University of Passau, Department of Informatics and Mathematics Innstr. 33 Passau, Bavaria, DE 94032)

Krzysztof Czarnecki (University of Waterloo, Computer Science Waterloo, ON, CAN)

Jesús Padilla (SAP, SAP Waterloo, ON, CAN)
Submitted by:: Mukelabai Mukelabai
Last updated:: Wed, 12/12/2018 - 15:02
DOI:: 10.21227/aswj-q655
Data Format:: ZIP(binary txt and csv files)

SQL

PDF

R
Links:: Study's online appendix

157 views

Categories:

Other

Keywords:

ACCESS DATASET CITE

Abstract

Feature code is often scattered across a software system. Scattering is not necessarily bad if used with care, as witnessed by systems with highly scattered features that evolved successfully. Feature scattering, often realized with a pre-processor, circumvents limitations of programming languages and software architectures. Unfortunately, little is known about the principles governing scattering in large and long-living software systems. We present a longitudinal study of feature scattering in the Linux kernel, complemented by a survey with 74, and interviews with nine Linux kernel developers. We analyzed almost eight years of the kernel's history, focusing on its largest subsystem: device drivers. We learned that the ratio of scattered features remained nearly constant and that most features were introduced without scattering. Yet, scattering easily crosses subsystem boundaries, and highly scattered outliers exist. Scattering often addresses a performance-maintenance tradeoff (alleviating complicated APIs), hardware design limitations, and avoids code duplication. While developers do not consciously enforce scattering limits, they actually improve the system design and refactor code, thereby mitigating pre-processor idiosyncrasies or reducing its use.

Instructions:

# A Study of Feature Scattering in the Linux Kernel

#Scattering database
Our longitudinal study is based on a feature-oriented analysis of the Linux kernel git repository. Using a custom made tool (see infrastructure), we convert the kernel git repository into a relational database, which we make available for download (scatdb_dump.zip).

#Sample classification and criteria

    Classification of a sample of scattered driver features (scat_grps.ods)
    Classification of all outlier features (outliers.ods)

The classification procedure of features as infrastructure or platform is documented here (criteria.tar.gz).

All the documents made available in this section are compatible with Open Office: odt (for text documents) and ods (for spreadsheet documents).

#Infrastructure

To create and analyze features in the Linux kernel, we rely on the following tools:

    scat_linux (scat_linux_db.zip): a tool that, given a snapshot of a Linux kernel repository, generates a database with scattering information of Kconfig features (configuration options).
    kconfig_info (kconfig_info.zip): a tool to recover information relative to a single feature.
    A set of helper scripts in R and Bash, which can be downloaded from here (scripts.tar.gz).
#Survey and Interviews

A summary report of the survey data is found in the survey folder and the interview guide can also be found in the interviews folder

Contact

In case of any problem, please contact one of the following:

    Leonardo Passos (lpassos at gsd dot uwaterloo dot ca)
    Rodrigo Queiroz (rqueiroz at gsd dot uwaterloo dot ca)
    Mukelabai Mukelabai (mukelabai dot mukelabai at gu dot se)
    Thorsten Berger (thorsten dot berger at cse dot gu dot se)
    Sven Apel (apel at uni-passau dot de)
    Krzysztof Czarnecki (kczarnec at gsd dot uwaterloo dot ca)
    Jesús Padilla (jesalepad at gmail dot com)