On Technical Debt in Mathematical Programming: An Exploratory Study

Abstract

The Technical Debt (TD) metaphor describes development shortcuts taken for expediency that cause the degradation of internal software quality. It has served the discourse between engineers and management regarding how to invest resources in maintenance and extend into scientific software (both the tools, the algorithms and the analysis conducted with it). Mathematical programming has been considered ‘special purpose programming’, meant to program and simulate particular problem types (e.g., symbolic mathematics through Matlab). Likewise, more traditional mathematical programming has been considered ‘modelling programming’ to program models by providing programming structures required for mathematical formulations (e.g., GAMS, AMPL, AIMMS). Because of this, other authors have argued the need to consider mathematical programming as closely related to software development. As a result, this paper presents a novel exploration of TD in mathematical programming by assessing self-reported practices through a survey, which gathered 168 complete responses. This study discovered potential debts manifested through smells and attitudinal causes towards them. Results uncovered a trend to refactor and polish the final mathematical model and use version control and detailed comments. Nonetheless, we uncovered traces of negative practices regarding Code Debt and Documentation Debt, alongside hints indicating that most TD is deliberately introduced (i.e., modellers are aware that their practices are not the best). We aim to discuss the idea that TD is also present in mathematical programming and that it may hamper the reproducibility and maintainability of the models created. The overall goal is to outline future areas of work that can lead to changing current modellers’ habits and assist in extending existing mathematical programming (both practice and research) to eventually manage TD in mathematical programming.

Publication
In Mathematical Programming Computation


Contributions

Overall, our study uncovered modellers’ tendency to refactor and polish the final model and use version control and very detailed comments. Nonetheless, we discovered traces of negative practices regarding Code and Documentation Debt (e.g., dead and duplicated code, and outdated or incomplete documentation). We also observed hints that TD appears to be deliberately introduced, with modellers being aware that their practices are not the best–this seems to align with prior findings related to scientific software practices. We also highlight four future areas of work to continue unveiling what TD means for OR. Finally, although the goal may be ambitious, this paper aspires to stimulate reflective thinking and promote a novel and different line of action and research among OR practitioners in search of two goals. First, achieving better programming habits during model development, and second, approaching SE research in OR programming.


Ethics Declarations

The authors have no relevant financial or non-financial interests to disclose. The authors declare that they have no conflict of interest.

The methodology used in this was approved by the ANU Human Ethics Research Committee (HREC), with project code 2020-23416-11101.


Replication Package

We provide a partial replication package in Zenodo. It includes the complete survey structure and the email invitation (with the Qualtrics’ embedded fields). The participant collection sheet used for the convenience sample is shared empty to demonstrate the type of data collected; note that we cannot provide the completed sheet (which included the name, email and affiliation of invited participants) because our Ethical Protocol requires us to preserve participants’ identity. This is a problem known as the privacy vs utility paradox (Li et al., 2009), and its study was out of scope for this investigation.

As mentioned before, the survey was implemented and distributed in Qualtrics. Qualtrics provides an advanced WYSIWYG (‘what you see is what you get’) editor for results reporting that also provides plots and summarises data. As a result, we used this system to produce the aggregated data. The tables were generated through Qualtrics, thus, there is no code/script available for this.

Finally, some additional plots not included in the manuscript are part of the replication package, showing aggregated, unidentifiable data.


Acknowledgements

Open Access funding enabled and organized by CAUL and its Member Institutions.


Citation

@Article{VidoniCunico2022,
author={Vidoni, Melina and Cunico, Maria Laura},
title="{On Technical Debt in Mathematical Programming: An Exploratory Study}",
journal="{Mathematical Programming Computation}"",
year={2022},
month={Aug},
day={05},
issn={1867-2957},
doi={10.1007/s12532-022-00225-1},
url={https://link.springer.com/article/10.1007/s12532-022-00225-1}
}


Venue Impact

The following is the venue impact, according to Scimago Journal Ranking:

SCImago Journal & Country Rank