- A variety of important machine learning applications require predictions on test data with different characteristics than the data on which a model was trained and validated. In particular, test data may have a different relative frequency of positives and negatives (i.e., class distribution) and/or different mislabeling costs of false positive and false negative errors (i.e., cost distribution) than the training data. Selecting models that have been built in conditions that are substantially different from the conditions under which they will be applied is more challenging than selecting models for identical conditions. Several approaches to this problem exist, but they have mostly been studied in theoretical contexts. This paper presents an empirical evaluation of approaches for model selection under class and cost distribution change, based on Receiver Operating Characteristic (ROC) analysis. The analysis compares the ROC Convex Hull (ROCCH) method with other candidate approaches for selecting discrete classifiers for several UCI Machine Learning Repository and simulated datasets. Surprisingly, the ROCCH method did not perform well in the experiments, despite being developed for this task. Instead, the results indicate that a reliable approach for selecting discrete classifiers on the ROC convex hull is to select the model that optimizes the cost metric of interest on the validation data (which has the same characteristics as the training data) but weighted by the class and/or cost distributions anticipated at test time.