Though mutation analysis is the primary means of evaluating the quality of test suites, though it suffers from inadequate standardization. Mutation analysis tools vary based on language, when mutants are generated (phase of compilation), and target audience. Mutation tools rarely implement the complete set of operators proposed in the literature,...
An important challenge in machine learning is to find ways of learning quickly from very small amounts of training data. The only way to learn from small data samples is to constrain the learning process by exploiting background knowledge. In this report, we present a theoretical analysis on the use...
Watering a lawn involves several factors such as laying out the sprinklers and valves, trenching, laying pipes, cables, etc. The primary goal is to ensure uniform watering subject to various constraints. We present some abstract problems involving sprinkler layout and some solutions with theoretical worst case bounds on their performance....
The amount of research data online is growing exponentially, scattered across a multitude of locations and stored in various formats on a wide variety of platforms. The value of metadata which assists us in determining the relevance, location, and accessibility of data and related research, has become correspondingly more important....
Software Configuration Management (SCM) is a concept of trying to group the changing artifacts of software development to try to manage the complexity of modern day software. This paper provides a roadmap for improving the SCM support process based on a meta-model. I developed a model that defines key practice...
A weakness of many interactive visual programming languages (VPLs) is their static representations. Lack of an adequate static representation places a heavy cognitive burden on a VPL's programmers, because they must remember potentially long dynamic sequences of screen displays in order to understand a previously-written program. However, although this problem...
We believe concreteness, direct manipulation and responsiveness in a visual programming language increase its usefulness. However, these characteristics present a challenge in generalizing programs for reuse, especially when concrete examples are used as one way of achieving concreteness. In this paper, we present a technique to solve this problem by...
Visual Fortran D (VFD) is a graphical tool to assist parallel programmers in specifying data distributions. Its target is Fortran D, an extension to Fortran77 or Fortran90 which supports data parallelism. VFD provides an intuitive framework where the user employs simple, fast graphical manipulations to specify how data is to...
Most designers of object-based languages adopt a reference model of variables without explicit justification, despite its wide ranging consequences. This paper argues that the traditional container model of variables is more efficient than the reference model, nearly as flexible, and more appropriate to parallel and distributed systems. The topics addressed...
The coverage of a learning algorithm is the number of concepts that can be learned by that algorithm from samples of a given size. This paper asks whether good learning algorithms can be designed by maximizing their coverage. The paper extends a previous upper bound on the coverage of any...
This paper applies learning techniques to make engineering optimization more efficient and reliable. When the function to be optimized is highly non-linear, the search space generally forms several disjoint convex regions . Unless gradient-descent search is begun in the right region, the solution found will be suboptimal. This paper formalizes...
This paper describes efficient methods for exact and approximate implementation of the MIN-FEATURES bias, which prefers consistent hypotheses definable over as few features as possible. This bias is useful for learning domains where many irrelevant features are present in the training data.
We first introduce FOCUS-2, a new algorithm that...
Leda is a strongly typed, compiled, multiparadigm programming language. This paper describes various implementation concerns which arose from the experience of writing a Leda compiler as part of the Leda research team. These include aspects of run-time representation, symbol-table information, and code generation. The paper concentrates on objects and classes....
Control abstraction is the process by which programmers define new control constructs by specifying an ordering of statement execution. Using control abstraction, we can create new control constructs for parallel programming, separate the specification of parallelism and synchronization from the rest of the application code, and vary the parallelism exploited...
A Belief Net is a factored representation for a joint probability distribution over a set of variables. This factoring is made possible by the conditional independence relationships among variables made evident in the sparseness of the graphical level of the net. There is, however, another source of factoring available which...
Leda is a newly evolving, strongly typed, compiled multi-paradigm programming language. This paper describes the integration of one of its supported paradigms, the logical (or relational) paradigm, into the language and the current implementation. It also describes implementational aspects of enumerated types and gives a variety of example programs that...
This document defines the language Leda as currently implemented by the authors and Vinoo Cherian. Leda is an evolving research language and readers may wish to consult the bibliography for a variety of papers concerning its raison d'être. Our purpose here is to guide investigators in the use of a...
Heuristics for static scheduling of task graphs using list scheduling techniques have continued to improve by adding real-world factors such as processor speed, network transmission speed, interconnection topology, and link contention considerations to the basic task graph model. Yet, the resulting schedules do not fully model program loops and branches,...