Scrutinize: Exploring a Project’s Revision History – Video @ CSCW 2008

by Tom on August 23, 2008

Scrutinize is a web based tool designed to take information from a source code repository, and present it in a way that allows project team members to learn about how the project has been changing and who has made those changes. Try Scrutinize.


Scrutinize on Vimeo. Try Scrutinize.

{ 0 comments }

What Makes a Good Bug Report? – FSE 2008

by Tom on August 1, 2008

In software development, bug reports provide crucial information to developers. However, these reports widely differ in their quality. We conducted a survey among developers and users of APACHE, ECLIPSE, and MOZILLA to find out what makes a good bug report. The analysis of the 466 responses revealed an information mismatch between what developers need and what users supply. Most developers consider steps to reproduce, stack traces, and test cases as helpful, which are at the same time most difficult to provide for users. Such insight is helpful to design new bug tracking tools that guide users at collecting and providing more helpful information. Our CUEZILLA prototype is such a tool and measures the quality of new bug reports; it also recommends which elements should be added to improve the quality. We trained CUEZILLA on a sample of 289 bug reports, rated by developers as part of the survey. In our experiments, CUEZILLA was able to predict the quality of 31-48% of bug reports accurately.

[click to continue...]

{ 0 comments }

Tag Clouds and Shuffle Effect in Apple Keynote

by Tom on July 31, 2008

Lately, I have been using and shuffling tag clouds a lot in my presentations. Here is a YouTube video demonstrating this effect, followed by instructions on how to create the effect in Apple Keynote. Click here to download the source Keynote file.

[click to continue...]

{ 0 comments }

Duplicate Bug Reports Considered Harmful… Really? – ICSM 2008

by Tom on July 5, 2008

In a survey we found that most developers have experienced duplicated bug reports, however, only few considered them as a serious problem. This contradicts popular wisdom that considers bug duplicates as a serious problem for open source projects. In the survey, developers also pointed out that the additional information provided by duplicates helps to resolve bugs quicker. In this paper, we therefore propose to merge bug duplicates, rather than treating them separately. We quantify the amount of information that is added for developers and show that automatic triaging can be improved as well. In addition, we discuss the different reasons why users submit duplicate bug reports in the first place.

[click to continue...]

{ 0 comments }

Towards the Next Generation of Bug Tracking Systems – VL/HCC 2008

by Tom on July 5, 2008

Developers typically rely on the information submitted by end-users to resolve bugs. We conducted a survey on information needs and commonly faced problems with bug reporting among several hundred developers and users of the APACHE, ECLIPSE and MOZILLA projects. In this paper, we present the results of a card sort on the 175 comments sent back to us by the responders of the survey. The card sort revealed several hurdles involved in reporting and resolving bugs, which we present in a collection of recommendations for the design of new bug tracking systems. Such systems could provide contextual assistance, reminders to add information, and most important, assistance to collect and report crucial information to developers.

[click to continue...]

{ 0 comments }

Predicting Software Metrics at Design Time – PROFES 2008

by Tom on June 23, 2008

How do problem domains impact software features? We mine software code bases to relate problem domains (characterized by imports) to code features such as complexity, size, or quality. The resulting predictors take the specific imports of a component and predict its size, complexity, and quality metrics. In an experiment involving 89 plug-ins of the ECLIPSE project, we found good prediction accuracy for most metrics. Since the predictors rely only on import relationships, and since these are available at design time, our approach allows for early estimation of crucial software metrics.

[click to continue...]

{ 0 comments }

Changes and Bugs: Mining and Predicting Development Activities – PhD Defense

by Tom on June 23, 2008

{ 0 comments }

Predicting Defects using Network Analysis on Dependency Graphs – ICSE 2008

by Tom on June 23, 2008

{ 0 comments }

Most frequently cited papers in computer science

by Tom on May 4, 2008

In any conference or journal on that is listed in the ACM Portal. Click here for the full list. You can also sort papers by downloads (6 weeks/12 months).

  1. A relational model of data for large shared data banks (Communications of the ACM 1970, 980 citations)
  2. Time, clocks, and the ordering of events in a distributed system (Communications of the ACM 1978, 916 citations)
  3. Graph-based algorithms for Boolean function manipulation (IEEE Transactions on Computers 1986, 911 citations)

{ 0 comments }

Most frequently cited papers in software engineering

by Tom on May 4, 2008

In a conference or journal on “Software Engineering” that is listed in the ACM Portal. Click here for the full list. You can also sort papers by downloads (6 weeks/12 months).

  1. The Model Checker SPIN (TSE 1997, 302 citations)
  2. Foundations for the study of software architecture (SIGSOFT SEN 1992, 302 citations)
  3. A Metrics Suite for Object Oriented Design (TSE 1994, 214 citations)

Here is a breakdown by year (1993-2008). Click on “full list” to see all papers of a year ranked by citation count.
[click to continue...]

{ 0 comments }