Tesseract Write Up

 

Demo:

You can accesss a demo of Tesseract at:
http://crc.maccherone.com/tesseract/mpv.html


Description of the tool:

Tesseract is designed to provide a high level view of the entire software project and its evolution. Specifically, it combines information about project activities (frequency of commits), file dependencies (logical coupling based on when files have been checked in together), social dependencies (dependencies among developers based on underlying dependencies among artifacts that they are editing), and bug history.

Tesseract is designed to allow a user to investigate a project through different perspectives that are linked together to present a holistic view of the project.



Figure 1



Tool Walk-through

 

  1. Each project in the combo list (Fig 1 (a)) represents a GNOME project. Select a project (e.g., “gnome/rhytmbox”) from the drop down to populate the other panels.
  2. The search bar (Fig 1 (b)) allows you to search for a specific file name, developer name, or bug text. The appropriate node/text that is found is highlighted in yellow.
  3. Date slider (see Fig 1 (c)) displays the date range for which the project was active. For example, Project “gnome/rhythmbox had commits between 2002-03-02 and 2007-01-03. The data range also shows the distribution of the number of commits  and communication frequency over time period. The date slider is set by default to encompass 6 month period (starting from the start date of the project).  The file network, developer network, and bug data displays information for this time slice. Either thumbnail in the slider can be adjusted to show a different start or end date.

 

Network Data:


  1. File-to-file network (Figure2), shows the network of interdependent files in the project. Currently, coupling among artifacts is determined when files have been changed together and committed in the selected time range. The graph takes into consideration only changes to code files (i.e., it disregards .gif, .log, .txt, etc). Hovering over a file node presents a tool tip displaying its file name and highlights (makes the nodes darker) its neighbors. Tesseract provides two extra controls to allow the user to fine-tune which files are to be considered in a particular commit.

                          i.      The Numeric combo box (top left in panel) allows users to specify the threshold for the number of files per commit that are to be considered. For example, if the threshold is set to 10, then commits in which more than 10 files have changed will not be considered. This threshold helps us filter changes to a large set of files which might have been due to a licensing change or authorship changes. This threshold also helps in making the file network graph scalable.

                        ii.      The File Numeric stepper (top right in panel) allows the user to specify how many times two files have to be checked in together to deem them coupled. 


Figure 2: File network



  1. Developer to developer network (Figure3) displays the congruence in the social network of the project. Congruence is defined as a match between the coordination requirements and the coordination behavior of a team, where developers who are working on interdependent artifacts are meant to coordinate with each other. We calculate coordination requirements based on the methodology developed by Cataldo et al [], where developer to developer dependency is calculated based on the underlying logical coupling among the artifacts (i.e., files that have been committed together). The communication behavior in the project is based on communication activities in the mailing lists and bug database. Specifically, when developers participate in email discussions, comment on a particular bug/issue in the Bugzilla database, or work on a particular bug/issue they are considered to have communicated with each other. This communication link is then compared with the coordination requirement link to calculate congruence. When the communication link matches the coordination requirement link, we color the edge between two nodes in this graph “green”. When the communication link is missing the edge is colored “red” representing a gap. When there is an extra communication link (i.e., two developers have communicated, but not worked on coupled artifact), the edge is colored “grey”. The developer network panel provides two controls.

                          i.            It is possible that developers first discuss a bug or feature before editing the files concerning that bug/feature and committing them to the repository. To take into consideration such discussions the Numerical stepper (left top panel in Figure3) allows the user to select the “number of days” prior to the editing of the files when the discussions occurred.

                        ii.            Communication selection checkboxes (right top panel Figure3) allow users to select which communication channel (email, bug activity, bug comment) is to be used for congruence calculation.


 Figure 3: Developer network





NOTES

 

Bug Data:

For the time range selected by the Date Slider (Fig 1(b)) bug information is shown if (1) the bug was opened during this time period or (2) an open bug was closed during this period:

  1. The stacked area chart (Fig 1 (f)) displays the number of open bugs in the selected time range classified (and colored) according bug severity.
  2. The Table (Fig 1 (f)) provides further information on bugs shown in the stack chart. The status that a user provides when reporting a bug is called “priority”, the status that is assigned by the core developers in the project is called “severity”. The status of the bug as reported by the developer working on it is reported as “status” and the final decision about how the bug was resolved shown as “resolution”.

                    i.            The user can use the checkboxes (Fig 1 (f)) to filter which bug is displayed based on the severity of the box. Enhancement requests are included in this list because it appears in the Bug database.

                  ii.            Clicking on a particular bug in the Table selects the developer (colored yellow) to whom the bug was assigned in the developer to developer network. Developers who had communicated regarding that bug (bug activity or comment) can be found by hovering on that developer node (deeper colored) in the developer-to-developer network.