[Soc-coordination] Debian Teams Activity Metrics - Final Report

Sukhbir Singh sukhbir.in at gmail.com
Mon Aug 22 12:29:32 UTC 2011


DEBIAN TEAMS ACTIVITY METRICS
==

Mentors: Andreas Tille and Scott Howard
Student: Sukhbir Singh

Website: teammetrics.alioth.debian.org
Mailing list: http://lists.alioth.debian.org/pipermail/teammetrics-discuss/
Git repository: git://anonscm.debian.org/teammetrics/teammetrics.git

This is the final report for the Debian Teams Activity Metrics project.


Summary:
--
The Debian Teams Activity Metrics project is an investigation into the
performance of teams in Debian. Over the summer, we worked on
implementing various tools that will will help us measure the
performance of teams using metrics such as mailing list activity,
commit statistics from project repositories and package upload records
from the UDD. We will continue our work towards presenting this
information and also incorporating the ideas for the enhancement of
this project which were suggested to us during our talk at DebConf11.


What we had planned to do:
-- 
The aim of this project as mentioned was to measure the performance of
teams in Debian. Before development began, we had outlined certain
metrics that we thought would give us a very good estimate, if not an
exact measure (which is not feasible), of how teams are performing in
Debian, like for example, which members are contributing the most, how
closely knit the team is and whether there is an equal contribution of
all members in the team. Given that there is no definite or concrete
measure of this, we decided on the following metrics:

    -   mailing list activity,
    -   commits to the repository,
    -   package upload records.

Our intention has been not to 'crown someone the king' of a team, but
rather to see how the team is performing as a group.


What we have done:
--
For the period of 2nd August till now, we have worked on writing a
mbox filter (many changes were made) and fixed the old code and added
lots of new features, thus finalizing the tools presented below.
Giving the entire changes will consume too much space here, so please
do a `git log --since='3 weeks ago'` on our repository to see the
entire list of changes.

As of the last day of development (22nd August, 2011), we have
completed the following:

    -   mailing list statistics for lists on Alioth and lists.d.o.,
    -   commit statistics for Git and SVN repositories,
    -   package upload records from the the UDD.

We have not been able to implement the presentation of information
because we just finished with the data gathering tools. We are going
to continue working on the presentation which is the 'readable' part
of this project. Though we have not finished all that we promised, yet
we consider this as a success because we have the tools ready and have
just the presentation part left.

Some of the important tools:

    liststat.py
    Downloads mbox archives for the lists on Alioth and parses them

    nnptstat.py
    Fetches messages from Gmane and creates mbox archives that are
then parsed by liststat.py

    commitstat.py
    Creates a SSH connection to Alioth that calls gitstat.py and
svnstat.py for Git and SVN repositories

    upload_history.py and maintain_names_prefered.py
    Package upload data from the UDD.


Dead ends during development (only one!):
--
Unavailability of lists.d.o mbox archives

As those of you are following this list might be aware that we were
(actually I in this case) not given access to the mbox archives for
lists.d.o due to privacy concerns pointed out by the list masters.
This involved some lengthy discussions and as a workaround, I wrote a
script that fetched the messages for a list from Gmane and then
created a mbox from that. During our continued discussions with the
list masters, we were told to write a filter that would remove
specific headers from the mbox archives after which we would be
provided access to them. We implemented that filter and are waiting
for a reply from them, however for the time being, our script which
fetches the messages over NNTP from Gmane works is being used.


DebConf11
--
I attended DebConf11 (yayay!) where Andreas and I gave a talk about
our project to showcase our work and gather feedback and ideas from
the community. I am happy to say that we were given some wonderful
ideas that we will be implementing in our project; we were presented
with actual problems in a team and how our project could potentially
solve them. The feedback during and after our talk was amazing and it
showed that people were excited about our work.

The notes from this talk are available at [0], the presentation PDF at
[1] and the video at [2].


Statistical Data
--
As our project was about measuring the performance of teams, here is
the data for our 'Team Metrics' team:
(Frequency of posting)

          name       | count
    -----------------+-------
     Sukhbir Singh   |   236
     Andreas Tille   |   163
     Scott Howard    |     5

The fact that Andreas gives very detailed replies to what I ask is demonstrated.
(The total number of lines written excluding blank lines and quotes)

          name       | sum
    -----------------+------
     Andreas Tille   | 3045
     Sukhbir Singh   | 2688
     Scott Howard    |   51

For our Git repository:
(Number of commits)

         name      | count
    ---------------+-------
     Sukhbir Singh |   125
     Andreas Tille |    62

So you see, we are a good team :)

If you have access to blends.d.n, this data for some teams is
available in the 'teammetrics' database.


What's next:
--
After GSoC ends, we are going to primarily work on presenting the
information and implementing the ideas from our talk. As for the
metrics, they are and will be constantly improved.


End notes:
--
I had a great summer because not only did I enjoy working on my
project but I also went to DebConf where I met all the people from the
Debian community.

Thanks to Andreas for being *very* responsive and patient in answering
my queries, taking out the time to mentor this project and allowing me
to reinvent the wheel! To Scott who helped us through some really bad
dead ends, thanks! From the perspective of a student, when your
mentors are passionate about what they do, it really helps.

If you have any queries about this project, as always, please use the
mailing list to get in touch with us. We would love to hear from you.

See you around!

-- 
Sukhbir Singh


[0] - http://lists.alioth.debian.org/pipermail/teammetrics-discuss/2011-August/000245.html
[1] - http://people.debian.org/~tille/talks/20110729-gsoc-teammetrics/liststats.pdf
[2] - http://meetings-archive.debian.net/pub/debian-meetings/2011/debconf11/high/712_Measuring_Team_Performance.ogv



More information about the Soc-coordination mailing list