[Soc-coordination] [GSoC 2014] Metrics Portal project

Stefano Zacchiroli zack at debian.org
Mon Mar 10 14:22:34 UTC 2014


On Sat, Mar 08, 2014 at 06:59:18PM +0200, Nikolay Baluk wrote:
> Hi, Stefano, Stuart and the Debian community!

Hi Nikolay, thanks for your interest in this project! Sorry for the
delay in answering; in the future, to avoid it, please feel free to mail
me and Stuart directly (addresses on the proposal page).

> In our case, we are faced with a very different data. As I understand it,
> we need to be able to show "normal" data from the database in which there
> is an time-value point

Correct.

> In other hand, such data sources as a RDB is a little easier to
> handle.  They already have some scheme so we can just let user specify
> which column is a timestamp and which is value or even do it
> ourselves.

Well, the idea is to use a storage that is specific of the portal, no
matter what's the current data storage. So you can count on a RDB
(loss less), because that's what we want to use.

> $ dmp-client add <metric_id> <value> <timestamp>
>  url: “http://192.168.0.100:4242/dmp-rpc/”,

If I'm getting you right, you're proposing something along these lines
as a push interface from "clients" to the portal. That's a possibility,
yes. I was myself more inclined to have a pull-only interface, where the
portal itself fetches data from what are the clients in your scheme. But
offering both possibilities could be certainly imagined.

The idea is that each metric will be created by adding a manifest for it
to the portal infrastructure. That manifest could include a description
of the update logics and, if it's pull, prescribe an upgrade frequency;
if it's push just sit and wait for updates.

> 2. Allow multiple display formats (adjusted for certain metrics,
> depending on the data type): charts, diagrams, etc.

ACK. Here too, the initial idea was to simply have time series and
simple line-plots. But we can imagine other types of diagrams. Keep in
mind that the higher the number of diagrams you want to support, the
more difficult finding the right abstraction to declare metrics
becomes. We want to optimize for ease in adding metrics... we definitely
do not want to end up reinventing matplotib or gnuplot! :)

> 3.  In addition to DMP API, collect data as follows:

Nah, it should really be much simpler than this. We do not want to have
in the app the logic to parse different data storages. For push metrics
(*if* we will have them, there is basically nothing to do). For pull
metrics just define an API based on the execution of a metric-specific
scripts, that the person adding the metric should provide. The portal
infrastructure will just have to run the script (which can be added only
by trusted members, so that's fine security-wise) and interpret its
output --- which could be as simple as a list of labeled measures.

> 4. Provide support for real-time graphs for metrics such as "Online right
> now", which is not required to store a lot of data, since it is necessary
> to analyze only the current state at the current moment.

I don't think this is of immediate interest for us. But could be an
extension.

> 5. DMP API - not only the entry point to send data. It also a way to
> querying it.

That would be good, yes. But as long as we define some standard "naming"
convention for the DB, it should be trivial.

> 6. Maybe provide an event-collector API for event-based data, so DMP
> would be also a open-source alternative of Google Analytics-like
> software.

That's probably far-fetched for us.

> Some feedback will help me to begin work on prototype (not just a
> script that uses matplotlib to graph some metrics, but a web app as a
> proof of concept). I think this project has the potential to become a
> great open-source solution to help Debian and other projects make
> their software better by analysis their stats.

HTH, feel free to ask more questions,
Cheers.
-- 
Stefano Zacchiroli  . . . . . . .  zack at upsilon.cc . . . . o . . . o . o
Maître de conférences . . . . . http://upsilon.cc/zack . . . o . . . o o
Former Debian Project Leader  . . @zack on identi.ca . . o o o . . . o .
« the first rule of tautology club is the first rule of tautology club »
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 811 bytes
Desc: Digital signature
URL: <http://lists.alioth.debian.org/pipermail/soc-coordination/attachments/20140310/47406b5d/attachment.sig>


More information about the Soc-coordination mailing list