Goals for next weeks

Mon Jul 15 20:03:37 UTC 2013

On Thu, Jul 11, 2013 at 23:55 +0200, Eugenio Cano-Manuel Mendoza wrote:

> It would be nice if we could set a couple of goals for the next 2 weeks. Last
> time we met we agreed on solving some issues, mostly related to redesigning
> clojurehelper to make it more modular and some other things. They were all
> taken care of last week and this week I've been trying to improve it, build
> tests, etc. 

I've reviewed your code today and am quite happy with the changes. The
structure of the code is much nicer and better maintainable. I did, however,
found a couple of things that we have to discuss or that you have to take care
of.

--------
lein-xml
--------
* wonderful progress!

* placeholder test/leiningen/test/core.clj

    write actual tests or remove this

* check for file existence before writing to project.xml → warn user if it
  exists
* maybe extract the project-xml logic in the let clause into a standalone
  function

* fix indentation

  - in the reduce step, something like:
    (reduce
      (fn [deps d]
      (conj deps
            (xml/element :dependency {}
                         (xml/element :name {} (str (d 0)))
                         (xml/element :version {} (d 1)))))
      nil
      (:dependencies project))))]

    Which editor and indenting scheme do you use? I just use vim with the
    standard indenting rules (same as on emacs) and typically use indent by
    two spaces. As the code shifts massively to the left you might want to
    refactor the reduce step into a separate function.

lein_makepkg
------------

* Don't hardcode path separators. So code like this:

        outfile = 'debian/' + project['package_name'] + '.jlibs'

  should simply use os.path.join():

        os.path.join("debian", "foo.jlibs")

* String construction with "foo" + "bar" is also a bit ugly and I find code along the lines of

    "{} {}".format("foo", "bar")

  much nicer. If you commonly fill these from a dictionary it makes sense to
  simply refer directly to the dictionary with, for example:

     "{foo} {bar}".format(**{"foo": 1, "bar": 2})

  See http://docs.python.org/2.7/tutorial/inputoutput.html#fancier-output-formatting for details

* All the create_* functions are quite similar. It might make sense to abstract
  the functionality into a single function and simply call that.

* No exception handling

  You don't handle any exceptions. It would be nice to fail gracefully if the
  program runs into exceptions, logs (cf. logging module) the failure and
  informs the user what is wrong.

  This holds true for file operations, but also for a lot of things that might
  throw IndexErrors tup[0] or so.

  What I mean here is that where you have:

    self._foo = parse(open(path))[0]

  it should rather be:

    try:
        with open(path) as in_file:
            foo = parse(parse)
            self._foo = foo[0]
    except IOError, ioe:
        _log.error("File {} couldn't be opened".format(path))
        _log.error(ioe)
        handle_error(...)
    except IndexError, indexe:
        _log.error(...)

This makes the program much more robust and will also allow the user to find
usage errors faster than just random tracebacks.

* logging

  This isn't really necessary right away, but the logging module *is* quite
  nice and you should accustom yourself with it.

leinpkg/tools
-------------

* Don't hardcode paths and separators, but use os.path.join
* Old-Style vs New-Style classes.

  Back in the bad old days (pre Python 2.2) built-in types (list, dict, ...)
  and user defined classes were very different beasts. I'd recommend to read
  http://www.python.org/download/releases/2.2.3/descrintro/ and
  http://www.python.org/doc/newstyle/ for an introduction to this topic. Since
  then a lot has changed, but one still has to derive new classes explicitly
  from object in order to get new style classes. In short, you shouldn't use:

    class Foo:

  but

    class Foo(object):

* config_files -- no need to make this global

PropertiesParser / PomParser / ProjectParser

 * old-style classes
 * class members/variables

   I don't quite understand why you initialise all those class variables and,
   in particular, items. If you *really* have to initialise them then just use
   instance variables (self.foo) and initialise them there.

* get_items()

  I am not too fond of this design decision and you call this method exactly
  twice in lein_makepkg. These classes combined with this method are
  essentially just a wrapper around a dictionary that has been initialised
  with certain values. Unfortunately you do this is such a way that the actual
  classes no longer behave like a dictionary but are only a factory method for
  the magical "items" dictionary.

  What you *actually* want in the end is simply a dictionary or *some* data
  structure that contains the data you use to fill the templates. This data
  comes from different data sources and you rightfully make this explicit.

  One thing you have to decide is how you want to access the data. There are
  two ways that you use at the moment:

    foo.bar (property/attribute)

  or

    foo['bar'] (dictionary key)

 You get data from multiple sources and essentially have to construct suitable
 dictionaries that you want to hand to jinja. These sources are:

    * global configuration dict (from configuration file in /etc)
    * user configuration dict (from conf file in ~/)
    * command line arguments
    * environment variables

    * project.clj
    * project.pom

  Please make each of these sources explicit and it is also of utmost
  important to have one place in which we model which setting can be set where
  and which setting can overwrite other. Lets discuss this later.

General Design - Data
---------------------

We have to think about the general design and should make it explicit what can
be set where. Generally speaking what we want to do is:

    1. Gather data from multiple sources (see above)
    2. Combine/Merge data according to *explicit* rules
    3. Hand data to the templating engine

What I would like you to do right now is to compile a table such as:

TEMPLATE_ENTRY          SOURCE (earlier overrides later)
--------------------------------------------------------

maintainer_email        DEBEMAIL (env), EMAIL (env), .. (see dch)

That should give us some idea on how to merge/construct the final
dictionaries.

One approach I typically like at least for command line arguments is to
populate argparse's default values from os.environ (that way environment
variables can be easily overridden by command line arguments).

I guess that we have to make the merging of options a bit more explicit, but
the important part is that you have *one* place in which you do that. You
parse all available data individually and then combine them explicitly for
each option.

PomParser / ProjectParser

* remove items class variable
* Don't parse in __init__ (or rather don't initiate the parsing here)
  but write an explicit method for the parsing which returns the resulting
  dictionary.

I don't quite like the design of these and would prefer one of two approaches:

1. Return a (conceptionally static) instance that simply populates attributes
   from a POM file.

   The approach here would be something like:

   class POMParser(object):

    def __init__(self):
      self._xpaths = {
        'name': "/pom:project/pom:artifactId/text()",
        'version': "/pom:project/pom:version/text()",
        ...
        }
      self._ns = {'pom': 'http://maven.apache.org/POM/4.0.0'}

     self.name = None
     self....

    def parse(self, path):
        """Parse given ...

        """
        try:
            with open(path) as in_file:
                self._tree = et.parse(in_file)
        except IOError, ioe:
            log_error(...)
            handle_error(...)

    @property
    def name(self):
        return self._tree.xpath(self._xpaths['name'}, namespaces=self._ns)
    ...

*OR*

2. Write an explicit parser method that returns a dictionary

  class POMParser(object):

    def __init__(self):
      self._xpaths = {
        'name': "/pom:project/pom:artifactId/text()",
        'version': "/pom:project/pom:version/text()",
        ...
        }

      self._ns = {'pom': 'http://maven.apache.org/POM/4.0.0'}

    def _name(self):
        return self._tree.xpath(self._xpaths['name'}, namespaces=self._ns)

    def parse(self, path):
        """Parse given ...

        """
        try:
            with open(path) as in_file:
                self._tree = et.parse(in_file)
        except IOError, ioe:
            log_error(...)
            handle_error(...)

        return {
            "name": self._name()
            "foo: self._foo()
        }

Or something along those lines. Comments from Paul/Algernon?

Project class
-------------

* don't set class variables

We don't really need this "I contain all the data class" -- Just construct
dictionaries you hand to jinja explicitly according to the rules.

We should discuss the general design and nail it now.

-- 
Wolodja <debian at babilen5.org>

4096R/CAF14EFC
081C B7CD FF04 2BA9 94EA  36B2 8B7F 7D30 CAF1 4EFC
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.alioth.debian.org/pipermail/pkg-clojure-maintainers/attachments/20130715/9e349a5a/attachment.sig>