Fwd: Goals for next weeks

Eugenio Cano-Manuel Mendoza eugeniocanom at gmail.com
Tue Jul 16 02:45:32 UTC 2013


Sorry Wolodja you're getting this email twice. When I clicked on reply it
didn't send to pkg-clojure.

---------- Forwarded message ----------
From: Eugenio Cano-Manuel Mendoza <eugeniocanom at gmail.com>
Date: Tue, Jul 16, 2013 at 12:43 PM
Subject: Re: Goals for next weeks
To: Wolodja Wentland <debian at babilen5.org>



--------
> lein-xml
> --------
> * wonderful progress!
>
> * placeholder test/leiningen/test/core.clj
>
>     write actual tests or remove this


> * check for file existence before writing to project.xml → warn user if it
>   exists

* maybe extract the project-xml logic in the let clause into a standalone
>   function
>

Agree to all.


> * fix indentation
>
>   - in the reduce step, something like:
>     (reduce
>       (fn [deps d]
>       (conj deps
>             (xml/element :dependency {}
>                          (xml/element :name {} (str (d 0)))
>                          (xml/element :version {} (d 1)))))
>       nil
>       (:dependencies project))))]
>
>     Which editor and indenting scheme do you use? I just use vim with the
>     standard indenting rules (same as on emacs) and typically use indent by
>     two spaces. As the code shifts massively to the left you might want to
>     refactor the reduce step into a separate function.
>

I use vim with 4 spaces. I'm also using the vimclojure plugin but you got
me here, when I wrote this I didn't know how to indent it properly.


>
> lein_makepkg
> ------------
>
> * Don't hardcode path separators. So code like this:
>
>         outfile = 'debian/' + project['package_name'] + '.jlibs'
>
>   should simply use os.path.join():
>
>         os.path.join("debian", "foo.jlibs")
>

Roger that


>
> * String construction with "foo" + "bar" is also a bit ugly and I find
> code along the lines of
>
>     "{} {}".format("foo", "bar")
>
>   much nicer. If you commonly fill these from a dictionary it makes sense
> to
>   simply refer directly to the dictionary with, for example:
>
>      "{foo} {bar}".format(**{"foo": 1, "bar": 2})
>
>   See
> http://docs.python.org/2.7/tutorial/inputoutput.html#fancier-output-formattingfor details
>

Will look it up but I personally don't like how it looks, maybe when I see
it done I will change my mind


> * All the create_* functions are quite similar. It might make sense to
> abstract
>   the functionality into a single function and simply call that.
>

Agree

* No exception handling
>
>   You don't handle any exceptions. It would be nice to fail gracefully if
> the
>   program runs into exceptions, logs (cf. logging module) the failure and
>   informs the user what is wrong.
>
>   This holds true for file operations, but also for a lot of things that
> might
>   throw IndexErrors tup[0] or so.
>
>   What I mean here is that where you have:
>
>     self._foo = parse(open(path))[0]
>
>   it should rather be:
>
>     try:
>         with open(path) as in_file:
>             foo = parse(parse)
>             self._foo = foo[0]
>     except IOError, ioe:
>         _log.error("File {} couldn't be opened".format(path))
>         _log.error(ioe)
>         handle_error(...)
>     except IndexError, indexe:
>         _log.error(...)
>
> This makes the program much more robust and will also allow the user to
> find
> usage errors faster than just random tracebacks.
>

True story. Sometimes I find myself trying to decrypt tracebacks. I planned
to do this later but I guess it makes sense to do it now since the program
is substantially larger.


> * logging
>
>   This isn't really necessary right away, but the logging module *is* quite
>   nice and you should accustom yourself with it.
>

Will do.


> leinpkg/tools
> -------------
>
> * Don't hardcode paths and separators, but use os.path.join
> * Old-Style vs New-Style classes.
>
>   Back in the bad old days (pre Python 2.2) built-in types (list, dict,
> ...)
>   and user defined classes were very different beasts. I'd recommend to
> read
>   http://www.python.org/download/releases/2.2.3/descrintro/ and
>   http://www.python.org/doc/newstyle/ for an introduction to this topic.
> Since
>   then a lot has changed, but one still has to derive new classes
> explicitly
>   from object in order to get new style classes. In short, you shouldn't
> use:
>
>     class Foo:
>
>   but
>
>     class Foo(object):
>
> * config_files -- no need to make this global
>

Agree to all


>
> PropertiesParser / PomParser / ProjectParser
>
>  * old-style classes
>  * class members/variables
>
>    I don't quite understand why you initialise all those class variables
> and,
>    in particular, items. If you *really* have to initialise them then just
> use
>    instance variables (self.foo) and initialise them there.


I did it to make the variables that the methods are expecting more
explicit, but I don't think I understand what you're telling me. Could you
show me an example?


> * get_items()
>
>   I am not too fond of this design decision and you call this method
> exactly
>   twice in lein_makepkg. These classes combined with this method are
>   essentially just a wrapper around a dictionary that has been initialised
>   with certain values. Unfortunately you do this is such a way that the
> actual
>   classes no longer behave like a dictionary but are only a factory method
> for
>   the magical "items" dictionary.
>
>   What you *actually* want in the end is simply a dictionary or *some* data
>   structure that contains the data you use to fill the templates. This data
>   comes from different data sources and you rightfully make this explicit.
>

The data structure that contains the data for the templates is the Project
class. The goal of *Parser is to create a dictionary with some default
values that can be used within the Project class. These values depend on
whether they are being parsed from a POM, clj etc. Now that you mention it
maybe I didn't choose the correct implementation for this particular idea
but then how could I accomplish what I want? I do want these values (which
I called properties) separated from the rest of the logic since there is
really a set of properties inherent to the project we are packaging. Any
ideas?


>   One thing you have to decide is how you want to access the data. There
> are
>   two ways that you use at the moment:
>
>     foo.bar (property/attribute)
>
>   or
>
>     foo['bar'] (dictionary key)


Yes, I see a problem there. I'll think of something.


>
>  You get data from multiple sources and essentially have to construct
> suitable
>  dictionaries that you want to hand to jinja. These sources are:
>
>     * global configuration dict (from configuration file in /etc)
>     * user configuration dict (from conf file in ~/)
>     * command line arguments
>     * environment variables
>
>     * project.clj
>     * project.pom
>
>   Please make each of these sources explicit and it is also of utmost
>   important to have one place in which we model which setting can be set
> where
>   and which setting can overwrite other. Lets discuss this later.
>

Ok =)



>
> General Design - Data
> ---------------------
>
> We have to think about the general design and should make it explicit what
> can
> be set where. Generally speaking what we want to do is:
>
>     1. Gather data from multiple sources (see above)
>     2. Combine/Merge data according to *explicit* rules
>     3. Hand data to the templating engine
>
> What I would like you to do right now is to compile a table such as:
>
> TEMPLATE_ENTRY          SOURCE (earlier overrides later)
> --------------------------------------------------------
>
> maintainer_email        DEBEMAIL (env), EMAIL (env), .. (see dch)
>
> That should give us some idea on how to merge/construct the final
> dictionaries.
>

I agree on the 3 steps that you mentioned, in fact the current
implementation does that,

1. Gather data from multiple sources (see above): *Parsers (The
OptionParser for options and the PropertiesParser {pom,project}parser for
project properties.

2. Combine/Merge data according to *explicit* rules: The Project class
where each rule is implemented with each _get_* method.

3. Hand data to the templating engine: The DebianTemplates.render_write
method.

I understand what you're saying. If we want to be more explicit then maybe
we have to create one single place where the merging is being done. With
the current approach there's merging being done in OptionsParser and
overrides in Project so if we want to change a rule maybe is more
difficult. But if we want to easily add another source for properties or
options how do we do it without modifying the rules that take care of
creating variables for the templates? This sounds like the expression
problem...

By the way here's the table (Based on what the Project class does):

TEMPLATE_ENTRY              SOURCE (earlier overrides later)
------------------------------------------------------------
source_name
prop(project/name)                              [1]
package_name                    cmdline(-p),
prop(project/name)                 [2]
version                                prop(project/version),
cmdline(-v)              [3]
description                          prop(project/description)
itp_bug                               cmdline(--itp),
cmdline(--guess-itp)            [4]
homepage                          prop(project/url)
dependencies
prop(project/dependencies)                      [5]
genfiles                              cmdline(--jar),
prop(project/name)              [6]
fullpath_deps
prop(project/dependencies)                      [7]
use_javahelper                    cmdline(--javahelper)
maintainer_name                cmdline(-m), env(DEBFULLNAME),
conf(maintainer)
maintainer_email                cmdline(-e), env(DEBEMAIL), conf(email)

[1] prop(project/name).replace('.', '-') + '-clojure'
[2] cmdline(-p) _OR_ 'lib' + prop(project/name).replace('.', '-') +
'-clojure'
[3] re.search('\d[.\d]*', prop(project/version).group(0) _OR_ cmdline(-v)
[4] cmdline(--itp) _OR_ if cmdline(--guess-itp): wnpp-check _OR_ 'XXXXXX'
[5] [dep.split('/')[-1] for dep in prop(project/dependencies)] (PARSER)
[6] [cmdline(--jar)] _OR_ [prop(project/name).replace('.', '-') + '.jar']
[7] ['/usr/share/java/' + 'x' + '.jar' for x in [5]]

and titanpad link: http://titanpad.com/OGcNayAacG


> One approach I typically like at least for command line arguments is to
> populate argparse's default values from os.environ (that way environment
> variables can be easily overridden by command line arguments).
>
> I guess that we have to make the merging of options a bit more explicit,
> but
> the important part is that you have *one* place in which you do that. You
> parse all available data individually and then combine them explicitly for
> each option.
>

Ok so one place for everything.


> PomParser / ProjectParser
>
> * remove items class variable
> * Don't parse in __init__ (or rather don't initiate the parsing here)
>   but write an explicit method for the parsing which returns the resulting
>   dictionary.
>
> I don't quite like the design of these and would prefer one of two
> approaches:
>
> 1. Return a (conceptionally static) instance that simply populates
> attributes
>    from a POM file.
>
>    The approach here would be something like:
>
>    class POMParser(object):
>
>     def __init__(self):
>       self._xpaths = {
>         'name': "/pom:project/pom:artifactId/text()",
>         'version': "/pom:project/pom:version/text()",
>         ...
>         }
>       self._ns = {'pom': 'http://maven.apache.org/POM/4.0.0'}
>
>      self.name = None
>      self....
>
>     def parse(self, path):
>         """Parse given ...
>
>         """
>         try:
>             with open(path) as in_file:
>                 self._tree = et.parse(in_file)
>         except IOError, ioe:
>             log_error(...)
>             handle_error(...)
>
>     @property
>     def name(self):
>         return self._tree.xpath(self._xpaths['name'}, namespaces=self._ns)
>     ...
>
ARRGGG why didn't I think of self._xpaths = {} !!?? much cleaner than what
I have.

>
> *OR*
>
> 2. Write an explicit parser method that returns a dictionary
>
>   class POMParser(object):
>
>     def __init__(self):
>       self._xpaths = {
>         'name': "/pom:project/pom:artifactId/text()",
>         'version': "/pom:project/pom:version/text()",
>         ...
>         }
>
>       self._ns = {'pom': 'http://maven.apache.org/POM/4.0.0'}
>
>     def _name(self):
>         return self._tree.xpath(self._xpaths['name'}, namespaces=self._ns)
>
>     def parse(self, path):
>         """Parse given ...
>
>         """
>         try:
>             with open(path) as in_file:
>                 self._tree = et.parse(in_file)
>         except IOError, ioe:
>             log_error(...)
>             handle_error(...)
>
>         return {
>             "name": self._name()
>             "foo: self._foo()
>         }
>
> Or something along those lines. Comments from Paul/Algernon?
>

Hmm I like them both. They are better than the mixed thing we currently
have.


> Project class
> -------------
>
> * don't set class variables
>
> We don't really need this "I contain all the data class" -- Just construct
> dictionaries you hand to jinja explicitly according to the rules.
>
> We should discuss the general design and nail it now.
>

As stated above.

Wolodja thank you so much for your feedback. I'm very happy with the
comments and I find very positive that we have a lot of room for
improvements. I'll be working on the more obvious ones while we get more
feedback from paultag and algernon on the critical ones.

Cheers!
Eugenio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-clojure-maintainers/attachments/20130716/a9677a32/attachment-0001.html>


More information about the Pkg-clojure-maintainers mailing list