[Pkg-gridengine-devel] pkg-gridengine first steps

Michael Banck mbanck at debian.org
Wed Apr 25 16:25:00 UTC 2007


Hi again,

On Wed, Apr 25, 2007 at 11:56:19AM +0200, Michael Banck wrote:
> > > Probably the most pressing right now is figuring out how to run the
> > > client programs without having to set environment variables (their
> > > startup expects SGE_ROOT SGE_QMASTER_PORT and SGE_EXECD_PORT to be set).
> > > Maybe we can hardcode the first one to /var/lib/gridengine (if $SGE_ROOT
> > > is set, it would take precedence, of course), and get the other two from
> > > /etc/services (there's support for this already).  We just need to get
> > > some ports (which?) allocated here.  I've never had to bother with this
> > > in Debian myself, anybody know where to request this?
> > >
> > > The daemons aren't a problem, they could just source
> > > /etc/default/gridengine in their init scripts (to be written) or so.

[...]

> > The problem is that (by my reading of policy) we can't require
> > environment variables to run at all.  So for SGE_ROOT (and presumably
> > SGE_CELL), we need to hardcode some defaults.  

This patch seems to work for me:

--- source/libs/gdi/sge_gdi_ctx.c.orig  2007-04-25 14:49:58.000000000 +0200
+++ source/libs/gdi/sge_gdi_ctx.c       2007-04-25 14:50:23.000000000 +0200
@@ -2007,8 +2007,7 @@
    */
    sge_root = getenv("SGE_ROOT");
    if (sge_root == NULL) {
-      answer_list_add_sprintf(alpp, STATUS_ESEMANTIC, ANSWER_QUALITY_CRITICAL, MSG_SGEROOTNOTSET);
-      DRETURN(AE_ERROR);
+      sge_root = "/var/lib/gridengine";
    }
    sge_cell = getenv("SGE_CELL")?getenv("SGE_CELL"):DEFAULT_CELL;
    sge_qmaster_port = sge_get_qmaster_port();

Or whatever else directory we agree on.  This only applies to the
clients, the server SGE_ROOT gets setup elsewhere AFAICT, and we could
control this via /etc/default/gridengine as sourced by
/etc/init.d/gridengine-*.

The next thing clients need is the act_qmaster file, i.e. the qmaster
hostname.  I propose we change the code slightly to first check
/etc/gridengine/act_qmaster and (if that is not available) then
$SGE_ROOT/$SGE_CELL/$COMMON/act_qmaster as now:

--- source/libs/gdi/sge_gdi_ctx.c.orig  2007-04-25 17:20:44.000000000 +0200
+++ source/libs/gdi/sge_gdi_ctx.c       2007-04-25 17:22:15.000000000 +0200
@@ -1615,11 +1615,13 @@
       char err_str[SGE_PATH_MAX+128];
       char master_name[CL_MAXHOSTLEN];

-      if (get_qm_name(master_name, path_state->get_act_qmaster_file(path_state), err_str) == -1) {
-         if (eh != NULL) {
-            eh->error(eh, STATUS_EUNKNOWN, ANSWER_QUALITY_ERROR, MSG_GDI_READMASTERNAMEFAILED_S, err_str);
+      if (get_qm_name(master_name, "/etc/gridengine/act_qmaster", err_str) == -1) {
+         if (get_qm_name(master_name, path_state->get_act_qmaster_file(path_state), err_str) == -1) {
+            if (eh != NULL) {
+               eh->error(eh, STATUS_EUNKNOWN, ANSWER_QUALITY_ERROR, MSG_GDI_READMASTERNAMEFAILED_S, err_str);
+            }
+            DRETURN(NULL);
          }
-         DRETURN(NULL);
       }
       DPRINTF(("(re-)reading act_qmaster file. Got master host \"%s\"\n", master_name));
       /*
--- source/libs/gdi/qm_name.c.orig      2007-04-25 17:37:09.000000000 +0200
+++ source/libs/gdi/qm_name.c   2007-04-25 17:37:27.000000000 +0200
@@ -77,7 +77,7 @@
    }

    if (!(fp=fopen(master_file,"r"))) {
-      ERROR((SGE_EVENT, MSG_GDI_FOPEN_FAILED, master_file, strerror(errno)));
+//      ERROR((SGE_EVENT, MSG_GDI_FOPEN_FAILED, master_file, strerror(errno)));
       if (err_str) {
          sprintf(err_str, MSG_GDI_OPENMASTERFILEFAILED_S , master_file);
       }

The last thing they need is $SGE_ROOT/$SGE_CELL/$COMMON/bootstrap,
though I am not quite sure why (could be that they just have one init
routine for everything and don't bother).  This is probably the hardest,
as bootstrap probably shouldn't vary among the machines on a cluster.

We could ship a generic version in gridengine-common and link to it in
$SGE_ROOT/$SGE_CELL/$COMMON/bootstrap, but I don't think this is ideal.
But maybe a good interim solution.

BTW, I checked in debian/file_structure where I've outlined a rough
first draft of which files/directories go into which package, can you
guys please review it and change it as necessary?


cheers,

Michael



More information about the Pkg-gridengine-devel mailing list