Explanation and solution of bug report #345931
    Mats Erik Andersson 
    ynglingatal at yahoo.se
       
    Thu Sep  7 20:11:22 UTC 2006
    
    
  
   Dear Maintainers of Grub,
 I propose to offer the solution to bug #345931. The 
 problem is that the grub-shell of 0.97 breaks stage1
 for version 0.96 and earlier. The demolition is so
 strong that stage1 is unintelligibly displaying
 "shadows" on the screen.
 I am writing this for your information and I intend
 to locate the disturbing code snippets later.
  Hands on solution in testing case:
  ----------------------------------
 *  Functional grub 0.95-2004... with Debian Sarge;
 *  Reconfiguring with grub 0.97-4 and 0.97-13
       using grml-small-0.2 resp. grml-0.8,
       both Debian-based:
      device (hd0) /dev/hda
      root (hd0,5)
      setup (hd0)
 *  Now the bootloader i completely broken, no
      signs of correct booting-activity.
 *  Remedy:  write 0x90 to byte 0x004d of MBR.
    This restores full funtionality of grub.
 Explanations
 ------------
   Stage1 of grub 0.95 and 0.96 begin with the 
   following machine code.
       0x7c00  eb48     jmp  0x7c4a
       0x7c4a  fa       cli
       0x7c4b  80ca00   or dl,0x00
       0x7c4e  .......   a corrective jump instruction
   (For buggy BIOSes the last will be changed 
    to 80ca80  or dl,0x80).
   Observe the last value 0x00 which is different
   from   nop = 0x90 .
   In contrast, grub 0.97 commences stage1 as
        0x7c00  eb48     jmp short 0x7c4a
        0x7c4a  fa       cli
        0x7c4b  eb07     jmp short 0x7c54
        0x7c4d  f6...    four corrective instructions
    **** The problem  ****
    The setup-command of the grub-shell uses some
    technique to judge the sanity of the bios and
    mostly replaces the TWO-BYTES INSTRUCTION 
                        ---------------------
          eb07      jmp  short 0x7c54
     by
          90        nop
          90        nop
     However, if stage1 originates from grub 0.95/96
     the corresponding instruction is THREE BYTES long
          80ca00    or dl,0x00
     thus being "corrected" to
        
          90         nop
          90         nop
          00ea597c      (processor runs havoc now)
     The corrective means that grub 0.95/96 uses
     is much simpler choice:
          0x7c4d     (either)  0x00  or  0x80.
   I see a twofold way out of this incompatibility:
     1)  Replace code snippet   'jmp short 0x7c4a'
         in stage1 of grub 0.97 by inserting an extra
         nop-instruction  'jmp 0x7c4a ; nop'.
         Then the grub-shell always can correct with
         a code snippet 0x909090 at address 0x7c4b
         for all versions of grub.
      2) Better testing in the grub-shell to determine
         whether location 0x7c4b holds 0x80**** which
         calls for a three bytes correction (or an
         alteration at 0x7c4d), or it holds 0xeb**
         and needs a two-byte correction.
    Since I only have gleaned on the source code for
    stage1 and not the rest of grubs sources, there
    is indication of which method is preferable.
    Do you have a suggestion? If not sooner, I will
    ask you for advice when I have disected the
    source code.
         Best regards
           Mats E Andersson
           ynglingatal at yahoo.se
    
    
More information about the Pkg-grub-devel
mailing list