[Python-modules-commits] [python-wget] 01/03: Imported Upstream version 3.2

Balasankar C balasankarc-guest at moszumanska.debian.org
Sun Nov 8 05:21:11 UTC 2015


This is an automated email from the git hooks/post-receive script.

balasankarc-guest pushed a commit to branch master
in repository python-wget.

commit ff067562404e1c6b21b4a6a76fe9422757bcb26a
Author: Balasankar C <balasankarc at autistici.org>
Date:   Sun Nov 8 10:46:56 2015 +0530

    Imported Upstream version 3.2
---
 PKG-INFO   | 240 ++++++++++++++++++++++++++++--------------------------
 README.txt |  14 ++++
 wget.py    | 267 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 389 insertions(+), 132 deletions(-)

diff --git a/PKG-INFO b/PKG-INFO
index 676da07..a773a38 100644
--- a/PKG-INFO
+++ b/PKG-INFO
@@ -1,113 +1,127 @@
-Metadata-Version: 1.1
-Name: wget
-Version: 2.2
-Summary: pure python download utility
-Home-page: http://bitbucket.org/techtonik/python-wget/
-Author: anatoly techtonik <techtonik at gmail.com>
-Author-email: UNKNOWN
-License: Public Domain
-Description: Usage
-        =====
-        
-          python -m wget [options] <URL>
-        
-          options:
-            -o --output FILE|DIR   output filename or directory
-        
-        
-        API Usage
-        =========
-        
-          >>> import wget
-          >>> url = 'http://www.futurecrew.com/skaven/song_files/mp3/razorback.mp3'
-          >>> filename = wget.download(url)
-          100% [................................................] 3841532 / 3841532>
-          >> filename
-          'razorback.mp3'
-        
-        The skew that you see above is a documented side effect.
-        Alternative progress bar:
-        
-          >>> wget.download(url, bar=bar_thermometer)
-        
-        
-        ChangeLog
-        =========
-        2.2 (2014-07-19)
-         * it again can download without -o option
-        
-        2.1 (2014-07-10)
-         * it shows command line help
-         * -o option allows to select output file/directory
-        
-           * download(url, out, bar) contains out parameter
-        
-        2.0 (2013-04-26)
-         * it shows percentage
-         * it has usage examples
-         * it changes if being used as a library
-        
-           * download shows progress bar by default
-           * bar_adaptive gets improved algorithm
-           * download(url, bar) contains bar parameter
-             * bar(current, total)
-           * progress_callback is named callback_progress
-        
-        1.0 (2012-11-13)
-         * it runs with Python 3
-        
-        0.9 (2012-11-13)
-         * it renames file if it already exists
-         * it can be used as a library
-        
-           * download(url) returns filename
-           * bar_adaptive() draws progress bar
-           * bar_thermometer() simplified bar
-        
-        0.8 (2011-05-03)
-         * it detects filename from HTTP headers
-        
-        0.7 (2011-03-01)
-         * compatibility fix for Python 2.5
-         * limit width of progress bar to 100 chars
-        
-        0.6 (2010-04-24)
-         * it detects console width on POSIX
-        
-        0.5 (2010-04-23)
-         * it detects console width on Windows
-        
-        0.4 (2010-04-15)
-         * it shows cute progress bar
-        
-        0.3 (2010-04-05)
-         * it creates temp file in current dir
-        
-        0.2 (2010-02-16)
-         * it tries to detect filename from URL
-        
-        0.1 (2010-02-04)
-         * it can download file
-        
-        
-        Release Checklist
-        =================
-        
-        | [ ] update version in wget.py
-        | [x] update description in setup.py
-        | [ ] python setup.py check -mrs
-        | [ ] python setup.py sdist upload
-        | [ ] tag hg version
-        
-        -- 
-        anatoly techtonik <techtonik at gmail.com>
-        
-Platform: UNKNOWN
-Classifier: Environment :: Console
-Classifier: License :: Public Domain
-Classifier: Operating System :: OS Independent
-Classifier: Programming Language :: Python :: 2
-Classifier: Programming Language :: Python :: 3
-Classifier: Topic :: Software Development :: Libraries :: Python Modules
-Classifier: Topic :: System :: Networking
-Classifier: Topic :: Utilities
+Metadata-Version: 1.1
+Name: wget
+Version: 3.2
+Summary: pure python download utility
+Home-page: http://bitbucket.org/techtonik/python-wget/
+Author: anatoly techtonik <techtonik at gmail.com>
+Author-email: UNKNOWN
+License: Public Domain
+Description: Usage
+        =====
+        
+          python -m wget [options] <URL>
+        
+          options:
+            -o --output FILE|DIR   output filename or directory
+        
+        
+        API Usage
+        =========
+        
+          >>> import wget
+          >>> url = 'http://www.futurecrew.com/skaven/song_files/mp3/razorback.mp3'
+          >>> filename = wget.download(url)
+          100% [................................................] 3841532 / 3841532>
+          >> filename
+          'razorback.mp3'
+        
+        The skew that you see above is a documented side effect.
+        Alternative progress bar:
+        
+          >>> wget.download(url, bar=bar_thermometer)
+        
+        
+        ChangeLog
+        =========
+        3.2 (2015-10-22)
+         * download(url) can again be unicode on Python 2.7
+           https://bitbucket.org/techtonik/python-wget/issues/8
+        
+        3.1 (2015-10-18)
+         * it saves unknown files under download.wget filename
+           https://bitbucket.org/techtonik/python-wget/issues/6
+         * it prints unicode chars to Windows console
+         * it downloads unicode urls with Python 3
+        
+        3.0 (2015-10-17)
+         * it can download and save unicode filenames
+           https://bitbucket.org/techtonik/python-wget/issues/7
+        
+        2.2 (2014-07-19)
+         * it again can download without -o option
+        
+        2.1 (2014-07-10)
+         * it shows command line help
+         * -o option allows to select output file/directory
+        
+           * download(url, out, bar) contains out parameter
+        
+        2.0 (2013-04-26)
+         * it shows percentage
+         * it has usage examples
+         * it changes if being used as a library
+        
+           * download shows progress bar by default
+           * bar_adaptive gets improved algorithm
+           * download(url, bar) contains bar parameter
+             * bar(current, total)
+           * progress_callback is named callback_progress
+        
+        1.0 (2012-11-13)
+         * it runs with Python 3
+        
+        0.9 (2012-11-13)
+         * it renames file if it already exists
+         * it can be used as a library
+        
+           * download(url) returns filename
+           * bar_adaptive() draws progress bar
+           * bar_thermometer() simplified bar
+        
+        0.8 (2011-05-03)
+         * it detects filename from HTTP headers
+        
+        0.7 (2011-03-01)
+         * compatibility fix for Python 2.5
+         * limit width of progress bar to 100 chars
+        
+        0.6 (2010-04-24)
+         * it detects console width on POSIX
+        
+        0.5 (2010-04-23)
+         * it detects console width on Windows
+        
+        0.4 (2010-04-15)
+         * it shows cute progress bar
+        
+        0.3 (2010-04-05)
+         * it creates temp file in current dir
+        
+        0.2 (2010-02-16)
+         * it tries to detect filename from URL
+        
+        0.1 (2010-02-04)
+         * it can download file
+        
+        
+        Release Checklist
+        =================
+        
+        | [ ] update version in wget.py
+        | [x] update description in setup.py
+        | [ ] python setup.py check -mrs
+        | [ ] python setup.py sdist upload
+        | [ ] tag hg version
+        
+        -- 
+        anatoly techtonik <techtonik at gmail.com>
+        
+Platform: UNKNOWN
+Classifier: Environment :: Console
+Classifier: License :: Public Domain
+Classifier: Operating System :: OS Independent
+Classifier: Programming Language :: Python :: 2
+Classifier: Programming Language :: Python :: 3
+Classifier: Topic :: Software Development :: Libraries :: Python Modules
+Classifier: Topic :: System :: Networking
+Classifier: Topic :: Utilities
diff --git a/README.txt b/README.txt
index eedfa72..a85d176 100644
--- a/README.txt
+++ b/README.txt
@@ -25,6 +25,20 @@ Alternative progress bar:
 
 ChangeLog
 =========
+3.2 (2015-10-22)
+ * download(url) can again be unicode on Python 2.7
+   https://bitbucket.org/techtonik/python-wget/issues/8
+
+3.1 (2015-10-18)
+ * it saves unknown files under download.wget filename
+   https://bitbucket.org/techtonik/python-wget/issues/6
+ * it prints unicode chars to Windows console
+ * it downloads unicode urls with Python 3
+
+3.0 (2015-10-17)
+ * it can download and save unicode filenames
+   https://bitbucket.org/techtonik/python-wget/issues/7
+
 2.2 (2014-07-19)
  * it again can download without -o option
 
diff --git a/wget.py b/wget.py
old mode 100755
new mode 100644
index b35e22a..c0fbb83
--- a/wget.py
+++ b/wget.py
@@ -13,9 +13,11 @@ to make command line interface intuitive for new people.
 
 Public domain by anatoly techtonik <techtonik at gmail.com>
 Also available under the terms of MIT license
-Copyright (c) 2010-2014 anatoly techtonik
+Copyright (c) 2010-2015 anatoly techtonik
 """
 
+__version__ = "3.2"
+
 
 import sys, shutil, os
 import tempfile
@@ -23,22 +25,212 @@ import math
 
 PY3K = sys.version_info >= (3, 0)
 if PY3K:
-  import urllib.request as urllib
+  import urllib.request as ulib
   import urllib.parse as urlparse
 else:
-  import urllib
+  import urllib as ulib
   import urlparse
 
 
-__version__ = "2.2"
+# --- workarounds for Python misbehavior ---
+
+# enable passing unicode arguments from command line in Python 2.x
+# https://stackoverflow.com/questions/846850/read-unicode-characters
+def win32_utf8_argv():
+    """Uses shell32.GetCommandLineArgvW to get sys.argv as a list of Unicode
+    strings.
+
+    Versions 2.x of Python don't support Unicode in sys.argv on
+    Windows, with the underlying Windows API instead replacing multi-byte
+    characters with '?'.
+    """
 
+    from ctypes import POINTER, byref, cdll, c_int, windll
+    from ctypes.wintypes import LPCWSTR, LPWSTR
+
+    GetCommandLineW = cdll.kernel32.GetCommandLineW
+    GetCommandLineW.argtypes = []
+    GetCommandLineW.restype = LPCWSTR
+
+    CommandLineToArgvW = windll.shell32.CommandLineToArgvW
+    CommandLineToArgvW.argtypes = [LPCWSTR, POINTER(c_int)]
+    CommandLineToArgvW.restype = POINTER(LPWSTR)
+
+    cmd = GetCommandLineW()
+    argc = c_int(0)
+    argv = CommandLineToArgvW(cmd, byref(argc))
+    argnum = argc.value
+    sysnum = len(sys.argv)
+    result = []
+    if argnum > 0:
+        # Remove Python executable and commands if present
+        start = argnum - sysnum
+        for i in range(start, argnum):
+            result.append(argv[i].encode('utf-8'))
+    return result
+
+
+# enable unicode output to windows console
+# https://stackoverflow.com/questions/878972/windows-cmd-encoding-change-causes-python-crash
+def win32_unicode_console():
+    import codecs
+    from ctypes import WINFUNCTYPE, windll, POINTER, byref, c_int
+    from ctypes.wintypes import BOOL, HANDLE, DWORD, LPWSTR, LPCWSTR, LPVOID
+
+    original_stderr = sys.stderr
+
+    # Output exceptions in this code to original_stderr, so that we can at least see them
+    def _complain(message):
+        original_stderr.write(message if isinstance(message, str) else repr(message))
+        original_stderr.write('\n')
+
+    codecs.register(lambda name: codecs.lookup('utf-8') if name == 'cp65001' else None)
+
+    try:
+        GetStdHandle = WINFUNCTYPE(HANDLE, DWORD)(("GetStdHandle", windll.kernel32))
+        STD_OUTPUT_HANDLE = DWORD(-11)
+        STD_ERROR_HANDLE = DWORD(-12)
+        GetFileType = WINFUNCTYPE(DWORD, DWORD)(("GetFileType", windll.kernel32))
+        FILE_TYPE_CHAR = 0x0002
+        FILE_TYPE_REMOTE = 0x8000
+        GetConsoleMode = WINFUNCTYPE(BOOL, HANDLE, POINTER(DWORD))(("GetConsoleMode", windll.kernel32))
+        INVALID_HANDLE_VALUE = DWORD(-1).value
+
+        def not_a_console(handle):
+            if handle == INVALID_HANDLE_VALUE or handle is None:
+                return True
+            return ((GetFileType(handle) & ~FILE_TYPE_REMOTE) != FILE_TYPE_CHAR
+                    or GetConsoleMode(handle, byref(DWORD())) == 0)
+
+        old_stdout_fileno = None
+        old_stderr_fileno = None
+        if hasattr(sys.stdout, 'fileno'):
+            old_stdout_fileno = sys.stdout.fileno()
+        if hasattr(sys.stderr, 'fileno'):
+            old_stderr_fileno = sys.stderr.fileno()
+
+        STDOUT_FILENO = 1
+        STDERR_FILENO = 2
+        real_stdout = (old_stdout_fileno == STDOUT_FILENO)
+        real_stderr = (old_stderr_fileno == STDERR_FILENO)
+
+        if real_stdout:
+            hStdout = GetStdHandle(STD_OUTPUT_HANDLE)
+            if not_a_console(hStdout):
+                real_stdout = False
+
+        if real_stderr:
+            hStderr = GetStdHandle(STD_ERROR_HANDLE)
+            if not_a_console(hStderr):
+                real_stderr = False
+
+        if real_stdout or real_stderr:
+            WriteConsoleW = WINFUNCTYPE(BOOL, HANDLE, LPWSTR, DWORD, POINTER(DWORD), LPVOID)(("WriteConsoleW", windll.kernel32))
+
+            class UnicodeOutput:
+                def __init__(self, hConsole, stream, fileno, name):
+                    self._hConsole = hConsole
+                    self._stream = stream
+                    self._fileno = fileno
+                    self.closed = False
+                    self.softspace = False
+                    self.mode = 'w'
+                    self.encoding = 'utf-8'
+                    self.name = name
+                    self.flush()
+
+                def isatty(self):
+                    return False
+
+                def close(self):
+                    # don't really close the handle, that would only cause problems
+                    self.closed = True
+
+                def fileno(self):
+                    return self._fileno
+
+                def flush(self):
+                    if self._hConsole is None:
+                        try:
+                            self._stream.flush()
+                        except Exception as e:
+                            _complain("%s.flush: %r from %r" % (self.name, e, self._stream))
+                            raise
+
+                def write(self, text):
+                    try:
+                        if self._hConsole is None:
+                            if not PY3K and isinstance(text, unicode):
+                                text = text.encode('utf-8')
+                            elif PY3K and isinstance(text, str):
+                                text = text.encode('utf-8')
+                            self._stream.write(text)
+                        else:
+                            if not PY3K and not isinstance(text, unicode):
+                                text = str(text).decode('utf-8')
+                            elif PY3K and not isinstance(text, str):
+                                text = text.decode('utf-8')
+                            remaining = len(text)
+                            while remaining:
+                                n = DWORD(0)
+                                # There is a shorter-than-documented limitation on the
+                                # length of the string passed to WriteConsoleW (see
+                                # <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1232>.
+                                retval = WriteConsoleW(self._hConsole, text, min(remaining, 10000), byref(n), None)
+                                if retval == 0 or n.value == 0:
+                                    raise IOError("WriteConsoleW returned %r, n.value = %r" % (retval, n.value))
+                                remaining -= n.value
+                                if not remaining:
+                                    break
+                                text = text[n.value:]
+                    except Exception as e:
+                        _complain("%s.write: %r" % (self.name, e))
+                        raise
+
+                def writelines(self, lines):
+                    try:
+                        for line in lines:
+                            self.write(line)
+                    except Exception as e:
+                        _complain("%s.writelines: %r" % (self.name, e))
+                        raise
+
+            if real_stdout:
+                sys.stdout = UnicodeOutput(hStdout, None, STDOUT_FILENO, '<Unicode console stdout>')
+            else:
+                sys.stdout = UnicodeOutput(None, sys.stdout, old_stdout_fileno, '<Unicode redirected stdout>')
+
+            if real_stderr:
+                sys.stderr = UnicodeOutput(hStderr, None, STDERR_FILENO, '<Unicode console stderr>')
+            else:
+                sys.stderr = UnicodeOutput(None, sys.stderr, old_stderr_fileno, '<Unicode redirected stderr>')
+    except Exception as e:
+        _complain("exception %r while fixing up sys.stdout and sys.stderr" % (e,))
+
+
+# --- helpers ---
+
+def to_unicode(filename):
+    """:return: filename decoded from utf-8 to unicode"""
+    #
+    if PY3K:
+        # [ ] test this on Python 3 + (Windows, Linux)
+        # [ ] port filename_from_headers once this works
+        # [ ] add test to repository / Travis
+        return filename
+    else:
+        if isinstance(filename, unicode): 
+            return filename
+        else:
+            return unicode(filename, 'utf-8')
 
 def filename_from_url(url):
-    """:return: detected filename or None"""
+    """:return: detected filename as unicode or None"""
+    # [ ] test urlparse behavior with unicode url
     fname = os.path.basename(urlparse.urlparse(url).path)
     if len(fname.strip(" \n\t.")) == 0:
         return None
-    return fname
+    return to_unicode(fname)
 
 def filename_from_headers(headers):
     """Detect filename from Content-Disposition headers if present.
@@ -73,7 +265,7 @@ def filename_fix_existing(filename):
     """Expands name portion of filename with numeric ' (x)' suffix to
     return filename that doesn't exist already.
     """
-    dirname = '.' 
+    dirname = u'.'
     name, ext = filename.rsplit('.', 1)
     names = [x for x in os.listdir(dirname) if x.startswith(name)]
     names = [x.rsplit('.', 1)[0] for x in names]
@@ -129,7 +321,8 @@ def get_console_width():
                         ("dwMaximumWindowSize", DWORD)]
 
         sbi = CONSOLE_SCREEN_BUFFER_INFO()
-        ret = windll.kernel32.GetConsoleScreenBufferInfo(console_handle, byref(sbi))
+        ret = windll.kernel32.GetConsoleScreenBufferInfo(
+            console_handle, byref(sbi))
         if ret == 0:
             return 0
         return sbi.srWindow.Right+1
@@ -280,6 +473,19 @@ def callback_progress(blocks, block_size, total_size, bar_function):
         sys.stdout.write("\r" + progress)
 
 
+def detect_filename(url=None, out=None, headers=None, default="download.wget"):
+    """Return filename for saving file. If no filename is detected from output
+    argument, url or headers, return default (download.wget)
+    """
+    names = dict(out='', url='', headers='')
+    if out:
+        names["out"] = out or ''
+    if url:
+        names["url"] = filename_from_url(url) or ''
+    if headers:
+        names["headers"] = filename_from_headers(headers) or ''
+    return names["out"] or names["headers"] or names["url"] or default
+
 def download(url, out=None, bar=bar_adaptive):
     """High level function, which downloads URL into tmp file in current
     directory and then renames it to filename autodetected from either URL
@@ -289,11 +495,14 @@ def download(url, out=None, bar=bar_adaptive):
     :param out: output filename or directory
     :return:    filename where URL is downloaded to
     """
-    names = dict()
-    names["out"] = out or ''
-    names["url"] = filename_from_url(url)
+    # detect of out is a directory
+    outdir = None
+    if out and os.path.isdir(out):
+        outdir = out
+        out = None
+
     # get filename for temp file in current directory
-    prefix = (names["url"] or names["out"] or ".") + "."
+    prefix = detect_filename(url, out)
     (fd, tmpfile) = tempfile.mkstemp(".tmp", prefix=prefix, dir=".")
     os.close(fd)
     os.unlink(tmpfile)
@@ -307,13 +516,18 @@ def download(url, out=None, bar=bar_adaptive):
     else:
         callback = None
 
-    (tmpfile, headers) = urllib.urlretrieve(url, tmpfile, callback)
-    names["header"] = filename_from_headers(headers)
-    if os.path.isdir(names["out"]):
-        filename = names["header"] or names["url"]
-        filename = names["out"] + "/" + filename
+    if PY3K:
+        # Python 3 can not quote URL as needed
+        binurl = list(urlparse.urlsplit(url))
+        binurl[2] = urlparse.quote(binurl[2])
+        binurl = urlparse.urlunsplit(binurl)
     else:
-        filename = names["out"] or names["header"] or names["url"]
+        binurl = url
+    (tmpfile, headers) = ulib.urlretrieve(binurl, tmpfile, callback)
+    filename = detect_filename(url, out, headers)
+    if outdir:
+        filename = outdir + "/" + filename
+
     # add numeric ' (x)' suffix if filename already exists
     if os.path.exists(filename):
         filename = filename_fix_existing(filename)
@@ -338,6 +552,13 @@ if __name__ == "__main__":
     if "--version" in sys.argv:
         sys.exit("wget.py " + __version__)
 
+    # patch Python 2.x to read unicode from command line
+    if not PY3K and sys.platform == "win32":
+        sys.argv = win32_utf8_argv()
+    # patch Python to write unicode characters to console
+    if sys.platform == "win32":
+        win32_unicode_console()
+
     from optparse import OptionParser
     parser = OptionParser()
     parser.add_option("-o", "--output", dest="output")
@@ -386,7 +607,15 @@ _ 30.0Mb at  3.0 Mbps  eta:   0:00:20   30% [=====         ]
 urllib.ContentTooShortError: retrieval incomplete: got only 15239952 out of 24807571 bytes
 
 [ ] find out if urlretrieve may return unicode headers
-[ ] test suite for unsafe filenames from url and from headers
+[ ] write files with unicode characters
+    https://bitbucket.org/techtonik/python-wget/issues/7/filename-issue
+  [x] Python 2, Windows
+  [x] Python 3, Windows
+  [ ] Linux
+[ ] add automatic tests
+  [ ] specify unicode URL from command line
+  [ ] specify unicode output file from command line
+  [ ] test suite for unsafe filenames from url and from headers
 
 [ ] security checks
   [ ] filename_from_url

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/python-modules/packages/python-wget.git



More information about the Python-modules-commits mailing list