[Python-modules-commits] [pypandoc] 01/01: Import pypandoc_1.2.0+ds0.orig.tar.xz

Elena Grandi valhalla-guest at moszumanska.debian.org
Sat Jul 30 14:43:49 UTC 2016


This is an automated email from the git hooks/post-receive script.

valhalla-guest pushed a commit to branch upstream
in repository pypandoc.

commit b70546c9a83595b0717b8ecccd10bb141641161a
Author: Elena Grandi <valhalla-d at trueelena.org>
Date:   Sat Jul 30 15:38:31 2016 +0200

    Import pypandoc_1.2.0+ds0.orig.tar.xz
---
 README.md                   | 446 ++++++++++++-----------
 pypandoc/__init__.py        | 865 ++++++++++++++++++++++++++------------------
 pypandoc/pandoc_download.py | 173 +++++++++
 setup.py                    | 209 ++---------
 tests.py                    | 653 +++++++++++++++++++--------------
 5 files changed, 1341 insertions(+), 1005 deletions(-)

diff --git a/README.md b/README.md
index e0cee5d..43f1759 100644
--- a/README.md
+++ b/README.md
@@ -1,203 +1,243 @@
-# pypandoc
-
-[![Build Status](https://travis-ci.org/bebraw/pypandoc.svg?branch=master)](https://travis-ci.org/bebraw/pypandoc)
-[![PyPI version](https://badge.fury.io/py/pypandoc.svg)](https://pypi.python.org/pypi/pypandoc/)
-[![conda version](https://anaconda.org/janschulz/pypandoc/badges/version.svg)](https://anaconda.org/janschulz/pypandoc/)
-
-pypandoc provides a thin wrapper for [pandoc](http://johnmacfarlane.net/pandoc/), a universal
-document converter.
-
-## Installation
-
-pypandoc uses pandoc, so it needs an available installation of pandoc. For some common cases
-(wheels, conda packages), pypandoc already includes pandoc (and pandoc_citeproc) in it's
-prebuilt package.
-
-If pandoc is already installed (`pandoc` is in the PATH), pypandoc uses the version with the
-higher version number and if both are the same, the already installed version. You can point
-to a specific version by setting the environment variable `PYPANDOC_PANDOC` to the full path to the pandoc binary (`PYPANDOC_PANDOC=/home/x/whatever/pandoc` or `PYPANDOC_PANDOC=c:\pandoc\pandoc.exe`). If this environment variabel is set, this is the only
-place where pandoc is searched for.
-
-To use pandoc filters, you must have the relevant filter installed on your machine.
-
-### Installing via pip
-
-Install via `pip install pypandoc`
-
-Prebuilt [wheels for Windows and Mac OS X](https://pypi.python.org/pypi/pypandoc/) include
-pandoc. If there is no prebuilt binary available, you have to
-[install pandoc yourself](#installing-pandoc).
-
-If you use Linux and have [your own wheelhouse](http://wheel.readthedocs.org/en/latest/#usage),
-you can build a wheel which includes pandoc with
-`python setup.py download_pandoc; python setup.py bdist_wheel`. Be aware that this works only
-on 64bit intel systems, as we only download it from the
-[official source](https://github.com/jgm/pandoc/releases).
-
-### Installing via conda
-
-Install via `conda install -c https://conda.anaconda.org/janschulz pypandoc`.
-
-You can also add the channel to your conda config via
-`conda config --add channels https://conda.anaconda.org/janschulz`. This makes it possible to
-use `conda install pypandoc` directly and also lets you update via `conda update pypandoc`.
-
-Conda packages include pandoc and are available for py2.7, py3.4 and py3.5,
-for Windows (32bit and 64bit), Mac OS X (64bit) and Linux (64bit).
-
-### Installing pandoc
-
-pandoc is available for many different platforms:
-
-- Ubuntu/Debian: `sudo apt-get install pandoc`
-- Fedora/Red Hat: `sudo yum install pandoc`
-- Arch: `sudo pacman -S pandoc`
-- Mac OS X with Homebrew: `brew install pandoc`
-- Machine with Haskell: `cabal-install pandoc`
-- Windows: There is an installer available
-  [here](http://johnmacfarlane.net/pandoc/installing.html)
-- [FreeBSD port](http://www.freshports.org/textproc/pandoc/)
-  - Or see http://johnmacfarlane.net/pandoc/installing.html
-
-## Usage
-
-The basic invocation looks like this: `pypandoc.convert('input', 'output format')`. `pypandoc`
-tries to infer the type of the input automatically. If it's a file, it will load it. In case you
-pass a string, you can define the `format` using the parameter. The example below should clarify
-the usage:
-
-```python
-import pypandoc
-
-output = pypandoc.convert('somefile.md', 'rst')
-
-# alternatively you could just pass some string to it and define its format
-output = pypandoc.convert('#some title', 'rst', format='md')
-# output == 'some title\r\n==========\r\n\r\n'
-```
-
-If you pass in a string (and not a filename), `convert` expects this string to be unicode or
-utf-8 encoded bytes. `convert` will always return a unicode string.
-
-It's also possible to directly let pandoc write the output to a file. This is the only way to
-convert to some output formats (e.g. odt, docx, epub, epub3, pdf). In that case `convert()` will
-return an empty string.
-
-```python
-import pypandoc
-
-output = pypandoc.convert('somefile.md', 'docx', outputfile="somefile.docx")
-assert output == ""
-```
-
-In addition to `format`, it is possible to pass `extra_args`.
-That makes it possible to access various pandoc options easily.
-
-```python
-output = pypandoc.convert(
-    '<h1>Primary Heading</h1>',
-    'md', format='html',
-    extra_args=['--atx-headers'])
-# output == '# Primary Heading\r\n'
-output = pypandoc.convert(
-    '# Primary Heading',
-    'html', format='md',
-    extra_args=['--base-header-level=2'])
-# output == '<h2 id="primary-heading">Primary Heading</h2>\r\n'
-```
-pypandoc now supports easy addition of
-[pandoc filters](http://johnmacfarlane.net/pandoc/scripting.html).
-
-```python
-filters = ['pandoc-citeproc']
-pdoc_args = ['--mathjax',
-             '--smart']
-output = pd.convert(source=filename,
-                    to='html5',
-                    format='md',
-                    extra_args=pdoc_args,
-                    filters=filters)
-```
-Please pass any filters in as a list and not a string.
-
-Please refer to `pandoc -h` and the
-[official documentation](http://johnmacfarlane.net/pandoc/README.html) for further details.
-
-## Dealing with Formatting Arguments
-
-Pandoc supports custom formatting though `-V` parameter. In order to use it through pypandoc, use code such as this:
-
-```python
-output = pypandoc.convert('demo.md', 'pdf', outputfile='demo.pdf',
-  extra_args=['-V', 'geometry:margin=1.5cm'])
-```
-
-Note that it's important to separate `-V` and its argument within a list like that or else it won't work. This gotcha has to do with the way `subprocess.Popen` works.
-
-## Getting Pandoc Version
-
-As it can be useful sometimes to check what Pandoc version is available at your system, `pypandoc` provides an utility for this. Example:
-
-```
-version = pypandoc.get_pandoc_version()
-```
-
-## Related
-
-[pydocverter](https://github.com/msabramo/pydocverter) is a client for a service called
-[Docverter](http://www.docverter.com/), which offers pandoc as a service (plus some extra goodies).
-It has the same API as pypandoc, so you can easily write code that uses one and falls back to the
-other. E.g.:
-
-```python
-try:
-    import pypandoc as converter
-except ImportError:
-    import pydocverter as converter
-
-converter.convert('somefile.md', 'rst')
-```
-
-See [pyandoc](http://pypi.python.org/pypi/pyandoc/) for an alternative implementation of a pandoc
-wrapper from Kenneth Reitz. This one hasn't been active in a while though.
-
-## Contributing
-
-Contributions are welcome. When opening a PR, please keep the following guidelines in mind:
-
-1. Before implementing, please open an issue for discussion.
-2. Make sure you have tests for the new logic.
-3. Make sure your code passes `flake8 pypandoc.py tests.py`
-4. Add yourself to contributors at `README.md` unless you are already there. In that case tweak your contributions.
-
-Note that for citeproc tests to pass you'll need to have [pandoc-citeproc](https://github.com/jgm/pandoc-citeproc) installed. If you installed a prebuilt wheel or conda package, it is already included.
-
-## Contributors
-
-* [Valentin Haenel](https://github.com/esc) - String conversion fix
-* [Daniel Sanchez](https://github.com/ErunamoJAZZ) - Automatic parsing of input/output formats
-* [Thomas G.](https://github.com/coldfix) - Python 3 support
-* [Ben Jao Ming](https://github.com/benjaoming) - Fail gracefully if `pandoc` is missing
-* [Ross Crawford-d'Heureuse](http://github.com/rosscdh) - Encode input in UTF-8 and add Django
-  example
-* [Michael Chow](https://github.com/machow) - Decode output in UTF-8
-* [Janusz Skonieczny](https://github.com/wooyek) - Support Windows newlines and allow encoding to
-  be specified.
-* [gabeos](https://github.com/gabeos) - Fix help parsing
-* [Marc Abramowitz](https://github.com/msabramo) - Make `setup.py` fail hard if `pandoc` is
-  missing, Travis, Dockerfile, PyPI badge, Tox, PEP-8, improved documentation
-* [Daniel L.](https://github.com/mcktrtl) - Add `extra_args` example to README
-* [Amy Guy](https://github.com/rhiaro) - Exception handling for unicode errors
-* [Florian Eßer](https://github.com/flesser) - Allow Markdown extensions in output format
-* [Philipp Wendler](https://github.com/PhilippWendler) - Allow Markdown extensions in input format
-* [Jan Schulz](https://github.com/JanSchulz) - Handling output to a file, Travis to work on newer version of Pandoc, return code checking, get_pandoc_version. Helped to fix the Travis build.
-* [Aaron Gonzales](https://github.com/xysmas) - Added better filter handling
-* [David Lukes](https://github.com/dlukes) - Enabled input from non-plain-text files and made sure tests clean up template files correctly if they fail
-* [valholl](https://github.com/valholl) - Set up licensing information correctly and include examples to distribution version
-* [Cyrille Rossant](https://github.com/rossant) - Fixed bug by trimming out stars in the list of pandoc formats. Helped to fix the Travis build.
-* [Paul Osborne](https://github.com/posborne) - Don't require pandoc to install pypandoc.
-* [Felix Yan](https://github.com/felixonmars) - Added installation instructions for Arch Linux.
-
-## License
-
-`pypandoc` is available under MIT license. See LICENSE for more details. `pandoc` itself is [available under the GPL2 license](https://github.com/jgm/pandoc/blob/master/COPYING).
+# pypandoc
+
+[![Build Status](https://travis-ci.org/bebraw/pypandoc.svg?branch=master)](https://travis-ci.org/bebraw/pypandoc)
+[![PyPI version](https://badge.fury.io/py/pypandoc.svg)](https://pypi.python.org/pypi/pypandoc/)
+[![conda version](https://anaconda.org/conda-forge/pypandoc/badges/version.svg)](https://anaconda.org/conda-forge/pypandoc/)
+
+pypandoc provides a thin wrapper for [pandoc](http://johnmacfarlane.net/pandoc/), a universal
+document converter.
+
+## Installation
+
+pypandoc uses `pandoc`, so it needs an available installation of `pandoc`. For some common cases
+(wheels, conda packages), pypandoc already includes `pandoc` (and `pandoc_citeproc`) in it's
+prebuilt package.
+
+If `pandoc` is already installed (`pandoc` is in the PATH), `pypandoc` uses the version with the
+higher version number and if both are the same, the already installed version. See [Specifying the location of pandoc binaries](#specifying_binaries) for more.
+
+To use `pandoc` filters, you must have the relevant filter installed on your machine.
+
+### Installing via pip
+
+Install via `pip install pypandoc`.
+
+Prebuilt [wheels for Windows and Mac OS X](https://pypi.python.org/pypi/pypandoc/) include
+pandoc. If there is no prebuilt binary available, you have to
+[install `pandoc` yourself](#installing-pandoc).
+
+If you use Linux and have [your own wheelhouse](http://wheel.readthedocs.org/en/latest/#usage),
+you can build a wheel which include `pandoc` with
+`python setup.py download_pandoc; python setup.py bdist_wheel`. Be aware that this works only
+on 64bit intel systems, as we only download it from the
+[official source](https://github.com/jgm/pandoc/releases).
+
+### Installing via conda
+
+`pypandoc` is included in [conda-forge](https://conda-forge.github.io/). The conda packages will
+also install the `pandoc` package, so `pandoc` is available in the installation.
+
+Install via `conda install -c conda-forge pypandoc`.
+
+You can also add the channel to your conda config via
+`conda config --add channels conda-forge`. This makes it possible to
+use `conda install pypandoc` directly and also lets you update via `conda update pypandoc`.
+
+### Installing pandoc
+
+If you don't get `pandoc` installed via a prebuild wheel which includes `pandoc` or via the
+conda package dependencies, you need to install `pandoc` by yourself.
+
+#### Installing pandoc via pypandoc
+
+Installing via pypandoc is possible on Windows, Mac OS X or Linux (Intel-based):
+
+```python
+# expects a installed pypandoc: pip install pypandoc
+from pypandoc.pandoc_download import download_pandoc
+# see the documentation how to customize the installation path
+# but be aware that you then need to include it in the PATH
+download_pandoc()
+```
+
+The default install location is included in the search path for `pandoc`, so you
+don't need to add it to `PATH`.
+
+#### Installing pandoc manually
+
+Installing manually via the system mechanism is also possible. Such installation mechanism
+make `pandoc` available on many more platforms:
+
+- Ubuntu/Debian: `sudo apt-get install pandoc`
+- Fedora/Red Hat: `sudo yum install pandoc`
+- Arch: `sudo pacman -S pandoc`
+- Mac OS X with Homebrew: `brew install pandoc pandoc-citeproc Caskroom/cask/mactex`
+- Machine with Haskell: `cabal-install pandoc`
+- Windows: There is an installer available
+  [here](http://johnmacfarlane.net/pandoc/installing.html)
+- [FreeBSD port](http://www.freshports.org/textproc/pandoc/)
+  - Or see http://johnmacfarlane.net/pandoc/installing.html
+
+Be aware that not all install mechanismen put `pandoc` in `PATH`, so you either
+have to change `PATH` yourself or set the full path to `pandoc` in
+`PYPANDOC_PANDOC`. See the next section for more information.
+
+### <a name="specifying_binaries"></a>Specifying the location of pandoc binaries
+
+You can point to a specific pandoc version by setting the environment variable
+`PYPANDOC_PANDOC` to the full path to the pandoc binary
+(`PYPANDOC_PANDOC=/home/x/whatever/pandoc` or `PYPANDOC_PANDOC=c:\pandoc\pandoc.exe`).
+If this environment variable is set, this is the only place where pandoc is searched for.
+
+In certain cases, e.g. pandoc is installed but a web server with its own user
+cannot find the binaries, it is useful to specify the location at runtime:
+
+```python
+import os
+os.environ.setdefault('PYPANDOC_PANDOC', '/home/x/whatever/pandoc')
+```
+
+## Usage
+
+There are two basic ways to use `pypandoc`: with input files or with input
+strings.
+
+
+```python
+import pypandoc
+
+# With an input file: it will infer the input format from the filename
+output = pypandoc.convert_file('somefile.md', 'rst')
+
+# ...but you can overwrite the format via the `format` argument:
+output = pypandoc.convert_file('somefile.txt', 'rst', format='md')
+
+# alternatively you could just pass some string. In this case you need to
+# define the input format:
+output = pypandoc.convert_text('#some title', 'rst', format='md')
+# output == 'some title\r\n==========\r\n\r\n'
+```
+
+`convert_text` expects this string to be unicode or utf-8 encoded bytes. `convert_*` will always
+return a unicode string.
+
+It's also possible to directly let `pandoc` write the output to a file. This is the only way to
+convert to some output formats (e.g. odt, docx, epub, epub3, pdf). In that case `convert_*()` will
+return an empty string.
+
+```python
+import pypandoc
+
+output = pypandoc.convert_file('somefile.md', 'docx', outputfile="somefile.docx")
+assert output == ""
+```
+
+In addition to `format`, it is possible to pass `extra_args`.
+That makes it possible to access various `pandoc` options easily.
+
+```python
+output = pypandoc.convert_text(
+    '<h1>Primary Heading</h1>',
+    'md', format='html',
+    extra_args=['--atx-headers'])
+# output == '# Primary Heading\r\n'
+output = pypandoc.convert(
+    '# Primary Heading',
+    'html', format='md',
+    extra_args=['--base-header-level=2'])
+# output == '<h2 id="primary-heading">Primary Heading</h2>\r\n'
+```
+pypandoc now supports easy addition of
+[pandoc filters](http://johnmacfarlane.net/pandoc/scripting.html).
+
+```python
+filters = ['pandoc-citeproc']
+pdoc_args = ['--mathjax',
+             '--smart']
+output = pd.convert_file(source=filename,
+                         to='html5',
+                         format='md',
+                         extra_args=pdoc_args,
+                         filters=filters)
+```
+Please pass any filters in as a list and not as a string.
+
+Please refer to `pandoc -h` and the
+[official documentation](http://johnmacfarlane.net/pandoc/README.html) for further details.
+
+> Note: the old way of using `convert(input, output)` is deprecated as in some cases it wasn't
+possible to determine whether the input should be used as a filename or as text.
+
+## Dealing with Formatting Arguments
+
+Pandoc supports custom formatting though `-V` parameter. In order to use it through
+pypandoc, use code such as this:
+
+```python
+output = pypandoc.convert_file('demo.md', 'pdf', outputfile='demo.pdf',
+  extra_args=['-V', 'geometry:margin=1.5cm'])
+```
+
+> Note: it's important to separate `-V` and its argument within a list like that or else
+it won't work. This gotcha has to do with the way
+[`subprocess.Popen`](https://docs.python.org/2/library/subprocess.html#subprocess.Popen) works.
+
+## Getting Pandoc Version
+
+As it can be useful sometimes to check what Pandoc version is available at your system or which
+particular `pandoc` binary is used by `pypandoc`. For that, `pypandoc` provides the following
+utility functions. Example:
+
+```
+print(pypandoc.get_pandoc_version())
+print(pypandoc.get_pandoc_path())
+print(pypandoc.get_pandoc_formats())
+```
+
+## Related
+
+* [pydocverter](https://github.com/msabramo/pydocverter) is a client for a service called
+[Docverter](http://www.docverter.com/), which offers `pandoc` as a service (plus some extra goodies).
+* See [pyandoc](http://pypi.python.org/pypi/pyandoc/) for an alternative implementation of a `pandoc`
+wrapper from Kenneth Reitz. This one hasn't been active in a while though.
+
+## Contributing
+
+Contributions are welcome. When opening a PR, please keep the following guidelines in mind:
+
+1. Before implementing, please open an issue for discussion.
+2. Make sure you have tests for the new logic.
+3. Make sure your code passes `flake8 pypandoc/*.py tests.py`
+4. Add yourself to contributors at `README.md` unless you are already there. In that case tweak your contributions.
+
+Note that for citeproc tests to pass you'll need to have [pandoc-citeproc](https://github.com/jgm/pandoc-citeproc) installed. If you installed a prebuilt wheel or conda package, it is already included.
+
+## Contributors
+
+* [Valentin Haenel](https://github.com/esc) - String conversion fix
+* [Daniel Sanchez](https://github.com/ErunamoJAZZ) - Automatic parsing of input/output formats
+* [Thomas G.](https://github.com/coldfix) - Python 3 support
+* [Ben Jao Ming](https://github.com/benjaoming) - Fail gracefully if `pandoc` is missing
+* [Ross Crawford-d'Heureuse](http://github.com/rosscdh) - Encode input in UTF-8 and add Django
+  example
+* [Michael Chow](https://github.com/machow) - Decode output in UTF-8
+* [Janusz Skonieczny](https://github.com/wooyek) - Support Windows newlines and allow encoding to
+  be specified.
+* [gabeos](https://github.com/gabeos) - Fix help parsing
+* [Marc Abramowitz](https://github.com/msabramo) - Make `setup.py` fail hard if `pandoc` is
+  missing, Travis, Dockerfile, PyPI badge, Tox, PEP-8, improved documentation
+* [Daniel L.](https://github.com/mcktrtl) - Add `extra_args` example to README
+* [Amy Guy](https://github.com/rhiaro) - Exception handling for unicode errors
+* [Florian Eßer](https://github.com/flesser) - Allow Markdown extensions in output format
+* [Philipp Wendler](https://github.com/PhilippWendler) - Allow Markdown extensions in input format
+* [Jan Schulz](https://github.com/JanSchulz) - Handling output to a file, Travis to work on newer version of Pandoc, return code checking, get_pandoc_version. Helped to fix the Travis build, new `convert_*` API
+* [Aaron Gonzales](https://github.com/xysmas) - Added better filter handling
+* [David Lukes](https://github.com/dlukes) - Enabled input from non-plain-text files and made sure tests clean up template files correctly if they fail
+* [valholl](https://github.com/valholl) - Set up licensing information correctly and include examples to distribution version
+* [Cyrille Rossant](https://github.com/rossant) - Fixed bug by trimming out stars in the list of `pandoc` formats. Helped to fix the Travis build.
+* [Paul Osborne](https://github.com/posborne) - Don't require `pandoc` to install pypandoc.
+* [Felix Yan](https://github.com/felixonmars) - Added installation instructions for Arch Linux.
+
+## License
+
+`pypandoc` is available under MIT license. See LICENSE for more details. `pandoc` itself is [available under the GPL2 license](https://github.com/jgm/pandoc/blob/master/COPYING).
diff --git a/pypandoc/__init__.py b/pypandoc/__init__.py
index e8816d0..06e8635 100644
--- a/pypandoc/__init__.py
+++ b/pypandoc/__init__.py
@@ -1,354 +1,511 @@
-# -*- coding: utf-8 -*-
-from __future__ import with_statement, absolute_import
-
-import subprocess
-import sys
-import textwrap
-import os
-import re
-
-from .py3compat import string_types, cast_bytes, cast_unicode
-
-__author__ = u'Juho Vepsäläinen'
-__version__ = '1.1.3'
-__license__ = 'MIT'
-__all__ = ['convert', 'get_pandoc_formats', 'get_pandoc_version', 'get_pandoc_path']
-
-
-def convert(source, to, format=None, extra_args=(), encoding='utf-8',
-            outputfile=None, filters=None):
-    """Converts given `source` from `format` `to` another.
-
-    :param str source: Unicode string or bytes or a file path (see encoding)
-
-    :param str to: format into which the input should be converted; can be one of
-            `pypandoc.get_pandoc_formats()[1]`
-
-    :param str format: the format of the inputs; will be inferred if input is a file with an
-            known filename extension; can be one of `pypandoc.get_pandoc_formats()[1]`
-            (Default value = None)
-
-    :param list extra_args: extra arguments (list of strings) to be passed to pandoc
-            (Default value = ())
-
-    :param str encoding: the encoding of the file or the input bytes (Default value = 'utf-8')
-
-    :param str outputfile: output will be written to outfilename or the converted content
-            returned if None (Default value = None)
-
-    :param list filters: pandoc filters e.g. filters=['pandoc-citeproc']
-
-    :returns: converted string (unicode) or an empty string if an outputfile was given
-    :rtype: unicode
-
-    :raises RuntimeError: if any of the inputs are not valid of if pandoc fails with an error
-    :raises OSError: if pandoc is not found; make sure it has been installed and is available at
-            path.
-    """
-    return _convert(_read_file, _process_file, source, to,
-                    format, extra_args, encoding=encoding,
-                    outputfile=outputfile, filters=filters)
-
-
-def _convert(reader, processor, source, to, format=None, extra_args=(), encoding=None,
-             outputfile=None, filters=None):
-    source, format, input_type = reader(source, format, encoding=encoding)
-
-    formats = {
-        'dbk': 'docbook',
-        'md': 'markdown',
-        'rest': 'rst',
-        'tex': 'latex',
-    }
-
-    format = formats.get(format, format)
-    to = formats.get(to, to)
-
-    if not format:
-        raise RuntimeError('Missing format!')
-
-    from_formats, to_formats = get_pandoc_formats()
-
-    if _get_base_format(format) not in from_formats:
-        raise RuntimeError(
-            'Invalid input format! Got "%s" but expected one of these: %s' % (
-                _get_base_format(format), ', '.join(from_formats)))
-
-    base_to_format = _get_base_format(to)
-    if base_to_format not in to_formats:
-        raise RuntimeError(
-            'Invalid output format! Expected one of these: ' +
-            ', '.join(to_formats))
-
-    # list from https://github.com/jgm/pandoc/blob/master/pandoc.hs
-    # `[...] where binaries = ["odt","docx","epub","epub3"] [...]`
-    if base_to_format in ["odt", "docx", "epub", "epub3"] and not outputfile:
-        raise RuntimeError(
-            'Output to %s only works by using a outputfile.' % base_to_format
-        )
-
-    return processor(source, input_type, to, format, extra_args,
-                     outputfile=outputfile, filters=filters)
-
-
-def _read_file(source, format, encoding='utf-8'):
-    try:
-        path = os.path.exists(source)
-    except UnicodeEncodeError:
-        path = os.path.exists(source.encode('utf-8'))
-    except ValueError:
-        path = ''
-    if path:
-        format = format or os.path.splitext(source)[1].strip('.')
-        input_type = 'path'
-    else:
-        if encoding != 'utf-8':
-            # if a source and a different encoding is given, try to decode the the source into a
-            # unicode string
-            try:
-                source = cast_unicode(source, encoding=encoding)
-            except (UnicodeDecodeError, UnicodeEncodeError):
-                pass
-        input_type = 'string'
-    return source, format, input_type
-
-
-def _process_file(source, input_type, to, format, extra_args, outputfile=None,
-                  filters=None):
-    _ensure_pandoc_path()
-    string_input = input_type == 'string'
-    input_file = [source] if not string_input else []
-    args = [__pandoc_path, '--from=' + format]
-
-    # #59 - pdf output won't work with `--to` set!
-    if to is not 'pdf':
-        args.append('--to=' + to)
-
-    args += input_file
-
-    if outputfile:
-        args.append("--output="+outputfile)
-
-    args.extend(extra_args)
-
-    # adds the proper filter syntax for each item in the filters list
-    if filters is not None:
-        if isinstance(filters, string_types):
-            filters = filters.split()
-        f = ['--filter=' + x for x in filters]
-        args.extend(f)
-
-    p = subprocess.Popen(
-        args,
-        stdin=subprocess.PIPE if string_input else None,
-        stdout=subprocess.PIPE,
-        stderr=subprocess.PIPE)
-
-    # something else than 'None' indicates that the process already terminated
-    if not (p.returncode is None):
-        raise RuntimeError(
-            'Pandoc died with exitcode "%s" before receiving input: %s' % (p.returncode,
-                                                                           p.stderr.read())
-        )
-
-    try:
-        source = cast_bytes(source, encoding='utf-8')
-    except (UnicodeDecodeError, UnicodeEncodeError):
-        # assume that it is already a utf-8 encoded string
-        pass
-    try:
-        stdout, stderr = p.communicate(source if string_input else None)
-    except OSError:
-        # this is happening only on Py2.6 when pandoc dies before reading all
-        # the input. We treat that the same as when we exit with an error...
-        raise RuntimeError('Pandoc died with exitcode "%s" during conversion.' % (p.returncode))
-
-    try:
-        stdout = stdout.decode('utf-8')
-    except UnicodeDecodeError:
-        # this shouldn't happen: pandoc more or less garantees that the output is utf-8!
-        raise RuntimeError('Pandoc output was not utf-8.')
-
-    # check that pandoc returned successfully
-    if p.returncode != 0:
-        raise RuntimeError(
-            'Pandoc died with exitcode "%s" during conversion: %s' % (p.returncode, stderr)
-        )
-
-    # if there is an outputfile, then stdout is likely empty!
-    return stdout
-
-
-def _get_base_format(format):
-    '''
-    According to http://johnmacfarlane.net/pandoc/README.html#general-options,
-    syntax extensions for markdown can be individually enabled or disabled by
-    appending +EXTENSION or -EXTENSION to the format name.
-    Return the base format without any extensions.
-    '''
-    return re.split('\+|-', format)[0]
-
-
-def get_pandoc_formats():
-    '''
-    Dynamic preprocessor for Pandoc formats.
-    Return 2 lists. "from_formats" and "to_formats".
-    '''
-    _ensure_pandoc_path()
-    p = subprocess.Popen(
-        [__pandoc_path, '-h'],
-        stdin=subprocess.PIPE,
-        stdout=subprocess.PIPE)
-
-    comm = p.communicate()
-    help_text = comm[0].decode().splitlines(False)
-    if p.returncode != 0 or 'Options:' not in help_text:
-        raise RuntimeError("Couldn't call pandoc to get output formats. Output from pandoc:\n%s" %
-                           str(comm))
-    txt = ' '.join(help_text[1:help_text.index('Options:')])
-
-    aux = txt.split('Output formats: ')
-    in_ = re.sub('Input\sformats:\s|\*|\[.*?\]', '', aux[0]).split(',')
-    out = re.sub('\*|\[.*?\]', '', aux[1]).split(',')
-
-    return [f.strip() for f in in_], [f.strip() for f in out]
-
-
-# copied and adapted from jupyter_nbconvert/utils/pandoc.py, Modified BSD License
-
-def _get_pandoc_version(pandoc_path):
-    p = subprocess.Popen(
-        [pandoc_path, '--version'],
-        stdin=subprocess.PIPE,
-        stdout=subprocess.PIPE)
-    comm = p.communicate()
-    out_lines = comm[0].decode().splitlines(False)
-    if p.returncode != 0 or len(out_lines) == 0:
-        raise RuntimeError("Couldn't call pandoc to get version information. Output from "
-                           "pandoc:\n%s" % str(comm))
-
-    version_pattern = re.compile(r"^\d+(\.\d+){1,}$")
-    for tok in out_lines[0].split():
-        if version_pattern.match(tok):
-            version = tok
-            break
-    return version
-
-
-def get_pandoc_version():
-    """Gets the Pandoc version if Pandoc is installed.
-
-    It will probe Pandoc for its version, cache it and return that value. If a cached version is
-    found, it will return the cached version and stop probing Pandoc
-    (unless :func:`clean_version_cache()` is called).
-
-    :raises OSError: if pandoc is not found; make sure it has been installed and is available at
-            path.
-    """
-    global __version
-
-    if __version is None:
-        _ensure_pandoc_path()
-        __version = _get_pandoc_version(__pandoc_path)
-    return __version
-
-
-def get_pandoc_path():
-    """Gets the Pandoc path if Pandoc is installed.
-
-    It will return a path to pandoc which is used by pypandoc.
-
-    This might be a full path or, if pandoc is on PATH, simple `pandoc`. It's garanteed
-    to be callable (i.e. we could get version information from `pandoc --version`).
-    If `PYPANDOC_PANDOC` is set and valid, it will return that value. If the environment
-    variable is not set, either the full path to the included pandoc or the pandoc in
-    `PATH` (whatever is the higher version) will be returned.
-
-    If a cached path is found, it will return the cached path and stop probing Pandoc
-    (unless :func:`clean_pandocpath_cache()` is called).
-
-    :raises OSError: if pandoc is not found
-    """
-    _ensure_pandoc_path()
-    return __pandoc_path
-
-
-def _ensure_pandoc_path():
-    global __pandoc_path
-
-    if __pandoc_path is None:
-        included_pandoc = os.path.join(os.path.dirname(os.path.realpath(__file__)),
-                                       "files", "pandoc")
-        search_paths = ["pandoc",  included_pandoc]
-        # If a user added the complete path to pandoc to an env, use that as the
-        # only way to get pandoc so that a user can overwrite even a higher
-        # version in some other places.
-        if os.getenv('PYPANDOC_PANDOC', None):
-            search_paths = [os.getenv('PYPANDOC_PANDOC')]
-        for path in search_paths:
-            curr_version = [0, 0, 0]
-            version_string = "0.0.0"
-            try:
-                version_string = _get_pandoc_version(path)
-            except:
-                # we can't use that path...
-                # print(e)
-                continue
-            version = [int(x) for x in version_string.split(".")]
-            while len(version) < len(curr_version):
-                version.append(0)
-            # print("%s, %s" % (path, version))
-            for pos in range(len(curr_version)):
-                # Only use the new version if it is any bigger...
-                if version[pos] > curr_version[pos]:
-                    # print("Found: %s" % path)
-                    __pandoc_path = path
-                    curr_version = version
-                    break
-
-        if __pandoc_path is None:
-            if os.path.exists('/usr/local/bin/brew'):
-                sys.stderr.write(textwrap.dedent("""\
-                    Maybe try:
-
-                        brew install pandoc
-                """))
-            elif os.path.exists('/usr/bin/apt-get'):
-                sys.stderr.write(textwrap.dedent("""\
-                    Maybe try:
-
-                        sudo apt-get install pandoc
-                """))
-            elif os.path.exists('/usr/bin/yum'):
-                sys.stderr.write(textwrap.dedent("""\
-                    Maybe try:
-
-                        sudo yum install pandoc
-                """))
-            sys.stderr.write(textwrap.dedent("""\
-                See http://johnmacfarlane.net/pandoc/installing.html
-                for installation options
-            """))
-            sys.stderr.write(textwrap.dedent("""\
-                ---------------------------------------------------------------
-
-            """))
-            raise OSError("No pandoc was found: either install pandoc and add it\n"
-                          "to your PATH or install pypandoc wheels with included pandoc.")
-
-
-# -----------------------------------------------------------------------------
-# Internal state management
-# -----------------------------------------------------------------------------
-def clean_version_cache():
-    global __version
-    __version = None
-
-
-def clean_pandocpath_cache():
-    global __pandoc_path
-    __pandoc_path = None
-
-
-__version = None
-__pandoc_path = None
+# -*- coding: utf-8 -*-
+from __future__ import with_statement, absolute_import, print_function
+
+import subprocess
+import sys
+import textwrap
+import os
+import re
+import warnings
+import tempfile
+
+from .py3compat import string_types, cast_bytes, cast_unicode
+
+from pypandoc.pandoc_download import DEFAULT_TARGET_FOLDER, download_pandoc
+
+__author__ = u'Juho Vepsäläinen'
+__version__ = '1.2.0'
+__license__ = 'MIT'
+__all__ = ['convert', 'convert_file', 'convert_text',
+           'get_pandoc_formats', 'get_pandoc_version', 'get_pandoc_path',
+           'download_pandoc']
+
+
+def convert(source, to, format=None, extra_args=(), encoding='utf-8',
+            outputfile=None, filters=None):
+    """Converts given `source` from `format` to `to` (deprecated).
+
+    :param str source: Unicode string or bytes or a file path (see encoding)
+
+    :param str to: format into which the input should be converted; can be one of
+            `pypandoc.get_pandoc_formats()[1]`
+
+    :param str format: the format of the inputs; will be inferred if input is a file with an
+            known filename extension; can be one of `pypandoc.get_pandoc_formats()[1]`
+            (Default value = None)
+
+    :param list extra_args: extra arguments (list of strings) to be passed to pandoc
+            (Default value = ())
+
+    :param str encoding: the encoding of the file or the input bytes (Default value = 'utf-8')
+
+    :param str outputfile: output will be written to outfilename or the converted content
+            returned if None (Default value = None)
+
+    :param list filters: pandoc filters e.g. filters=['pandoc-citeproc']
+
+    :returns: converted string (unicode) or an empty string if an outputfile was given
+    :rtype: unicode
+
+    :raises RuntimeError: if any of the inputs are not valid of if pandoc fails with an error
+    :raises OSError: if pandoc is not found; make sure it has been installed and is available at
+            path.
+    """
+    msg = ("Due to possible ambiguity, 'convert()' is deprecated. "
+           "Use 'convert_file()'  or 'convert_text()'.")
+    warnings.warn(msg, DeprecationWarning, stacklevel=2)
+
+    path = _identify_path(source)
+    if path:
+        format = _identify_format_from_path(source, format)
+        input_type = 'path'
+    else:
+        source = _as_unicode(source, encoding)
+        input_type = 'string'
+        if not format:
+            raise RuntimeError("Format missing, but need one (identified source as text as no "
+                               "file with that name was found).")
+    return _convert_input(source, format, input_type, to, extra_args=extra_args,
+                          outputfile=outputfile, filters=filters)
+
+
+def convert_text(source, to, format, extra_args=(), encoding='utf-8',
+                 outputfile=None, filters=None):
+
+    """Converts given `source` from `format` to `to`.
+
+    :param str source: Unicode string or bytes (see encoding)
+
+    :param str to: format into which the input should be converted; can be one of
+            `pypandoc.get_pandoc_formats()[1]`
+
+    :param str format: the format of the inputs; can be one of `pypandoc.get_pandoc_formats()[1]`
+
+    :param list extra_args: extra arguments (list of strings) to be passed to pandoc
+            (Default value = ())
+
+    :param str encoding: the encoding of the input bytes (Default value = 'utf-8')
+
+    :param str outputfile: output will be written to outfilename or the converted content
+            returned if None (Default value = None)
+
+    :param list filters: pandoc filters e.g. filters=['pandoc-citeproc']
+
+    :returns: converted string (unicode) or an empty string if an outputfile was given
+    :rtype: unicode
+
+    :raises RuntimeError: if any of the inputs are not valid of if pandoc fails with an error
+    :raises OSError: if pandoc is not found; make sure it has been installed and is available at
+            path.
+    """
+    source = _as_unicode(source, encoding)
+    return _convert_input(source, format, 'string', to, extra_args=extra_args,
+                          outputfile=outputfile, filters=filters)
+
+
+def convert_file(source_file, to, format=None, extra_args=(), encoding='utf-8',
+                 outputfile=None, filters=None):
+    """Converts given `source` from `format` to `to`.
+
+    :param str source_file: file path (see encoding)
+
+    :param str to: format into which the input should be converted; can be one of
+            `pypandoc.get_pandoc_formats()[1]`
+
+    :param str format: the format of the inputs; will be inferred from the source_file with an
+            known filename extension; can be one of `pypandoc.get_pandoc_formats()[1]`
+            (Default value = None)
+
+    :param list extra_args: extra arguments (list of strings) to be passed to pandoc
+            (Default value = ())
+
+    :param str encoding: the encoding of the file or the input bytes (Default value = 'utf-8')
+
+    :param str outputfile: output will be written to outfilename or the converted content
+            returned if None (Default value = None)
+
+    :param list filters: pandoc filters e.g. filters=['pandoc-citeproc']
+
+    :returns: converted string (unicode) or an empty string if an outputfile was given
+    :rtype: unicode
+
+    :raises RuntimeError: if any of the inputs are not valid of if pandoc fails with an error
+    :raises OSError: if pandoc is not found; make sure it has been installed and is available at
+            path.
+    """
+    if not _identify_path(source_file):
+        raise RuntimeError("source_file is not a valid path")
+    format = _identify_format_from_path(source_file, format)
+    return _convert_input(source_file, format, 'path', to, extra_args=extra_args,
+                          outputfile=outputfile, filters=filters)
+
+
+def _identify_path(source):
+    try:
+        path = os.path.exists(source)
+    except UnicodeEncodeError:
+        path = os.path.exists(source.encode('utf-8'))
+    except ValueError:
+        path = False
+    except TypeError:
+        # source is None...
+        path = False
+    return path
+
+
+def _identify_format_from_path(sourcefile, format):
+    return format or os.path.splitext(sourcefile)[1].strip('.')
+
+
+def _as_unicode(source, encoding):
+    if encoding != 'utf-8':
+        # if a source and a different encoding is given, try to decode the the source into a
+        # unicode string
+        try:
+            source = cast_unicode(source, encoding=encoding)
+        except (UnicodeDecodeError, UnicodeEncodeError):
+            pass
+    return source
+
+
+def _identify_input_type(source, format, encoding='utf-8'):
+    path = _identify_path(source)
+    if path:
+        format = _identify_format_from_path(source, format)
+        input_type = 'path'
+    else:
+        source = _as_unicode(source, encoding)
... 1431 lines suppressed ...

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/python-modules/packages/pypandoc.git



More information about the Python-modules-commits mailing list