[Python-modules-commits] [python-nameparser] 01/04: import python-nameparser_0.3.11.orig.tar.gz

Edward Betts edward at moszumanska.debian.org
Fri Feb 26 09:20:23 UTC 2016


This is an automated email from the git hooks/post-receive script.

edward pushed a commit to branch master
in repository python-nameparser.

commit 5ac427392709a183228d48685054dfaf80d839fe
Author: Edward Betts <edward at 4angle.com>
Date:   Fri Feb 26 09:15:46 2016 +0000

    import python-nameparser_0.3.11.orig.tar.gz
---
 .gitignore                          |   17 +
 .hgignore                           |    8 +
 .tm_properties                      |    1 +
 .travis.yml                         |   16 +
 AUTHORS                             |    1 +
 CONTRIBUTING.md                     |   82 ++
 LICENSE                             |   19 +
 MANIFEST.in                         |    3 +
 README.rst                          |  118 +++
 dev-requirements.txt                |    7 +
 docs/Makefile                       |  177 ++++
 docs/conf.py                        |  269 +++++
 docs/contributing.rst               |   20 +
 docs/customize.rst                  |  261 +++++
 docs/index.rst                      |   85 ++
 docs/modules.rst                    |   36 +
 docs/release_log.rst                |   80 ++
 docs/resources.rst                  |   14 +
 docs/usage.rst                      |  135 +++
 nameparser/__init__.py              |    9 +
 nameparser/config/__init__.py       |  185 ++++
 nameparser/config/capitalization.py |   13 +
 nameparser/config/conjunctions.py   |   18 +
 nameparser/config/prefixes.py       |   28 +
 nameparser/config/regexes.py        |   17 +
 nameparser/config/suffixes.py       |  120 +++
 nameparser/config/titles.py         |  413 ++++++++
 nameparser/parser.py                |  706 +++++++++++++
 nameparser/util.py                  |   39 +
 setup.py                            |   40 +
 tests.py                            | 1984 +++++++++++++++++++++++++++++++++++
 31 files changed, 4921 insertions(+)

diff --git a/.gitignore b/.gitignore
new file mode 100644
index 0000000..c8428c9
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1,17 @@
+*.hgrc
+*.DS_Store
+__pycache__/
+*.py[cod]
+.python2/
+MANIFEST
+nameparser.egg-info/
+dummycert.pem
+build
+*.egg
+.env
+.env3
+.coverage
+dist
+
+# docs
+docs/_*
diff --git a/.hgignore b/.hgignore
new file mode 100644
index 0000000..b7123be
--- /dev/null
+++ b/.hgignore
@@ -0,0 +1,8 @@
+.hgrc
+.DS_Store
+.pyc
+__pycache__
+.python2
+MANIFEST
+nameparser.egg-info/*
+dummycert.pem
\ No newline at end of file
diff --git a/.tm_properties b/.tm_properties
new file mode 100644
index 0000000..3f04e72
--- /dev/null
+++ b/.tm_properties
@@ -0,0 +1 @@
+excludeDirectories = "{$excludeDirectories,dist,*.egg-info,build,docs/_*}"
diff --git a/.travis.yml b/.travis.yml
new file mode 100644
index 0000000..571244d
--- /dev/null
+++ b/.travis.yml
@@ -0,0 +1,16 @@
+language: python
+python:
+  - "2.6"
+  - "2.7"
+  - "3.2"
+  - "3.3"
+  - "3.4"
+  - "3.5"
+# command to install dependencies
+install: 
+  - if [[ $TRAVIS_PYTHON_VERSION == '2.6' ]]; then pip install --use-mirrors unittest2; fi
+  - "pip install dill"
+  - "python setup.py install"
+# command to run tests
+script: python tests.py
+sudo: false
diff --git a/AUTHORS b/AUTHORS
new file mode 100644
index 0000000..08e5031
--- /dev/null
+++ b/AUTHORS
@@ -0,0 +1 @@
+Derek Gulbranson <derek73 at gmail.com>
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
new file mode 100644
index 0000000..b4a0208
--- /dev/null
+++ b/CONTRIBUTING.md
@@ -0,0 +1,82 @@
+Contributing
+==============
+
+Development Environment Setup
+--------------------------------
+
+There are some exernal dependencies required in order to run the
+tests, located in the dev-requirements.txt file.
+
+    pip install -r dev-requirements.txt
+
+If you are running Python 2.6 you will also need to `pip install unitest2`
+in order to run the tests.
+
+Travis CI
+---------
+
+[![Build Status](https://travis-ci.org/derek73/python-nameparser.svg?branch=master)](https://travis-ci.org/derek73/python-nameparser)
+
+The GitHub project is set up with Travis CI. Tests are run
+automatically against new code pushes to any branch in the main
+repository. Test results may be viewed here:
+
+https://travis-ci.org/derek73/python-nameparser
+
+Running Tests
+---------------
+
+To run the tests locally, run `python tests.py`.
+
+
+    python tests.py
+
+
+You can also pass a name string to `tests.py` to see how it will be parsed.
+
+    $ python tests.py "Secretary of State Hillary Rodham-Clinton"
+    <HumanName : [
+    	Title: 'Secretary of State' 
+    	First: 'Hillary' 
+    	Middle: '' 
+    	Last: 'Rodham-Clinton' 
+    	Suffix: ''
+    ]>
+
+
+Writing Tests
+----------------
+
+If you make changes, please make sure you include tests with example
+names that you want to be parsed correctly.
+
+It's a good idea to include tests of alternate comma placement formats
+of the name to ensure that the 3 code paths for the 3 formats work in
+the same way.
+
+The tests could be MUCH better. If the spirit moves you to design or
+implement a much more intelligent test strategy, please know that your
+efforts will be welcome and appreciated.
+
+Unless you add better coverage someplace else, add a few examples of
+your names to `TEST_NAMES`. A test attempts to try the 3 different
+comma variations of these names automatically and make sure things
+don't blow up, so it can be a helpful regression indicator.
+
+
+Provide Example Data
+----------------------
+
+We humans are the learning machine behind this code, and we can't do
+it without real world data. If it doesn't work, start a new issue
+because we probably don't know.
+
+If you have a dataset that has lots of issues, add the data to a
+[gist](https://gist.github.com) and [create a new
+issue](https://github.com/derek73/python-nameparser/issues) so we can
+try to get it working as expected.
+
+Feel free to update this documentation to address any questions that I
+missed. GitHub makes it pretty easy to edit it right on the web site
+now.
+
diff --git a/LICENSE b/LICENSE
new file mode 100644
index 0000000..9aaf664
--- /dev/null
+++ b/LICENSE
@@ -0,0 +1,19 @@
+Copyright Derek Gulbranson <derek73 at gmail>.
+http://derekgulbranson.com/
+
+Parser logic based on PHP nameParser.php by G. Miernicki
+http://code.google.com/p/nameparser/
+
+-----
+
+LGPL
+http://www.opensource.org/licenses/lgpl-license.html
+
+This library is free software; you can redistribute it and/or modify it under the
+terms of the GNU Lesser General Public License as published by the Free Software
+Foundation; either version 2.1 of the License, or (at your option) any later
+version.
+
+This library is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
+PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
diff --git a/MANIFEST.in b/MANIFEST.in
new file mode 100644
index 0000000..7c985df
--- /dev/null
+++ b/MANIFEST.in
@@ -0,0 +1,3 @@
+include AUTHORS
+include LICENSE
+include README.rst
diff --git a/README.rst b/README.rst
new file mode 100644
index 0000000..c42aa1c
--- /dev/null
+++ b/README.rst
@@ -0,0 +1,118 @@
+Name Parser
+===========
+
+.. image:: https://travis-ci.org/derek73/python-nameparser.svg?branch=master
+   :target: https://travis-ci.org/derek73/python-nameparser
+.. image:: https://badge.fury.io/py/nameparser.svg
+    :target: http://badge.fury.io/py/nameparser
+
+A simple Python (3.2+ & 2.6+) module for parsing human names into their individual
+components. The HumanName class splits a name string up into name parts
+based on placement in the string and matches against known name pieces
+like titles. It joins name pieces on conjunctions and special prefixes to
+last names like "del". Titles can be chained together and include conjunctions
+to handle titles like "Asst Secretary of State". It can also try to 
+correct capitalization of all upper or lowercase names.
+
+It attempts the best guess that can be made with a simple, rule-based approach. 
+Unicode is supported, but the parser is not likely to be useful for languages 
+that to not share the same structure as English names. It's not perfect, but it 
+gets you pretty far.
+
+Quick Start Example
+-------------------
+
+::
+
+    >>> from nameparser import HumanName
+    >>> name = HumanName("Dr. Juan Q. Xavier de la Vega III (Doc Vega)")
+    >>> name 
+    <HumanName : [
+    	title: 'Dr.' 
+    	first: 'Juan' 
+    	middle: 'Q. Xavier' 
+    	last: 'de la Vega' 
+    	suffix: 'III'
+    	nickname: 'Doc Vega'
+    ]>
+    >>> name.last
+    'de la Vega'
+    >>> name.as_dict()
+    {'last': 'de la Vega', 'suffix': 'III', 'title': 'Dr.', 'middle': 'Q. Xavier', 'nickname': 'Doc Vega', 'first': 'Juan'}
+    >>> name.string_format = "{first} {last}"
+    >>> str(name)
+    'Juan de la Vega'
+
+
+3 different comma placement variations are supported for the string that you pass.
+
+* Title Firstname "Nickname" Middle Middle Lastname Suffix
+* Lastname [Suffix], Title Firstname (Nickname) Middle Middle[,] Suffix [, Suffix]
+* Title Firstname M Lastname [Suffix], Suffix [Suffix] [, Suffix]
+
+The parser does not make any attempt to clean the data. It mostly just splits on white
+space and puts things in buckets based on their position in the string. This also means
+the difference between 'title' and 'suffix' is positional, not semantic. ("Pre-nominal"
+and "post-nominal" would probably be better names.)
+
+::
+
+    >>> name = HumanName("1 & 2, 3 4 5, Mr.")
+    >>> name 
+    <HumanName : [
+    	title: '' 
+    	first: '3' 
+    	middle: '4 5' 
+    	last: '1 & 2' 
+    	suffix: 'Mr.'
+    	nickname: ''
+    ]>
+
+Customization
+-------------
+
+Your project may need a bit of adjustments for your dataset. You can
+do this in your own pre- or post-processing, by `customizing the configured pre-defined 
+sets`_ of titles, prefixes, etc., or by subclassing the `HumanName` class. See the 
+`full documentation`_ for more information.
+
+
+`Full documentation`_
+~~~~~~~~~~~~~~~~~~~~~
+
+.. _customizing the configured pre-defined sets: http://nameparser.readthedocs.org/en/latest/customize.html
+.. _Full documentation: http://nameparser.readthedocs.org/en/latest/
+
+
+Installation
+------------
+
+``pip install nameparser``
+
+If you want to try out the latest code from GitHub you can
+install with pip using the command below.
+
+``pip install -e git+git://github.com/derek73/python-nameparser.git#egg=nameparser``
+
+If you're looking for a web service, check out
+`eyeseast's nameparse service <https://github.com/eyeseast/nameparse>`_, a
+simple Heroku-friendly Flask wrapper for this module.
+
+
+Contributing
+------------
+
+If you come across name piece that you think should be in the default config, you're
+probably right. `Start a New Issue`_ and we can get them added. 
+
+Please let me know if there are ways this library could be structured to make
+it easier for you to use in your projects. Read CONTRIBUTING.md_ for more info
+on running the tests and contributing to the project.
+
+**GitHub Project**
+
+https://github.com/derek73/python-nameparser
+
+.. _CONTRIBUTING.md: https://github.com/derek73/python-nameparser/tree/master/CONTRIBUTING.md
+.. _Start a New Issue: https://github.com/derek73/python-nameparser/issues
+.. _click here to propose changes to the titles: https://github.com/derek73/python-nameparser/edit/master/nameparser/config/titles.py
\ No newline at end of file
diff --git a/dev-requirements.txt b/dev-requirements.txt
new file mode 100644
index 0000000..cdd1ba4
--- /dev/null
+++ b/dev-requirements.txt
@@ -0,0 +1,7 @@
+ipdb==0.8.1
+nose==1.3.7
+Sphinx==1.3.1
+coverage==3.7.1
+ipython==4.0.0
+Pygments==2.0.2
+dill==0.2.4
diff --git a/docs/Makefile b/docs/Makefile
new file mode 100644
index 0000000..ff4d382
--- /dev/null
+++ b/docs/Makefile
@@ -0,0 +1,177 @@
+# Makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line.
+SPHINXOPTS    =
+SPHINXBUILD   = sphinx-build
+PAPER         =
+BUILDDIR      = _build
+
+# User-friendly check for sphinx-build
+ifeq ($(shell which $(SPHINXBUILD) >/dev/null 2>&1; echo $$?), 1)
+$(error The '$(SPHINXBUILD)' command was not found. Make sure you have Sphinx installed, then set the SPHINXBUILD environment variable to point to the full path of the '$(SPHINXBUILD)' executable. Alternatively you can add the directory with the executable to your PATH. If you don't have Sphinx installed, grab it from http://sphinx-doc.org/)
+endif
+
+# Internal variables.
+PAPEROPT_a4     = -D latex_paper_size=a4
+PAPEROPT_letter = -D latex_paper_size=letter
+ALLSPHINXOPTS   = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .
+# the i18n builder cannot share the environment and doctrees with the others
+I18NSPHINXOPTS  = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .
+
+.PHONY: help clean html dirhtml singlehtml pickle json htmlhelp qthelp devhelp epub latex latexpdf text man changes linkcheck doctest gettext
+
+help:
+	@echo "Please use \`make <target>' where <target> is one of"
+	@echo "  html       to make standalone HTML files"
+	@echo "  dirhtml    to make HTML files named index.html in directories"
+	@echo "  singlehtml to make a single large HTML file"
+	@echo "  pickle     to make pickle files"
+	@echo "  json       to make JSON files"
+	@echo "  htmlhelp   to make HTML files and a HTML help project"
+	@echo "  qthelp     to make HTML files and a qthelp project"
+	@echo "  devhelp    to make HTML files and a Devhelp project"
+	@echo "  epub       to make an epub"
+	@echo "  latex      to make LaTeX files, you can set PAPER=a4 or PAPER=letter"
+	@echo "  latexpdf   to make LaTeX files and run them through pdflatex"
+	@echo "  latexpdfja to make LaTeX files and run them through platex/dvipdfmx"
+	@echo "  text       to make text files"
+	@echo "  man        to make manual pages"
+	@echo "  texinfo    to make Texinfo files"
+	@echo "  info       to make Texinfo files and run them through makeinfo"
+	@echo "  gettext    to make PO message catalogs"
+	@echo "  changes    to make an overview of all changed/added/deprecated items"
+	@echo "  xml        to make Docutils-native XML files"
+	@echo "  pseudoxml  to make pseudoxml-XML files for display purposes"
+	@echo "  linkcheck  to check all external links for integrity"
+	@echo "  doctest    to run all doctests embedded in the documentation (if enabled)"
+
+clean:
+	rm -rf $(BUILDDIR)/*
+
+html:
+	$(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html
+	@echo
+	@echo "Build finished. The HTML pages are in $(BUILDDIR)/html."
+
+dirhtml:
+	$(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml
+	@echo
+	@echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml."
+
+singlehtml:
+	$(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml
+	@echo
+	@echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml."
+
+pickle:
+	$(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle
+	@echo
+	@echo "Build finished; now you can process the pickle files."
+
+json:
+	$(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json
+	@echo
+	@echo "Build finished; now you can process the JSON files."
+
+htmlhelp:
+	$(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp
+	@echo
+	@echo "Build finished; now you can run HTML Help Workshop with the" \
+	      ".hhp project file in $(BUILDDIR)/htmlhelp."
+
+qthelp:
+	$(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) $(BUILDDIR)/qthelp
+	@echo
+	@echo "Build finished; now you can run "qcollectiongenerator" with the" \
+	      ".qhcp project file in $(BUILDDIR)/qthelp, like this:"
+	@echo "# qcollectiongenerator $(BUILDDIR)/qthelp/Nameparser.qhcp"
+	@echo "To view the help file:"
+	@echo "# assistant -collectionFile $(BUILDDIR)/qthelp/Nameparser.qhc"
+
+devhelp:
+	$(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp
+	@echo
+	@echo "Build finished."
+	@echo "To view the help file:"
+	@echo "# mkdir -p $$HOME/.local/share/devhelp/Nameparser"
+	@echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/Nameparser"
+	@echo "# devhelp"
+
+epub:
+	$(SPHINXBUILD) -b epub $(ALLSPHINXOPTS) $(BUILDDIR)/epub
+	@echo
+	@echo "Build finished. The epub file is in $(BUILDDIR)/epub."
+
+latex:
+	$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
+	@echo
+	@echo "Build finished; the LaTeX files are in $(BUILDDIR)/latex."
+	@echo "Run \`make' in that directory to run these through (pdf)latex" \
+	      "(use \`make latexpdf' here to do that automatically)."
+
+latexpdf:
+	$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
+	@echo "Running LaTeX files through pdflatex..."
+	$(MAKE) -C $(BUILDDIR)/latex all-pdf
+	@echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex."
+
+latexpdfja:
+	$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
+	@echo "Running LaTeX files through platex and dvipdfmx..."
+	$(MAKE) -C $(BUILDDIR)/latex all-pdf-ja
+	@echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex."
+
+text:
+	$(SPHINXBUILD) -b text $(ALLSPHINXOPTS) $(BUILDDIR)/text
+	@echo
+	@echo "Build finished. The text files are in $(BUILDDIR)/text."
+
+man:
+	$(SPHINXBUILD) -b man $(ALLSPHINXOPTS) $(BUILDDIR)/man
+	@echo
+	@echo "Build finished. The manual pages are in $(BUILDDIR)/man."
+
+texinfo:
+	$(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo
+	@echo
+	@echo "Build finished. The Texinfo files are in $(BUILDDIR)/texinfo."
+	@echo "Run \`make' in that directory to run these through makeinfo" \
+	      "(use \`make info' here to do that automatically)."
+
+info:
+	$(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo
+	@echo "Running Texinfo files through makeinfo..."
+	make -C $(BUILDDIR)/texinfo info
+	@echo "makeinfo finished; the Info files are in $(BUILDDIR)/texinfo."
+
+gettext:
+	$(SPHINXBUILD) -b gettext $(I18NSPHINXOPTS) $(BUILDDIR)/locale
+	@echo
+	@echo "Build finished. The message catalogs are in $(BUILDDIR)/locale."
+
+changes:
+	$(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) $(BUILDDIR)/changes
+	@echo
+	@echo "The overview file is in $(BUILDDIR)/changes."
+
+linkcheck:
+	$(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck
+	@echo
+	@echo "Link check complete; look for any errors in the above output " \
+	      "or in $(BUILDDIR)/linkcheck/output.txt."
+
+doctest:
+	$(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) $(BUILDDIR)/doctest
+	@echo "Testing of doctests in the sources finished, look at the " \
+	      "results in $(BUILDDIR)/doctest/output.txt."
+
+xml:
+	$(SPHINXBUILD) -b xml $(ALLSPHINXOPTS) $(BUILDDIR)/xml
+	@echo
+	@echo "Build finished. The XML files are in $(BUILDDIR)/xml."
+
+pseudoxml:
+	$(SPHINXBUILD) -b pseudoxml $(ALLSPHINXOPTS) $(BUILDDIR)/pseudoxml
+	@echo
+	@echo "Build finished. The pseudo-XML files are in $(BUILDDIR)/pseudoxml."
diff --git a/docs/conf.py b/docs/conf.py
new file mode 100644
index 0000000..0595889
--- /dev/null
+++ b/docs/conf.py
@@ -0,0 +1,269 @@
+# -*- coding: utf-8 -*-
+#
+# Nameparser documentation build configuration file, created by
+# sphinx-quickstart on Fri May 16 01:29:58 2014.
+#
+# This file is execfile()d with the current directory set to its
+# containing dir.
+#
+# Note that not all possible configuration values are present in this
+# autogenerated file.
+#
+# All configuration values have a default; values that are commented out
+# serve to show the default.
+
+import sys
+import os
+from datetime import date
+
+# If extensions (or modules to document with autodoc) are in another directory,
+# add these directories to sys.path here. If the directory is relative to the
+# documentation root, use os.path.abspath to make it absolute, like shown here.
+sys.path.insert(0, os.path.abspath('..'))
+import nameparser
+
+# -- General configuration ------------------------------------------------
+
+# If your documentation needs a minimal Sphinx version, state it here.
+#needs_sphinx = '1.0'
+
+# Add any Sphinx extension module names here, as strings. They can be
+# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
+# ones.
+extensions = [
+    'sphinx.ext.autodoc',
+    'sphinx.ext.doctest',
+    'sphinx.ext.viewcode',
+]
+
+# Add any paths that contain templates here, relative to this directory.
+templates_path = ['_templates']
+
+# The suffix of source filenames.
+source_suffix = '.rst'
+
+# The encoding of source files.
+#source_encoding = 'utf-8-sig'
+
+# The master toctree document.
+master_doc = 'index'
+
+# General information about the project.
+project = u'Nameparser'
+copyright = u'{:%Y}, Derek Gulbranson'.format(date.today())
+
+# The version info for the project you're documenting, acts as replacement for
+# |version| and |release|, also used in various other places throughout the
+# built documents.
+#
+# The short X.Y version.
+version = nameparser.__version__
+# The full version, including alpha/beta/rc tags.
+release = version
+
+# The language for content autogenerated by Sphinx. Refer to documentation
+# for a list of supported languages.
+#language = None
+
+# There are two options for replacing |today|: either, you set today to some
+# non-false value, then it is used:
+#today = ''
+# Else, today_fmt is used as the format for a strftime call.
+#today_fmt = '%B %d, %Y'
+
+# List of patterns, relative to source directory, that match files and
+# directories to ignore when looking for source files.
+exclude_patterns = ['_build']
+
+# The reST default role (used for this markup: `text`) to use for all
+# documents.
+#default_role = None
+
+# If true, '()' will be appended to :func: etc. cross-reference text.
+#add_function_parentheses = True
+
+# If true, the current module name will be prepended to all description
+# unit titles (such as .. function::).
+#add_module_names = True
+
+# If true, sectionauthor and moduleauthor directives will be shown in the
+# output. They are ignored by default.
+#show_authors = False
+
+# The name of the Pygments (syntax highlighting) style to use.
+pygments_style = 'sphinx'
+
+# A list of ignored prefixes for module index sorting.
+#modindex_common_prefix = []
+
+# If true, keep warnings as "system message" paragraphs in the built documents.
+#keep_warnings = False
+
+
+# -- Options for HTML output ----------------------------------------------
+
+# The theme to use for HTML and HTML Help pages.  See the documentation for
+# a list of builtin themes.
+html_theme = 'default'
+
+# Theme options are theme-specific and customize the look and feel of a theme
+# further.  For a list of options available for each theme, see the
+# documentation.
+#html_theme_options = {}
+
+# Add any paths that contain custom themes here, relative to this directory.
+#html_theme_path = []
+
+# The name for this set of Sphinx documents.  If None, it defaults to
+# "<project> v<release> documentation".
+#html_title = None
+
+# A shorter title for the navigation bar.  Default is the same as html_title.
+#html_short_title = None
+
+# The name of an image file (relative to this directory) to place at the top
+# of the sidebar.
+#html_logo = None
+
+# The name of an image file (within the static path) to use as favicon of the
+# docs.  This file should be a Windows icon file (.ico) being 16x16 or 32x32
+# pixels large.
+#html_favicon = None
+
+# Add any paths that contain custom static files (such as style sheets) here,
+# relative to this directory. They are copied after the builtin static files,
+# so a file named "default.css" will overwrite the builtin "default.css".
+html_static_path = ['_static']
+
+# Add any extra paths that contain custom files (such as robots.txt or
+# .htaccess) here, relative to this directory. These files are copied
+# directly to the root of the documentation.
+#html_extra_path = []
+
+# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
+# using the given strftime format.
+#html_last_updated_fmt = '%b %d, %Y'
+
+# If true, SmartyPants will be used to convert quotes and dashes to
+# typographically correct entities.
+#html_use_smartypants = True
+
+# Custom sidebar templates, maps document names to template names.
+#html_sidebars = {}
+
+# Additional templates that should be rendered to pages, maps page names to
+# template names.
+#html_additional_pages = {}
+
+# If false, no module index is generated.
+#html_domain_indices = True
+
+# If false, no index is generated.
+#html_use_index = True
+
+# If true, the index is split into individual pages for each letter.
+#html_split_index = False
+
+# If true, links to the reST sources are added to the pages.
+#html_show_sourcelink = True
+
+# If true, "Created using Sphinx" is shown in the HTML footer. Default is True.
+#html_show_sphinx = True
+
+# If true, "(C) Copyright ..." is shown in the HTML footer. Default is True.
+#html_show_copyright = True
+
+# If true, an OpenSearch description file will be output, and all pages will
+# contain a <link> tag referring to it.  The value of this option must be the
+# base URL from which the finished HTML is served.
+#html_use_opensearch = ''
+
+# This is the file name suffix for HTML files (e.g. ".xhtml").
+#html_file_suffix = None
+
+# Output file base name for HTML help builder.
+htmlhelp_basename = 'Nameparserdoc'
+
+
+# -- Options for LaTeX output ---------------------------------------------
+
+latex_elements = {
+# The paper size ('letterpaper' or 'a4paper').
+#'papersize': 'letterpaper',
+
+# The font size ('10pt', '11pt' or '12pt').
+#'pointsize': '10pt',
+
+# Additional stuff for the LaTeX preamble.
+#'preamble': '',
+}
+
+# Grouping the document tree into LaTeX files. List of tuples
+# (source start file, target name, title,
+#  author, documentclass [howto, manual, or own class]).
+latex_documents = [
+  ('index', 'Nameparser.tex', u'Nameparser Documentation',
+   u'Derek Gulbranson', 'manual'),
+]
+
+# The name of an image file (relative to this directory) to place at the top of
+# the title page.
+#latex_logo = None
+
+# For "manual" documents, if this is true, then toplevel headings are parts,
+# not chapters.
+#latex_use_parts = False
+
+# If true, show page references after internal links.
+#latex_show_pagerefs = False
+
+# If true, show URL addresses after external links.
+#latex_show_urls = False
+
+# Documents to append as an appendix to all manuals.
+#latex_appendices = []
+
+# If false, no module index is generated.
+#latex_domain_indices = True
+
+
+# -- Options for manual page output ---------------------------------------
+
+# One entry per manual page. List of tuples
+# (source start file, name, description, authors, manual section).
+man_pages = [
+    ('index', 'nameparser', u'Nameparser Documentation',
+     [u'Derek Gulbranson'], 1)
+]
+
+# If true, show URL addresses after external links.
+#man_show_urls = False
+
+
+# -- Options for Texinfo output -------------------------------------------
+
+# Grouping the document tree into Texinfo files. List of tuples
+# (source start file, target name, title, author,
+#  dir menu entry, description, category)
+texinfo_documents = [
+  ('index', 'Nameparser', u'Nameparser Documentation',
+   u'Derek Gulbranson', 'Nameparser', 'A simple python modules for parsing human names into components.',
+   'Miscellaneous'),
+]
+
+# Documents to append as an appendix to all manuals.
+#texinfo_appendices = []
+
+# If false, no module index is generated.
+#texinfo_domain_indices = True
+
+# How to display URL addresses: 'footnote', 'no', or 'inline'.
+#texinfo_show_urls = 'footnote'
+
+# If true, do not generate a @detailmenu in the "Top" node's menu.
+#texinfo_no_detailmenu = False
+
+doctest_global_setup = """from nameparser import HumanName
+from nameparser.config import CONSTANTS, Constants
+CONSTANTS = Constants()
+"""
diff --git a/docs/contributing.rst b/docs/contributing.rst
new file mode 100644
index 0000000..90f53d4
--- /dev/null
+++ b/docs/contributing.rst
@@ -0,0 +1,20 @@
+Contributing
+============
+
+The project is hosted on GitHub:
+
+https://github.com/derek73/python-nameparser
+
+Find more information about running tests and contributing the project at the projects contribution guide.
+
+https://github.com/derek73/python-nameparser/blob/master/CONTRIBUTING.md
+
+Providing Example Data
+----------------------
+
+We humans are the learning machine behind this code, and we can't do it without real world data. If it doesn't work, start a new issue because we probably don't know. 
+
+If you have a dataset that has lots of issues, add the data to a [gist](https://gist.github.com) and [create a new issue](https://github.com/derek73/python-nameparser/issues) so we can try to get it working as expected.
+
+Feel free to update this documentation to address any questions that I missed. GitHub makes it pretty easy to edit it right on the web site now. 
+
diff --git a/docs/customize.rst b/docs/customize.rst
new file mode 100644
index 0000000..68d0bd0
--- /dev/null
+++ b/docs/customize.rst
@@ -0,0 +1,261 @@
+Pre-processing
+=================
+
+
+Name buckets
+++++++++++++++
+
+Each attribute has a corresponding ordered list of name pieces. 
+
+* o.title_list
+* o.first_list
+* o.middle_list
+* o.last_list
+* o.suffix_list
+* o.nickname_list
+
+If you're doing pre- or post-processing you may wish to manipulate these lists directly. 
+The strings returned by the attribute names just join these lists with spaces.
+
+::
+
+  >>> hn = HumanName("Juan Q. Xavier Velasquez y Garcia, Jr.")
+  >>> hn.middle_list
+  [u'Q.', u'Xavier']
+  >>> hn.middle_list += ["Ricardo"]
+  >>> hn.middle_list
+  [u'Q.', u'Xavier', 'Ricardo']
+
+
+You can also replace any name bucket's contents by assigning a string or a list
+directly to the attribute.
+
+::
+
+  >>> hn = HumanName("Dr. John A. Kenneth Doe")
+  >>> hn.title = ["Associate","Professor"]
+  >>> hn.suffix = "Md."
+  >>> hn.suffix
+  <HumanName : [
+  	title: 'Associate Processor' 
+  	first: 'John' 
+  	middle: 'A. Kenneth' 
+  	last: 'Doe' 
+  	suffix: 'Md.'
+  	nickname: ''
+  ]>
+
+
+Customizing the Parser with Your Own Configuration
+==================================================
+
+Recognition of titles, prefixes, suffixes and conjunctions is provided by
+matching the lower case characters of a name piece with pre-defined sets
+of strings located in :py:mod:`nameparser.config`. You can easily adjust
+these predefined sets to help fine tune the parser for your dataset.
+
+
+Changing the Predefined Variables
++++++++++++++++++++++++++++++++++
+
+There are a few ways to adjust the parser configuration depending on your
+needs. The config is available in two places.
+
+The first is via ``from nameparser.config import CONSTANTS``.
+
+.. doctest::
+
+    >>> from nameparser.config import CONSTANTS
+    >>> CONSTANTS
+    <Constants() instance>
+
+The other is the ``C`` attribute of a ``HumanName`` instance, e.g.
+``hn.C``.
+
+.. doctest::
+
+    >>> from nameparser import HumanName
+    >>> hn = HumanName("Dean Robert Johns")
+    >>> hn.C
+    <Constants() instance>
+
+Both places are usually a reference to the same shared module-level 
+:py:class:`~nameparser.config.Constants` instance, depending on how you 
+instantiate the :py:class:`~nameparser.parser.HumanName` class (see below).
+
+Take a look at the :py:mod:`nameparser.config` documentation to see what's
+in the constants. Here's a quick walk through of some examples where you
+might want to adjust them.
+
+
+Parser Customization Examples
++++++++++++++++++++++++++++++
+
+"Hon" is a common abbreviation for "Honorable", a title used when
+addressing judges, and is included in the default tiles constants. This
+means it will never be considered a first name, because titles are the
+pieces before first names. 
+
+But "Hon" is also sometimes a first name. If your dataset contains more
+"Hon"s than "Honorable"s, you may wish to remove it from the titles
+constant so that "Hon" can be parsed as a first name.
+
+.. doctest::
+    :options: +ELLIPSIS, +NORMALIZE_WHITESPACE
+
+    >>> from nameparser import HumanName
+    >>> hn = HumanName("Hon Solo")
+    >>> hn
+    <HumanName : [
+    	title: 'Hon' 
+    	first: '' 
+    	middle: '' 
+    	last: 'Solo' 
+    	suffix: ''
+    	nickname: ''
+    ]>
+    >>> from nameparser.config import CONSTANTS
+    >>> CONSTANTS.titles.remove('hon')
+    SetManager(set([u'msgt', ..., u'adjutant']))
+    >>> hn = HumanName("Hon Solo")
+    >>> hn
+    <HumanName : [
+    	title: '' 
+    	first: 'Hon' 
+    	middle: '' 
+    	last: 'Solo' 
+    	suffix: ''
+    	nickname: ''
+    ]>
+
+
+"Dean" is a common first name so it is not included in the default titles
+constant. But in some contexts it is more common as a title. If you would
+like "Dean" to be parsed as a title, simply add it to the titles constant.
+
+You can pass multiple strings to both the ``add()`` and ``remove()``
+methods and each string will be added or removed. Both functions
... 4151 lines suppressed ...

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/python-modules/packages/python-nameparser.git



More information about the Python-modules-commits mailing list