[med-svn] [python-avro] 07/14: Imported Upstream version 1.8.0~rc0+dfsg

Sun Oct 25 00:46:24 UTC 2015

This is an automated email from the git hooks/post-receive script.

afif-guest pushed a commit to branch master
in repository python-avro.

commit 9d6eb1fc1ed7c9efe1790ca15732688cb4ef5af4
Author: Afif Elghraoui <afif at ghraoui.name>
Date:   Sat Oct 24 15:51:54 2015 -0700

    Imported Upstream version 1.8.0~rc0+dfsg
---
 .gitignore                                         |    4 +
 BUILD.txt                                          |   18 +-
 CHANGES.txt                                        |  181 ++
 README.txt                                         |    2 +-
 build.sh                                           |   40 +-
 doc/src/content/xdocs/gettingstartedpython.xml     |   20 +-
 doc/src/content/xdocs/spec.xml                     |   63 +-
 lang/py/build.xml                                  |   52 +-
 lang/py/ivy.xml                                    |   24 +
 lang/py/ivysettings.xml                            |   30 +
 lang/py/lib/pyAntTasks-1.3-LICENSE.txt             |  202 --
 lang/py/lib/pyAntTasks-1.3.jar                     |  Bin 18788 -> 0 bytes
 lang/py/lib/simplejson/LICENSE.txt                 |   19 -
 lang/py/lib/simplejson/__init__.py                 |  318 ---
 lang/py/lib/simplejson/_speedups.c                 | 2329 --------------------
 lang/py/lib/simplejson/decoder.py                  |  354 ---
 lang/py/lib/simplejson/encoder.py                  |  440 ----
 lang/py/lib/simplejson/scanner.py                  |   65 -
 lang/py/lib/simplejson/tool.py                     |   37 -
 lang/py/src/avro/schema.py                         |    6 +-
 lang/py/src/avro/tether/__init__.py                |    7 +
 lang/py/src/avro/tether/tether_task.py             |  498 +++++
 lang/py/src/avro/tether/tether_task_runner.py      |  227 ++
 lang/py/src/avro/tether/util.py                    |   34 +
 lang/py/test/mock_tether_parent.py                 |   95 +
 lang/py/test/set_avro_test_path.py                 |   40 +
 lang/py/test/test_datafile.py                      |    3 +
 lang/py/test/test_datafile_interop.py              |    3 +
 lang/py/test/test_io.py                            |    3 +
 lang/py/test/test_ipc.py                           |    2 +
 lang/py/test/test_schema.py                        |    6 +
 lang/py/test/test_tether_task.py                   |  116 +
 lang/py/test/test_tether_task_runner.py            |  191 ++
 lang/py/test/test_tether_word_count.py             |  213 ++
 lang/py/test/word_count_task.py                    |   96 +
 lang/py3/avro/schema.py                            |    8 +-
 lang/py3/avro/tests/run_tests.py                   |    1 +
 .../avro/tests/test_enum.py}                       |   40 +-
 lang/py3/avro/tests/test_schema.py                 |    5 +
 lang/py3/setup.py                                  |    5 +-
 pom.xml                                            |    8 +-
 share/VERSION.txt                                  |    2 +-
 share/docker/Dockerfile                            |   58 +
 share/rat-excludes.txt                             |    1 +
 .../org/apache/avro/ipc/trace/avroTrace.avdl       |   68 -
 .../org/apache/avro/ipc/trace/avroTrace.avpr       |   82 -
 share/test/schemas/http.avdl                       |   66 +
 share/test/schemas/reserved.avsc                   |    2 +
 share/test/schemas/specialtypes.avdl               |   98 +
 49 files changed, 2221 insertions(+), 3961 deletions(-)

diff --git a/.gitignore b/.gitignore
index 8c6b133..372789a 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,3 +1,7 @@
+*.iml
+*.ipr
+*.iws
+.idea/
 .project
 .settings
 .classpath
diff --git a/BUILD.txt b/BUILD.txt
index a59c80c..7c3eea7 100644
--- a/BUILD.txt
+++ b/BUILD.txt
@@ -21,9 +21,25 @@ The following packages must be installed before Avro can be built:
  - Apache Forrest 0.8 (for documentation)
  - md5sum, sha1sum, used by top-level dist target
 
+To simplify this, you can run a Docker container with all the above
+dependencies installed by installing docker.io and typing:
+
+ ./build.sh docker
+
+When this completes you will be in a shell running in the
+container. Building the image the first time may take a while (20
+minutes or more) since dependencies must be downloaded and
+installed. However subsequent invocations are much faster as the
+cached image is used.
+
+The working directory in the container is mounted from your host. This
+allows you to access the files in your Avro development tree from the
+Docker container.
+
 BUILDING
 
-Once the requirements are installed, build.sh can be used as follows:
+Once the requirements are installed (or from the Docker container),
+build.sh can be used as follows:
 
  './build.sh test' runs tests for all languages
  './build.sh dist' creates all release distribution files in dist/
diff --git a/CHANGES.txt b/CHANGES.txt
index 188ec44..afedefb 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,5 +1,186 @@
 Avro Change Log
 
+Avro 1.8.0 (10 August 2014)
+
+  INCOMPATIBLE CHANGES
+
+    AVRO-1334. Java: Update versions of many dependencies. (scottcarey, cutting)
+
+    AVRO-997. Java: For enum values, no longer sometimes permit any
+    Object whose toString() names an enum symbol, but rather always
+    require use of distinct enum types. (Sean Busbey via cutting)
+
+    AVRO-1602. Java: Remove Dapper-style RPC trace facility.  This
+    seems unused and has been a source of build problems.  (cutting)
+
+    AVRO-1586. Build against Hadoop 2. With this change the avro-mapred and
+    trevni-avro JARs without a hadoop1 or hadoop2 Maven classifier are Hadoop 2
+    artifacts. To use with Hadoop 1, set the classifier to hadoop1.
+    (tomwhite)
+
+    AVRO-1502. Java: Generated classes now implement Serializable.
+    Generated classes need to be regenerated to use this release. (cutting)
+
+  NEW FEATURES
+
+    AVRO-1555.  C#: Add support for RPC over HTTP. (Dmitry Kovalev via cutting)
+
+    AVRO-739. Add date, time, timestamp, and duration binary types to
+    specification. (Dmitry Kovalev and Ryan Blue via tomwhite)
+
+    AVRO-1590. Java: In resolving records in unions, permit structural
+    and shortname matches when fullname matching fails.
+    (Ryan Blue via cutting)
+
+    AVRO-570. Python: Add connector for tethered mapreduce.
+    (Jeremy Lewi and Steven Willis via cutting)    
+
+    AVRO-834. Java: Data File corruption recovery tool.
+    (scottcarey and tomwhite)
+
+    AVRO-1614. Java: In generated builder classes, add accessors to
+    field sub-builders, permitting easier creation of nested, optional
+    structures. (Niels Basjes via cutting)
+
+    AVRO-1537. Make it easier to set up a multi-language build environment.
+    Support for running a Docker container with all build dependencies.
+    (tomwhite)
+
+    AVRO-680. Java: Support non-string map keys. (Sachin Goyal via Ryan Blue).
+
+    AVRO-1497. Java: Add support for logical types. (blue)
+
+    AVRO-1685. Java: Allow specifying sync in DataFileWriter.create
+    (Sehrope Sarkuni via tomwhite)
+
+    AVRO-1683. Add microsecond time and timestamp logical types to the
+    specification. (blue)
+
+    AVRO-1672. Java: Add date/time logical types and conversions. (blue)
+
+  OPTIMIZATIONS
+
+  IMPROVEMENTS
+
+    AVRO-843. C#: Change Visual Studio project files to specify .NET 3.5.
+    (Dmitry Kovalev via cutting)
+
+    AVRO-1583. Java: Add stdin support to the tojson tool.
+    (Clément Mahtieu via cutting)
+
+    AVRO-1551. Java: Add an output encoding option to the compiler
+    command line tool. (Keegan Witt via cutting)
+
+    AVRO-1585. Java: Deprecate Jackson classes in public API. (tomwhite)
+
+    AVRO-1619. Java: Improve javadoc comments in generated code.
+    (Niels Basjes via cutting)
+
+    AVRO-1616. Add IntelliJ files to .gitignore. (Niels Basjes via cutting)
+
+    AVRO-1539. Java: Add FileSystem based FsInput constructor.
+    (Allan Shoup via cutting)
+
+    AVRO-1628. Java: Add Schema#createUnion(Schema ...) convenience method.
+    (Clément Mahtieu via cutting)
+
+    AVRO-1655. Java: Add Schema.createRecord with field list.
+    (Lars Francke via blue)
+
+    AVRO-1681. Improve generated JavaDocs.
+    (Charles Gariépy-Ikeson via tomwhite)
+
+    AVRO-1645. Ruby: Improved handling of missing named types.
+    (Daniel Schierbeck via tomwhite)
+
+    AVRO-1693. Ruby: Allow writing arbitrary metadata to data files.
+    (Daniel Schierbeck via tomwhite)
+
+    AVRO-1692. Allow more than one logical type for a Java class. (blue via
+    tomwhite)
+
+	AVRO-1697. Ruby: Add support for the Snappy codec to the Ruby library.
+	(Daniel Schierbeck via tomwhite)
+
+  BUG FIXES
+
+    AVRO-1553. Java: MapReduce never uses MapOutputValueSchema (tomwhite)
+
+    AVRO-1544. Java: Fix GenericData#validate for unions with null.
+    (Matthew Hayes via cutting)
+
+    AVRO-1589. Java: Fix ReflectData.AllowNulls to not create unions
+    for primitive types. (Ryan Blue via cutting)
+
+    AVRO-1591. Java: Fix specific RPC so that proxies implement hashCode(),
+    equals() and toString(). (Mark Spadoni via cutting)
+
+    AVRO-1489. Java: Avro fails to build with OpenJDK 8. (Ricardo Arguello via
+    tomwhite)
+
+    AVRO-1302. Python: Update documentation to open files as binary to
+    prevent EOL substitution. (Lars Francke via cutting)
+
+    AVRO-1598. Java: Fix flakiness in TestFileSpanStorage.
+    (Ryan Blue via cutting)
+
+    AVRO-1592. Java: Fix handling of Java reserved words as enum
+    constants in generated code. (Lukas Steiblys via cutting)
+
+    AVRO-1597. Java: Random data tool writes corrupt files to standard out.
+    (cutting)
+
+    AVRO-1596. Java: Cannot read past corrupted block in Avro data file.
+    (tomwhite)
+
+    AVRO-1564. Java: Fix handling of optional byte field in Thrift.
+    (Michael Pershyn via cutting)
+
+    AVRO-1407: Java: Fix infinite loop on slow connect in NettyTransceiver.
+    (Gareth Davis via cutting)
+
+    AVRO-1604. Java: Fix ReflectData.AllowNull to work with @Nullable
+    annotations. (Ryan Blue via cutting)
+
+    AVRO-1545. Python. Fix to retain schema properties on primitive types.
+    (Dustin Spicuzza via cutting)
+
+    AVRO-1623. Java: Fix GenericData#validate to correctly resolve unions.
+    (Jeffrey Mullins via cutting)
+
+    AVRO-1621. PHP: FloatIntEncodingTest fails for NAN. (tomwhite)
+
+    AVRO-1573. Javascript. Upgrade to Grunt 0.4 for testing. (tomwhite)
+
+    AVRO-1624. Java. Surefire forkMode is deprecated. (Niels Basjes via
+    tomwhite)
+
+    AVRO-1630. Java: Creating Builder from instance loses data. (Niels Basjes
+    via tomwhite)
+
+    AVRO-1653. Fix typo in spec (lenghted => length). (Sehrope Sarkuni via blue)
+
+    AVRO-1656. Fix 'How to Contribute' link. (Benjamin Clauss via blue)
+
+    AVRO-1652. Java: Do not warn or validate defaults if validation is off.
+    (Michael D'Angelo via blue)
+
+    AVRO-1655. Java: Fix NPE in RecordSchema#toString when fields are null.
+    (Lars Francke via blue)
+
+    AVRO-1689. Update Dockerfile to use official Java repository. (tomwhite)
+
+    AVRO-1576. TestSchemaCompatibility is platform dependant.
+    (Stevo Slavic via tomwhite)
+
+    AVRO-1688. Ruby test_union(TestIO) is failing. (tomwhite)
+
+    AVRO-1673. Python 3 EnumSchema changes the order of symbols.
+    (Marcin Białoń via tomwhite)
+
+    AVRO-1491. Avro.ipc.dll not included in release zip/build file.
+    (Dmitry Kovalev via tomwhite)
+
 Avro 1.7.7 (23 July 2014)
 
   NEW FEATURES
diff --git a/README.txt b/README.txt
index a8f66f7..566f192 100644
--- a/README.txt
+++ b/README.txt
@@ -6,4 +6,4 @@ Learn more about Avro, please visit our website at:
 
 To contribute to Avro, please read:
 
-  https://cwiki.apache.org/AVRO/how-to-contribute.html
+  https://cwiki.apache.org/confluence/display/AVRO/How+To+Contribute
diff --git a/build.sh b/build.sh
index 06961c0..cce0cfb 100755
--- a/build.sh
+++ b/build.sh
@@ -22,7 +22,7 @@ cd `dirname "$0"`				  # connect to root
 VERSION=`cat share/VERSION.txt`
 
 function usage {
-  echo "Usage: $0 {test|dist|sign|clean}"
+  echo "Usage: $0 {test|dist|sign|clean|docker}"
   exit 1
 }
 
@@ -96,8 +96,10 @@ case "$target" in
 
 	# build lang-specific artifacts
         
-	(cd lang/java; mvn package -DskipTests -Dhadoop.version=2; rm -rf mapred/target/classes/;
-	  mvn -P dist package -DskipTests -Davro.version=$VERSION javadoc:aggregate) 
+	(cd lang/java; mvn package -DskipTests -Dhadoop.version=1;
+	  rm -rf mapred/target/{classes,test-classes}/;
+	  rm -rf trevni/avro/target/{classes,test-classes}/;
+	  mvn -P dist package -DskipTests -Davro.version=$VERSION javadoc:aggregate)
         (cd lang/java/trevni/doc; mvn site)
         (mvn -N -P copy-artifacts antrun:run) 
 
@@ -169,9 +171,39 @@ case "$target" in
 
 	(cd lang/php; ./build.sh clean)
 
-	(cd lang/perl; [ -f Makefile ] && make clean)
+	(cd lang/perl; [ ! -f Makefile ] || make clean)
 	;;
 
+    docker)
+        docker build -t avro-build share/docker
+        if [ "$(uname -s)" == "Linux" ]; then
+          USER_NAME=${SUDO_USER:=$USER}
+          USER_ID=$(id -u $USER_NAME)
+          GROUP_ID=$(id -g $USER_NAME)
+        else # boot2docker uid and gid
+          USER_NAME=$USER
+          USER_ID=1000
+          GROUP_ID=50
+        fi
+        docker build -t avro-build-${USER_NAME} - <<UserSpecificDocker
+FROM avro-build
+RUN groupadd -g ${GROUP_ID} ${USER_NAME} || true
+RUN useradd -g ${GROUP_ID} -u ${USER_ID} -k /root -m ${USER_NAME}
+ENV HOME /home/${USER_NAME}
+UserSpecificDocker
+        # By mapping the .m2 directory you can do an mvn install from
+        # within the container and use the result on your normal
+        # system.  And this also is a significant speedup in subsequent
+        # builds because the dependencies are downloaded only once.
+        docker run --rm=true -t -i \
+          -v ${PWD}:/home/${USER_NAME}/avro \
+          -w /home/${USER_NAME}/avro \
+          -v ${HOME}/.m2:/home/${USER_NAME}/.m2 \
+          -v ${HOME}/.gnupg:/home/${USER_NAME}/.gnupg \
+          -u ${USER_NAME} \
+          avro-build-${USER_NAME}
+        ;;
+
     *)
         usage
         ;;
diff --git a/doc/src/content/xdocs/gettingstartedpython.xml b/doc/src/content/xdocs/gettingstartedpython.xml
index d8d9df8..156646a 100644
--- a/doc/src/content/xdocs/gettingstartedpython.xml
+++ b/doc/src/content/xdocs/gettingstartedpython.xml
@@ -136,14 +136,14 @@ import avro.schema
 from avro.datafile import DataFileReader, DataFileWriter
 from avro.io import DatumReader, DatumWriter
 
-schema = avro.schema.parse(open("user.avsc").read())
+schema = avro.schema.parse(open("user.avsc", "rb").read())
 
-writer = DataFileWriter(open("users.avro", "w"), DatumWriter(), schema)
+writer = DataFileWriter(open("users.avro", "wb"), DatumWriter(), schema)
 writer.append({"name": "Alyssa", "favorite_number": 256})
 writer.append({"name": "Ben", "favorite_number": 7, "favorite_color": "red"})
 writer.close()
 
-reader = DataFileReader(open("users.avro", "r"), DatumReader())
+reader = DataFileReader(open("users.avro", "rb"), DatumReader())
 for user in reader:
     print user
 reader.close()
@@ -154,10 +154,18 @@ reader.close()
 {u'favorite_color': u'red', u'favorite_number': 7, u'name': u'Ben'}
       </source>
       <p>
+        Do make sure that you open your files in binary mode (i.e. using the modes
+        <code>wb</code> or <code>rb</code> respectively). Otherwise you might
+        generate corrupt files due to
+        <a href="http://docs.python.org/library/functions.html#open">
+        automatic replacement</a> of newline characters with the
+        platform-specific representations.
+      </p>
+      <p>
         Let's take a closer look at what's going on here.
       </p>
       <source>
-schema = avro.schema.parse(open("user.avsc").read())
+schema = avro.schema.parse(open("user.avsc", "rb").read())
       </source>
       <p>
         <code>avro.schema.parse</code> takes a string containing a JSON schema
@@ -167,7 +175,7 @@ schema = avro.schema.parse(open("user.avsc").read())
         user.avsc schema file here.
       </p>
       <source>
-writer = DataFileWriter(open("users.avro", "w"), DatumWriter(), schema)
+writer = DataFileWriter(open("users.avro", "wb"), DatumWriter(), schema)
       </source>
       <p>
         We create a <code>DataFileWriter</code>, which we'll use to write
@@ -201,7 +209,7 @@ writer.append({"name": "Ben", "favorite_number": 7, "favorite_color": "red"})
           ignored.
         </p>
         <source>
-reader = DataFileReader(open("users.avro", "r"), DatumReader())
+reader = DataFileReader(open("users.avro", "rb"), DatumReader())
         </source>
         <p>
           We open the file again, this time for reading back from disk.  We use
diff --git a/doc/src/content/xdocs/spec.xml b/doc/src/content/xdocs/spec.xml
index 8c108c8..83c0420 100644
--- a/doc/src/content/xdocs/spec.xml
+++ b/doc/src/content/xdocs/spec.xml
@@ -871,7 +871,7 @@
               <li>that many bytes of <em>buffer data</em>.</li>
             </ul>
           </li>
-          <li>A message is always terminated by a zero-lenghted buffer.</li>
+          <li>A message is always terminated by a zero-length buffer.</li>
         </ul>
 
         <p>Framing is transparent to request and response message
@@ -1406,6 +1406,67 @@ void initFPTable() {
           precisions match.</p>
 
       </section>
+
+      <section>
+        <title>Date</title>
+        <p>
+          The <code>date</code> logical type represents a date within the calendar, with no reference to a particular time zone or time of day.
+        </p>
+        <p>
+          A <code>date</code> logical type annotates an Avro <code>int</code>, where the int stores the number of days from the unix epoch, 1 January 1970 (ISO calendar).
+        </p>
+      </section>
+
+      <section>
+        <title>Time (millisecond precision)</title>
+        <p>
+          The <code>time-millis</code> logical type represents a time of day, with no reference to a particular calendar, time zone or date, with a precision of one millisecond.
+        </p>
+        <p>
+          A <code>time-millis</code> logical type annotates an Avro <code>int</code>, where the int stores the number of milliseconds after midnight, 00:00:00.000.
+        </p>
+      </section>
+
+      <section>
+        <title>Time (microsecond precision)</title>
+        <p>
+          The <code>time-micros</code> logical type represents a time of day, with no reference to a particular calendar, time zone or date, with a precision of one microsecond.
+        </p>
+        <p>
+          A <code>time-micros</code> logical type annotates an Avro <code>long</code>, where the long stores the number of microseconds after midnight, 00:00:00.000000.
+        </p>
+      </section>
+
+      <section>
+        <title>Timestamp (millisecond precision)</title>
+        <p>
+          The <code>timestamp-millis</code> logical type represents an instant on the global timeline, independent of a particular time zone or calendar, with a precision of one millisecond.
+        </p>
+        <p>
+          A <code>timestamp-millis</code> logical type annotates an Avro <code>long</code>, where the long stores the number of milliseconds from the unix epoch, 1 January 1970 00:00:00.000 UTC.
+        </p>
+      </section>
+
+      <section>
+        <title>Timestamp (microsecond precision)</title>
+        <p>
+          The <code>timestamp-micros</code> logical type represents an instant on the global timeline, independent of a particular time zone or calendar, with a precision of one microsecond.
+        </p>
+        <p>
+          A <code>timestamp-micros</code> logical type annotates an Avro <code>long</code>, where the long stores the number of microseconds from the unix epoch, 1 January 1970 00:00:00.000000 UTC.
+        </p>
+      </section>
+
+      <section>
+        <title>Duration</title>
+        <p>
+          The <code>duration</code> logical type represents an amount of time defined by a number of months, days and milliseconds. This is not equivalent to a number of milliseconds, because, depending on the moment in time from which the duration is measured, the number of days in the month and number of milliseconds in a day may differ. Other standard periods such as years, quarters, hours and minutes can be expressed through these basic periods.
+        </p>
+        <p>
+          A <code>duration</code> logical type annotates Avro <code>fixed</code> type of size 12, which stores three little-endian unsigned integers that represent durations at different granularities of time. The first stores a number in months, the second stores a number in days, and the third stores a number in milliseconds.
+        </p>
+      </section>
+
     </section>
 
   <p><em>Apache Avro, Avro, Apache, and the Avro and Apache logos are
diff --git a/lang/py/build.xml b/lang/py/build.xml
index 6d371ea..61c3f4c 100644
--- a/lang/py/build.xml
+++ b/lang/py/build.xml
@@ -16,7 +16,7 @@
    limitations under the License.
 -->
 
-<project name="Avro" default="dist">
+<project name="Avro" default="dist" xmlns:ivy="antlib:org.apache.ivy.ant">
  
   <!-- Load user's default properties. -->
   <property file="${user.home}/build.properties"/>
@@ -36,6 +36,9 @@
   <property name="lib.dir" value="${basedir}/lib"/>
   <property name="test.dir" value="${basedir}/test"/>
 
+  <property name="ivy.version" value="2.2.0"/>
+  <property name="ivy.jar" value="${basedir}/lib/ivy-${ivy.version}.jar"/>
+
   <!-- Load shared properties -->
   <loadfile srcFile="${share.dir}/VERSION.txt" property="avro.version" />
   <loadfile srcFile="${share.schema.dir}/org/apache/avro/ipc/HandshakeRequest.avsc" property="handshake.request.json"/>
@@ -55,6 +58,17 @@
 
   <target name="init" description="Create the build directory.">
     <mkdir dir="${build.dir}"/>
+    <available file="${ivy.jar}" property="ivy.jar.found"/>
+    <antcall target="ivy-download"/>
+    <typedef uri="antlib:org.apache.ivy.ant">
+      <classpath>
+        <pathelement location="${ivy.jar}" />
+      </classpath>
+    </typedef>
+  </target>
+  
+  <target name="ivy-download" unless="ivy.jar.found" >
+    <get src="http://repo2.maven.org/maven2/org/apache/ivy/ivy/${ivy.version}/ivy-${ivy.version}.jar" dest="${ivy.jar}" usetimestamp="true" />
   </target>
 
   <target name="build"
@@ -77,6 +91,12 @@
       <fileset dir="${lib.dir}" />
     </copy>
 
+    <!--Copy the protocols used for tethering -->
+    <copy todir="${build.dir}/src/avro/tether">
+        <fileset dir="${share.schema.dir}/org/apache/avro/mapred/tether/">
+    <include name="*.avpr"/>
+        </fileset>
+    </copy>
     <!-- Inline the handshake schemas -->
     <copy file="${src.dir}/avro/ipc.py"
           toFile="${build.dir}/src/avro/ipc.py"
@@ -120,6 +140,20 @@
         <filter token="INTEROP_DATA_DIR" value="${interop.data.dir}"/>
       </filterset>
     </copy>
+
+    <!-- Ensure we have a local copy of the tools jar -->
+    <ivy:retrieve
+        pattern="${basedir}/../java/tools/target/[artifact]-[revision].[ext]"/>
+
+    <!-- Inline the location of the tools jar -->
+    <copy file="${test.dir}/test_tether_word_count.py"
+          toFile="${build.dir}/test/test_tether_word_count.py"
+          overwrite="true">
+      <filterset>
+  <filter token="AVRO_VERSION" value="${avro.version}"/>
+  <filter token="TOPDIR" value="${basedir}"/>
+      </filterset>
+    </copy>
   </target>
 
   <target name="test"
@@ -135,6 +169,22 @@
     </py-test>
   </target>
 
+    <!--Created a unittest to run just the tests for tethered jobs.
+    -->
+    <target name="test-tether"
+          description="Run unit tests for a hadoop python-tethered job."
+          depends="build">
+    <taskdef name="py-test" classname="org.pyant.tasks.PythonTestTask"
+       classpathref="java.classpath"/>
+    <py-test python="${python}" pythonpathref="test.path">
+      <fileset dir="${build.dir}/test">
+        <include name="test_tether*.py"/>
+        <!--<exclude name="test_datafile_interop.py"/>-->
+      </fileset>
+    </py-test>
+  </target>
+
+
   <target name="interop-data-test"
           description="Run python interop data tests"
           depends="build">
diff --git a/lang/py/ivy.xml b/lang/py/ivy.xml
new file mode 100644
index 0000000..c37216c
--- /dev/null
+++ b/lang/py/ivy.xml
@@ -0,0 +1,24 @@
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+<ivy-module version="2.0">
+    <info organisation="org.apache.avro" module="python"/>
+    <configurations defaultconfmapping="default"/>
+    <dependencies>
+        <dependency org="org.apache.avro" name="avro-tools"
+                    rev="${avro.version}" transitive="false"/>
+    </dependencies>
+</ivy-module>
diff --git a/lang/py/ivysettings.xml b/lang/py/ivysettings.xml
new file mode 100644
index 0000000..31de16e
--- /dev/null
+++ b/lang/py/ivysettings.xml
@@ -0,0 +1,30 @@
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+<ivysettings>
+  <settings defaultResolver="repos" />
+  <property name="m2-pattern" value="${user.home}/.m2/repository/[organisation]/[module]/[revision]/[module]-[revision](-[classifier]).[ext]" override="false" />
+  <resolvers>
+    <chain name="repos">
+      <ibiblio name="central" m2compatible="true"/>   
+      <ibiblio name="apache-snapshots" m2compatible="true" root="https://repository.apache.org/content/groups/snapshots"/> 
+      <filesystem name="local-maven2" m2compatible="true"> <!-- needed when building non-snapshot version for release -->
+        <artifact pattern="${m2-pattern}"/>
+        <ivy pattern="${m2-pattern}"/>
+      </filesystem>
+    </chain>
+  </resolvers>
+</ivysettings>
diff --git a/lang/py/lib/pyAntTasks-1.3-LICENSE.txt b/lang/py/lib/pyAntTasks-1.3-LICENSE.txt
deleted file mode 100644
index d645695..0000000
--- a/lang/py/lib/pyAntTasks-1.3-LICENSE.txt
+++ /dev/null
@@ -1,202 +0,0 @@
-
-                                 Apache License
-                           Version 2.0, January 2004
-                        http://www.apache.org/licenses/
-
-   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
-
-   1. Definitions.
-
-      "License" shall mean the terms and conditions for use, reproduction,
-      and distribution as defined by Sections 1 through 9 of this document.
-
-      "Licensor" shall mean the copyright owner or entity authorized by
-      the copyright owner that is granting the License.
-
-      "Legal Entity" shall mean the union of the acting entity and all
-      other entities that control, are controlled by, or are under common
-      control with that entity. For the purposes of this definition,
-      "control" means (i) the power, direct or indirect, to cause the
-      direction or management of such entity, whether by contract or
-      otherwise, or (ii) ownership of fifty percent (50%) or more of the
-      outstanding shares, or (iii) beneficial ownership of such entity.
-
-      "You" (or "Your") shall mean an individual or Legal Entity
-      exercising permissions granted by this License.
-
-      "Source" form shall mean the preferred form for making modifications,
-      including but not limited to software source code, documentation
-      source, and configuration files.
-
-      "Object" form shall mean any form resulting from mechanical
-      transformation or translation of a Source form, including but
-      not limited to compiled object code, generated documentation,
-      and conversions to other media types.
-
-      "Work" shall mean the work of authorship, whether in Source or
-      Object form, made available under the License, as indicated by a
-      copyright notice that is included in or attached to the work
-      (an example is provided in the Appendix below).
-
-      "Derivative Works" shall mean any work, whether in Source or Object
-      form, that is based on (or derived from) the Work and for which the
-      editorial revisions, annotations, elaborations, or other modifications
-      represent, as a whole, an original work of authorship. For the purposes
-      of this License, Derivative Works shall not include works that remain
-      separable from, or merely link (or bind by name) to the interfaces of,
-      the Work and Derivative Works thereof.
-
-      "Contribution" shall mean any work of authorship, including
-      the original version of the Work and any modifications or additions
-      to that Work or Derivative Works thereof, that is intentionally
-      submitted to Licensor for inclusion in the Work by the copyright owner
-      or by an individual or Legal Entity authorized to submit on behalf of
-      the copyright owner. For the purposes of this definition, "submitted"
-      means any form of electronic, verbal, or written communication sent
-      to the Licensor or its representatives, including but not limited to
-      communication on electronic mailing lists, source code control systems,
-      and issue tracking systems that are managed by, or on behalf of, the
-      Licensor for the purpose of discussing and improving the Work, but
-      excluding communication that is conspicuously marked or otherwise
-      designated in writing by the copyright owner as "Not a Contribution."
-
-      "Contributor" shall mean Licensor and any individual or Legal Entity
-      on behalf of whom a Contribution has been received by Licensor and
-      subsequently incorporated within the Work.
-
-   2. Grant of Copyright License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      copyright license to reproduce, prepare Derivative Works of,
-      publicly display, publicly perform, sublicense, and distribute the
-      Work and such Derivative Works in Source or Object form.
-
-   3. Grant of Patent License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      (except as stated in this section) patent license to make, have made,
-      use, offer to sell, sell, import, and otherwise transfer the Work,
-      where such license applies only to those patent claims licensable
-      by such Contributor that are necessarily infringed by their
-      Contribution(s) alone or by combination of their Contribution(s)
-      with the Work to which such Contribution(s) was submitted. If You
-      institute patent litigation against any entity (including a
-      cross-claim or counterclaim in a lawsuit) alleging that the Work
-      or a Contribution incorporated within the Work constitutes direct
-      or contributory patent infringement, then any patent licenses
-      granted to You under this License for that Work shall terminate
-      as of the date such litigation is filed.
-
-   4. Redistribution. You may reproduce and distribute copies of the
-      Work or Derivative Works thereof in any medium, with or without
-      modifications, and in Source or Object form, provided that You
-      meet the following conditions:
-
-      (a) You must give any other recipients of the Work or
-          Derivative Works a copy of this License; and
-
-      (b) You must cause any modified files to carry prominent notices
-          stating that You changed the files; and
-
-      (c) You must retain, in the Source form of any Derivative Works
-          that You distribute, all copyright, patent, trademark, and
-          attribution notices from the Source form of the Work,
-          excluding those notices that do not pertain to any part of
-          the Derivative Works; and
-
-      (d) If the Work includes a "NOTICE" text file as part of its
-          distribution, then any Derivative Works that You distribute must
-          include a readable copy of the attribution notices contained
-          within such NOTICE file, excluding those notices that do not
-          pertain to any part of the Derivative Works, in at least one
-          of the following places: within a NOTICE text file distributed
-          as part of the Derivative Works; within the Source form or
-          documentation, if provided along with the Derivative Works; or,
-          within a display generated by the Derivative Works, if and
-          wherever such third-party notices normally appear. The contents
-          of the NOTICE file are for informational purposes only and
-          do not modify the License. You may add Your own attribution
-          notices within Derivative Works that You distribute, alongside
-          or as an addendum to the NOTICE text from the Work, provided
-          that such additional attribution notices cannot be construed
-          as modifying the License.
-
-      You may add Your own copyright statement to Your modifications and
-      may provide additional or different license terms and conditions
-      for use, reproduction, or distribution of Your modifications, or
-      for any such Derivative Works as a whole, provided Your use,
-      reproduction, and distribution of the Work otherwise complies with
-      the conditions stated in this License.
-
-   5. Submission of Contributions. Unless You explicitly state otherwise,
-      any Contribution intentionally submitted for inclusion in the Work
-      by You to the Licensor shall be under the terms and conditions of
-      this License, without any additional terms or conditions.
-      Notwithstanding the above, nothing herein shall supersede or modify
-      the terms of any separate license agreement you may have executed
-      with Licensor regarding such Contributions.
-
-   6. Trademarks. This License does not grant permission to use the trade
-      names, trademarks, service marks, or product names of the Licensor,
-      except as required for reasonable and customary use in describing the
-      origin of the Work and reproducing the content of the NOTICE file.
-
-   7. Disclaimer of Warranty. Unless required by applicable law or
-      agreed to in writing, Licensor provides the Work (and each
-      Contributor provides its Contributions) on an "AS IS" BASIS,
-      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
-      implied, including, without limitation, any warranties or conditions
-      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
-      PARTICULAR PURPOSE. You are solely responsible for determining the
-      appropriateness of using or redistributing the Work and assume any
-      risks associated with Your exercise of permissions under this License.
-
-   8. Limitation of Liability. In no event and under no legal theory,
-      whether in tort (including negligence), contract, or otherwise,
-      unless required by applicable law (such as deliberate and grossly
-      negligent acts) or agreed to in writing, shall any Contributor be
-      liable to You for damages, including any direct, indirect, special,
-      incidental, or consequential damages of any character arising as a
-      result of this License or out of the use or inability to use the
-      Work (including but not limited to damages for loss of goodwill,
-      work stoppage, computer failure or malfunction, or any and all
-      other commercial damages or losses), even if such Contributor
-      has been advised of the possibility of such damages.
-
-   9. Accepting Warranty or Additional Liability. While redistributing
-      the Work or Derivative Works thereof, You may choose to offer,
-      and charge a fee for, acceptance of support, warranty, indemnity,
-      or other liability obligations and/or rights consistent with this
-      License. However, in accepting such obligations, You may act only
-      on Your own behalf and on Your sole responsibility, not on behalf
-      of any other Contributor, and only if You agree to indemnify,
-      defend, and hold each Contributor harmless for any liability
-      incurred by, or claims asserted against, such Contributor by reason
-      of your accepting any such warranty or additional liability.
-
-   END OF TERMS AND CONDITIONS
-
-   APPENDIX: How to apply the Apache License to your work.
-
-      To apply the Apache License to your work, attach the following
-      boilerplate notice, with the fields enclosed by brackets "[]"
-      replaced with your own identifying information. (Don't include
-      the brackets!)  The text should be enclosed in the appropriate
-      comment syntax for the file format. We also recommend that a
-      file or class name and description of purpose be included on the
-      same "printed page" as the copyright notice for easier
-      identification within third-party archives.
-
-   Copyright [yyyy] [name of copyright owner]
-
-   Licensed under the Apache License, Version 2.0 (the "License");
-   you may not use this file except in compliance with the License.
-   You may obtain a copy of the License at
-
-       http://www.apache.org/licenses/LICENSE-2.0
-
-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
diff --git a/lang/py/lib/pyAntTasks-1.3.jar b/lang/py/lib/pyAntTasks-1.3.jar
deleted file mode 100644
index 53a7877..0000000
Binary files a/lang/py/lib/pyAntTasks-1.3.jar and /dev/null differ
diff --git a/lang/py/lib/simplejson/LICENSE.txt b/lang/py/lib/simplejson/LICENSE.txt
deleted file mode 100644
index ad95f29..0000000
--- a/lang/py/lib/simplejson/LICENSE.txt
+++ /dev/null
@@ -1,19 +0,0 @@
-Copyright (c) 2006 Bob Ippolito
-
-Permission is hereby granted, free of charge, to any person obtaining a copy of
-this software and associated documentation files (the "Software"), to deal in
-the Software without restriction, including without limitation the rights to
-use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
-of the Software, and to permit persons to whom the Software is furnished to do
-so, subject to the following conditions:
-
-The above copyright notice and this permission notice shall be included in all
-copies or substantial portions of the Software.
-
-THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
-IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
-FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
-AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
-LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
-OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
-SOFTWARE.
diff --git a/lang/py/lib/simplejson/__init__.py b/lang/py/lib/simplejson/__init__.py
deleted file mode 100644
index d5b4d39..0000000
--- a/lang/py/lib/simplejson/__init__.py
+++ /dev/null
@@ -1,318 +0,0 @@
-r"""JSON (JavaScript Object Notation) <http://json.org> is a subset of
-JavaScript syntax (ECMA-262 3rd edition) used as a lightweight data
-interchange format.
-
-:mod:`simplejson` exposes an API familiar to users of the standard library
-:mod:`marshal` and :mod:`pickle` modules. It is the externally maintained
-version of the :mod:`json` library contained in Python 2.6, but maintains
-compatibility with Python 2.4 and Python 2.5 and (currently) has
-significant performance advantages, even without using the optional C
-extension for speedups.
-
-Encoding basic Python object hierarchies::
-
-    >>> import simplejson as json
-    >>> json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])
-    '["foo", {"bar": ["baz", null, 1.0, 2]}]'
-    >>> print json.dumps("\"foo\bar")
-    "\"foo\bar"
-    >>> print json.dumps(u'\u1234')
-    "\u1234"
-    >>> print json.dumps('\\')
-    "\\"
-    >>> print json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True)
-    {"a": 0, "b": 0, "c": 0}
-    >>> from StringIO import StringIO
-    >>> io = StringIO()
-    >>> json.dump(['streaming API'], io)
-    >>> io.getvalue()
-    '["streaming API"]'
-
-Compact encoding::
-
-    >>> import simplejson as json
-    >>> json.dumps([1,2,3,{'4': 5, '6': 7}], separators=(',',':'))
-    '[1,2,3,{"4":5,"6":7}]'
-
-Pretty printing::
-
-    >>> import simplejson as json
-    >>> s = json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4)
-    >>> print '\n'.join([l.rstrip() for l in  s.splitlines()])
-    {
-        "4": 5,
-        "6": 7
-    }
-
-Decoding JSON::
-
-    >>> import simplejson as json
-    >>> obj = [u'foo', {u'bar': [u'baz', None, 1.0, 2]}]
-    >>> json.loads('["foo", {"bar":["baz", null, 1.0, 2]}]') == obj
-    True
-    >>> json.loads('"\\"foo\\bar"') == u'"foo\x08ar'
-    True
-    >>> from StringIO import StringIO
-    >>> io = StringIO('["streaming API"]')
-    >>> json.load(io)[0] == 'streaming API'
-    True
-
-Specializing JSON object decoding::
-
-    >>> import simplejson as json
-    >>> def as_complex(dct):
-    ...     if '__complex__' in dct:
-    ...         return complex(dct['real'], dct['imag'])
-    ...     return dct
-    ...
-    >>> json.loads('{"__complex__": true, "real": 1, "imag": 2}',
-    ...     object_hook=as_complex)
-    (1+2j)
-    >>> import decimal
-    >>> json.loads('1.1', parse_float=decimal.Decimal) == decimal.Decimal('1.1')
-    True
-
-Specializing JSON object encoding::
-
-    >>> import simplejson as json
-    >>> def encode_complex(obj):
-    ...     if isinstance(obj, complex):
-    ...         return [obj.real, obj.imag]
-    ...     raise TypeError(repr(o) + " is not JSON serializable")
-    ...
-    >>> json.dumps(2 + 1j, default=encode_complex)
-    '[2.0, 1.0]'
-    >>> json.JSONEncoder(default=encode_complex).encode(2 + 1j)
-    '[2.0, 1.0]'
-    >>> ''.join(json.JSONEncoder(default=encode_complex).iterencode(2 + 1j))
-    '[2.0, 1.0]'
-
-
-Using simplejson.tool from the shell to validate and pretty-print::
-
-    $ echo '{"json":"obj"}' | python -m simplejson.tool
-    {
-        "json": "obj"
-    }
-    $ echo '{ 1.2:3.4}' | python -m simplejson.tool
-    Expecting property name: line 1 column 2 (char 2)
-"""
-__version__ = '2.0.9'
-__all__ = [
-    'dump', 'dumps', 'load', 'loads',
-    'JSONDecoder', 'JSONEncoder',
-]
-
-__author__ = 'Bob Ippolito <bob at redivi.com>'
-
-from decoder import JSONDecoder
-from encoder import JSONEncoder
-
-_default_encoder = JSONEncoder(
-    skipkeys=False,
-    ensure_ascii=True,
-    check_circular=True,
-    allow_nan=True,
-    indent=None,
-    separators=None,
-    encoding='utf-8',
-    default=None,
-)
-
-def dump(obj, fp, skipkeys=False, ensure_ascii=True, check_circular=True,
-        allow_nan=True, cls=None, indent=None, separators=None,
-        encoding='utf-8', default=None, **kw):
-    """Serialize ``obj`` as a JSON formatted stream to ``fp`` (a
-    ``.write()``-supporting file-like object).
-
-    If ``skipkeys`` is true then ``dict`` keys that are not basic types
-    (``str``, ``unicode``, ``int``, ``long``, ``float``, ``bool``, ``None``)
-    will be skipped instead of raising a ``TypeError``.
-
-    If ``ensure_ascii`` is false, then the some chunks written to ``fp``
-    may be ``unicode`` instances, subject to normal Python ``str`` to
-    ``unicode`` coercion rules. Unless ``fp.write()`` explicitly
-    understands ``unicode`` (as in ``codecs.getwriter()``) this is likely
-    to cause an error.
-
-    If ``check_circular`` is false, then the circular reference check
-    for container types will be skipped and a circular reference will
-    result in an ``OverflowError`` (or worse).
-
-    If ``allow_nan`` is false, then it will be a ``ValueError`` to
-    serialize out of range ``float`` values (``nan``, ``inf``, ``-inf``)
-    in strict compliance of the JSON specification, instead of using the
-    JavaScript equivalents (``NaN``, ``Infinity``, ``-Infinity``).
-
-    If ``indent`` is a non-negative integer, then JSON array elements and object
-    members will be pretty-printed with that indent level. An indent level
-    of 0 will only insert newlines. ``None`` is the most compact representation.
-
-    If ``separators`` is an ``(item_separator, dict_separator)`` tuple
-    then it will be used instead of the default ``(', ', ': ')`` separators.
-    ``(',', ':')`` is the most compact JSON representation.
-
-    ``encoding`` is the character encoding for str instances, default is UTF-8.
-
-    ``default(obj)`` is a function that should return a serializable version
-    of obj or raise TypeError. The default simply raises TypeError.
-
-    To use a custom ``JSONEncoder`` subclass (e.g. one that overrides the
-    ``.default()`` method to serialize additional types), specify it with
-    the ``cls`` kwarg.
-
-    """
-    # cached encoder
-    if (not skipkeys and ensure_ascii and
-        check_circular and allow_nan and
-        cls is None and indent is None and separators is None and
-        encoding == 'utf-8' and default is None and not kw):
-        iterable = _default_encoder.iterencode(obj)
-    else:
-        if cls is None:
-            cls = JSONEncoder
-        iterable = cls(skipkeys=skipkeys, ensure_ascii=ensure_ascii,
-            check_circular=check_circular, allow_nan=allow_nan, indent=indent,
-            separators=separators, encoding=encoding,
-            default=default, **kw).iterencode(obj)
-    # could accelerate with writelines in some versions of Python, at
-    # a debuggability cost
-    for chunk in iterable:
-        fp.write(chunk)
-
-
-def dumps(obj, skipkeys=False, ensure_ascii=True, check_circular=True,
-        allow_nan=True, cls=None, indent=None, separators=None,
-        encoding='utf-8', default=None, **kw):
-    """Serialize ``obj`` to a JSON formatted ``str``.
-
-    If ``skipkeys`` is false then ``dict`` keys that are not basic types
-    (``str``, ``unicode``, ``int``, ``long``, ``float``, ``bool``, ``None``)
-    will be skipped instead of raising a ``TypeError``.
-
-    If ``ensure_ascii`` is false, then the return value will be a
-    ``unicode`` instance subject to normal Python ``str`` to ``unicode``
-    coercion rules instead of being escaped to an ASCII ``str``.
-
-    If ``check_circular`` is false, then the circular reference check
-    for container types will be skipped and a circular reference will
-    result in an ``OverflowError`` (or worse).
-
-    If ``allow_nan`` is false, then it will be a ``ValueError`` to
-    serialize out of range ``float`` values (``nan``, ``inf``, ``-inf``) in
-    strict compliance of the JSON specification, instead of using the
-    JavaScript equivalents (``NaN``, ``Infinity``, ``-Infinity``).
-
-    If ``indent`` is a non-negative integer, then JSON array elements and
-    object members will be pretty-printed with that indent level. An indent
-    level of 0 will only insert newlines. ``None`` is the most compact
-    representation.
-
-    If ``separators`` is an ``(item_separator, dict_separator)`` tuple
-    then it will be used instead of the default ``(', ', ': ')`` separators.
-    ``(',', ':')`` is the most compact JSON representation.
-
-    ``encoding`` is the character encoding for str instances, default is UTF-8.
-
-    ``default(obj)`` is a function that should return a serializable version
-    of obj or raise TypeError. The default simply raises TypeError.
-
-    To use a custom ``JSONEncoder`` subclass (e.g. one that overrides the
-    ``.default()`` method to serialize additional types), specify it with
-    the ``cls`` kwarg.
-
-    """
-    # cached encoder
-    if (not skipkeys and ensure_ascii and
-        check_circular and allow_nan and
-        cls is None and indent is None and separators is None and
-        encoding == 'utf-8' and default is None and not kw):
-        return _default_encoder.encode(obj)
-    if cls is None:
-        cls = JSONEncoder
-    return cls(
-        skipkeys=skipkeys, ensure_ascii=ensure_ascii,
-        check_circular=check_circular, allow_nan=allow_nan, indent=indent,
-        separators=separators, encoding=encoding, default=default,
-        **kw).encode(obj)
-
-
-_default_decoder = JSONDecoder(encoding=None, object_hook=None)
-
-
-def load(fp, encoding=None, cls=None, object_hook=None, parse_float=None,
-        parse_int=None, parse_constant=None, **kw):
-    """Deserialize ``fp`` (a ``.read()``-supporting file-like object containing
-    a JSON document) to a Python object.
-
-    If the contents of ``fp`` is encoded with an ASCII based encoding other
-    than utf-8 (e.g. latin-1), then an appropriate ``encoding`` name must
-    be specified. Encodings that are not ASCII based (such as UCS-2) are
-    not allowed, and should be wrapped with
-    ``codecs.getreader(fp)(encoding)``, or simply decoded to a ``unicode``
-    object and passed to ``loads()``
-
-    ``object_hook`` is an optional function that will be called with the
-    result of any object literal decode (a ``dict``). The return value of
-    ``object_hook`` will be used instead of the ``dict``. This feature
-    can be used to implement custom decoders (e.g. JSON-RPC class hinting).
-
-    To use a custom ``JSONDecoder`` subclass, specify it with the ``cls``
-    kwarg.
-
-    """
-    return loads(fp.read(),
-        encoding=encoding, cls=cls, object_hook=object_hook,
-        parse_float=parse_float, parse_int=parse_int,
-        parse_constant=parse_constant, **kw)
-
-
-def loads(s, encoding=None, cls=None, object_hook=None, parse_float=None,
-        parse_int=None, parse_constant=None, **kw):
-    """Deserialize ``s`` (a ``str`` or ``unicode`` instance containing a JSON
-    document) to a Python object.
-
-    If ``s`` is a ``str`` instance and is encoded with an ASCII based encoding
-    other than utf-8 (e.g. latin-1) then an appropriate ``encoding`` name
-    must be specified. Encodings that are not ASCII based (such as UCS-2)
-    are not allowed and should be decoded to ``unicode`` first.
-
-    ``object_hook`` is an optional function that will be called with the
-    result of any object literal decode (a ``dict``). The return value of
-    ``object_hook`` will be used instead of the ``dict``. This feature
-    can be used to implement custom decoders (e.g. JSON-RPC class hinting).
-
-    ``parse_float``, if specified, will be called with the string
-    of every JSON float to be decoded. By default this is equivalent to
-    float(num_str). This can be used to use another datatype or parser
-    for JSON floats (e.g. decimal.Decimal).
-
-    ``parse_int``, if specified, will be called with the string
-    of every JSON int to be decoded. By default this is equivalent to
-    int(num_str). This can be used to use another datatype or parser
-    for JSON integers (e.g. float).
-
-    ``parse_constant``, if specified, will be called with one of the
-    following strings: -Infinity, Infinity, NaN, null, true, false.
-    This can be used to raise an exception if invalid JSON numbers
-    are encountered.
-
-    To use a custom ``JSONDecoder`` subclass, specify it with the ``cls``
-    kwarg.
-
-    """
-    if (cls is None and encoding is None and object_hook is None and
-            parse_int is None and parse_float is None and
-            parse_constant is None and not kw):
-        return _default_decoder.decode(s)
-    if cls is None:
-        cls = JSONDecoder
-    if object_hook is not None:
-        kw['object_hook'] = object_hook
-    if parse_float is not None:
-        kw['parse_float'] = parse_float
-    if parse_int is not None:
-        kw['parse_int'] = parse_int
-    if parse_constant is not None:
-        kw['parse_constant'] = parse_constant
-    return cls(encoding=encoding, **kw).decode(s)
diff --git a/lang/py/lib/simplejson/_speedups.c b/lang/py/lib/simplejson/_speedups.c
deleted file mode 100644
index 23b5f4a..0000000
--- a/lang/py/lib/simplejson/_speedups.c
+++ /dev/null
@@ -1,2329 +0,0 @@
-#include "Python.h"
-#include "structmember.h"
-#if PY_VERSION_HEX < 0x02060000 && !defined(Py_TYPE)
-#define Py_TYPE(ob)     (((PyObject*)(ob))->ob_type)
-#endif
-#if PY_VERSION_HEX < 0x02050000 && !defined(PY_SSIZE_T_MIN)
-typedef int Py_ssize_t;
-#define PY_SSIZE_T_MAX INT_MAX
-#define PY_SSIZE_T_MIN INT_MIN
-#define PyInt_FromSsize_t PyInt_FromLong
-#define PyInt_AsSsize_t PyInt_AsLong
-#endif
-#ifndef Py_IS_FINITE
-#define Py_IS_FINITE(X) (!Py_IS_INFINITY(X) && !Py_IS_NAN(X))
-#endif
-
-#ifdef __GNUC__
-#define UNUSED __attribute__((__unused__))
-#else
-#define UNUSED
-#endif
-
-#define DEFAULT_ENCODING "utf-8"
-
-#define PyScanner_Check(op) PyObject_TypeCheck(op, &PyScannerType)
-#define PyScanner_CheckExact(op) (Py_TYPE(op) == &PyScannerType)
-#define PyEncoder_Check(op) PyObject_TypeCheck(op, &PyEncoderType)
-#define PyEncoder_CheckExact(op) (Py_TYPE(op) == &PyEncoderType)
-
-static PyTypeObject PyScannerType;
-static PyTypeObject PyEncoderType;
-
-typedef struct _PyScannerObject {
-    PyObject_HEAD
-    PyObject *encoding;
-    PyObject *strict;
-    PyObject *object_hook;
-    PyObject *parse_float;
-    PyObject *parse_int;
-    PyObject *parse_constant;
-} PyScannerObject;
-
-static PyMemberDef scanner_members[] = {
-    {"encoding", T_OBJECT, offsetof(PyScannerObject, encoding), READONLY, "encoding"},
-    {"strict", T_OBJECT, offsetof(PyScannerObject, strict), READONLY, "strict"},
-    {"object_hook", T_OBJECT, offsetof(PyScannerObject, object_hook), READONLY, "object_hook"},
-    {"parse_float", T_OBJECT, offsetof(PyScannerObject, parse_float), READONLY, "parse_float"},
-    {"parse_int", T_OBJECT, offsetof(PyScannerObject, parse_int), READONLY, "parse_int"},
-    {"parse_constant", T_OBJECT, offsetof(PyScannerObject, parse_constant), READONLY, "parse_constant"},
-    {NULL}
-};
-
-typedef struct _PyEncoderObject {
-    PyObject_HEAD
-    PyObject *markers;
-    PyObject *defaultfn;
-    PyObject *encoder;
-    PyObject *indent;
-    PyObject *key_separator;
-    PyObject *item_separator;
-    PyObject *sort_keys;
-    PyObject *skipkeys;
-    int fast_encode;
-    int allow_nan;
-} PyEncoderObject;
-
-static PyMemberDef encoder_members[] = {
-    {"markers", T_OBJECT, offsetof(PyEncoderObject, markers), READONLY, "markers"},
-    {"default", T_OBJECT, offsetof(PyEncoderObject, defaultfn), READONLY, "default"},
-    {"encoder", T_OBJECT, offsetof(PyEncoderObject, encoder), READONLY, "encoder"},
-    {"indent", T_OBJECT, offsetof(PyEncoderObject, indent), READONLY, "indent"},
-    {"key_separator", T_OBJECT, offsetof(PyEncoderObject, key_separator), READONLY, "key_separator"},
-    {"item_separator", T_OBJECT, offsetof(PyEncoderObject, item_separator), READONLY, "item_separator"},
-    {"sort_keys", T_OBJECT, offsetof(PyEncoderObject, sort_keys), READONLY, "sort_keys"},
-    {"skipkeys", T_OBJECT, offsetof(PyEncoderObject, skipkeys), READONLY, "skipkeys"},
-    {NULL}
-};
-
-static Py_ssize_t
-ascii_escape_char(Py_UNICODE c, char *output, Py_ssize_t chars);
-static PyObject *
-ascii_escape_unicode(PyObject *pystr);
-static PyObject *
-ascii_escape_str(PyObject *pystr);
-static PyObject *
-py_encode_basestring_ascii(PyObject* self UNUSED, PyObject *pystr);
-void init_speedups(void);
-static PyObject *
-scan_once_str(PyScannerObject *s, PyObject *pystr, Py_ssize_t idx, Py_ssize_t *next_idx_ptr);
-static PyObject *
-scan_once_unicode(PyScannerObject *s, PyObject *pystr, Py_ssize_t idx, Py_ssize_t *next_idx_ptr);
-static PyObject *
-_build_rval_index_tuple(PyObject *rval, Py_ssize_t idx);
-static PyObject *
-scanner_new(PyTypeObject *type, PyObject *args, PyObject *kwds);
-static int
-scanner_init(PyObject *self, PyObject *args, PyObject *kwds);
-static void
-scanner_dealloc(PyObject *self);
-static int
-scanner_clear(PyObject *self);
-static PyObject *
-encoder_new(PyTypeObject *type, PyObject *args, PyObject *kwds);
-static int
-encoder_init(PyObject *self, PyObject *args, PyObject *kwds);
-static void
-encoder_dealloc(PyObject *self);
-static int
-encoder_clear(PyObject *self);
-static int
-encoder_listencode_list(PyEncoderObject *s, PyObject *rval, PyObject *seq, Py_ssize_t indent_level);
-static int
-encoder_listencode_obj(PyEncoderObject *s, PyObject *rval, PyObject *obj, Py_ssize_t indent_level);
-static int
-encoder_listencode_dict(PyEncoderObject *s, PyObject *rval, PyObject *dct, Py_ssize_t indent_level);
-static PyObject *
-_encoded_const(PyObject *const);
-static void
-raise_errmsg(char *msg, PyObject *s, Py_ssize_t end);
-static PyObject *
-encoder_encode_string(PyEncoderObject *s, PyObject *obj);
-static int
-_convertPyInt_AsSsize_t(PyObject *o, Py_ssize_t *size_ptr);
-static PyObject *
-_convertPyInt_FromSsize_t(Py_ssize_t *size_ptr);
-static PyObject *
-encoder_encode_float(PyEncoderObject *s, PyObject *obj);
-
-#define S_CHAR(c) (c >= ' ' && c <= '~' && c != '\\' && c != '"')
-#define IS_WHITESPACE(c) (((c) == ' ') || ((c) == '\t') || ((c) == '\n') || ((c) == '\r'))
-
-#define MIN_EXPANSION 6
-#ifdef Py_UNICODE_WIDE
-#define MAX_EXPANSION (2 * MIN_EXPANSION)
-#else
-#define MAX_EXPANSION MIN_EXPANSION
-#endif
-
-static int
-_convertPyInt_AsSsize_t(PyObject *o, Py_ssize_t *size_ptr)
-{
-    /* PyObject to Py_ssize_t converter */
-    *size_ptr = PyInt_AsSsize_t(o);
-    if (*size_ptr == -1 && PyErr_Occurred());
-        return 1;
-    return 0;
-}
-
-static PyObject *
-_convertPyInt_FromSsize_t(Py_ssize_t *size_ptr)
-{
-    /* Py_ssize_t to PyObject converter */
-    return PyInt_FromSsize_t(*size_ptr);
-}
-
-static Py_ssize_t
-ascii_escape_char(Py_UNICODE c, char *output, Py_ssize_t chars)
-{
-    /* Escape unicode code point c to ASCII escape sequences
-    in char *output. output must have at least 12 bytes unused to
-    accommodate an escaped surrogate pair "\uXXXX\uXXXX" */
-    output[chars++] = '\\';
-    switch (c) {
-        case '\\': output[chars++] = (char)c; break;
-        case '"': output[chars++] = (char)c; break;
-        case '\b': output[chars++] = 'b'; break;
-        case '\f': output[chars++] = 'f'; break;
-        case '\n': output[chars++] = 'n'; break;
-        case '\r': output[chars++] = 'r'; break;
-        case '\t': output[chars++] = 't'; break;
-        default:
-#ifdef Py_UNICODE_WIDE
-            if (c >= 0x10000) {
-                /* UTF-16 surrogate pair */
-                Py_UNICODE v = c - 0x10000;
-                c = 0xd800 | ((v >> 10) & 0x3ff);
-                output[chars++] = 'u';
-                output[chars++] = "0123456789abcdef"[(c >> 12) & 0xf];
-                output[chars++] = "0123456789abcdef"[(c >>  8) & 0xf];
-                output[chars++] = "0123456789abcdef"[(c >>  4) & 0xf];
-                output[chars++] = "0123456789abcdef"[(c      ) & 0xf];
-                c = 0xdc00 | (v & 0x3ff);
-                output[chars++] = '\\';
-            }
-#endif
-            output[chars++] = 'u';
-            output[chars++] = "0123456789abcdef"[(c >> 12) & 0xf];
-            output[chars++] = "0123456789abcdef"[(c >>  8) & 0xf];
-            output[chars++] = "0123456789abcdef"[(c >>  4) & 0xf];
-            output[chars++] = "0123456789abcdef"[(c      ) & 0xf];
-    }
-    return chars;
-}
-
-static PyObject *
-ascii_escape_unicode(PyObject *pystr)
-{
-    /* Take a PyUnicode pystr and return a new ASCII-only escaped PyString */
-    Py_ssize_t i;
-    Py_ssize_t input_chars;
-    Py_ssize_t output_size;
-    Py_ssize_t max_output_size;
-    Py_ssize_t chars;
-    PyObject *rval;
-    char *output;
-    Py_UNICODE *input_unicode;
-
-    input_chars = PyUnicode_GET_SIZE(pystr);
-    input_unicode = PyUnicode_AS_UNICODE(pystr);
-
-    /* One char input can be up to 6 chars output, estimate 4 of these */
-    output_size = 2 + (MIN_EXPANSION * 4) + input_chars;
-    max_output_size = 2 + (input_chars * MAX_EXPANSION);
-    rval = PyString_FromStringAndSize(NULL, output_size);
-    if (rval == NULL) {
-        return NULL;
-    }
-    output = PyString_AS_STRING(rval);
-    chars = 0;
-    output[chars++] = '"';
-    for (i = 0; i < input_chars; i++) {
-        Py_UNICODE c = input_unicode[i];
-        if (S_CHAR(c)) {
-            output[chars++] = (char)c;
-        }
-        else {
-            chars = ascii_escape_char(c, output, chars);
-        }
-        if (output_size - chars < (1 + MAX_EXPANSION)) {
-            /* There's more than four, so let's resize by a lot */
-            Py_ssize_t new_output_size = output_size * 2;
-            /* This is an upper bound */
-            if (new_output_size > max_output_size) {
-                new_output_size = max_output_size;
-            }
-            /* Make sure that the output size changed before resizing */
-            if (new_output_size != output_size) {
-                output_size = new_output_size;
-                if (_PyString_Resize(&rval, output_size) == -1) {
-                    return NULL;
-                }
-                output = PyString_AS_STRING(rval);
-            }
-        }
-    }
-    output[chars++] = '"';
-    if (_PyString_Resize(&rval, chars) == -1) {
-        return NULL;
-    }
-    return rval;
-}
-
-static PyObject *
-ascii_escape_str(PyObject *pystr)
-{
-    /* Take a PyString pystr and return a new ASCII-only escaped PyString */
-    Py_ssize_t i;
-    Py_ssize_t input_chars;
-    Py_ssize_t output_size;
-    Py_ssize_t chars;
-    PyObject *rval;
-    char *output;
-    char *input_str;
-
-    input_chars = PyString_GET_SIZE(pystr);
-    input_str = PyString_AS_STRING(pystr);
-
-    /* Fast path for a string that's already ASCII */
-    for (i = 0; i < input_chars; i++) {
-        Py_UNICODE c = (Py_UNICODE)(unsigned char)input_str[i];
-        if (!S_CHAR(c)) {
-            /* If we have to escape something, scan the string for unicode */
-            Py_ssize_t j;
-            for (j = i; j < input_chars; j++) {
-                c = (Py_UNICODE)(unsigned char)input_str[j];
-                if (c > 0x7f) {
-                    /* We hit a non-ASCII character, bail to unicode mode */
-                    PyObject *uni;
-                    uni = PyUnicode_DecodeUTF8(input_str, input_chars, "strict");
-                    if (uni == NULL) {
-                        return NULL;
-                    }
-                    rval = ascii_escape_unicode(uni);
-                    Py_DECREF(uni);
-                    return rval;
-                }
-            }
-            break;
-        }
-    }
-
-    if (i == input_chars) {
-        /* Input is already ASCII */
-        output_size = 2 + input_chars;
-    }
-    else {
-        /* One char input can be up to 6 chars output, estimate 4 of these */
-        output_size = 2 + (MIN_EXPANSION * 4) + input_chars;
-    }
-    rval = PyString_FromStringAndSize(NULL, output_size);
-    if (rval == NULL) {
-        return NULL;
-    }
-    output = PyString_AS_STRING(rval);
-    output[0] = '"';
-
-    /* We know that everything up to i is ASCII already */
-    chars = i + 1;
-    memcpy(&output[1], input_str, i);
-
-    for (; i < input_chars; i++) {
-        Py_UNICODE c = (Py_UNICODE)(unsigned char)input_str[i];
-        if (S_CHAR(c)) {
-            output[chars++] = (char)c;
-        }
-        else {
-            chars = ascii_escape_char(c, output, chars);
-        }
-        /* An ASCII char can't possibly expand to a surrogate! */
-        if (output_size - chars < (1 + MIN_EXPANSION)) {
-            /* There's more than four, so let's resize by a lot */
-            output_size *= 2;
-            if (output_size > 2 + (input_chars * MIN_EXPANSION)) {
-                output_size = 2 + (input_chars * MIN_EXPANSION);
-            }
-            if (_PyString_Resize(&rval, output_size) == -1) {
-                return NULL;
-            }
-            output = PyString_AS_STRING(rval);
-        }
-    }
-    output[chars++] = '"';
-    if (_PyString_Resize(&rval, chars) == -1) {
-        return NULL;
-    }
-    return rval;
-}
-
-static void
-raise_errmsg(char *msg, PyObject *s, Py_ssize_t end)
-{
-    /* Use the Python function simplejson.decoder.errmsg to raise a nice
-    looking ValueError exception */
-    static PyObject *errmsg_fn = NULL;
-    PyObject *pymsg;
-    if (errmsg_fn == NULL) {
-        PyObject *decoder = PyImport_ImportModule("simplejson.decoder");
-        if (decoder == NULL)
-            return;
-        errmsg_fn = PyObject_GetAttrString(decoder, "errmsg");
-        Py_DECREF(decoder);
-        if (errmsg_fn == NULL)
-            return;
-    }
-    pymsg = PyObject_CallFunction(errmsg_fn, "(zOO&)", msg, s, _convertPyInt_FromSsize_t, &end);
-    if (pymsg) {
-        PyErr_SetObject(PyExc_ValueError, pymsg);
-        Py_DECREF(pymsg);
-    }
-}
-
-static PyObject *
-join_list_unicode(PyObject *lst)
-{
-    /* return u''.join(lst) */
-    static PyObject *joinfn = NULL;
-    if (joinfn == NULL) {
-        PyObject *ustr = PyUnicode_FromUnicode(NULL, 0);
-        if (ustr == NULL)
-            return NULL;
-
-        joinfn = PyObject_GetAttrString(ustr, "join");
-        Py_DECREF(ustr);
-        if (joinfn == NULL)
-            return NULL;
-    }
-    return PyObject_CallFunctionObjArgs(joinfn, lst, NULL);
-}
-
-static PyObject *
-join_list_string(PyObject *lst)
-{
-    /* return ''.join(lst) */
-    static PyObject *joinfn = NULL;
-    if (joinfn == NULL) {
-        PyObject *ustr = PyString_FromStringAndSize(NULL, 0);
-        if (ustr == NULL)
-            return NULL;
-
-        joinfn = PyObject_GetAttrString(ustr, "join");
-        Py_DECREF(ustr);
-        if (joinfn == NULL)
-            return NULL;
-    }
-    return PyObject_CallFunctionObjArgs(joinfn, lst, NULL);
-}
-
-static PyObject *
-_build_rval_index_tuple(PyObject *rval, Py_ssize_t idx) {
-    /* return (rval, idx) tuple, stealing reference to rval */
-    PyObject *tpl;
-    PyObject *pyidx;
-    /*
-    steal a reference to rval, returns (rval, idx)
-    */
-    if (rval == NULL) {
-        return NULL;
-    }
-    pyidx = PyInt_FromSsize_t(idx);
-    if (pyidx == NULL) {
-        Py_DECREF(rval);
-        return NULL;
-    }
-    tpl = PyTuple_New(2);
-    if (tpl == NULL) {
-        Py_DECREF(pyidx);
-        Py_DECREF(rval);
-        return NULL;
-    }
-    PyTuple_SET_ITEM(tpl, 0, rval);
-    PyTuple_SET_ITEM(tpl, 1, pyidx);
-    return tpl;
-}
-
-static PyObject *
-scanstring_str(PyObject *pystr, Py_ssize_t end, char *encoding, int strict, Py_ssize_t *next_end_ptr)
-{
-    /* Read the JSON string from PyString pystr.
-    end is the index of the first character after the quote.
-    encoding is the encoding of pystr (must be an ASCII superset)
-    if strict is zero then literal control characters are allowed
-    *next_end_ptr is a return-by-reference index of the character
-        after the end quote
-
-    Return value is a new PyString (if ASCII-only) or PyUnicode
-    */
-    PyObject *rval;
-    Py_ssize_t len = PyString_GET_SIZE(pystr);
-    Py_ssize_t begin = end - 1;
-    Py_ssize_t next = begin;
-    int has_unicode = 0;
-    char *buf = PyString_AS_STRING(pystr);
-    PyObject *chunks = PyList_New(0);
-    if (chunks == NULL) {
-        goto bail;
-    }
-    if (end < 0 || len <= end) {
-        PyErr_SetString(PyExc_ValueError, "end is out of bounds");
-        goto bail;
-    }
-    while (1) {
-        /* Find the end of the string or the next escape */
-        Py_UNICODE c = 0;
-        PyObject *chunk = NULL;
-        for (next = end; next < len; next++) {
-            c = (unsigned char)buf[next];
-            if (c == '"' || c == '\\') {
-                break;
-            }
-            else if (strict && c <= 0x1f) {
-                raise_errmsg("Invalid control character at", pystr, next);
-                goto bail;
-            }
-            else if (c > 0x7f) {
-                has_unicode = 1;
-            }
-        }
-        if (!(c == '"' || c == '\\')) {
-            raise_errmsg("Unterminated string starting at", pystr, begin);
-            goto bail;
-        }
-        /* Pick up this chunk if it's not zero length */
-        if (next != end) {
-            PyObject *strchunk = PyString_FromStringAndSize(&buf[end], next - end);
-            if (strchunk == NULL) {
-                goto bail;
-            }
-            if (has_unicode) {
-                chunk = PyUnicode_FromEncodedObject(strchunk, encoding, NULL);
-                Py_DECREF(strchunk);
-                if (chunk == NULL) {
-                    goto bail;
-                }
-            }
-            else {
-                chunk = strchunk;
-            }
-            if (PyList_Append(chunks, chunk)) {
-                Py_DECREF(chunk);
-                goto bail;
-            }
-            Py_DECREF(chunk);
-        }
-        next++;
-        if (c == '"') {
-            end = next;
-            break;
-        }
-        if (next == len) {
-            raise_errmsg("Unterminated string starting at", pystr, begin);
-            goto bail;
-        }
-        c = buf[next];
-        if (c != 'u') {
-            /* Non-unicode backslash escapes */
-            end = next + 1;
-            switch (c) {
-                case '"': break;
-                case '\\': break;
-                case '/': break;
-                case 'b': c = '\b'; break;
-                case 'f': c = '\f'; break;
-                case 'n': c = '\n'; break;
-                case 'r': c = '\r'; break;
-                case 't': c = '\t'; break;
-                default: c = 0;
-            }
-            if (c == 0) {
-                raise_errmsg("Invalid \\escape", pystr, end - 2);
-                goto bail;
-            }
-        }
-        else {
-            c = 0;
-            next++;
-            end = next + 4;
-            if (end >= len) {
-                raise_errmsg("Invalid \\uXXXX escape", pystr, next - 1);
-                goto bail;
-            }
-            /* Decode 4 hex digits */
-            for (; next < end; next++) {
-                Py_UNICODE digit = buf[next];
-                c <<= 4;
-                switch (digit) {
-                    case '0': case '1': case '2': case '3': case '4':
-                    case '5': case '6': case '7': case '8': case '9':
-                        c |= (digit - '0'); break;
-                    case 'a': case 'b': case 'c': case 'd': case 'e':
-                    case 'f':
-                        c |= (digit - 'a' + 10); break;
-                    case 'A': case 'B': case 'C': case 'D': case 'E':
-                    case 'F':
-                        c |= (digit - 'A' + 10); break;
-                    default:
-                        raise_errmsg("Invalid \\uXXXX escape", pystr, end - 5);
-                        goto bail;
-                }
-            }
-#ifdef Py_UNICODE_WIDE
-            /* Surrogate pair */
-            if ((c & 0xfc00) == 0xd800) {
-                Py_UNICODE c2 = 0;
-                if (end + 6 >= len) {
-                    raise_errmsg("Unpaired high surrogate", pystr, end - 5);
-                    goto bail;
-                }
-                if (buf[next++] != '\\' || buf[next++] != 'u') {
-                    raise_errmsg("Unpaired high surrogate", pystr, end - 5);
-                    goto bail;
-                }
-                end += 6;
-                /* Decode 4 hex digits */
-                for (; next < end; next++) {
-                    c2 <<= 4;
-                    Py_UNICODE digit = buf[next];
-                    switch (digit) {
-                        case '0': case '1': case '2': case '3': case '4':
-                        case '5': case '6': case '7': case '8': case '9':
-                            c2 |= (digit - '0'); break;
-                        case 'a': case 'b': case 'c': case 'd': case 'e':
-                        case 'f':
-                            c2 |= (digit - 'a' + 10); break;
-                        case 'A': case 'B': case 'C': case 'D': case 'E':
-                        case 'F':
-                            c2 |= (digit - 'A' + 10); break;
-                        default:
-                            raise_errmsg("Invalid \\uXXXX escape", pystr, end - 5);
-                            goto bail;
-                    }
-                }
-                if ((c2 & 0xfc00) != 0xdc00) {
-                    raise_errmsg("Unpaired high surrogate", pystr, end - 5);
-                    goto bail;
-                }
-                c = 0x10000 + (((c - 0xd800) << 10) | (c2 - 0xdc00));
-            }
-            else if ((c & 0xfc00) == 0xdc00) {
-                raise_errmsg("Unpaired low surrogate", pystr, end - 5);
-                goto bail;
-            }
-#endif
-        }
-        if (c > 0x7f) {
-            has_unicode = 1;
-        }
-        if (has_unicode) {
-            chunk = PyUnicode_FromUnicode(&c, 1);
-            if (chunk == NULL) {
-                goto bail;
-            }
-        }
-        else {
-            char c_char = Py_CHARMASK(c);
-            chunk = PyString_FromStringAndSize(&c_char, 1);
-            if (chunk == NULL) {
-                goto bail;
-            }
-        }
-        if (PyList_Append(chunks, chunk)) {
-            Py_DECREF(chunk);
-            goto bail;
-        }
-        Py_DECREF(chunk);
-    }
-
-    rval = join_list_string(chunks);
-    if (rval == NULL) {
-        goto bail;
-    }
-    Py_CLEAR(chunks);
-    *next_end_ptr = end;
-    return rval;
-bail:
-    *next_end_ptr = -1;
-    Py_XDECREF(chunks);
-    return NULL;
-}
-
-
-static PyObject *
-scanstring_unicode(PyObject *pystr, Py_ssize_t end, int strict, Py_ssize_t *next_end_ptr)
-{
-    /* Read the JSON string from PyUnicode pystr.
-    end is the index of the first character after the quote.
-    if strict is zero then literal control characters are allowed
-    *next_end_ptr is a return-by-reference index of the character
-        after the end quote
-
-    Return value is a new PyUnicode
-    */
-    PyObject *rval;
-    Py_ssize_t len = PyUnicode_GET_SIZE(pystr);
-    Py_ssize_t begin = end - 1;
-    Py_ssize_t next = begin;
-    const Py_UNICODE *buf = PyUnicode_AS_UNICODE(pystr);
-    PyObject *chunks = PyList_New(0);
-    if (chunks == NULL) {
-        goto bail;
-    }
-    if (end < 0 || len <= end) {
-        PyErr_SetString(PyExc_ValueError, "end is out of bounds");
-        goto bail;
-    }
-    while (1) {
-        /* Find the end of the string or the next escape */
-        Py_UNICODE c = 0;
-        PyObject *chunk = NULL;
-        for (next = end; next < len; next++) {
-            c = buf[next];
-            if (c == '"' || c == '\\') {
-                break;
-            }
-            else if (strict && c <= 0x1f) {
-                raise_errmsg("Invalid control character at", pystr, next);
-                goto bail;
-            }
-        }
-        if (!(c == '"' || c == '\\')) {
-            raise_errmsg("Unterminated string starting at", pystr, begin);
-            goto bail;
-        }
-        /* Pick up this chunk if it's not zero length */
-        if (next != end) {
-            chunk = PyUnicode_FromUnicode(&buf[end], next - end);
-            if (chunk == NULL) {
-                goto bail;
-            }
-            if (PyList_Append(chunks, chunk)) {
-                Py_DECREF(chunk);
-                goto bail;
-            }
-            Py_DECREF(chunk);
-        }
-        next++;
-        if (c == '"') {
-            end = next;
-            break;
-        }
-        if (next == len) {
-            raise_errmsg("Unterminated string starting at", pystr, begin);
-            goto bail;
-        }
-        c = buf[next];
-        if (c != 'u') {
-            /* Non-unicode backslash escapes */
-            end = next + 1;
-            switch (c) {
-                case '"': break;
-                case '\\': break;
-                case '/': break;
-                case 'b': c = '\b'; break;
-                case 'f': c = '\f'; break;
-                case 'n': c = '\n'; break;
-                case 'r': c = '\r'; break;
-                case 't': c = '\t'; break;
-                default: c = 0;
-            }
-            if (c == 0) {
-                raise_errmsg("Invalid \\escape", pystr, end - 2);
-                goto bail;
-            }
-        }
-        else {
-            c = 0;
-            next++;
-            end = next + 4;
-            if (end >= len) {
-                raise_errmsg("Invalid \\uXXXX escape", pystr, next - 1);
-                goto bail;
-            }
-            /* Decode 4 hex digits */
-            for (; next < end; next++) {
-                Py_UNICODE digit = buf[next];
-                c <<= 4;
-                switch (digit) {
-                    case '0': case '1': case '2': case '3': case '4':
-                    case '5': case '6': case '7': case '8': case '9':
-                        c |= (digit - '0'); break;
-                    case 'a': case 'b': case 'c': case 'd': case 'e':
-                    case 'f':
-                        c |= (digit - 'a' + 10); break;
-                    case 'A': case 'B': case 'C': case 'D': case 'E':
-                    case 'F':
-                        c |= (digit - 'A' + 10); break;
-                    default:
-                        raise_errmsg("Invalid \\uXXXX escape", pystr, end - 5);
-                        goto bail;
-                }
-            }
-#ifdef Py_UNICODE_WIDE
-            /* Surrogate pair */
-            if ((c & 0xfc00) == 0xd800) {
-                Py_UNICODE c2 = 0;
-                if (end + 6 >= len) {
-                    raise_errmsg("Unpaired high surrogate", pystr, end - 5);
-                    goto bail;
-                }
-                if (buf[next++] != '\\' || buf[next++] != 'u') {
-                    raise_errmsg("Unpaired high surrogate", pystr, end - 5);
-                    goto bail;
-                }
-                end += 6;
-                /* Decode 4 hex digits */
-                for (; next < end; next++) {
-                    c2 <<= 4;
-                    Py_UNICODE digit = buf[next];
-                    switch (digit) {
-                        case '0': case '1': case '2': case '3': case '4':
-                        case '5': case '6': case '7': case '8': case '9':
-                            c2 |= (digit - '0'); break;
-                        case 'a': case 'b': case 'c': case 'd': case 'e':
-                        case 'f':
-                            c2 |= (digit - 'a' + 10); break;
-                        case 'A': case 'B': case 'C': case 'D': case 'E':
-                        case 'F':
-                            c2 |= (digit - 'A' + 10); break;
-                        default:
-                            raise_errmsg("Invalid \\uXXXX escape", pystr, end - 5);
-                            goto bail;
-                    }
-                }
-                if ((c2 & 0xfc00) != 0xdc00) {
-                    raise_errmsg("Unpaired high surrogate", pystr, end - 5);
-                    goto bail;
-                }
-                c = 0x10000 + (((c - 0xd800) << 10) | (c2 - 0xdc00));
-            }
-            else if ((c & 0xfc00) == 0xdc00) {
-                raise_errmsg("Unpaired low surrogate", pystr, end - 5);
-                goto bail;
-            }
-#endif
-        }
-        chunk = PyUnicode_FromUnicode(&c, 1);
-        if (chunk == NULL) {
-            goto bail;
-        }
-        if (PyList_Append(chunks, chunk)) {
-            Py_DECREF(chunk);
-            goto bail;
-        }
-        Py_DECREF(chunk);
-    }
-
-    rval = join_list_unicode(chunks);
-    if (rval == NULL) {
-        goto bail;
-    }
-    Py_DECREF(chunks);
-    *next_end_ptr = end;
-    return rval;
-bail:
-    *next_end_ptr = -1;
-    Py_XDECREF(chunks);
-    return NULL;
-}
-
-PyDoc_STRVAR(pydoc_scanstring,
-    "scanstring(basestring, end, encoding, strict=True) -> (str, end)\n"
-    "\n"
-    "Scan the string s for a JSON string. End is the index of the\n"
-    "character in s after the quote that started the JSON string.\n"
-    "Unescapes all valid JSON string escape sequences and raises ValueError\n"
-    "on attempt to decode an invalid string. If strict is False then literal\n"
-    "control characters are allowed in the string.\n"
-    "\n"
-    "Returns a tuple of the decoded string and the index of the character in s\n"
-    "after the end quote."
-);
-
-static PyObject *
-py_scanstring(PyObject* self UNUSED, PyObject *args)
-{
-    PyObject *pystr;
-    PyObject *rval;
-    Py_ssize_t end;
-    Py_ssize_t next_end = -1;
-    char *encoding = NULL;
-    int strict = 1;
-    if (!PyArg_ParseTuple(args, "OO&|zi:scanstring", &pystr, _convertPyInt_AsSsize_t, &end, &encoding, &strict)) {
-        return NULL;
-    }
-    if (encoding == NULL) {
-        encoding = DEFAULT_ENCODING;
-    }
-    if (PyString_Check(pystr)) {
-        rval = scanstring_str(pystr, end, encoding, strict, &next_end);
-    }
-    else if (PyUnicode_Check(pystr)) {
-        rval = scanstring_unicode(pystr, end, strict, &next_end);
-    }
-    else {
-        PyErr_Format(PyExc_TypeError,
-                     "first argument must be a string, not %.80s",
-                     Py_TYPE(pystr)->tp_name);
-        return NULL;
-    }
-    return _build_rval_index_tuple(rval, next_end);
-}
-
-PyDoc_STRVAR(pydoc_encode_basestring_ascii,
-    "encode_basestring_ascii(basestring) -> str\n"
-    "\n"
-    "Return an ASCII-only JSON representation of a Python string"
-);
-
-static PyObject *
-py_encode_basestring_ascii(PyObject* self UNUSED, PyObject *pystr)
-{
-    /* Return an ASCII-only JSON representation of a Python string */
-    /* METH_O */
-    if (PyString_Check(pystr)) {
-        return ascii_escape_str(pystr);
-    }
-    else if (PyUnicode_Check(pystr)) {
-        return ascii_escape_unicode(pystr);
-    }
-    else {
-        PyErr_Format(PyExc_TypeError,
-                     "first argument must be a string, not %.80s",
-                     Py_TYPE(pystr)->tp_name);
-        return NULL;
-    }
-}
-
-static void
-scanner_dealloc(PyObject *self)
-{
-    /* Deallocate scanner object */
-    scanner_clear(self);
-    Py_TYPE(self)->tp_free(self);
-}
-
-static int
-scanner_traverse(PyObject *self, visitproc visit, void *arg)
-{
-    PyScannerObject *s;
-    assert(PyScanner_Check(self));
-    s = (PyScannerObject *)self;
-    Py_VISIT(s->encoding);
-    Py_VISIT(s->strict);
-    Py_VISIT(s->object_hook);
-    Py_VISIT(s->parse_float);
-    Py_VISIT(s->parse_int);
-    Py_VISIT(s->parse_constant);
-    return 0;
-}
-
-static int
-scanner_clear(PyObject *self)
-{
-    PyScannerObject *s;
-    assert(PyScanner_Check(self));
-    s = (PyScannerObject *)self;
-    Py_CLEAR(s->encoding);
-    Py_CLEAR(s->strict);
-    Py_CLEAR(s->object_hook);
-    Py_CLEAR(s->parse_float);
-    Py_CLEAR(s->parse_int);
-    Py_CLEAR(s->parse_constant);
-    return 0;
-}
-
-static PyObject *
-_parse_object_str(PyScannerObject *s, PyObject *pystr, Py_ssize_t idx, Py_ssize_t *next_idx_ptr) {
-    /* Read a JSON object from PyString pystr.
-    idx is the index of the first character after the opening curly brace.
-    *next_idx_ptr is a return-by-reference index to the first character after
-        the closing curly brace.
-
-    Returns a new PyObject (usually a dict, but object_hook can change that)
-    */
-    char *str = PyString_AS_STRING(pystr);
-    Py_ssize_t end_idx = PyString_GET_SIZE(pystr) - 1;
-    PyObject *rval = PyDict_New();
-    PyObject *key = NULL;
-    PyObject *val = NULL;
-    char *encoding = PyString_AS_STRING(s->encoding);
-    int strict = PyObject_IsTrue(s->strict);
-    Py_ssize_t next_idx;
-    if (rval == NULL)
-        return NULL;
-
-    /* skip whitespace after { */
-    while (idx <= end_idx && IS_WHITESPACE(str[idx])) idx++;
-
-    /* only loop if the object is non-empty */
-    if (idx <= end_idx && str[idx] != '}') {
-        while (idx <= end_idx) {
-            /* read key */
-            if (str[idx] != '"') {
-                raise_errmsg("Expecting property name", pystr, idx);
-                goto bail;
-            }
-            key = scanstring_str(pystr, idx + 1, encoding, strict, &next_idx);
-            if (key == NULL)
-                goto bail;
-            idx = next_idx;
-
-            /* skip whitespace between key and : delimiter, read :, skip whitespace */
-            while (idx <= end_idx && IS_WHITESPACE(str[idx])) idx++;
-            if (idx > end_idx || str[idx] != ':') {
-                raise_errmsg("Expecting : delimiter", pystr, idx);
-                goto bail;
-            }
-            idx++;
-            while (idx <= end_idx && IS_WHITESPACE(str[idx])) idx++;
-
-            /* read any JSON data type */
-            val = scan_once_str(s, pystr, idx, &next_idx);
-            if (val == NULL)
-                goto bail;
-
-            if (PyDict_SetItem(rval, key, val) == -1)
-                goto bail;
-
-            Py_CLEAR(key);
-            Py_CLEAR(val);
-            idx = next_idx;
-
-            /* skip whitespace before } or , */
-            while (idx <= end_idx && IS_WHITESPACE(str[idx])) idx++;
-
-            /* bail if the object is closed or we didn't get the , delimiter */
-            if (idx > end_idx) break;
-            if (str[idx] == '}') {
-                break;
-            }
-            else if (str[idx] != ',') {
-                raise_errmsg("Expecting , delimiter", pystr, idx);
-                goto bail;
-            }
-            idx++;
-
-            /* skip whitespace after , delimiter */
-            while (idx <= end_idx && IS_WHITESPACE(str[idx])) idx++;
-        }
-    }
-    /* verify that idx < end_idx, str[idx] should be '}' */
-    if (idx > end_idx || str[idx] != '}') {
-        raise_errmsg("Expecting object", pystr, end_idx);
-        goto bail;
-    }
-    /* if object_hook is not None: rval = object_hook(rval) */
-    if (s->object_hook != Py_None) {
-        val = PyObject_CallFunctionObjArgs(s->object_hook, rval, NULL);
-        if (val == NULL)
-            goto bail;
-        Py_DECREF(rval);
-        rval = val;
-        val = NULL;
-    }
-    *next_idx_ptr = idx + 1;
-    return rval;
-bail:
-    Py_XDECREF(key);
-    Py_XDECREF(val);
-    Py_DECREF(rval);
-    return NULL;
-}
-
-static PyObject *
-_parse_object_unicode(PyScannerObject *s, PyObject *pystr, Py_ssize_t idx, Py_ssize_t *next_idx_ptr) {
-    /* Read a JSON object from PyUnicode pystr.
-    idx is the index of the first character after the opening curly brace.
-    *next_idx_ptr is a return-by-reference index to the first character after
-        the closing curly brace.
-
-    Returns a new PyObject (usually a dict, but object_hook can change that)
-    */
-    Py_UNICODE *str = PyUnicode_AS_UNICODE(pystr);
-    Py_ssize_t end_idx = PyUnicode_GET_SIZE(pystr) - 1;
-    PyObject *val = NULL;
-    PyObject *rval = PyDict_New();
-    PyObject *key = NULL;
-    int strict = PyObject_IsTrue(s->strict);
-    Py_ssize_t next_idx;
-    if (rval == NULL)
-        return NULL;
-
-    /* skip whitespace after { */
-    while (idx <= end_idx && IS_WHITESPACE(str[idx])) idx++;
-
-    /* only loop if the object is non-empty */
-    if (idx <= end_idx && str[idx] != '}') {
-        while (idx <= end_idx) {
-            /* read key */
-            if (str[idx] != '"') {
-                raise_errmsg("Expecting property name", pystr, idx);
-                goto bail;
-            }
-            key = scanstring_unicode(pystr, idx + 1, strict, &next_idx);
-            if (key == NULL)
-                goto bail;
-            idx = next_idx;
-
-            /* skip whitespace between key and : delimiter, read :, skip whitespace */
-            while (idx <= end_idx && IS_WHITESPACE(str[idx])) idx++;
-            if (idx > end_idx || str[idx] != ':') {
-                raise_errmsg("Expecting : delimiter", pystr, idx);
-                goto bail;
-            }
-            idx++;
-            while (idx <= end_idx && IS_WHITESPACE(str[idx])) idx++;
-
-            /* read any JSON term */
-            val = scan_once_unicode(s, pystr, idx, &next_idx);
-            if (val == NULL)
-                goto bail;
-
-            if (PyDict_SetItem(rval, key, val) == -1)
-                goto bail;
-
-            Py_CLEAR(key);
-            Py_CLEAR(val);
-            idx = next_idx;
-
-            /* skip whitespace before } or , */
-            while (idx <= end_idx && IS_WHITESPACE(str[idx])) idx++;
-
-            /* bail if the object is closed or we didn't get the , delimiter */
-            if (idx > end_idx) break;
-            if (str[idx] == '}') {
-                break;
-            }
-            else if (str[idx] != ',') {
-                raise_errmsg("Expecting , delimiter", pystr, idx);
-                goto bail;
-            }
-            idx++;
-
-            /* skip whitespace after , delimiter */
-            while (idx <= end_idx && IS_WHITESPACE(str[idx])) idx++;
-        }
-    }
-
-    /* verify that idx < end_idx, str[idx] should be '}' */
-    if (idx > end_idx || str[idx] != '}') {
-        raise_errmsg("Expecting object", pystr, end_idx);
-        goto bail;
-    }
-
-    /* if object_hook is not None: rval = object_hook(rval) */
-    if (s->object_hook != Py_None) {
-        val = PyObject_CallFunctionObjArgs(s->object_hook, rval, NULL);
-        if (val == NULL)
-            goto bail;
-        Py_DECREF(rval);
-        rval = val;
-        val = NULL;
-    }
-    *next_idx_ptr = idx + 1;
-    return rval;
-bail:
-    Py_XDECREF(key);
-    Py_XDECREF(val);
-    Py_DECREF(rval);
-    return NULL;
-}
-
-static PyObject *
-_parse_array_str(PyScannerObject *s, PyObject *pystr, Py_ssize_t idx, Py_ssize_t *next_idx_ptr) {
-    /* Read a JSON array from PyString pystr.
-    idx is the index of the first character after the opening brace.
-    *next_idx_ptr is a return-by-reference index to the first character after
-        the closing brace.
-
-    Returns a new PyList
-    */
-    char *str = PyString_AS_STRING(pystr);
-    Py_ssize_t end_idx = PyString_GET_SIZE(pystr) - 1;
-    PyObject *val = NULL;
-    PyObject *rval = PyList_New(0);
-    Py_ssize_t next_idx;
-    if (rval == NULL)
-        return NULL;
-
-    /* skip whitespace after [ */
-    while (idx <= end_idx && IS_WHITESPACE(str[idx])) idx++;
-
-    /* only loop if the array is non-empty */
-    if (idx <= end_idx && str[idx] != ']') {
-        while (idx <= end_idx) {
-
-            /* read any JSON term and de-tuplefy the (rval, idx) */
-            val = scan_once_str(s, pystr, idx, &next_idx);
-            if (val == NULL)
-                goto bail;
-
-            if (PyList_Append(rval, val) == -1)
-                goto bail;
-
-            Py_CLEAR(val);
-            idx = next_idx;
-
-            /* skip whitespace between term and , */
-            while (idx <= end_idx && IS_WHITESPACE(str[idx])) idx++;
-
-            /* bail if the array is closed or we didn't get the , delimiter */
-            if (idx > end_idx) break;
-            if (str[idx] == ']') {
-                break;
-            }
-            else if (str[idx] != ',') {
-                raise_errmsg("Expecting , delimiter", pystr, idx);
-                goto bail;
-            }
-            idx++;
-
-            /* skip whitespace after , */
-            while (idx <= end_idx && IS_WHITESPACE(str[idx])) idx++;
-        }
-    }
-
-    /* verify that idx < end_idx, str[idx] should be ']' */
-    if (idx > end_idx || str[idx] != ']') {
-        raise_errmsg("Expecting object", pystr, end_idx);
-        goto bail;
-    }
-    *next_idx_ptr = idx + 1;
-    return rval;
-bail:
-    Py_XDECREF(val);
-    Py_DECREF(rval);
-    return NULL;
-}
-
-static PyObject *
-_parse_array_unicode(PyScannerObject *s, PyObject *pystr, Py_ssize_t idx, Py_ssize_t *next_idx_ptr) {
-    /* Read a JSON array from PyString pystr.
-    idx is the index of the first character after the opening brace.
-    *next_idx_ptr is a return-by-reference index to the first character after
-        the closing brace.
-
-    Returns a new PyList
-    */
-    Py_UNICODE *str = PyUnicode_AS_UNICODE(pystr);
-    Py_ssize_t end_idx = PyUnicode_GET_SIZE(pystr) - 1;
-    PyObject *val = NULL;
-    PyObject *rval = PyList_New(0);
-    Py_ssize_t next_idx;
-    if (rval == NULL)
-        return NULL;
-
-    /* skip whitespace after [ */
-    while (idx <= end_idx && IS_WHITESPACE(str[idx])) idx++;
-
-    /* only loop if the array is non-empty */
-    if (idx <= end_idx && str[idx] != ']') {
-        while (idx <= end_idx) {
-
-            /* read any JSON term  */
-            val = scan_once_unicode(s, pystr, idx, &next_idx);
-            if (val == NULL)
-                goto bail;
-
-            if (PyList_Append(rval, val) == -1)
-                goto bail;
-
-            Py_CLEAR(val);
-            idx = next_idx;
-
-            /* skip whitespace between term and , */
-            while (idx <= end_idx && IS_WHITESPACE(str[idx])) idx++;
-
-            /* bail if the array is closed or we didn't get the , delimiter */
-            if (idx > end_idx) break;
-            if (str[idx] == ']') {
-                break;
-            }
-            else if (str[idx] != ',') {
-                raise_errmsg("Expecting , delimiter", pystr, idx);
-                goto bail;
-            }
-            idx++;
-
-            /* skip whitespace after , */
-            while (idx <= end_idx && IS_WHITESPACE(str[idx])) idx++;
-        }
-    }
-
-    /* verify that idx < end_idx, str[idx] should be ']' */
-    if (idx > end_idx || str[idx] != ']') {
-        raise_errmsg("Expecting object", pystr, end_idx);
-        goto bail;
-    }
-    *next_idx_ptr = idx + 1;
-    return rval;
-bail:
-    Py_XDECREF(val);
-    Py_DECREF(rval);
-    return NULL;
-}
-
-static PyObject *
-_parse_constant(PyScannerObject *s, char *constant, Py_ssize_t idx, Py_ssize_t *next_idx_ptr) {
-    /* Read a JSON constant from PyString pystr.
-    constant is the constant string that was found
-        ("NaN", "Infinity", "-Infinity").
-    idx is the index of the first character of the constant
-    *next_idx_ptr is a return-by-reference index to the first character after
-        the constant.
-
-    Returns the result of parse_constant
-    */
-    PyObject *cstr;
-    PyObject *rval;
-    /* constant is "NaN", "Infinity", or "-Infinity" */
-    cstr = PyString_InternFromString(constant);
-    if (cstr == NULL)
-        return NULL;
-
-    /* rval = parse_constant(constant) */
-    rval = PyObject_CallFunctionObjArgs(s->parse_constant, cstr, NULL);
-    idx += PyString_GET_SIZE(cstr);
-    Py_DECREF(cstr);
-    *next_idx_ptr = idx;
-    return rval;
-}
-
-static PyObject *
-_match_number_str(PyScannerObject *s, PyObject *pystr, Py_ssize_t start, Py_ssize_t *next_idx_ptr) {
-    /* Read a JSON number from PyString pystr.
-    idx is the index of the first character of the number
-    *next_idx_ptr is a return-by-reference index to the first character after
-        the number.
-
-    Returns a new PyObject representation of that number:
-        PyInt, PyLong, or PyFloat.
-        May return other types if parse_int or parse_float are set
-    */
-    char *str = PyString_AS_STRING(pystr);
-    Py_ssize_t end_idx = PyString_GET_SIZE(pystr) - 1;
-    Py_ssize_t idx = start;
-    int is_float = 0;
-    PyObject *rval;
-    PyObject *numstr;
-
-    /* read a sign if it's there, make sure it's not the end of the string */
-    if (str[idx] == '-') {
-        idx++;
-        if (idx > end_idx) {
-            PyErr_SetNone(PyExc_StopIteration);
-            return NULL;
-        }
-    }
-
-    /* read as many integer digits as we find as long as it doesn't start with 0 */
-    if (str[idx] >= '1' && str[idx] <= '9') {
-        idx++;
-        while (idx <= end_idx && str[idx] >= '0' && str[idx] <= '9') idx++;
-    }
-    /* if it starts with 0 we only expect one integer digit */
-    else if (str[idx] == '0') {
-        idx++;
-    }
-    /* no integer digits, error */
-    else {
-        PyErr_SetNone(PyExc_StopIteration);
-        return NULL;
-    }
-
-    /* if the next char is '.' followed by a digit then read all float digits */
-    if (idx < end_idx && str[idx] == '.' && str[idx + 1] >= '0' && str[idx + 1] <= '9') {
-        is_float = 1;
-        idx += 2;
-        while (idx <= end_idx && str[idx] >= '0' && str[idx] <= '9') idx++;
-    }
-
-    /* if the next char is 'e' or 'E' then maybe read the exponent (or backtrack) */
-    if (idx < end_idx && (str[idx] == 'e' || str[idx] == 'E')) {
-
-        /* save the index of the 'e' or 'E' just in case we need to backtrack */
-        Py_ssize_t e_start = idx;
-        idx++;
-
-        /* read an exponent sign if present */
-        if (idx < end_idx && (str[idx] == '-' || str[idx] == '+')) idx++;
-
-        /* read all digits */
-        while (idx <= end_idx && str[idx] >= '0' && str[idx] <= '9') idx++;
-
-        /* if we got a digit, then parse as float. if not, backtrack */
-        if (str[idx - 1] >= '0' && str[idx - 1] <= '9') {
-            is_float = 1;
-        }
-        else {
-            idx = e_start;
-        }
-    }
-
-    /* copy the section we determined to be a number */
-    numstr = PyString_FromStringAndSize(&str[start], idx - start);
-    if (numstr == NULL)
-        return NULL;
-    if (is_float) {
-        /* parse as a float using a fast path if available, otherwise call user defined method */
-        if (s->parse_float != (PyObject *)&PyFloat_Type) {
-            rval = PyObject_CallFunctionObjArgs(s->parse_float, numstr, NULL);
-        }
-        else {
-            rval = PyFloat_FromDouble(PyOS_ascii_atof(PyString_AS_STRING(numstr)));
-        }
-    }
-    else {
-        /* parse as an int using a fast path if available, otherwise call user defined method */
-        if (s->parse_int != (PyObject *)&PyInt_Type) {
-            rval = PyObject_CallFunctionObjArgs(s->parse_int, numstr, NULL);
-        }
-        else {
-            rval = PyInt_FromString(PyString_AS_STRING(numstr), NULL, 10);
-        }
-    }
-    Py_DECREF(numstr);
-    *next_idx_ptr = idx;
-    return rval;
-}
-
-static PyObject *
-_match_number_unicode(PyScannerObject *s, PyObject *pystr, Py_ssize_t start, Py_ssize_t *next_idx_ptr) {
-    /* Read a JSON number from PyUnicode pystr.
-    idx is the index of the first character of the number
-    *next_idx_ptr is a return-by-reference index to the first character after
-        the number.
-
-    Returns a new PyObject representation of that number:
-        PyInt, PyLong, or PyFloat.
-        May return other types if parse_int or parse_float are set
-    */
-    Py_UNICODE *str = PyUnicode_AS_UNICODE(pystr);
-    Py_ssize_t end_idx = PyUnicode_GET_SIZE(pystr) - 1;
-    Py_ssize_t idx = start;
-    int is_float = 0;
-    PyObject *rval;
-    PyObject *numstr;
-
-    /* read a sign if it's there, make sure it's not the end of the string */
-    if (str[idx] == '-') {
-        idx++;
-        if (idx > end_idx) {
-            PyErr_SetNone(PyExc_StopIteration);
-            return NULL;
-        }
-    }
-
-    /* read as many integer digits as we find as long as it doesn't start with 0 */
-    if (str[idx] >= '1' && str[idx] <= '9') {
-        idx++;
-        while (idx <= end_idx && str[idx] >= '0' && str[idx] <= '9') idx++;
-    }
-    /* if it starts with 0 we only expect one integer digit */
-    else if (str[idx] == '0') {
-        idx++;
-    }
-    /* no integer digits, error */
-    else {
-        PyErr_SetNone(PyExc_StopIteration);
-        return NULL;
-    }
-
-    /* if the next char is '.' followed by a digit then read all float digits */
-    if (idx < end_idx && str[idx] == '.' && str[idx + 1] >= '0' && str[idx + 1] <= '9') {
-        is_float = 1;
-        idx += 2;
-        while (idx < end_idx && str[idx] >= '0' && str[idx] <= '9') idx++;
-    }
-
-    /* if the next char is 'e' or 'E' then maybe read the exponent (or backtrack) */
-    if (idx < end_idx && (str[idx] == 'e' || str[idx] == 'E')) {
-        Py_ssize_t e_start = idx;
-        idx++;
-
-        /* read an exponent sign if present */
-        if (idx < end_idx && (str[idx] == '-' || str[idx] == '+')) idx++;
-
-        /* read all digits */
-        while (idx <= end_idx && str[idx] >= '0' && str[idx] <= '9') idx++;
-
-        /* if we got a digit, then parse as float. if not, backtrack */
-        if (str[idx - 1] >= '0' && str[idx - 1] <= '9') {
-            is_float = 1;
-        }
-        else {
-            idx = e_start;
-        }
-    }
-
-    /* copy the section we determined to be a number */
-    numstr = PyUnicode_FromUnicode(&str[start], idx - start);
-    if (numstr == NULL)
-        return NULL;
-    if (is_float) {
-        /* parse as a float using a fast path if available, otherwise call user defined method */
-        if (s->parse_float != (PyObject *)&PyFloat_Type) {
-            rval = PyObject_CallFunctionObjArgs(s->parse_float, numstr, NULL);
-        }
-        else {
-            rval = PyFloat_FromString(numstr, NULL);
-        }
-    }
-    else {
-        /* no fast path for unicode -> int, just call */
-        rval = PyObject_CallFunctionObjArgs(s->parse_int, numstr, NULL);
-    }
-    Py_DECREF(numstr);
-    *next_idx_ptr = idx;
-    return rval;
-}
-
-static PyObject *
-scan_once_str(PyScannerObject *s, PyObject *pystr, Py_ssize_t idx, Py_ssize_t *next_idx_ptr)
-{
-    /* Read one JSON term (of any kind) from PyString pystr.
-    idx is the index of the first character of the term
-    *next_idx_ptr is a return-by-reference index to the first character after
-        the number.
-
-    Returns a new PyObject representation of the term.
-    */
-    char *str = PyString_AS_STRING(pystr);
-    Py_ssize_t length = PyString_GET_SIZE(pystr);
-    if (idx >= length) {
-        PyErr_SetNone(PyExc_StopIteration);
-        return NULL;
-    }
-    switch (str[idx]) {
-        case '"':
-            /* string */
-            return scanstring_str(pystr, idx + 1,
-                PyString_AS_STRING(s->encoding),
-                PyObject_IsTrue(s->strict),
-                next_idx_ptr);
-        case '{':
-            /* object */
-            return _parse_object_str(s, pystr, idx + 1, next_idx_ptr);
-        case '[':
-            /* array */
-            return _parse_array_str(s, pystr, idx + 1, next_idx_ptr);
-        case 'n':
-            /* null */
-            if ((idx + 3 < length) && str[idx + 1] == 'u' && str[idx + 2] == 'l' && str[idx + 3] == 'l') {
-                Py_INCREF(Py_None);
-                *next_idx_ptr = idx + 4;
-                return Py_None;
-            }
-            break;
-        case 't':
-            /* true */
-            if ((idx + 3 < length) && str[idx + 1] == 'r' && str[idx + 2] == 'u' && str[idx + 3] == 'e') {
-                Py_INCREF(Py_True);
-                *next_idx_ptr = idx + 4;
-                return Py_True;
-            }
-            break;
-        case 'f':
-            /* false */
-            if ((idx + 4 < length) && str[idx + 1] == 'a' && str[idx + 2] == 'l' && str[idx + 3] == 's' && str[idx + 4] == 'e') {
-                Py_INCREF(Py_False);
-                *next_idx_ptr = idx + 5;
-                return Py_False;
-            }
-            break;
-        case 'N':
-            /* NaN */
-            if ((idx + 2 < length) && str[idx + 1] == 'a' && str[idx + 2] == 'N') {
-                return _parse_constant(s, "NaN", idx, next_idx_ptr);
-            }
-            break;
-        case 'I':
-            /* Infinity */
-            if ((idx + 7 < length) && str[idx + 1] == 'n' && str[idx + 2] == 'f' && str[idx + 3] == 'i' && str[idx + 4] == 'n' && str[idx + 5] == 'i' && str[idx + 6] == 't' && str[idx + 7] == 'y') {
-                return _parse_constant(s, "Infinity", idx, next_idx_ptr);
-            }
-            break;
-        case '-':
-            /* -Infinity */
-            if ((idx + 8 < length) && str[idx + 1] == 'I' && str[idx + 2] == 'n' && str[idx + 3] == 'f' && str[idx + 4] == 'i' && str[idx + 5] == 'n' && str[idx + 6] == 'i' && str[idx + 7] == 't' && str[idx + 8] == 'y') {
-                return _parse_constant(s, "-Infinity", idx, next_idx_ptr);
-            }
-            break;
-    }
-    /* Didn't find a string, object, array, or named constant. Look for a number. */
-    return _match_number_str(s, pystr, idx, next_idx_ptr);
-}
-
-static PyObject *
-scan_once_unicode(PyScannerObject *s, PyObject *pystr, Py_ssize_t idx, Py_ssize_t *next_idx_ptr)
-{
-    /* Read one JSON term (of any kind) from PyUnicode pystr.
-    idx is the index of the first character of the term
-    *next_idx_ptr is a return-by-reference index to the first character after
-        the number.
-
-    Returns a new PyObject representation of the term.
-    */
-    Py_UNICODE *str = PyUnicode_AS_UNICODE(pystr);
-    Py_ssize_t length = PyUnicode_GET_SIZE(pystr);
-    if (idx >= length) {
-        PyErr_SetNone(PyExc_StopIteration);
-        return NULL;
-    }
-    switch (str[idx]) {
-        case '"':
-            /* string */
-            return scanstring_unicode(pystr, idx + 1,
-                PyObject_IsTrue(s->strict),
-                next_idx_ptr);
-        case '{':
-            /* object */
-            return _parse_object_unicode(s, pystr, idx + 1, next_idx_ptr);
-        case '[':
-            /* array */
-            return _parse_array_unicode(s, pystr, idx + 1, next_idx_ptr);
-        case 'n':
-            /* null */
-            if ((idx + 3 < length) && str[idx + 1] == 'u' && str[idx + 2] == 'l' && str[idx + 3] == 'l') {
-                Py_INCREF(Py_None);
-                *next_idx_ptr = idx + 4;
-                return Py_None;
-            }
-            break;
-        case 't':
-            /* true */
-            if ((idx + 3 < length) && str[idx + 1] == 'r' && str[idx + 2] == 'u' && str[idx + 3] == 'e') {
-                Py_INCREF(Py_True);
-                *next_idx_ptr = idx + 4;
-                return Py_True;
-            }
-            break;
-        case 'f':
-            /* false */
-            if ((idx + 4 < length) && str[idx + 1] == 'a' && str[idx + 2] == 'l' && str[idx + 3] == 's' && str[idx + 4] == 'e') {
-                Py_INCREF(Py_False);
-                *next_idx_ptr = idx + 5;
-                return Py_False;
-            }
-            break;
-        case 'N':
-            /* NaN */
-            if ((idx + 2 < length) && str[idx + 1] == 'a' && str[idx + 2] == 'N') {
-                return _parse_constant(s, "NaN", idx, next_idx_ptr);
-            }
-            break;
-        case 'I':
-            /* Infinity */
-            if ((idx + 7 < length) && str[idx + 1] == 'n' && str[idx + 2] == 'f' && str[idx + 3] == 'i' && str[idx + 4] == 'n' && str[idx + 5] == 'i' && str[idx + 6] == 't' && str[idx + 7] == 'y') {
-                return _parse_constant(s, "Infinity", idx, next_idx_ptr);
-            }
-            break;
-        case '-':
-            /* -Infinity */
-            if ((idx + 8 < length) && str[idx + 1] == 'I' && str[idx + 2] == 'n' && str[idx + 3] == 'f' && str[idx + 4] == 'i' && str[idx + 5] == 'n' && str[idx + 6] == 'i' && str[idx + 7] == 't' && str[idx + 8] == 'y') {
-                return _parse_constant(s, "-Infinity", idx, next_idx_ptr);
-            }
-            break;
-    }
-    /* Didn't find a string, object, array, or named constant. Look for a number. */
-    return _match_number_unicode(s, pystr, idx, next_idx_ptr);
-}
-
-static PyObject *
-scanner_call(PyObject *self, PyObject *args, PyObject *kwds)
-{
-    /* Python callable interface to scan_once_{str,unicode} */
-    PyObject *pystr;
-    PyObject *rval;
-    Py_ssize_t idx;
-    Py_ssize_t next_idx = -1;
-    static char *kwlist[] = {"string", "idx", NULL};
-    PyScannerObject *s;
-    assert(PyScanner_Check(self));
-    s = (PyScannerObject *)self;
-    if (!PyArg_ParseTupleAndKeywords(args, kwds, "OO&:scan_once", kwlist, &pystr, _convertPyInt_AsSsize_t, &idx))
-        return NULL;
-
-    if (PyString_Check(pystr)) {
-        rval = scan_once_str(s, pystr, idx, &next_idx);
-    }
-    else if (PyUnicode_Check(pystr)) {
-        rval = scan_once_unicode(s, pystr, idx, &next_idx);
-    }
-    else {
-        PyErr_Format(PyExc_TypeError,
-                 "first argument must be a string, not %.80s",
-                 Py_TYPE(pystr)->tp_name);
-        return NULL;
-    }
-    return _build_rval_index_tuple(rval, next_idx);
-}
-
-static PyObject *
-scanner_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
-{
-    PyScannerObject *s;
-    s = (PyScannerObject *)type->tp_alloc(type, 0);
-    if (s != NULL) {
-        s->encoding = NULL;
-        s->strict = NULL;
-        s->object_hook = NULL;
-        s->parse_float = NULL;
-        s->parse_int = NULL;
-        s->parse_constant = NULL;
-    }
-    return (PyObject *)s;
-}
-
-static int
-scanner_init(PyObject *self, PyObject *args, PyObject *kwds)
-{
-    /* Initialize Scanner object */
-    PyObject *ctx;
-    static char *kwlist[] = {"context", NULL};
-    PyScannerObject *s;
-
-    assert(PyScanner_Check(self));
-    s = (PyScannerObject *)self;
-
-    if (!PyArg_ParseTupleAndKeywords(args, kwds, "O:make_scanner", kwlist, &ctx))
-        return -1;
-
-    /* PyString_AS_STRING is used on encoding */
-    s->encoding = PyObject_GetAttrString(ctx, "encoding");
-    if (s->encoding == Py_None) {
-        Py_DECREF(Py_None);
-        s->encoding = PyString_InternFromString(DEFAULT_ENCODING);
-    }
-    else if (PyUnicode_Check(s->encoding)) {
-        PyObject *tmp = PyUnicode_AsEncodedString(s->encoding, NULL, NULL);
-        Py_DECREF(s->encoding);
-        s->encoding = tmp;
-    }
-    if (s->encoding == NULL || !PyString_Check(s->encoding))
-        goto bail;
-
-    /* All of these will fail "gracefully" so we don't need to verify them */
-    s->strict = PyObject_GetAttrString(ctx, "strict");
-    if (s->strict == NULL)
-        goto bail;
-    s->object_hook = PyObject_GetAttrString(ctx, "object_hook");
-    if (s->object_hook == NULL)
-        goto bail;
-    s->parse_float = PyObject_GetAttrString(ctx, "parse_float");
-    if (s->parse_float == NULL)
-        goto bail;
-    s->parse_int = PyObject_GetAttrString(ctx, "parse_int");
-    if (s->parse_int == NULL)
-        goto bail;
-    s->parse_constant = PyObject_GetAttrString(ctx, "parse_constant");
-    if (s->parse_constant == NULL)
-        goto bail;
-
-    return 0;
-
-bail:
-    Py_CLEAR(s->encoding);
-    Py_CLEAR(s->strict);
-    Py_CLEAR(s->object_hook);
-    Py_CLEAR(s->parse_float);
-    Py_CLEAR(s->parse_int);
-    Py_CLEAR(s->parse_constant);
-    return -1;
-}
-
-PyDoc_STRVAR(scanner_doc, "JSON scanner object");
-
-static
-PyTypeObject PyScannerType = {
-    PyObject_HEAD_INIT(NULL)
-    0,                    /* tp_internal */
-    "simplejson._speedups.Scanner",       /* tp_name */
-    sizeof(PyScannerObject), /* tp_basicsize */
-    0,                    /* tp_itemsize */
-    scanner_dealloc, /* tp_dealloc */
-    0,                    /* tp_print */
-    0,                    /* tp_getattr */
-    0,                    /* tp_setattr */
-    0,                    /* tp_compare */
-    0,                    /* tp_repr */
-    0,                    /* tp_as_number */
-    0,                    /* tp_as_sequence */
-    0,                    /* tp_as_mapping */
-    0,                    /* tp_hash */
-    scanner_call,         /* tp_call */
-    0,                    /* tp_str */
-    0,/* PyObject_GenericGetAttr, */                    /* tp_getattro */
-    0,/* PyObject_GenericSetAttr, */                    /* tp_setattro */
-    0,                    /* tp_as_buffer */
-    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,   /* tp_flags */
-    scanner_doc,          /* tp_doc */
-    scanner_traverse,                    /* tp_traverse */
-    scanner_clear,                    /* tp_clear */
-    0,                    /* tp_richcompare */
-    0,                    /* tp_weaklistoffset */
-    0,                    /* tp_iter */
-    0,                    /* tp_iternext */
-    0,                    /* tp_methods */
-    scanner_members,                    /* tp_members */
-    0,                    /* tp_getset */
-    0,                    /* tp_base */
-    0,                    /* tp_dict */
-    0,                    /* tp_descr_get */
-    0,                    /* tp_descr_set */
-    0,                    /* tp_dictoffset */
-    scanner_init,                    /* tp_init */
-    0,/* PyType_GenericAlloc, */        /* tp_alloc */
-    scanner_new,          /* tp_new */
-    0,/* PyObject_GC_Del, */              /* tp_free */
-};
-
-static PyObject *
-encoder_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
-{
-    PyEncoderObject *s;
-    s = (PyEncoderObject *)type->tp_alloc(type, 0);
-    if (s != NULL) {
-        s->markers = NULL;
-        s->defaultfn = NULL;
-        s->encoder = NULL;
-        s->indent = NULL;
-        s->key_separator = NULL;
-        s->item_separator = NULL;
-        s->sort_keys = NULL;
-        s->skipkeys = NULL;
-    }
-    return (PyObject *)s;
-}
-
-static int
-encoder_init(PyObject *self, PyObject *args, PyObject *kwds)
-{
-    /* initialize Encoder object */
-    static char *kwlist[] = {"markers", "default", "encoder", "indent", "key_separator", "item_separator", "sort_keys", "skipkeys", "allow_nan", NULL};
-
-    PyEncoderObject *s;
-    PyObject *allow_nan;
-
-    assert(PyEncoder_Check(self));
-    s = (PyEncoderObject *)self;
-
-    if (!PyArg_ParseTupleAndKeywords(args, kwds, "OOOOOOOOO:make_encoder", kwlist,
-        &s->markers, &s->defaultfn, &s->encoder, &s->indent, &s->key_separator, &s->item_separator, &s->sort_keys, &s->skipkeys, &allow_nan))
-        return -1;
-
-    Py_INCREF(s->markers);
-    Py_INCREF(s->defaultfn);
-    Py_INCREF(s->encoder);
-    Py_INCREF(s->indent);
-    Py_INCREF(s->key_separator);
-    Py_INCREF(s->item_separator);
-    Py_INCREF(s->sort_keys);
-    Py_INCREF(s->skipkeys);
-    s->fast_encode = (PyCFunction_Check(s->encoder) && PyCFunction_GetFunction(s->encoder) == (PyCFunction)py_encode_basestring_ascii);
-    s->allow_nan = PyObject_IsTrue(allow_nan);
-    return 0;
-}
-
-static PyObject *
-encoder_call(PyObject *self, PyObject *args, PyObject *kwds)
-{
-    /* Python callable interface to encode_listencode_obj */
-    static char *kwlist[] = {"obj", "_current_indent_level", NULL};
-    PyObject *obj;
-    PyObject *rval;
-    Py_ssize_t indent_level;
-    PyEncoderObject *s;
-    assert(PyEncoder_Check(self));
-    s = (PyEncoderObject *)self;
-    if (!PyArg_ParseTupleAndKeywords(args, kwds, "OO&:_iterencode", kwlist,
-        &obj, _convertPyInt_AsSsize_t, &indent_level))
-        return NULL;
-    rval = PyList_New(0);
-    if (rval == NULL)
-        return NULL;
-    if (encoder_listencode_obj(s, rval, obj, indent_level)) {
-        Py_DECREF(rval);
-        return NULL;
-    }
-    return rval;
-}
-
-static PyObject *
-_encoded_const(PyObject *obj)
-{
-    /* Return the JSON string representation of None, True, False */
-    if (obj == Py_None) {
-        static PyObject *s_null = NULL;
-        if (s_null == NULL) {
-            s_null = PyString_InternFromString("null");
-        }
-        Py_INCREF(s_null);
-        return s_null;
-    }
-    else if (obj == Py_True) {
-        static PyObject *s_true = NULL;
-        if (s_true == NULL) {
-            s_true = PyString_InternFromString("true");
-        }
-        Py_INCREF(s_true);
-        return s_true;
-    }
-    else if (obj == Py_False) {
-        static PyObject *s_false = NULL;
-        if (s_false == NULL) {
-            s_false = PyString_InternFromString("false");
-        }
-        Py_INCREF(s_false);
-        return s_false;
-    }
-    else {
-        PyErr_SetString(PyExc_ValueError, "not a const");
-        return NULL;
-    }
-}
-
-static PyObject *
-encoder_encode_float(PyEncoderObject *s, PyObject *obj)
-{
-    /* Return the JSON representation of a PyFloat */
-    double i = PyFloat_AS_DOUBLE(obj);
-    if (!Py_IS_FINITE(i)) {
-        if (!s->allow_nan) {
-            PyErr_SetString(PyExc_ValueError, "Out of range float values are not JSON compliant");
-            return NULL;
-        }
-        if (i > 0) {
-            return PyString_FromString("Infinity");
-        }
-        else if (i < 0) {
-            return PyString_FromString("-Infinity");
-        }
-        else {
-            return PyString_FromString("NaN");
-        }
-    }
-    /* Use a better float format here? */
-    return PyObject_Repr(obj);
-}
-
-static PyObject *
-encoder_encode_string(PyEncoderObject *s, PyObject *obj)
-{
-    /* Return the JSON representation of a string */
-    if (s->fast_encode)
-        return py_encode_basestring_ascii(NULL, obj);
-    else
-        return PyObject_CallFunctionObjArgs(s->encoder, obj, NULL);
-}
-
-static int
-_steal_list_append(PyObject *lst, PyObject *stolen)
-{
-    /* Append stolen and then decrement its reference count */
-    int rval = PyList_Append(lst, stolen);
-    Py_DECREF(stolen);
-    return rval;
-}
-
-static int
-encoder_listencode_obj(PyEncoderObject *s, PyObject *rval, PyObject *obj, Py_ssize_t indent_level)
-{
-    /* Encode Python object obj to a JSON term, rval is a PyList */
-    PyObject *newobj;
-    int rv;
-
-    if (obj == Py_None || obj == Py_True || obj == Py_False) {
-        PyObject *cstr = _encoded_const(obj);
-        if (cstr == NULL)
-            return -1;
-        return _steal_list_append(rval, cstr);
-    }
-    else if (PyString_Check(obj) || PyUnicode_Check(obj))
-    {
-        PyObject *encoded = encoder_encode_string(s, obj);
-        if (encoded == NULL)
-            return -1;
-        return _steal_list_append(rval, encoded);
-    }
-    else if (PyInt_Check(obj) || PyLong_Check(obj)) {
-        PyObject *encoded = PyObject_Str(obj);
-        if (encoded == NULL)
-            return -1;
-        return _steal_list_append(rval, encoded);
-    }
-    else if (PyFloat_Check(obj)) {
-        PyObject *encoded = encoder_encode_float(s, obj);
-        if (encoded == NULL)
-            return -1;
-        return _steal_list_append(rval, encoded);
-    }
-    else if (PyList_Check(obj) || PyTuple_Check(obj)) {
-        return encoder_listencode_list(s, rval, obj, indent_level);
-    }
-    else if (PyDict_Check(obj)) {
-        return encoder_listencode_dict(s, rval, obj, indent_level);
-    }
-    else {
-        PyObject *ident = NULL;
-        if (s->markers != Py_None) {
-            int has_key;
-            ident = PyLong_FromVoidPtr(obj);
-            if (ident == NULL)
-                return -1;
-            has_key = PyDict_Contains(s->markers, ident);
-            if (has_key) {
-                if (has_key != -1)
-                    PyErr_SetString(PyExc_ValueError, "Circular reference detected");
-                Py_DECREF(ident);
-                return -1;
-            }
-            if (PyDict_SetItem(s->markers, ident, obj)) {
-                Py_DECREF(ident);
-                return -1;
-            }
-        }
-        newobj = PyObject_CallFunctionObjArgs(s->defaultfn, obj, NULL);
-        if (newobj == NULL) {
-            Py_XDECREF(ident);
-            return -1;
-        }
-        rv = encoder_listencode_obj(s, rval, newobj, indent_level);
-        Py_DECREF(newobj);
-        if (rv) {
-            Py_XDECREF(ident);
-            return -1;
-        }
-        if (ident != NULL) {
-            if (PyDict_DelItem(s->markers, ident)) {
-                Py_XDECREF(ident);
-                return -1;
-            }
-            Py_XDECREF(ident);
-        }
-        return rv;
-    }
-}
-
-static int
-encoder_listencode_dict(PyEncoderObject *s, PyObject *rval, PyObject *dct, Py_ssize_t indent_level)
-{
-    /* Encode Python dict dct a JSON term, rval is a PyList */
-    static PyObject *open_dict = NULL;
-    static PyObject *close_dict = NULL;
-    static PyObject *empty_dict = NULL;
-    PyObject *kstr = NULL;
-    PyObject *ident = NULL;
-    PyObject *key, *value;
-    Py_ssize_t pos;
-    int skipkeys;
-    Py_ssize_t idx;
-
-    if (open_dict == NULL || close_dict == NULL || empty_dict == NULL) {
-        open_dict = PyString_InternFromString("{");
-        close_dict = PyString_InternFromString("}");
-        empty_dict = PyString_InternFromString("{}");
-        if (open_dict == NULL || close_dict == NULL || empty_dict == NULL)
-            return -1;
-    }
-    if (PyDict_Size(dct) == 0)
-        return PyList_Append(rval, empty_dict);
-
-    if (s->markers != Py_None) {
-        int has_key;
-        ident = PyLong_FromVoidPtr(dct);
-        if (ident == NULL)
-            goto bail;
-        has_key = PyDict_Contains(s->markers, ident);
-        if (has_key) {
-            if (has_key != -1)
-                PyErr_SetString(PyExc_ValueError, "Circular reference detected");
-            goto bail;
-        }
-        if (PyDict_SetItem(s->markers, ident, dct)) {
-            goto bail;
-        }
-    }
-
-    if (PyList_Append(rval, open_dict))
-        goto bail;
-
-    if (s->indent != Py_None) {
-        /* TODO: DOES NOT RUN */
-        indent_level += 1;
-        /*
-            newline_indent = '\n' + (' ' * (_indent * _current_indent_level))
-            separator = _item_separator + newline_indent
-            buf += newline_indent
-        */
-    }
-
-    /* TODO: C speedup not implemented for sort_keys */
-
-    pos = 0;
-    skipkeys = PyObject_IsTrue(s->skipkeys);
-    idx = 0;
-    while (PyDict_Next(dct, &pos, &key, &value)) {
-        PyObject *encoded;
-
-        if (PyString_Check(key) || PyUnicode_Check(key)) {
-            Py_INCREF(key);
-            kstr = key;
-        }
-        else if (PyFloat_Check(key)) {
-            kstr = encoder_encode_float(s, key);
-            if (kstr == NULL)
-                goto bail;
-        }
-        else if (PyInt_Check(key) || PyLong_Check(key)) {
-            kstr = PyObject_Str(key);
-            if (kstr == NULL)
-                goto bail;
-        }
-        else if (key == Py_True || key == Py_False || key == Py_None) {
-            kstr = _encoded_const(key);
-            if (kstr == NULL)
-                goto bail;
-        }
-        else if (skipkeys) {
-            continue;
-        }
-        else {
-            /* TODO: include repr of key */
-            PyErr_SetString(PyExc_ValueError, "keys must be a string");
-            goto bail;
-        }
-
-        if (idx) {
-            if (PyList_Append(rval, s->item_separator))
-                goto bail;
-        }
-
-        encoded = encoder_encode_string(s, kstr);
-        Py_CLEAR(kstr);
-        if (encoded == NULL)
-            goto bail;
-        if (PyList_Append(rval, encoded)) {
-            Py_DECREF(encoded);
-            goto bail;
-        }
-        Py_DECREF(encoded);
-        if (PyList_Append(rval, s->key_separator))
-            goto bail;
-        if (encoder_listencode_obj(s, rval, value, indent_level))
-            goto bail;
-        idx += 1;
-    }
-    if (ident != NULL) {
-        if (PyDict_DelItem(s->markers, ident))
-            goto bail;
-        Py_CLEAR(ident);
-    }
-    if (s->indent != Py_None) {
-        /* TODO: DOES NOT RUN */
-        indent_level -= 1;
-        /*
-            yield '\n' + (' ' * (_indent * _current_indent_level))
-        */
-    }
-    if (PyList_Append(rval, close_dict))
-        goto bail;
-    return 0;
-
-bail:
-    Py_XDECREF(kstr);
-    Py_XDECREF(ident);
-    return -1;
-}
-
-
-static int
-encoder_listencode_list(PyEncoderObject *s, PyObject *rval, PyObject *seq, Py_ssize_t indent_level)
-{
-    /* Encode Python list seq to a JSON term, rval is a PyList */
-    static PyObject *open_array = NULL;
-    static PyObject *close_array = NULL;
-    static PyObject *empty_array = NULL;
-    PyObject *ident = NULL;
-    PyObject *s_fast = NULL;
-    Py_ssize_t num_items;
-    PyObject **seq_items;
-    Py_ssize_t i;
-
-    if (open_array == NULL || close_array == NULL || empty_array == NULL) {
-        open_array = PyString_InternFromString("[");
-        close_array = PyString_InternFromString("]");
-        empty_array = PyString_InternFromString("[]");
-        if (open_array == NULL || close_array == NULL || empty_array == NULL)
-            return -1;
-    }
-    ident = NULL;
-    s_fast = PySequence_Fast(seq, "_iterencode_list needs a sequence");
-    if (s_fast == NULL)
-        return -1;
-    num_items = PySequence_Fast_GET_SIZE(s_fast);
-    if (num_items == 0) {
-        Py_DECREF(s_fast);
-        return PyList_Append(rval, empty_array);
-    }
-
-    if (s->markers != Py_None) {
-        int has_key;
-        ident = PyLong_FromVoidPtr(seq);
-        if (ident == NULL)
-            goto bail;
-        has_key = PyDict_Contains(s->markers, ident);
-        if (has_key) {
-            if (has_key != -1)
-                PyErr_SetString(PyExc_ValueError, "Circular reference detected");
-            goto bail;
-        }
-        if (PyDict_SetItem(s->markers, ident, seq)) {
-            goto bail;
-        }
-    }
-
-    seq_items = PySequence_Fast_ITEMS(s_fast);
-    if (PyList_Append(rval, open_array))
-        goto bail;
-    if (s->indent != Py_None) {
-        /* TODO: DOES NOT RUN */
-        indent_level += 1;
-        /*
-            newline_indent = '\n' + (' ' * (_indent * _current_indent_level))
-            separator = _item_separator + newline_indent
-            buf += newline_indent
-        */
-    }
-    for (i = 0; i < num_items; i++) {
-        PyObject *obj = seq_items[i];
-        if (i) {
-            if (PyList_Append(rval, s->item_separator))
-                goto bail;
-        }
-        if (encoder_listencode_obj(s, rval, obj, indent_level))
-            goto bail;
-    }
-    if (ident != NULL) {
-        if (PyDict_DelItem(s->markers, ident))
-            goto bail;
-        Py_CLEAR(ident);
-    }
-    if (s->indent != Py_None) {
-        /* TODO: DOES NOT RUN */
-        indent_level -= 1;
-        /*
-            yield '\n' + (' ' * (_indent * _current_indent_level))
-        */
-    }
-    if (PyList_Append(rval, close_array))
-        goto bail;
-    Py_DECREF(s_fast);
-    return 0;
-
-bail:
-    Py_XDECREF(ident);
-    Py_DECREF(s_fast);
-    return -1;
-}
-
-static void
-encoder_dealloc(PyObject *self)
-{
-    /* Deallocate Encoder */
-    encoder_clear(self);
-    Py_TYPE(self)->tp_free(self);
-}
-
-static int
-encoder_traverse(PyObject *self, visitproc visit, void *arg)
-{
-    PyEncoderObject *s;
-    assert(PyEncoder_Check(self));
-    s = (PyEncoderObject *)self;
-    Py_VISIT(s->markers);
-    Py_VISIT(s->defaultfn);
-    Py_VISIT(s->encoder);
-    Py_VISIT(s->indent);
-    Py_VISIT(s->key_separator);
-    Py_VISIT(s->item_separator);
-    Py_VISIT(s->sort_keys);
-    Py_VISIT(s->skipkeys);
-    return 0;
-}
-
-static int
-encoder_clear(PyObject *self)
-{
-    /* Deallocate Encoder */
-    PyEncoderObject *s;
-    assert(PyEncoder_Check(self));
-    s = (PyEncoderObject *)self;
-    Py_CLEAR(s->markers);
-    Py_CLEAR(s->defaultfn);
-    Py_CLEAR(s->encoder);
-    Py_CLEAR(s->indent);
-    Py_CLEAR(s->key_separator);
-    Py_CLEAR(s->item_separator);
-    Py_CLEAR(s->sort_keys);
-    Py_CLEAR(s->skipkeys);
-    return 0;
-}
-
-PyDoc_STRVAR(encoder_doc, "_iterencode(obj, _current_indent_level) -> iterable");
-
-static
-PyTypeObject PyEncoderType = {
-    PyObject_HEAD_INIT(NULL)
-    0,                    /* tp_internal */
-    "simplejson._speedups.Encoder",       /* tp_name */
-    sizeof(PyEncoderObject), /* tp_basicsize */
-    0,                    /* tp_itemsize */
-    encoder_dealloc, /* tp_dealloc */
-    0,                    /* tp_print */
-    0,                    /* tp_getattr */
-    0,                    /* tp_setattr */
-    0,                    /* tp_compare */
-    0,                    /* tp_repr */
-    0,                    /* tp_as_number */
-    0,                    /* tp_as_sequence */
-    0,                    /* tp_as_mapping */
-    0,                    /* tp_hash */
-    encoder_call,         /* tp_call */
-    0,                    /* tp_str */
-    0,                    /* tp_getattro */
-    0,                    /* tp_setattro */
-    0,                    /* tp_as_buffer */
-    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,   /* tp_flags */
-    encoder_doc,          /* tp_doc */
-    encoder_traverse,     /* tp_traverse */
-    encoder_clear,        /* tp_clear */
-    0,                    /* tp_richcompare */
-    0,                    /* tp_weaklistoffset */
-    0,                    /* tp_iter */
-    0,                    /* tp_iternext */
-    0,                    /* tp_methods */
-    encoder_members,      /* tp_members */
-    0,                    /* tp_getset */
-    0,                    /* tp_base */
-    0,                    /* tp_dict */
-    0,                    /* tp_descr_get */
-    0,                    /* tp_descr_set */
-    0,                    /* tp_dictoffset */
-    encoder_init,         /* tp_init */
-    0,                    /* tp_alloc */
-    encoder_new,          /* tp_new */
-    0,                    /* tp_free */
-};
-
-static PyMethodDef speedups_methods[] = {
-    {"encode_basestring_ascii",
-        (PyCFunction)py_encode_basestring_ascii,
-        METH_O,
-        pydoc_encode_basestring_ascii},
-    {"scanstring",
-        (PyCFunction)py_scanstring,
-        METH_VARARGS,
-        pydoc_scanstring},
-    {NULL, NULL, 0, NULL}
-};
-
-PyDoc_STRVAR(module_doc,
-"simplejson speedups\n");
-
-void
-init_speedups(void)
-{
-    PyObject *m;
-    PyScannerType.tp_new = PyType_GenericNew;
-    if (PyType_Ready(&PyScannerType) < 0)
-        return;
-    PyEncoderType.tp_new = PyType_GenericNew;
-    if (PyType_Ready(&PyEncoderType) < 0)
-        return;
-    m = Py_InitModule3("_speedups", speedups_methods, module_doc);
-    Py_INCREF((PyObject*)&PyScannerType);
-    PyModule_AddObject(m, "make_scanner", (PyObject*)&PyScannerType);
-    Py_INCREF((PyObject*)&PyEncoderType);
-    PyModule_AddObject(m, "make_encoder", (PyObject*)&PyEncoderType);
-}
diff --git a/lang/py/lib/simplejson/decoder.py b/lang/py/lib/simplejson/decoder.py
deleted file mode 100644
index b769ea4..0000000
--- a/lang/py/lib/simplejson/decoder.py
+++ /dev/null
@@ -1,354 +0,0 @@
-"""Implementation of JSONDecoder
-"""
-import re
-import sys
-import struct
-
-from simplejson.scanner import make_scanner
-try:
-    from simplejson._speedups import scanstring as c_scanstring
-except ImportError:
-    c_scanstring = None
-
-__all__ = ['JSONDecoder']
-
-FLAGS = re.VERBOSE | re.MULTILINE | re.DOTALL
-
-def _floatconstants():
-    _BYTES = '7FF80000000000007FF0000000000000'.decode('hex')
-    if sys.byteorder != 'big':
-        _BYTES = _BYTES[:8][::-1] + _BYTES[8:][::-1]
-    nan, inf = struct.unpack('dd', _BYTES)
-    return nan, inf, -inf
-
-NaN, PosInf, NegInf = _floatconstants()
-
-
-def linecol(doc, pos):
-    lineno = doc.count('\n', 0, pos) + 1
-    if lineno == 1:
-        colno = pos
-    else:
-        colno = pos - doc.rindex('\n', 0, pos)
-    return lineno, colno
-
-
-def errmsg(msg, doc, pos, end=None):
-    # Note that this function is called from _speedups
-    lineno, colno = linecol(doc, pos)
-    if end is None:
-        #fmt = '{0}: line {1} column {2} (char {3})'
-        #return fmt.format(msg, lineno, colno, pos)
-        fmt = '%s: line %d column %d (char %d)'
-        return fmt % (msg, lineno, colno, pos)
-    endlineno, endcolno = linecol(doc, end)
-    #fmt = '{0}: line {1} column {2} - line {3} column {4} (char {5} - {6})'
-    #return fmt.format(msg, lineno, colno, endlineno, endcolno, pos, end)
-    fmt = '%s: line %d column %d - line %d column %d (char %d - %d)'
-    return fmt % (msg, lineno, colno, endlineno, endcolno, pos, end)
-
-
-_CONSTANTS = {
-    '-Infinity': NegInf,
-    'Infinity': PosInf,
-    'NaN': NaN,
-}
-
-STRINGCHUNK = re.compile(r'(.*?)(["\\\x00-\x1f])', FLAGS)
-BACKSLASH = {
-    '"': u'"', '\\': u'\\', '/': u'/',
-    'b': u'\b', 'f': u'\f', 'n': u'\n', 'r': u'\r', 't': u'\t',
-}
-
-DEFAULT_ENCODING = "utf-8"
-
-def py_scanstring(s, end, encoding=None, strict=True, _b=BACKSLASH, _m=STRINGCHUNK.match):
-    """Scan the string s for a JSON string. End is the index of the
-    character in s after the quote that started the JSON string.
-    Unescapes all valid JSON string escape sequences and raises ValueError
-    on attempt to decode an invalid string. If strict is False then literal
-    control characters are allowed in the string.
-    
-    Returns a tuple of the decoded string and the index of the character in s
-    after the end quote."""
-    if encoding is None:
-        encoding = DEFAULT_ENCODING
-    chunks = []
-    _append = chunks.append
-    begin = end - 1
-    while 1:
-        chunk = _m(s, end)
-        if chunk is None:
-            raise ValueError(
-                errmsg("Unterminated string starting at", s, begin))
-        end = chunk.end()
-        content, terminator = chunk.groups()
-        # Content is contains zero or more unescaped string characters
-        if content:
-            if not isinstance(content, unicode):
-                content = unicode(content, encoding)
-            _append(content)
-        # Terminator is the end of string, a literal control character,
-        # or a backslash denoting that an escape sequence follows
-        if terminator == '"':
-            break
-        elif terminator != '\\':
-            if strict:
-                msg = "Invalid control character %r at" % (terminator,)
-                #msg = "Invalid control character {0!r} at".format(terminator)
-                raise ValueError(errmsg(msg, s, end))
-            else:
-                _append(terminator)
-                continue
-        try:
-            esc = s[end]
-        except IndexError:
-            raise ValueError(
-                errmsg("Unterminated string starting at", s, begin))
-        # If not a unicode escape sequence, must be in the lookup table
-        if esc != 'u':
-            try:
-                char = _b[esc]
-            except KeyError:
-                msg = "Invalid \\escape: " + repr(esc)
-                raise ValueError(errmsg(msg, s, end))
-            end += 1
-        else:
-            # Unicode escape sequence
-            esc = s[end + 1:end + 5]
-            next_end = end + 5
-            if len(esc) != 4:
-                msg = "Invalid \\uXXXX escape"
-                raise ValueError(errmsg(msg, s, end))
-            uni = int(esc, 16)
-            # Check for surrogate pair on UCS-4 systems
-            if 0xd800 <= uni <= 0xdbff and sys.maxunicode > 65535:
-                msg = "Invalid \\uXXXX\\uXXXX surrogate pair"
-                if not s[end + 5:end + 7] == '\\u':
-                    raise ValueError(errmsg(msg, s, end))
-                esc2 = s[end + 7:end + 11]
-                if len(esc2) != 4:
-                    raise ValueError(errmsg(msg, s, end))
-                uni2 = int(esc2, 16)
-                uni = 0x10000 + (((uni - 0xd800) << 10) | (uni2 - 0xdc00))
-                next_end += 6
-            char = unichr(uni)
-            end = next_end
-        # Append the unescaped character
-        _append(char)
-    return u''.join(chunks), end
-
-
-# Use speedup if available
-scanstring = c_scanstring or py_scanstring
-
-WHITESPACE = re.compile(r'[ \t\n\r]*', FLAGS)
-WHITESPACE_STR = ' \t\n\r'
-
-def JSONObject((s, end), encoding, strict, scan_once, object_hook, _w=WHITESPACE.match, _ws=WHITESPACE_STR):
-    pairs = {}
-    # Use a slice to prevent IndexError from being raised, the following
-    # check will raise a more specific ValueError if the string is empty
-    nextchar = s[end:end + 1]
-    # Normally we expect nextchar == '"'
-    if nextchar != '"':
-        if nextchar in _ws:
-            end = _w(s, end).end()
-            nextchar = s[end:end + 1]
-        # Trivial empty object
-        if nextchar == '}':
-            return pairs, end + 1
-        elif nextchar != '"':
-            raise ValueError(errmsg("Expecting property name", s, end))
-    end += 1
-    while True:
-        key, end = scanstring(s, end, encoding, strict)
-
-        # To skip some function call overhead we optimize the fast paths where
-        # the JSON key separator is ": " or just ":".
-        if s[end:end + 1] != ':':
-            end = _w(s, end).end()
-            if s[end:end + 1] != ':':
-                raise ValueError(errmsg("Expecting : delimiter", s, end))
-
-        end += 1
-
-        try:
-            if s[end] in _ws:
-                end += 1
-                if s[end] in _ws:
-                    end = _w(s, end + 1).end()
-        except IndexError:
-            pass
-
-        try:
-            value, end = scan_once(s, end)
-        except StopIteration:
-            raise ValueError(errmsg("Expecting object", s, end))
-        pairs[key] = value
-
-        try:
-            nextchar = s[end]
-            if nextchar in _ws:
-                end = _w(s, end + 1).end()
-                nextchar = s[end]
-        except IndexError:
-            nextchar = ''
-        end += 1
-
-        if nextchar == '}':
-            break
-        elif nextchar != ',':
-            raise ValueError(errmsg("Expecting , delimiter", s, end - 1))
-
-        try:
-            nextchar = s[end]
-            if nextchar in _ws:
-                end += 1
-                nextchar = s[end]
-                if nextchar in _ws:
-                    end = _w(s, end + 1).end()
-                    nextchar = s[end]
-        except IndexError:
-            nextchar = ''
-
-        end += 1
-        if nextchar != '"':
-            raise ValueError(errmsg("Expecting property name", s, end - 1))
-
-    if object_hook is not None:
-        pairs = object_hook(pairs)
-    return pairs, end
-
-def JSONArray((s, end), scan_once, _w=WHITESPACE.match, _ws=WHITESPACE_STR):
-    values = []
-    nextchar = s[end:end + 1]
-    if nextchar in _ws:
-        end = _w(s, end + 1).end()
-        nextchar = s[end:end + 1]
-    # Look-ahead for trivial empty array
-    if nextchar == ']':
-        return values, end + 1
-    _append = values.append
-    while True:
-        try:
-            value, end = scan_once(s, end)
-        except StopIteration:
-            raise ValueError(errmsg("Expecting object", s, end))
-        _append(value)
-        nextchar = s[end:end + 1]
-        if nextchar in _ws:
-            end = _w(s, end + 1).end()
-            nextchar = s[end:end + 1]
-        end += 1
-        if nextchar == ']':
-            break
-        elif nextchar != ',':
-            raise ValueError(errmsg("Expecting , delimiter", s, end))
-
-        try:
-            if s[end] in _ws:
-                end += 1
-                if s[end] in _ws:
-                    end = _w(s, end + 1).end()
-        except IndexError:
-            pass
-
-    return values, end
-
-class JSONDecoder(object):
-    """Simple JSON <http://json.org> decoder
-
-    Performs the following translations in decoding by default:
-
-    +---------------+-------------------+
-    | JSON          | Python            |
-    +===============+===================+
-    | object        | dict              |
-    +---------------+-------------------+
-    | array         | list              |
-    +---------------+-------------------+
-    | string        | unicode           |
-    +---------------+-------------------+
-    | number (int)  | int, long         |
-    +---------------+-------------------+
-    | number (real) | float             |
-    +---------------+-------------------+
-    | true          | True              |
-    +---------------+-------------------+
-    | false         | False             |
-    +---------------+-------------------+
-    | null          | None              |
-    +---------------+-------------------+
-
-    It also understands ``NaN``, ``Infinity``, and ``-Infinity`` as
-    their corresponding ``float`` values, which is outside the JSON spec.
-
-    """
-
-    def __init__(self, encoding=None, object_hook=None, parse_float=None,
-            parse_int=None, parse_constant=None, strict=True):
-        """``encoding`` determines the encoding used to interpret any ``str``
-        objects decoded by this instance (utf-8 by default).  It has no
-        effect when decoding ``unicode`` objects.
-
-        Note that currently only encodings that are a superset of ASCII work,
-        strings of other encodings should be passed in as ``unicode``.
-
-        ``object_hook``, if specified, will be called with the result
-        of every JSON object decoded and its return value will be used in
-        place of the given ``dict``.  This can be used to provide custom
-        deserializations (e.g. to support JSON-RPC class hinting).
-
-        ``parse_float``, if specified, will be called with the string
-        of every JSON float to be decoded. By default this is equivalent to
-        float(num_str). This can be used to use another datatype or parser
-        for JSON floats (e.g. decimal.Decimal).
-
-        ``parse_int``, if specified, will be called with the string
-        of every JSON int to be decoded. By default this is equivalent to
-        int(num_str). This can be used to use another datatype or parser
-        for JSON integers (e.g. float).
-
-        ``parse_constant``, if specified, will be called with one of the
-        following strings: -Infinity, Infinity, NaN.
-        This can be used to raise an exception if invalid JSON numbers
-        are encountered.
-
-        """
-        self.encoding = encoding
-        self.object_hook = object_hook
-        self.parse_float = parse_float or float
-        self.parse_int = parse_int or int
-        self.parse_constant = parse_constant or _CONSTANTS.__getitem__
-        self.strict = strict
-        self.parse_object = JSONObject
-        self.parse_array = JSONArray
-        self.parse_string = scanstring
-        self.scan_once = make_scanner(self)
-
-    def decode(self, s, _w=WHITESPACE.match):
-        """Return the Python representation of ``s`` (a ``str`` or ``unicode``
-        instance containing a JSON document)
-
-        """
-        obj, end = self.raw_decode(s, idx=_w(s, 0).end())
-        end = _w(s, end).end()
-        if end != len(s):
-            raise ValueError(errmsg("Extra data", s, end, len(s)))
-        return obj
-
-    def raw_decode(self, s, idx=0):
-        """Decode a JSON document from ``s`` (a ``str`` or ``unicode`` beginning
-        with a JSON document) and return a 2-tuple of the Python
-        representation and the index in ``s`` where the document ended.
-
-        This can be used to decode a JSON document from a string that may
-        have extraneous data at the end.
-
-        """
-        try:
-            obj, end = self.scan_once(s, idx)
-        except StopIteration:
-            raise ValueError("No JSON object could be decoded")
-        return obj, end
diff --git a/lang/py/lib/simplejson/encoder.py b/lang/py/lib/simplejson/encoder.py
deleted file mode 100644
index cf58290..0000000
--- a/lang/py/lib/simplejson/encoder.py
+++ /dev/null
@@ -1,440 +0,0 @@
-"""Implementation of JSONEncoder
-"""
-import re
-
-try:
-    from simplejson._speedups import encode_basestring_ascii as c_encode_basestring_ascii
-except ImportError:
-    c_encode_basestring_ascii = None
-try:
-    from simplejson._speedups import make_encoder as c_make_encoder
-except ImportError:
-    c_make_encoder = None
-
-ESCAPE = re.compile(r'[\x00-\x1f\\"\b\f\n\r\t]')
-ESCAPE_ASCII = re.compile(r'([\\"]|[^\ -~])')
-HAS_UTF8 = re.compile(r'[\x80-\xff]')
-ESCAPE_DCT = {
-    '\\': '\\\\',
-    '"': '\\"',
-    '\b': '\\b',
-    '\f': '\\f',
-    '\n': '\\n',
-    '\r': '\\r',
-    '\t': '\\t',
-}
-for i in range(0x20):
-    #ESCAPE_DCT.setdefault(chr(i), '\\u{0:04x}'.format(i))
-    ESCAPE_DCT.setdefault(chr(i), '\\u%04x' % (i,))
-
-# Assume this produces an infinity on all machines (probably not guaranteed)
-INFINITY = float('1e66666')
-FLOAT_REPR = repr
-
-def encode_basestring(s):
-    """Return a JSON representation of a Python string
-
-    """
-    def replace(match):
-        return ESCAPE_DCT[match.group(0)]
-    return '"' + ESCAPE.sub(replace, s) + '"'
-
-
-def py_encode_basestring_ascii(s):
-    """Return an ASCII-only JSON representation of a Python string
-
-    """
-    if isinstance(s, str) and HAS_UTF8.search(s) is not None:
-        s = s.decode('utf-8')
-    def replace(match):
-        s = match.group(0)
-        try:
-            return ESCAPE_DCT[s]
-        except KeyError:
-            n = ord(s)
-            if n < 0x10000:
-                #return '\\u{0:04x}'.format(n)
-                return '\\u%04x' % (n,)
-            else:
-                # surrogate pair
-                n -= 0x10000
-                s1 = 0xd800 | ((n >> 10) & 0x3ff)
-                s2 = 0xdc00 | (n & 0x3ff)
-                #return '\\u{0:04x}\\u{1:04x}'.format(s1, s2)
-                return '\\u%04x\\u%04x' % (s1, s2)
-    return '"' + str(ESCAPE_ASCII.sub(replace, s)) + '"'
-
-
-encode_basestring_ascii = c_encode_basestring_ascii or py_encode_basestring_ascii
-
-class JSONEncoder(object):
-    """Extensible JSON <http://json.org> encoder for Python data structures.
-
-    Supports the following objects and types by default:
-
-    +-------------------+---------------+
-    | Python            | JSON          |
-    +===================+===============+
-    | dict              | object        |
-    +-------------------+---------------+
-    | list, tuple       | array         |
-    +-------------------+---------------+
-    | str, unicode      | string        |
-    +-------------------+---------------+
-    | int, long, float  | number        |
-    +-------------------+---------------+
-    | True              | true          |
-    +-------------------+---------------+
-    | False             | false         |
-    +-------------------+---------------+
-    | None              | null          |
-    +-------------------+---------------+
-
-    To extend this to recognize other objects, subclass and implement a
-    ``.default()`` method with another method that returns a serializable
-    object for ``o`` if possible, otherwise it should call the superclass
-    implementation (to raise ``TypeError``).
-
-    """
-    item_separator = ', '
-    key_separator = ': '
-    def __init__(self, skipkeys=False, ensure_ascii=True,
-            check_circular=True, allow_nan=True, sort_keys=False,
-            indent=None, separators=None, encoding='utf-8', default=None):
-        """Constructor for JSONEncoder, with sensible defaults.
-
-        If skipkeys is false, then it is a TypeError to attempt
-        encoding of keys that are not str, int, long, float or None.  If
-        skipkeys is True, such items are simply skipped.
-
-        If ensure_ascii is true, the output is guaranteed to be str
-        objects with all incoming unicode characters escaped.  If
-        ensure_ascii is false, the output will be unicode object.
-
-        If check_circular is true, then lists, dicts, and custom encoded
-        objects will be checked for circular references during encoding to
-        prevent an infinite recursion (which would cause an OverflowError).
-        Otherwise, no such check takes place.
-
-        If allow_nan is true, then NaN, Infinity, and -Infinity will be
-        encoded as such.  This behavior is not JSON specification compliant,
-        but is consistent with most JavaScript based encoders and decoders.
-        Otherwise, it will be a ValueError to encode such floats.
-
-        If sort_keys is true, then the output of dictionaries will be
-        sorted by key; this is useful for regression tests to ensure
-        that JSON serializations can be compared on a day-to-day basis.
-
-        If indent is a non-negative integer, then JSON array
-        elements and object members will be pretty-printed with that
-        indent level.  An indent level of 0 will only insert newlines.
-        None is the most compact representation.
-
-        If specified, separators should be a (item_separator, key_separator)
-        tuple.  The default is (', ', ': ').  To get the most compact JSON
-        representation you should specify (',', ':') to eliminate whitespace.
-
-        If specified, default is a function that gets called for objects
-        that can't otherwise be serialized.  It should return a JSON encodable
-        version of the object or raise a ``TypeError``.
-
-        If encoding is not None, then all input strings will be
-        transformed into unicode using that encoding prior to JSON-encoding.
-        The default is UTF-8.
-
-        """
-
-        self.skipkeys = skipkeys
-        self.ensure_ascii = ensure_ascii
-        self.check_circular = check_circular
-        self.allow_nan = allow_nan
-        self.sort_keys = sort_keys
-        self.indent = indent
-        if separators is not None:
-            self.item_separator, self.key_separator = separators
-        if default is not None:
-            self.default = default
-        self.encoding = encoding
-
-    def default(self, o):
-        """Implement this method in a subclass such that it returns
-        a serializable object for ``o``, or calls the base implementation
-        (to raise a ``TypeError``).
-
-        For example, to support arbitrary iterators, you could
-        implement default like this::
-
-            def default(self, o):
-                try:
-                    iterable = iter(o)
-                except TypeError:
-                    pass
-                else:
-                    return list(iterable)
-                return JSONEncoder.default(self, o)
-
-        """
-        raise TypeError(repr(o) + " is not JSON serializable")
-
-    def encode(self, o):
-        """Return a JSON string representation of a Python data structure.
-
-        >>> JSONEncoder().encode({"foo": ["bar", "baz"]})
-        '{"foo": ["bar", "baz"]}'
-
-        """
-        # This is for extremely simple cases and benchmarks.
-        if isinstance(o, basestring):
-            if isinstance(o, str):
-                _encoding = self.encoding
-                if (_encoding is not None
-                        and not (_encoding == 'utf-8')):
-                    o = o.decode(_encoding)
-            if self.ensure_ascii:
-                return encode_basestring_ascii(o)
-            else:
-                return encode_basestring(o)
-        # This doesn't pass the iterator directly to ''.join() because the
-        # exceptions aren't as detailed.  The list call should be roughly
-        # equivalent to the PySequence_Fast that ''.join() would do.
-        chunks = self.iterencode(o, _one_shot=True)
-        if not isinstance(chunks, (list, tuple)):
-            chunks = list(chunks)
-        return ''.join(chunks)
-
-    def iterencode(self, o, _one_shot=False):
-        """Encode the given object and yield each string
-        representation as available.
-
-        For example::
-
-            for chunk in JSONEncoder().iterencode(bigobject):
-                mysocket.write(chunk)
-
-        """
-        if self.check_circular:
-            markers = {}
-        else:
-            markers = None
-        if self.ensure_ascii:
-            _encoder = encode_basestring_ascii
-        else:
-            _encoder = encode_basestring
-        if self.encoding != 'utf-8':
-            def _encoder(o, _orig_encoder=_encoder, _encoding=self.encoding):
-                if isinstance(o, str):
-                    o = o.decode(_encoding)
-                return _orig_encoder(o)
-
-        def floatstr(o, allow_nan=self.allow_nan, _repr=FLOAT_REPR, _inf=INFINITY, _neginf=-INFINITY):
-            # Check for specials.  Note that this type of test is processor- and/or
-            # platform-specific, so do tests which don't depend on the internals.
-
-            if o != o:
-                text = 'NaN'
-            elif o == _inf:
-                text = 'Infinity'
-            elif o == _neginf:
-                text = '-Infinity'
-            else:
-                return _repr(o)
-
-            if not allow_nan:
-                raise ValueError(
-                    "Out of range float values are not JSON compliant: " +
-                    repr(o))
-
-            return text
-
-
-        if _one_shot and c_make_encoder is not None and not self.indent and not self.sort_keys:
-            _iterencode = c_make_encoder(
-                markers, self.default, _encoder, self.indent,
-                self.key_separator, self.item_separator, self.sort_keys,
-                self.skipkeys, self.allow_nan)
-        else:
-            _iterencode = _make_iterencode(
-                markers, self.default, _encoder, self.indent, floatstr,
-                self.key_separator, self.item_separator, self.sort_keys,
-                self.skipkeys, _one_shot)
-        return _iterencode(o, 0)
-
-def _make_iterencode(markers, _default, _encoder, _indent, _floatstr, _key_separator, _item_separator, _sort_keys, _skipkeys, _one_shot,
-        ## HACK: hand-optimized bytecode; turn globals into locals
-        False=False,
-        True=True,
-        ValueError=ValueError,
-        basestring=basestring,
-        dict=dict,
-        float=float,
-        id=id,
-        int=int,
-        isinstance=isinstance,
-        list=list,
-        long=long,
-        str=str,
-        tuple=tuple,
-    ):
-
-    def _iterencode_list(lst, _current_indent_level):
-        if not lst:
-            yield '[]'
-            return
-        if markers is not None:
-            markerid = id(lst)
-            if markerid in markers:
-                raise ValueError("Circular reference detected")
-            markers[markerid] = lst
-        buf = '['
-        if _indent is not None:
-            _current_indent_level += 1
-            newline_indent = '\n' + (' ' * (_indent * _current_indent_level))
-            separator = _item_separator + newline_indent
-            buf += newline_indent
-        else:
-            newline_indent = None
-            separator = _item_separator
-        first = True
-        for value in lst:
-            if first:
-                first = False
-            else:
-                buf = separator
-            if isinstance(value, basestring):
-                yield buf + _encoder(value)
-            elif value is None:
-                yield buf + 'null'
-            elif value is True:
-                yield buf + 'true'
-            elif value is False:
-                yield buf + 'false'
-            elif isinstance(value, (int, long)):
-                yield buf + str(value)
-            elif isinstance(value, float):
-                yield buf + _floatstr(value)
-            else:
-                yield buf
-                if isinstance(value, (list, tuple)):
-                    chunks = _iterencode_list(value, _current_indent_level)
-                elif isinstance(value, dict):
-                    chunks = _iterencode_dict(value, _current_indent_level)
-                else:
-                    chunks = _iterencode(value, _current_indent_level)
-                for chunk in chunks:
-                    yield chunk
-        if newline_indent is not None:
-            _current_indent_level -= 1
-            yield '\n' + (' ' * (_indent * _current_indent_level))
-        yield ']'
-        if markers is not None:
-            del markers[markerid]
-
-    def _iterencode_dict(dct, _current_indent_level):
-        if not dct:
-            yield '{}'
-            return
-        if markers is not None:
-            markerid = id(dct)
-            if markerid in markers:
-                raise ValueError("Circular reference detected")
-            markers[markerid] = dct
-        yield '{'
-        if _indent is not None:
-            _current_indent_level += 1
-            newline_indent = '\n' + (' ' * (_indent * _current_indent_level))
-            item_separator = _item_separator + newline_indent
-            yield newline_indent
-        else:
-            newline_indent = None
-            item_separator = _item_separator
-        first = True
-        if _sort_keys:
-            items = dct.items()
-            items.sort(key=lambda kv: kv[0])
-        else:
-            items = dct.iteritems()
-        for key, value in items:
-            if isinstance(key, basestring):
-                pass
-            # JavaScript is weakly typed for these, so it makes sense to
-            # also allow them.  Many encoders seem to do something like this.
-            elif isinstance(key, float):
-                key = _floatstr(key)
-            elif key is True:
-                key = 'true'
-            elif key is False:
-                key = 'false'
-            elif key is None:
-                key = 'null'
-            elif isinstance(key, (int, long)):
-                key = str(key)
-            elif _skipkeys:
-                continue
-            else:
-                raise TypeError("key " + repr(key) + " is not a string")
-            if first:
-                first = False
-            else:
-                yield item_separator
-            yield _encoder(key)
-            yield _key_separator
-            if isinstance(value, basestring):
-                yield _encoder(value)
-            elif value is None:
-                yield 'null'
-            elif value is True:
-                yield 'true'
-            elif value is False:
-                yield 'false'
-            elif isinstance(value, (int, long)):
-                yield str(value)
-            elif isinstance(value, float):
-                yield _floatstr(value)
-            else:
-                if isinstance(value, (list, tuple)):
-                    chunks = _iterencode_list(value, _current_indent_level)
-                elif isinstance(value, dict):
-                    chunks = _iterencode_dict(value, _current_indent_level)
-                else:
-                    chunks = _iterencode(value, _current_indent_level)
-                for chunk in chunks:
-                    yield chunk
-        if newline_indent is not None:
-            _current_indent_level -= 1
-            yield '\n' + (' ' * (_indent * _current_indent_level))
-        yield '}'
-        if markers is not None:
-            del markers[markerid]
-
-    def _iterencode(o, _current_indent_level):
-        if isinstance(o, basestring):
-            yield _encoder(o)
-        elif o is None:
-            yield 'null'
-        elif o is True:
-            yield 'true'
-        elif o is False:
-            yield 'false'
-        elif isinstance(o, (int, long)):
-            yield str(o)
-        elif isinstance(o, float):
-            yield _floatstr(o)
-        elif isinstance(o, (list, tuple)):
-            for chunk in _iterencode_list(o, _current_indent_level):
-                yield chunk
-        elif isinstance(o, dict):
-            for chunk in _iterencode_dict(o, _current_indent_level):
-                yield chunk
-        else:
-            if markers is not None:
-                markerid = id(o)
-                if markerid in markers:
-                    raise ValueError("Circular reference detected")
-                markers[markerid] = o
-            o = _default(o)
-            for chunk in _iterencode(o, _current_indent_level):
-                yield chunk
-            if markers is not None:
-                del markers[markerid]
-
-    return _iterencode
diff --git a/lang/py/lib/simplejson/scanner.py b/lang/py/lib/simplejson/scanner.py
deleted file mode 100644
index adbc6ec..0000000
--- a/lang/py/lib/simplejson/scanner.py
+++ /dev/null
@@ -1,65 +0,0 @@
-"""JSON token scanner
-"""
-import re
-try:
-    from simplejson._speedups import make_scanner as c_make_scanner
-except ImportError:
-    c_make_scanner = None
-
-__all__ = ['make_scanner']
-
-NUMBER_RE = re.compile(
-    r'(-?(?:0|[1-9]\d*))(\.\d+)?([eE][-+]?\d+)?',
-    (re.VERBOSE | re.MULTILINE | re.DOTALL))
-
-def py_make_scanner(context):
-    parse_object = context.parse_object
-    parse_array = context.parse_array
-    parse_string = context.parse_string
-    match_number = NUMBER_RE.match
-    encoding = context.encoding
-    strict = context.strict
-    parse_float = context.parse_float
-    parse_int = context.parse_int
-    parse_constant = context.parse_constant
-    object_hook = context.object_hook
-
-    def _scan_once(string, idx):
-        try:
-            nextchar = string[idx]
-        except IndexError:
-            raise StopIteration
-
-        if nextchar == '"':
-            return parse_string(string, idx + 1, encoding, strict)
-        elif nextchar == '{':
-            return parse_object((string, idx + 1), encoding, strict, _scan_once, object_hook)
-        elif nextchar == '[':
-            return parse_array((string, idx + 1), _scan_once)
-        elif nextchar == 'n' and string[idx:idx + 4] == 'null':
-            return None, idx + 4
-        elif nextchar == 't' and string[idx:idx + 4] == 'true':
-            return True, idx + 4
-        elif nextchar == 'f' and string[idx:idx + 5] == 'false':
-            return False, idx + 5
-
-        m = match_number(string, idx)
-        if m is not None:
-            integer, frac, exp = m.groups()
-            if frac or exp:
-                res = parse_float(integer + (frac or '') + (exp or ''))
-            else:
-                res = parse_int(integer)
-            return res, m.end()
-        elif nextchar == 'N' and string[idx:idx + 3] == 'NaN':
-            return parse_constant('NaN'), idx + 3
-        elif nextchar == 'I' and string[idx:idx + 8] == 'Infinity':
-            return parse_constant('Infinity'), idx + 8
-        elif nextchar == '-' and string[idx:idx + 9] == '-Infinity':
-            return parse_constant('-Infinity'), idx + 9
-        else:
-            raise StopIteration
-
-    return _scan_once
-
-make_scanner = c_make_scanner or py_make_scanner
diff --git a/lang/py/lib/simplejson/tool.py b/lang/py/lib/simplejson/tool.py
deleted file mode 100644
index 9044331..0000000
--- a/lang/py/lib/simplejson/tool.py
+++ /dev/null
@@ -1,37 +0,0 @@
-r"""Command-line tool to validate and pretty-print JSON
-
-Usage::
-
-    $ echo '{"json":"obj"}' | python -m simplejson.tool
-    {
-        "json": "obj"
-    }
-    $ echo '{ 1.2:3.4}' | python -m simplejson.tool
-    Expecting property name: line 1 column 2 (char 2)
-
-"""
-import sys
-import simplejson
-
-def main():
-    if len(sys.argv) == 1:
-        infile = sys.stdin
-        outfile = sys.stdout
-    elif len(sys.argv) == 2:
-        infile = open(sys.argv[1], 'rb')
-        outfile = sys.stdout
-    elif len(sys.argv) == 3:
-        infile = open(sys.argv[1], 'rb')
-        outfile = open(sys.argv[2], 'wb')
-    else:
-        raise SystemExit(sys.argv[0] + " [infile [outfile]]")
-    try:
-        obj = simplejson.load(infile)
-    except ValueError, e:
-        raise SystemExit(e)
-    simplejson.dump(obj, outfile, sort_keys=True, indent=4)
-    outfile.write('\n')
-
-
-if __name__ == '__main__':
-    main()
diff --git a/lang/py/src/avro/schema.py b/lang/py/src/avro/schema.py
index 86ce86a..f946d0a 100644
--- a/lang/py/src/avro/schema.py
+++ b/lang/py/src/avro/schema.py
@@ -385,13 +385,13 @@ class Field(object):
 #
 class PrimitiveSchema(Schema):
   """Valid primitive types are in PRIMITIVE_TYPES."""
-  def __init__(self, type):
+  def __init__(self, type, other_props=None):
     # Ensure valid ctor args
     if type not in PRIMITIVE_TYPES:
       raise AvroException("%s is not a valid primitive type." % type)
 
     # Call parent ctor
-    Schema.__init__(self, type)
+    Schema.__init__(self, type, other_props=other_props)
 
     self.fullname = type
 
@@ -723,7 +723,7 @@ def make_avsc_object(json_data, names=None):
     type = json_data.get('type')
     other_props = get_other_props(json_data, SCHEMA_RESERVED_PROPS)
     if type in PRIMITIVE_TYPES:
-      return PrimitiveSchema(type)
+      return PrimitiveSchema(type, other_props)
     elif type in NAMED_TYPES:
       name = json_data.get('name')
       namespace = json_data.get('namespace', names.default_namespace)
diff --git a/lang/py/src/avro/tether/__init__.py b/lang/py/src/avro/tether/__init__.py
new file mode 100644
index 0000000..458c692
--- /dev/null
+++ b/lang/py/src/avro/tether/__init__.py
@@ -0,0 +1,7 @@
+from .util import *
+from .tether_task import *
+from .tether_task_runner import *
+
+__all__=util.__all__
+__all__+=tether_task.__all__
+__all__+=tether_task_runner.__all__
diff --git a/lang/py/src/avro/tether/tether_task.py b/lang/py/src/avro/tether/tether_task.py
new file mode 100644
index 0000000..90a8788
--- /dev/null
+++ b/lang/py/src/avro/tether/tether_task.py
@@ -0,0 +1,498 @@
+"""
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+"""
+
+__all__=["TetherTask","TaskType","inputProtocol","outputProtocol","HTTPRequestor"]
+
+from avro import schema, protocol
+from avro import io as avio
+from avro import ipc
+
+import io as pyio
+import sys
+import os
+import traceback
+import logging
+import collections
+from StringIO import StringIO
+import threading
+
+
+# create protocol objects for the input and output protocols
+# The build process should copy InputProtocol.avpr and OutputProtocol.avpr
+# into the same directory as this module
+inputProtocol=None
+outputProtocol=None
+
+TaskType=None
+if (inputProtocol is None):
+  pfile=os.path.split(__file__)[0]+os.sep+"InputProtocol.avpr"
+
+  if not(os.path.exists(pfile)):
+    raise Exception("Could not locate the InputProtocol: {0} does not exist".format(pfile))
+
+  with file(pfile,'r') as hf:
+    prototxt=hf.read()
+
+  inputProtocol=protocol.parse(prototxt)
+
+  # use a named tuple to represent the tasktype enumeration
+  taskschema=inputProtocol.types_dict["TaskType"]
+  _ttype=collections.namedtuple("_tasktype",taskschema.symbols)
+  TaskType=_ttype(*taskschema.symbols)
+
+if (outputProtocol is None):
+  pfile=os.path.split(__file__)[0]+os.sep+"OutputProtocol.avpr"
+
+  if not(os.path.exists(pfile)):
+    raise Exception("Could not locate the OutputProtocol: {0} does not exist".format(pfile))
+
+  with file(pfile,'r') as hf:
+    prototxt=hf.read()
+
+  outputProtocol=protocol.parse(prototxt)
+
+class Collector(object):
+  """
+  Collector for map and reduce output values
+  """
+  def __init__(self,scheme=None,outputClient=None):
+    """
+
+    Parameters
+    ---------------------------------------------
+    scheme - The scheme for the datums to output - can be a json string
+           - or an instance of Schema
+    outputClient - The output client used to send messages to the parent
+    """
+
+    if not(isinstance(scheme,schema.Schema)):
+      scheme=schema.parse(scheme)
+
+    if (outputClient is None):
+      raise ValueError("output client can't be none.")
+
+    self.scheme=scheme
+    self.buff=StringIO()
+    self.encoder=avio.BinaryEncoder(self.buff)
+
+    self.datum_writer = avio.DatumWriter(writers_schema=self.scheme)
+    self.outputClient=outputClient
+
+  def collect(self,record,partition=None):
+    """Collect a map or reduce output value
+
+    Parameters
+    ------------------------------------------------------
+    record - The record to write
+    partition - Indicates the partition for a pre-partitioned map output
+              - currently not supported
+    """
+
+    self.buff.truncate(0)
+    self.datum_writer.write(record, self.encoder);
+    self.buff.flush();
+    self.buff.seek(0)
+
+    # delete all the data in the buffer
+    if (partition is None):
+
+      # TODO: Is there a more efficient way to read the data in self.buff?
+      # we could use self.buff.read() but that returns the byte array as a string
+      # will that work?  We can also use self.buff.readinto to read it into
+      # a bytearray but the byte array must be pre-allocated
+      # self.outputClient.output(self.buff.buffer.read())
+
+      #its not a StringIO
+      self.outputClient.request("output",{"datum":self.buff.read()})
+    else:
+      self.outputClient.request("outputPartitioned",{"datum":self.buff.read(),"partition":partition})
+
+
+
+def keys_are_equal(rec1,rec2,fkeys):
+  """Check if the "keys" in two records are equal. The key fields
+  are all fields for which order isn't marked ignore.
+
+  Parameters
+  -------------------------------------------------------------------------
+  rec1  - The first record
+  rec2 - The second record
+  fkeys - A list of the fields to compare
+  """
+
+  for f in fkeys:
+    if not(rec1[f]==rec2[f]):
+      return False
+
+  return True
+
+
+class HTTPRequestor(object):
+  """
+  This is a small requestor subclass I created for the HTTP protocol.
+  Since the HTTP protocol isn't persistent, we need to instantiate
+  a new transciever and new requestor for each request.
+  But I wanted to use of the requestor to be identical to that for
+  SocketTransciever so that we can seamlessly switch between the two.
+  """
+
+  def __init__(self, server,port,protocol):
+    """
+    Instantiate the class.
+
+    Parameters
+    ----------------------------------------------------------------------
+    server - The server hostname
+    port - Which port to use
+    protocol - The protocol for the communication
+    """
+
+    self.server=server
+    self.port=port
+    self.protocol=protocol
+
+  def request(self,*args,**param):
+    transciever=ipc.HTTPTransceiver(self.server,self.port)
+    requestor=ipc.Requestor(self.protocol, transciever)
+    return requestor.request(*args,**param)
+
+
+class TetherTask(object):
+  """
+  Base class for python tether mapreduce programs.
+
+  ToDo: Currently the subclass has to implement both reduce and reduceFlush.
+  This is not very pythonic. A pythonic way to implement the reducer
+  would be to pass the reducer a generator (as dumbo does) so that the user
+  could iterate over the records for the given key.
+  How would we do this. I think we would need to have two threads, one thread would run
+  the user's reduce function. This loop would be suspended when no reducer records were available.
+  The other thread would read in the records for the reducer. This thread should
+  only buffer so many records at a time (i.e if the buffer is full, self.input shouldn't return right
+  away but wait for space to free up)
+  """
+
+  def __init__(self,inschema=None,midschema=None,outschema=None):
+    """
+
+    Parameters
+    ---------------------------------------------------------
+    inschema - The scheme for the input to the mapper
+    midschema  - The scheme for the output of the mapper
+    outschema - The scheme for the output of the reducer
+
+    An example scheme for the prototypical word count example would be
+    inscheme='{"type":"record", "name":"Pair","namespace":"org.apache.avro.mapred","fields":[
+              {"name":"key","type":"string"},
+              {"name":"value","type":"long","order":"ignore"}]
+              }'
+
+    Important: The records are split into (key,value) pairs as required by map reduce
+    by using all fields with "order"=ignore for the key and the remaining fields for the value.
+
+    The subclass provides these schemas in order to tell this class which schemas it expects.
+    The configure request will also provide the schemas that the parent process is using.
+    This allows us to check whether the schemas match and if not whether we can resolve
+    the differences (see http://avro.apache.org/docs/current/spec.html#Schema+Resolution))
+
+    """
+
+
+    if (inschema is None):
+      raise ValueError("inschema can't be None")
+
+    if (midschema is None):
+      raise ValueError("midschema can't be None")
+
+    if (outschema is None):
+      raise ValueError("outschema can't be None")
+
+    # make sure we can parse the schemas
+    # Should we call fail if we can't parse the schemas?
+    self.inschema=schema.parse(inschema)
+    self.midschema=schema.parse(midschema)
+    self.outschema=schema.parse(outschema)
+
+
+    # declare various variables
+    self.clienTransciever=None
+
+    # output client is used to communicate with the parent process
+    # in particular to transmit the outputs of the mapper and reducer
+    self.outputClient = None
+
+    # collectors for the output of the mapper and reducer
+    self.midCollector=None
+    self.outCollector=None
+
+    self._partitions=None
+
+    # cache a list of the fields used by the reducer as the keys
+    # we need the fields to decide when we have finished processing all values for
+    # a given key. We cache the fields to be more efficient
+    self._red_fkeys=None
+
+    # We need to keep track of the previous record fed to the reducer
+    # b\c we need to be able to determine when we start processing a new group
+    # in the reducer
+    self.midRecord=None
+
+    # create an event object to signal when
+    # http server is ready to be shutdown
+    self.ready_for_shutdown=threading.Event()
+    self.log=logging.getLogger("TetherTask")
+
+  def open(self, inputport,clientPort=None):
+    """Open the output client - i.e the connection to the parent process
+
+    Parameters
+    ---------------------------------------------------------------
+    inputport - This is the port that the subprocess is listening on. i.e the
+                subprocess starts a server listening on this port to accept requests from
+                the parent process
+    clientPort - The port on which the server in the parent process is listening
+                - If this is None we look for the environment variable AVRO_TETHER_OUTPUT_PORT
+                - This is mainly provided for debugging purposes. In practice
+                we want to use the environment variable
+
+    """
+
+
+    # Open the connection to the parent process
+    # The port the parent process is listening on is set in the environment
+    # variable AVRO_TETHER_OUTPUT_PORT
+    # open output client, connecting to parent
+
+    if (clientPort is None):
+      clientPortString = os.getenv("AVRO_TETHER_OUTPUT_PORT")
+      if (clientPortString is None):
+        raise Exception("AVRO_TETHER_OUTPUT_PORT env var is not set")
+
+      clientPort = int(clientPortString)
+
+    self.log.info("TetherTask.open: Opening connection to parent server on port={0}".format(clientPort))
+
+    # We use the HTTP protocol although we hope to shortly have
+    # support for SocketServer,
+    usehttp=True
+
+    if(usehttp):
+      # self.outputClient =  ipc.Requestor(outputProtocol, self.clientTransceiver)
+      # since HTTP is stateless, a new transciever
+      # is created and closed for each request. We therefore set clientTransciever to None
+      # We still declare clientTransciever because for other (state) protocols we will need
+      # it and we want to check when we get the message fail whether the transciever
+      # needs to be closed.
+      # self.clientTranciever=None
+      self.outputClient =  HTTPRequestor("127.0.0.1",clientPort,outputProtocol)
+
+    else:
+      raise NotImplementedError("Only http protocol is currently supported")
+
+    try:
+      self.outputClient.request('configure',{"port":inputport})
+    except Exception as e:
+      estr= traceback.format_exc()
+      self.fail(estr)
+
+
+  def configure(self,taskType,  inSchemaText,  outSchemaText):
+    """
+
+    Parameters
+    -------------------------------------------------------------------
+    taskType - What type of task (e.g map, reduce)
+             - This is an enumeration which is specified in the input protocol
+    inSchemaText -  string containing the input schema
+                 - This is the actual schema with which the data was encoded
+                   i.e it is the writer_schema (see http://avro.apache.org/docs/current/spec.html#Schema+Resolution)
+                   This is the schema the parent process is using which might be different
+                   from the one provided by the subclass of tether_task
+
+    outSchemaText - string containing the output scheme
+                  - This is the schema expected by the parent process for the output
+    """
+    self.taskType = taskType
+
+    try:
+      inSchema = schema.parse(inSchemaText)
+      outSchema = schema.parse(outSchemaText)
+
+      if (taskType==TaskType.MAP):
+        self.inReader=avio.DatumReader(writers_schema=inSchema,readers_schema=self.inschema)
+        self.midCollector=Collector(outSchemaText,self.outputClient)
+
+      elif(taskType==TaskType.REDUCE):
+        self.midReader=avio.DatumReader(writers_schema=inSchema,readers_schema=self.midschema)
+        # this.outCollector = new Collector<OUT>(outSchema);
+        self.outCollector=Collector(outSchemaText,self.outputClient)
+
+        # determine which fields in the input record are they keys for the reducer
+        self._red_fkeys=[f.name for f in self.midschema.fields if not(f.order=='ignore')]
+
+    except Exception as e:
+
+      estr= traceback.format_exc()
+      self.fail(estr)
+
+  def set_partitions(self,npartitions):
+
+    try:
+      self._partitions=npartitions
+    except Exception as e:
+      estr= traceback.format_exc()
+      self.fail(estr)
+
+  def get_partitions():
+    """ Return the number of map output partitions of this job."""
+    return self._partitions
+
+  def input(self,data,count):
+    """ Recieve input from the server
+
+    Parameters
+    ------------------------------------------------------
+    data - Sould containg the bytes encoding the serialized data
+          - I think this gets represented as a tring
+    count - how many input records are provided in the binary stream
+    """
+    try:
+      # to avio.BinaryDecoder
+      bdata=StringIO(data)
+      decoder = avio.BinaryDecoder(bdata)
+
+      for i in range(count):
+        if (self.taskType==TaskType.MAP):
+          inRecord = self.inReader.read(decoder)
+
+          # Do we need to pass midCollector if its declared as an instance variable
+          self.map(inRecord, self.midCollector)
+
+        elif (self.taskType==TaskType.REDUCE):
+
+          # store the previous record
+          prev = self.midRecord
+
+          # read the new record
+          self.midRecord = self.midReader.read(decoder);
+          if (prev != None and not(keys_are_equal(self.midRecord,prev,self._red_fkeys))):
+            # since the key has changed we need to finalize the processing
+            # for this group of key,value pairs
+            self.reduceFlush(prev, self.outCollector)
+          self.reduce(self.midRecord, self.outCollector)
+
+    except Exception as e:
+      estr= traceback.format_exc()
+      self.log.warning("failing: "+estr)
+      self.fail(estr)
+
+  def complete(self):
+    """
+    Process the complete request
+    """
+    if ((self.taskType == TaskType.REDUCE ) and not(self.midRecord is None)):
+      try:
+        self.reduceFlush(self.midRecord, self.outCollector);
+      except Exception as e:
+        estr=traceback.format_exc()
+        self.log.warning("failing: "+estr);
+        self.fail(estr)
+
+    self.outputClient.request("complete",dict())
+
+  def map(self,record,collector):
+    """Called with input values to generate intermediat values (i.e mapper output).
+
+    Parameters
+    ----------------------------------------------------------------------------
+    record - The input record
+    collector - The collector to collect the output
+
+    This is an abstract function which should be overloaded by the application specific
+    subclass.
+    """
+
+    raise NotImplementedError("This is an abstract method which should be overloaded in the subclass")
+
+  def reduce(self,record, collector):
+    """ Called with input values to generate reducer output. Inputs are sorted by the mapper
+    key.
+
+    The reduce function is invoked once for each value belonging to a given key outputted
+    by the mapper.
+
+    Parameters
+    ----------------------------------------------------------------------------
+    record - The mapper output
+    collector - The collector to collect the output
+
+    This is an abstract function which should be overloaded by the application specific
+    subclass.
+    """
+
+    raise NotImplementedError("This is an abstract method which should be overloaded in the subclass")
+
+  def reduceFlush(self,record, collector):
+    """
+    Called with the last intermediate value in each equivalence run.
+    In other words, reduceFlush is invoked once for each key produced in the reduce
+    phase. It is called after reduce has been invoked on each value for the given key.
+
+    Parameters
+    ------------------------------------------------------------------
+    record - the last record on which reduce was invoked.
+    """
+    raise NotImplementedError("This is an abstract method which should be overloaded in the subclass")
+
+  def status(self,message):
+    """
+    Called to update task status
+    """
+    self.outputClient.request("status",{"message":message})
+
+  def count(self,group, name, amount):
+    """
+    Called to increment a counter
+    """
+    self.outputClient.request("count",{"group":group, "name":name, "amount":amount})
+
+  def fail(self,message):
+    """
+    Call to fail the task.
+    """
+    self.log.error("TetherTask.fail: failure occured message follows:\n{0}".format(message))
+    try:
+      self.outputClient.request("fail",{"message":message})
+    except Exception as e:
+      estr=traceback.format_exc()
+      self.log.error("TetherTask.fail: an exception occured while trying to send the fail message to the output server:\n{0}".format(estr))
+
+    self.close()
+
+  def close(self):
+    self.log.info("TetherTask.close: closing")
+    if not(self.clienTransciever is None):
+      try:
+        self.clienTransciever.close()
+
+      except Exception as e:
+        # ignore exceptions
+        pass
+
+    # http server is ready to be shutdown
+    self.ready_for_shutdown.set()
diff --git a/lang/py/src/avro/tether/tether_task_runner.py b/lang/py/src/avro/tether/tether_task_runner.py
new file mode 100644
index 0000000..7d223d3
--- /dev/null
+++ b/lang/py/src/avro/tether/tether_task_runner.py
@@ -0,0 +1,227 @@
+"""
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+"""
+
+__all__=["TaskRunner"]
+
+if __name__ == "__main__":
+  # Relative imports don't work when being run directly
+  from avro import tether
+  from avro.tether import TetherTask, find_port, inputProtocol
+
+else:
+  from . import TetherTask, find_port, inputProtocol
+
+from avro import ipc
+from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
+import logging
+import weakref
+import threading
+import sys
+import traceback
+
+class TaskRunnerResponder(ipc.Responder):
+  """
+  The responder for the thethered process
+  """
+  def __init__(self,runner):
+    """
+    Param
+    ----------------------------------------------------------
+    runner - Instance of TaskRunner
+    """
+    ipc.Responder.__init__(self, inputProtocol)
+
+    self.log=logging.getLogger("TaskRunnerResponder")
+
+    # should we use weak references to avoid circular references?
+    # We use weak references b\c self.runner owns this instance of TaskRunnerResponder
+    if isinstance(runner,weakref.ProxyType):
+      self.runner=runner
+    else:
+      self.runner=weakref.proxy(runner)
+
+    self.task=weakref.proxy(runner.task)
+
+  def invoke(self, message, request):
+    try:
+      if message.name=='configure':
+        self.log.info("TetherTaskRunner: Recieved configure")
+        self.task.configure(request["taskType"],request["inSchema"],request["outSchema"])
+      elif message.name=='partitions':
+        self.log.info("TetherTaskRunner: Recieved partitions")
+        try:
+          self.task.set_partitions(request["partitions"])
+        except Exception as e:
+          self.log.error("Exception occured while processing the partitions message: Message:\n"+traceback.format_exc())
+          raise
+      elif message.name=='input':
+        self.log.info("TetherTaskRunner: Recieved input")
+        self.task.input(request["data"],request["count"])
+      elif message.name=='abort':
+        self.log.info("TetherTaskRunner: Recieved abort")
+        self.runner.close()
+      elif message.name=='complete':
+        self.log.info("TetherTaskRunner: Recieved complete")
+        self.task.complete()
+        self.task.close()
+        self.runner.close()
+      else:
+        self.log.warning("TetherTaskRunner: recieved unknown message {0}".format(message.name))
+
+    except Exception as e:
+      self.log.error("Error occured while processing message: {0}".format(message.name))
+      emsg=traceback.format_exc()
+      self.task.fail(emsg)
+
+    return None
+
+
+def HTTPHandlerGen(runner):
+  """
+  This is a class factory for the HTTPHandler. We need
+  a factory b\c we need a reference to the runner
+
+  Parameters
+  -----------------------------------------------------------------
+  runner - instance of the task runner
+  """
+
+  if not(isinstance(runner,weakref.ProxyType)):
+    runnerref=weakref.proxy(runner)
+  else:
+    runnerref=runner
+
+  class TaskRunnerHTTPHandler(BaseHTTPRequestHandler):
+    """Create a handler for the parent.
+    """
+
+    runner=runnerref
+    def __init__(self,*args,**param):
+      """
+      """
+      BaseHTTPRequestHandler.__init__(self,*args,**param)
+
+    def do_POST(self):
+      self.responder =TaskRunnerResponder(self.runner)
+      call_request_reader = ipc.FramedReader(self.rfile)
+      call_request = call_request_reader.read_framed_message()
+      resp_body = self.responder.respond(call_request)
+      self.send_response(200)
+      self.send_header('Content-Type', 'avro/binary')
+      self.end_headers()
+      resp_writer = ipc.FramedWriter(self.wfile)
+      resp_writer.write_framed_message(resp_body)
+
+  return TaskRunnerHTTPHandler
+
+class TaskRunner(object):
+  """This class ties together the server handling the requests from
+  the parent process and the instance of TetherTask which actually
+  implements the logic for the mapper and reducer phases
+  """
+
+  def __init__(self,task):
+    """
+    Construct the runner
+
+    Parameters
+    ---------------------------------------------------------------
+    task - An instance of tether task
+    """
+
+    self.log=logging.getLogger("TaskRunner:")
+
+    if not(isinstance(task,TetherTask)):
+      raise ValueError("task must be an instance of tether task")
+    self.task=task
+
+    self.server=None
+    self.sthread=None
+
+  def start(self,outputport=None,join=True):
+    """
+    Start the server
+
+    Parameters
+    -------------------------------------------------------------------
+    outputport - (optional) The port on which the parent process is listening
+                 for requests from the task.
+               - This will typically be supplied by an environment variable
+                 we allow it to be supplied as an argument mainly for debugging
+    join       - (optional) If set to fault then we don't issue a join to block
+                 until the thread excecuting the server terminates.
+                This is mainly for debugging. By setting it to false,
+                we can resume execution in this thread so that we can do additional
+                testing
+    """
+
+    port=find_port()
+    address=("localhost",port)
+
+
+    def thread_run(task_runner=None):
+      task_runner.server = HTTPServer(address, HTTPHandlerGen(task_runner))
+      task_runner.server.allow_reuse_address = True
+      task_runner.server.serve_forever()
+
+    # create a separate thread for the http server
+    sthread=threading.Thread(target=thread_run,kwargs={"task_runner":self})
+    sthread.start()
+
+    self.sthread=sthread
+    # This needs to run in a separat thread b\c serve_forever() blocks
+    self.task.open(port,clientPort=outputport)
+
+    # wait for the other thread to finish
+    if (join):
+      self.task.ready_for_shutdown.wait()
+      self.server.shutdown()
+
+      # should we do some kind of check to make sure it exits
+      self.log.info("Shutdown the logger")
+      # shutdown the logging
+      logging.shutdown()
+
+  def close(self):
+    """
+    Handler for the close message
+    """
+
+    self.task.close()
+
+if __name__ == '__main__':
+  # TODO::Make the logging level a parameter we can set
+  # logging.basicConfig(level=logging.INFO,filename='/tmp/log',filemode='w')
+  logging.basicConfig(level=logging.INFO)
+
+  if (len(sys.argv)<=1):
+    print "Error: tether_task_runner.__main__: Usage: tether_task_runner task_package.task_module.TaskClass"
+    raise ValueError("Usage: tether_task_runner task_package.task_module.TaskClass")
+
+  fullcls=sys.argv[1]
+  mod,cname=fullcls.rsplit(".",1)
+
+  logging.info("tether_task_runner.__main__: Task: {0}".format(fullcls))
+
+  modobj=__import__(mod,fromlist=cname)
+
+  taskcls=getattr(modobj,cname)
+  task=taskcls()
+
+  runner=TaskRunner(task=task)
+  runner.start()
diff --git a/lang/py/src/avro/tether/util.py b/lang/py/src/avro/tether/util.py
new file mode 100644
index 0000000..071b4a1
--- /dev/null
+++ b/lang/py/src/avro/tether/util.py
@@ -0,0 +1,34 @@
+"""
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+"""
+
+__all__=["find_port"]
+
+import socket
+
+
+def find_port():
+  """
+  Return an unbound port
+  """
+  s=socket.socket()
+  s.bind(("127.0.0.1",0))
+
+  port=s.getsockname()[1]
+  s.close()
+
+  return port
\ No newline at end of file
diff --git a/lang/py/test/mock_tether_parent.py b/lang/py/test/mock_tether_parent.py
new file mode 100644
index 0000000..399a03a
--- /dev/null
+++ b/lang/py/test/mock_tether_parent.py
@@ -0,0 +1,95 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import sys
+import set_avro_test_path
+from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
+from avro import ipc
+from avro import protocol
+from avro import tether
+
+import socket
+
+def find_port():
+  """
+  Return an unbound port
+  """
+  s=socket.socket()
+  s.bind(("127.0.0.1",0))
+
+  port=s.getsockname()[1]
+  s.close()
+
+  return port
+
+SERVER_ADDRESS = ('localhost', find_port())
+
+class MockParentResponder(ipc.Responder):
+  """
+  The responder for the mocked parent
+  """
+  def __init__(self):
+    ipc.Responder.__init__(self, tether.outputProtocol)
+
+  def invoke(self, message, request):
+    if message.name=='configure':
+      print "MockParentResponder: Recieved 'configure': inputPort={0}".format(request["port"])
+
+    elif message.name=='status':
+      print "MockParentResponder: Recieved 'status': message={0}".format(request["message"])
+    elif message.name=='fail':
+      print "MockParentResponder: Recieved 'fail': message={0}".format(request["message"])
+    else:
+      print "MockParentResponder: Recieved {0}".format(message.name)
+
+    # flush the output so it shows up in the parent process
+    sys.stdout.flush()
+
+    return None
+
+class MockParentHandler(BaseHTTPRequestHandler):
+  """Create a handler for the parent.
+  """
+  def do_POST(self):
+    self.responder =MockParentResponder()
+    call_request_reader = ipc.FramedReader(self.rfile)
+    call_request = call_request_reader.read_framed_message()
+    resp_body = self.responder.respond(call_request)
+    self.send_response(200)
+    self.send_header('Content-Type', 'avro/binary')
+    self.end_headers()
+    resp_writer = ipc.FramedWriter(self.wfile)
+    resp_writer.write_framed_message(resp_body)
+
+if __name__ == '__main__':
+  if (len(sys.argv)<=1):
+    raise ValueError("Usage: mock_tether_parent command")
+
+  cmd=sys.argv[1].lower()
+  if (sys.argv[1]=='start_server'):
+    if (len(sys.argv)==3):
+      port=int(sys.argv[2])
+    else:
+      raise ValueError("Usage: mock_tether_parent start_server port")
+
+    SERVER_ADDRESS=(SERVER_ADDRESS[0],port)
+    print "mock_tether_parent: Launching Server on Port: {0}".format(SERVER_ADDRESS[1])
+
+    # flush the output so it shows up in the parent process
+    sys.stdout.flush()
+    parent_server = HTTPServer(SERVER_ADDRESS, MockParentHandler)
+    parent_server.allow_reuse_address = True
+    parent_server.serve_forever()
diff --git a/lang/py/test/set_avro_test_path.py b/lang/py/test/set_avro_test_path.py
new file mode 100644
index 0000000..d8b0098
--- /dev/null
+++ b/lang/py/test/set_avro_test_path.py
@@ -0,0 +1,40 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""
+Module adjusts the path PYTHONPATH so the unittests
+will work even if an egg for AVRO is already installed.
+By default eggs always appear higher on pythons path then
+directories set via the environment variable PYTHONPATH.
+
+For reference see:
+http://www.velocityreviews.com/forums/t716589-pythonpath-and-eggs.html
+http://stackoverflow.com/questions/897792/pythons-sys-path-value.
+
+Unittests would therefore use the installed AVRO and not the AVRO
+being built. To work around this the unittests import this module before
+importing AVRO. This module in turn adjusts the python path so that the test
+build of AVRO is higher on the path then any installed eggs.
+"""
+import sys
+import os
+
+# determine the build directory and then make sure all paths that start with the
+# build directory are at the top of the path
+builddir=os.path.split(os.path.split(__file__)[0])[0]
+bpaths=filter(lambda s:s.startswith(builddir), sys.path)
+
+for p in bpaths:
+  sys.path.insert(0,p)
\ No newline at end of file
diff --git a/lang/py/test/test_datafile.py b/lang/py/test/test_datafile.py
index b3ce692..72994f3 100644
--- a/lang/py/test/test_datafile.py
+++ b/lang/py/test/test_datafile.py
@@ -15,6 +15,9 @@
 # limitations under the License.
 import os
 import unittest
+
+import set_avro_test_path
+
 from avro import schema
 from avro import io
 from avro import datafile
diff --git a/lang/py/test/test_datafile_interop.py b/lang/py/test/test_datafile_interop.py
index 8f4e883..7204529 100644
--- a/lang/py/test/test_datafile_interop.py
+++ b/lang/py/test/test_datafile_interop.py
@@ -15,6 +15,9 @@
 # limitations under the License.
 import os
 import unittest
+
+import set_avro_test_path
+
 from avro import io
 from avro import datafile
 
diff --git a/lang/py/test/test_io.py b/lang/py/test/test_io.py
index 05a6f80..1e79d3e 100644
--- a/lang/py/test/test_io.py
+++ b/lang/py/test/test_io.py
@@ -19,6 +19,9 @@ try:
 except ImportError:
   from StringIO import StringIO
 from binascii import hexlify
+
+import set_avro_test_path
+
 from avro import schema
 from avro import io
 
diff --git a/lang/py/test/test_ipc.py b/lang/py/test/test_ipc.py
index 2545b15..7fffe49 100644
--- a/lang/py/test/test_ipc.py
+++ b/lang/py/test/test_ipc.py
@@ -19,6 +19,8 @@ servers yet available.
 """
 import unittest
 
+import set_avro_test_path
+
 # This test does import this code, to make sure it at least passes
 # compilation.
 from avro import ipc
diff --git a/lang/py/test/test_schema.py b/lang/py/test/test_schema.py
index b9c84b3..204d1b1 100644
--- a/lang/py/test/test_schema.py
+++ b/lang/py/test/test_schema.py
@@ -17,6 +17,8 @@
 Test the schema parsing logic.
 """
 import unittest
+import set_avro_test_path
+
 from avro import schema
 
 def print_test_name(test_name):
@@ -287,6 +289,10 @@ OTHER_PROP_EXAMPLES = [
      "symbols": [ "one", "two", "three" ],
      "cp_float" : 1.0 }
     """,True),
+  ExampleSchema("""\
+    {"type": "long",
+     "date": "true"}
+    """, True)
 ]
 
 EXAMPLES = PRIMITIVE_EXAMPLES
diff --git a/lang/py/test/test_tether_task.py b/lang/py/test/test_tether_task.py
new file mode 100644
index 0000000..32265e6
--- /dev/null
+++ b/lang/py/test/test_tether_task.py
@@ -0,0 +1,116 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+
+import os
+import subprocess
+import sys
+import time
+import unittest
+
+import set_avro_test_path
+
+class TestTetherTask(unittest.TestCase):
+  """
+  TODO: We should validate the the server response by looking at stdout
+  """
+  def test1(self):
+    """
+    Test that the thether_task is working. We run the mock_tether_parent in a separate
+    subprocess
+    """
+    from avro import tether
+    from avro import io as avio
+    from avro import schema
+    from avro.tether import HTTPRequestor,inputProtocol, find_port
+
+    import StringIO
+    import mock_tether_parent
+    from word_count_task import WordCountTask
+
+    task=WordCountTask()
+
+    proc=None
+    try:
+      # launch the server in a separate process
+      # env["AVRO_TETHER_OUTPUT_PORT"]=output_port
+      env=dict()
+      env["PYTHONPATH"]=':'.join(sys.path)
+      server_port=find_port()
+
+      pyfile=mock_tether_parent.__file__
+      proc=subprocess.Popen(["python", pyfile,"start_server","{0}".format(server_port)])
+      input_port=find_port()
+
+      print "Mock server started process pid={0}".format(proc.pid)
+      # Possible race condition? open tries to connect to the subprocess before the subprocess is fully started
+      # so we give the subprocess time to start up
+      time.sleep(1)
+      task.open(input_port,clientPort=server_port)
+
+      # TODO: We should validate that open worked by grabbing the STDOUT of the subproces
+      # and ensuring that it outputted the correct message.
+
+      #***************************************************************
+      # Test the mapper
+      task.configure(tether.TaskType.MAP,str(task.inschema),str(task.midschema))
+
+      # Serialize some data so we can send it to the input function
+      datum="This is a line of text"
+      writer = StringIO.StringIO()
+      encoder = avio.BinaryEncoder(writer)
+      datum_writer = avio.DatumWriter(task.inschema)
+      datum_writer.write(datum, encoder)
+
+      writer.seek(0)
+      data=writer.read()
+
+      # Call input to simulate calling map
+      task.input(data,1)
+
+      # Test the reducer
+      task.configure(tether.TaskType.REDUCE,str(task.midschema),str(task.outschema))
+
+      # Serialize some data so we can send it to the input function
+      datum={"key":"word","value":2}
+      writer = StringIO.StringIO()
+      encoder = avio.BinaryEncoder(writer)
+      datum_writer = avio.DatumWriter(task.midschema)
+      datum_writer.write(datum, encoder)
+
+      writer.seek(0)
+      data=writer.read()
+
+      # Call input to simulate calling reduce
+      task.input(data,1)
+
+      task.complete()
+
+      # try a status
+      task.status("Status message")
+
+    except Exception as e:
+      raise
+    finally:
+      # close the process
+      if not(proc is None):
+        proc.kill()
+
+      pass
+
+if __name__ == '__main__':
+  unittest.main()
\ No newline at end of file
diff --git a/lang/py/test/test_tether_task_runner.py b/lang/py/test/test_tether_task_runner.py
new file mode 100644
index 0000000..a3f10fe
--- /dev/null
+++ b/lang/py/test/test_tether_task_runner.py
@@ -0,0 +1,191 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import subprocess
+import sys
+import time
+import unittest
+
+import set_avro_test_path
+
+
+class TestTetherTaskRunner(unittest.TestCase):
+  """ unit test for a tethered task runner.
+  """
+
+  def test1(self):
+    from word_count_task import WordCountTask
+    from avro.tether import TaskRunner, find_port,HTTPRequestor,inputProtocol, TaskType
+    from avro import io as avio
+    import mock_tether_parent
+    import subprocess
+    import StringIO
+    import logging
+
+    # set the logging level to debug so that debug messages are printed
+    logging.basicConfig(level=logging.DEBUG)
+
+    proc=None
+    try:
+      # launch the server in a separate process
+      env=dict()
+      env["PYTHONPATH"]=':'.join(sys.path)
+      parent_port=find_port()
+
+      pyfile=mock_tether_parent.__file__
+      proc=subprocess.Popen(["python", pyfile,"start_server","{0}".format(parent_port)])
+      input_port=find_port()
+
+      print "Mock server started process pid={0}".format(proc.pid)
+      # Possible race condition? open tries to connect to the subprocess before the subprocess is fully started
+      # so we give the subprocess time to start up
+      time.sleep(1)
+
+      runner=TaskRunner(WordCountTask())
+
+      runner.start(outputport=parent_port,join=False)
+
+      # Test sending various messages to the server and ensuring they are
+      # processed correctly
+      requestor=HTTPRequestor("localhost",runner.server.server_address[1],inputProtocol)
+
+      # TODO: We should validate that open worked by grabbing the STDOUT of the subproces
+      # and ensuring that it outputted the correct message.
+
+      # Test the mapper
+      requestor.request("configure",{"taskType":TaskType.MAP,"inSchema":str(runner.task.inschema),"outSchema":str(runner.task.midschema)})
+
+      # Serialize some data so we can send it to the input function
+      datum="This is a line of text"
+      writer = StringIO.StringIO()
+      encoder = avio.BinaryEncoder(writer)
+      datum_writer = avio.DatumWriter(runner.task.inschema)
+      datum_writer.write(datum, encoder)
+
+      writer.seek(0)
+      data=writer.read()
+
+
+      # Call input to simulate calling map
+      requestor.request("input",{"data":data,"count":1})
+
+      #Test the reducer
+      requestor.request("configure",{"taskType":TaskType.REDUCE,"inSchema":str(runner.task.midschema),"outSchema":str(runner.task.outschema)})
+
+      #Serialize some data so we can send it to the input function
+      datum={"key":"word","value":2}
+      writer = StringIO.StringIO()
+      encoder = avio.BinaryEncoder(writer)
+      datum_writer = avio.DatumWriter(runner.task.midschema)
+      datum_writer.write(datum, encoder)
+
+      writer.seek(0)
+      data=writer.read()
+
+
+      #Call input to simulate calling reduce
+      requestor.request("input",{"data":data,"count":1})
+
+      requestor.request("complete",{})
+
+
+      runner.task.ready_for_shutdown.wait()
+      runner.server.shutdown()
+      #time.sleep(2)
+      #runner.server.shutdown()
+
+      sthread=runner.sthread
+
+      #Possible race condition?
+      time.sleep(1)
+
+      #make sure the other thread terminated
+      self.assertFalse(sthread.isAlive())
+
+      #shutdown the logging
+      logging.shutdown()
+
+    except Exception as e:
+      raise
+    finally:
+      #close the process
+      if not(proc is None):
+        proc.kill()
+
+
+  def test2(self):
+    """
+    In this test we want to make sure that when we run "tether_task_runner.py"
+    as our main script everything works as expected. We do this by using subprocess to run it
+    in a separate thread.
+    """
+    from word_count_task import WordCountTask
+    from avro.tether import TaskRunner, find_port,HTTPRequestor,inputProtocol, TaskType
+    from avro.tether import tether_task_runner
+    from avro import io as avio
+    import mock_tether_parent
+    import subprocess
+    import StringIO
+
+
+    proc=None
+
+    runnerproc=None
+    try:
+      #launch the server in a separate process
+      env=dict()
+      env["PYTHONPATH"]=':'.join(sys.path)
+      parent_port=find_port()
+
+      pyfile=mock_tether_parent.__file__
+      proc=subprocess.Popen(["python", pyfile,"start_server","{0}".format(parent_port)])
+
+      #Possible race condition? when we start tether_task_runner it will call
+      # open tries to connect to the subprocess before the subprocess is fully started
+      #so we give the subprocess time to start up
+      time.sleep(1)
+
+
+      #start the tether_task_runner in a separate process
+      env={"AVRO_TETHER_OUTPUT_PORT":"{0}".format(parent_port)}
+      env["PYTHONPATH"]=':'.join(sys.path)
+
+      runnerproc=subprocess.Popen(["python",tether_task_runner.__file__,"word_count_task.WordCountTask"],env=env)
+
+      #possible race condition wait for the process to start
+      time.sleep(1)
+
+
+
+      print "Mock server started process pid={0}".format(proc.pid)
+      #Possible race condition? open tries to connect to the subprocess before the subprocess is fully started
+      #so we give the subprocess time to start up
+      time.sleep(1)
+
+
+    except Exception as e:
+      raise
+    finally:
+      #close the process
+      if not(runnerproc is None):
+        runnerproc.kill()
+
+      if not(proc is None):
+        proc.kill()
+
+if __name__==("__main__"):
+  unittest.main()
diff --git a/lang/py/test/test_tether_word_count.py b/lang/py/test/test_tether_word_count.py
new file mode 100644
index 0000000..6e51d31
--- /dev/null
+++ b/lang/py/test/test_tether_word_count.py
@@ -0,0 +1,213 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import inspect
+import subprocess
+import sys
+import time
+import unittest
+import os
+
+import set_avro_test_path
+
+class TestTetherWordCount(unittest.TestCase):
+  """ unittest for a python tethered map-reduce job.
+  """
+
+  def _write_lines(self,lines,fname):
+    """
+    Write the lines to an avro file named fname
+
+    Parameters
+    --------------------------------------------------------
+    lines - list of strings to write
+    fname - the name of the file to write to.
+    """
+    import avro.io as avio
+    from avro.datafile import DataFileReader,DataFileWriter
+    from avro import schema
+
+    #recursively make all directories
+    dparts=fname.split(os.sep)[:-1]
+    for i in range(len(dparts)):
+      pdir=os.sep+os.sep.join(dparts[:i+1])
+      if not(os.path.exists(pdir)):
+        os.mkdir(pdir)
+
+
+    with file(fname,'w') as hf:
+      inschema="""{"type":"string"}"""
+      writer=DataFileWriter(hf,avio.DatumWriter(inschema),writers_schema=schema.parse(inschema))
+
+      #encoder = avio.BinaryEncoder(writer)
+      #datum_writer = avio.DatumWriter()
+      for datum in lines:
+        writer.append(datum)
+
+      writer.close()
+
+
+
+
+  def _count_words(self,lines):
+    """Return a dictionary counting the words in lines
+    """
+    counts={}
+
+    for line in lines:
+      words=line.split()
+
+      for w in words:
+        if not(counts.has_key(w.strip())):
+          counts[w.strip()]=0
+
+        counts[w.strip()]=counts[w.strip()]+1
+
+    return counts
+
+  def test1(self):
+    """
+    Run a tethered map-reduce job.
+
+    Assumptions: 1) bash is available in /bin/bash
+    """
+    from word_count_task import WordCountTask
+    from avro.tether import tether_task_runner
+    from avro.datafile import DataFileReader
+    from avro.io import DatumReader
+    import avro
+
+    import subprocess
+    import StringIO
+    import shutil
+    import tempfile
+    import inspect
+
+    proc=None
+
+    try:
+
+
+      # TODO we use the tempfile module to generate random names
+      # for the files
+      base_dir = "/tmp/test_tether_word_count"
+      if os.path.exists(base_dir):
+        shutil.rmtree(base_dir)
+
+      inpath = os.path.join(base_dir, "in")
+      infile=os.path.join(inpath, "lines.avro")
+      lines=["the quick brown fox jumps over the lazy dog",
+             "the cow jumps over the moon",
+             "the rain in spain falls mainly on the plains"]
+
+      self._write_lines(lines,infile)
+
+      true_counts=self._count_words(lines)
+
+      if not(os.path.exists(infile)):
+        self.fail("Missing the input file {0}".format(infile))
+
+
+      # The schema for the output of the mapper and reducer
+      oschema="""
+{"type":"record",
+ "name":"Pair","namespace":"org.apache.avro.mapred","fields":[
+     {"name":"key","type":"string"},
+     {"name":"value","type":"long","order":"ignore"}
+ ]
+}
+"""
+
+      # write the schema to a temporary file
+      osfile=tempfile.NamedTemporaryFile(mode='w',suffix=".avsc",prefix="wordcount",delete=False)
+      outschema=osfile.name
+      osfile.write(oschema)
+      osfile.close()
+
+      if not(os.path.exists(outschema)):
+        self.fail("Missing the schema file")
+
+      outpath = os.path.join(base_dir, "out")
+
+      args=[]
+
+      args.append("java")
+      args.append("-jar")
+      args.append(os.path.abspath("@TOPDIR@/../java/tools/target/avro-tools- at AVRO_VERSION@.jar"))
+
+
+      args.append("tether")
+      args.extend(["--in",inpath])
+      args.extend(["--out",outpath])
+      args.extend(["--outschema",outschema])
+      args.extend(["--protocol","http"])
+
+      # form the arguments for the subprocess
+      subargs=[]
+
+      srcfile=inspect.getsourcefile(tether_task_runner)
+
+      # Create a shell script to act as the program we want to execute
+      # We do this so we can set the python path appropriately
+      script="""#!/bin/bash
+export PYTHONPATH={0}
+python -m avro.tether.tether_task_runner word_count_task.WordCountTask
+"""
+      # We need to make sure avro is on the path
+      # getsourcefile(avro) returns .../avro/__init__.py
+      asrc=inspect.getsourcefile(avro)
+      apath=asrc.rsplit(os.sep,2)[0]
+
+      # path to where the tests lie
+      tpath=os.path.split(__file__)[0]
+
+      exhf=tempfile.NamedTemporaryFile(mode='w',prefix="exec_word_count_",delete=False)
+      exfile=exhf.name
+      exhf.write(script.format((os.pathsep).join([apath,tpath]),srcfile))
+      exhf.close()
+
+      # make it world executable
+      os.chmod(exfile,0755)
+
+      args.extend(["--program",exfile])
+
+      print "Command:\n\t{0}".format(" ".join(args))
+      proc=subprocess.Popen(args)
+
+
+      proc.wait()
+
+      # read the output
+      with file(os.path.join(outpath,"part-00000.avro")) as hf:
+        reader=DataFileReader(hf, DatumReader())
+        for record in reader:
+          self.assertEqual(record["value"],true_counts[record["key"]])
+
+        reader.close()
+
+    except Exception as e:
+      raise
+    finally:
+      # close the process
+      if proc is not None and proc.returncode is None:
+        proc.kill()
+      if os.path.exists(base_dir):
+        shutil.rmtree(base_dir)
+      if os.path.exists(exfile):
+        os.remove(exfile)
+
+if __name__== "__main__":
+  unittest.main()
diff --git a/lang/py/test/word_count_task.py b/lang/py/test/word_count_task.py
new file mode 100644
index 0000000..30dcc51
--- /dev/null
+++ b/lang/py/test/word_count_task.py
@@ -0,0 +1,96 @@
+"""
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+"""
+
+__all__=["WordCountTask"]
+
+from avro.tether import TetherTask
+
+import logging
+
+#TODO::Make the logging level a parameter we can set
+#logging.basicConfig(level=logging.INFO)
+class WordCountTask(TetherTask):
+  """
+  Implements the mappper and reducer for the word count example
+  """
+
+  def __init__(self):
+    """
+    """
+
+    inschema="""{"type":"string"}"""
+    midschema="""{"type":"record", "name":"Pair","namespace":"org.apache.avro.mapred","fields":[
+              {"name":"key","type":"string"},
+              {"name":"value","type":"long","order":"ignore"}]
+              }"""
+    outschema=midschema
+    TetherTask.__init__(self,inschema,midschema,outschema)
+
+
+    #keep track of the partial sums of the counts
+    self.psum=0
+
+
+  def map(self,record,collector):
+    """Implement the mapper for the word count example
+
+    Parameters
+    ----------------------------------------------------------------------------
+    record - The input record
+    collector - The collector to collect the output
+    """
+
+    words=record.split()
+
+    for w in words:
+      logging.info("WordCountTask.Map: word={0}".format(w))
+      collector.collect({"key":w,"value":1})
+
+  def reduce(self,record, collector):
+    """Called with input values to generate reducer output. Inputs are sorted by the mapper
+    key.
+
+    The reduce function is invoked once for each value belonging to a given key outputted
+    by the mapper.
+
+    Parameters
+    ----------------------------------------------------------------------------
+    record - The mapper output
+    collector - The collector to collect the output
+    """
+
+    self.psum+=record["value"]
+
+  def reduceFlush(self,record, collector):
+    """
+    Called with the last intermediate value in each equivalence run.
+    In other words, reduceFlush is invoked once for each key produced in the reduce
+    phase. It is called after reduce has been invoked on each value for the given key.
+
+    Parameters
+    ------------------------------------------------------------------
+    record - the last record on which reduce was invoked.
+    """
+
+    #collect the current record
+    logging.info("WordCountTask.reduceFlush key={0} value={1}".format(record["key"],self.psum))
+
+    collector.collect({"key":record["key"],"value":self.psum})
+
+    #reset the sum
+    self.psum=0
diff --git a/lang/py3/avro/schema.py b/lang/py3/avro/schema.py
index b5d17fe..c3f73c5 100644
--- a/lang/py3/avro/schema.py
+++ b/lang/py3/avro/schema.py
@@ -643,7 +643,7 @@ class PrimitiveSchema(Schema):
   Valid primitive types are defined in PRIMITIVE_TYPES.
   """
 
-  def __init__(self, type):
+  def __init__(self, type, other_props=None):
     """Initializes a new schema object for the specified primitive type.
 
     Args:
@@ -651,7 +651,7 @@ class PrimitiveSchema(Schema):
     """
     if type not in PRIMITIVE_TYPES:
       raise AvroException('%r is not a valid primitive type.' % type)
-    super(PrimitiveSchema, self).__init__(type)
+    super(PrimitiveSchema, self).__init__(type, other_props=other_props)
 
   @property
   def name(self):
@@ -752,7 +752,7 @@ class EnumSchema(NamedSchema):
         other_props=other_props,
     )
 
-    self._props['symbols'] = tuple(sorted(symbol_set))
+    self._props['symbols'] = symbols
     if doc is not None:
       self._props['doc'] = doc
 
@@ -1153,7 +1153,7 @@ def _SchemaFromJSONObject(json_object, names):
 
   if type in PRIMITIVE_TYPES:
     # FIXME should not ignore other properties
-    return PrimitiveSchema(type)
+    return PrimitiveSchema(type, other_props=other_props)
 
   elif type in NAMED_TYPES:
     name = json_object.get('name')
diff --git a/lang/py3/avro/tests/run_tests.py b/lang/py3/avro/tests/run_tests.py
index 738c8e5..d7e6512 100644
--- a/lang/py3/avro/tests/run_tests.py
+++ b/lang/py3/avro/tests/run_tests.py
@@ -54,6 +54,7 @@ from avro.tests.test_ipc import *
 from avro.tests.test_protocol import *
 from avro.tests.test_schema import *
 from avro.tests.test_script import *
+from avro.tests.test_enum import *
 
 
 def SetupLogging():
diff --git a/lang/py/test/test_datafile_interop.py b/lang/py3/avro/tests/test_enum.py
similarity index 54%
copy from lang/py/test/test_datafile_interop.py
copy to lang/py3/avro/tests/test_enum.py
index 8f4e883..7e55359 100644
--- a/lang/py/test/test_datafile_interop.py
+++ b/lang/py3/avro/tests/test_enum.py
@@ -1,39 +1,35 @@
+#!/usr/bin/env python3
+# -*- mode: python -*-
+# -*- coding: utf-8 -*-
+
 # Licensed to the Apache Software Foundation (ASF) under one
 # or more contributor license agreements.  See the NOTICE file
 # distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
+# regarding copyright ownership.  Thete ASF licenses this file
 # to you under the Apache License, Version 2.0 (the
 # "License"); you may not use this file except in compliance
 # with the License.  You may obtain a copy of the License at
-# 
+#
 # http://www.apache.org/licenses/LICENSE-2.0
-# 
+#
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-import os
+
 import unittest
-from avro import io
-from avro import datafile
 
-class TestDataFileInterop(unittest.TestCase):
-  def test_interop(self):
-    print ''
-    print 'TEST INTEROP'
-    print '============'
-    print ''
-    for f in os.listdir('@INTEROP_DATA_DIR@'):
-      print 'READING %s' % f
-      print ''
+from avro import schema
+
+class TestEnum(unittest.TestCase):
+  def testSymbolsInOrder(self):
+    enum = schema.EnumSchema('Test', '', ['A', 'B'], schema.Names(), '', {})
+    self.assertEqual('A', enum.symbols[0])
 
-      # read data in binary from file
-      reader = open(os.path.join('@INTEROP_DATA_DIR@', f), 'rb')
-      datum_reader = io.DatumReader()
-      dfr = datafile.DataFileReader(reader, datum_reader)
-      for datum in dfr:
-        assert datum is not None
+  def testSymbolsInReverseOrder(self):
+    enum = schema.EnumSchema('Test', '', ['B', 'A'], schema.Names(), '', {})
+    self.assertEqual('B', enum.symbols[0])
 
 if __name__ == '__main__':
-  unittest.main()
+  raise Exception('Use run_tests.py')
diff --git a/lang/py3/avro/tests/test_schema.py b/lang/py3/avro/tests/test_schema.py
index 3aaa6b3..c836528 100644
--- a/lang/py3/avro/tests/test_schema.py
+++ b/lang/py3/avro/tests/test_schema.py
@@ -426,6 +426,11 @@ OTHER_PROP_EXAMPLES = [
     """,
     valid=True,
   ),
+  ExampleSchema("""
+    {"type": "long", "date": "true"}
+    """,
+    valid=True,
+  ),
 ]
 
 EXAMPLES = PRIMITIVE_EXAMPLES
diff --git a/lang/py3/setup.py b/lang/py3/setup.py
index 426ad1d..53b76ad 100644
--- a/lang/py3/setup.py
+++ b/lang/py3/setup.py
@@ -27,6 +27,9 @@ from setuptools import setup
 
 VERSION_FILE_NAME = 'VERSION.txt'
 
+# The following prevents distutils from using hardlinks (which may not always be
+# available, e.g. on a Docker volume). See http://bugs.python.org/issue8876
+del os.link
 
 def RunsFromSourceDist():
   """Tests whether setup.py is invoked from a source distribution.
@@ -120,7 +123,7 @@ def Main():
   avro_version = ReadVersion()
 
   setup(
-      name = 'avro-python3-snapshot',
+      name = 'avro-python3',
       version = avro_version,
       packages = ['avro'],
       package_dir = {'avro': 'avro'},
diff --git a/pom.xml b/pom.xml
index e188eb0..c3b6197 100644
--- a/pom.xml
+++ b/pom.xml
@@ -19,6 +19,10 @@
          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
   <modelVersion>4.0.0</modelVersion>
 
+  <prerequisites>
+    <maven>2.2.1</maven>
+  </prerequisites>
+
   <parent>
     <groupId>org.apache</groupId>
     <artifactId>apache</artifactId>
@@ -27,7 +31,7 @@
 
   <groupId>org.apache.avro</groupId>
   <artifactId>avro-toplevel</artifactId>
-  <version>1.7.7</version>
+  <version>1.8.0</version>
   <packaging>pom</packaging>
 
   <name>Apache Avro Toplevel</name>
@@ -47,7 +51,7 @@
 
     <!-- plugin versions -->
     <antrun-plugin.version>1.7</antrun-plugin.version>
-    <enforcer-plugin.version>1.0.1</enforcer-plugin.version>
+    <enforcer-plugin.version>1.3.1</enforcer-plugin.version>
   </properties>
 
   <modules>
diff --git a/share/VERSION.txt b/share/VERSION.txt
index 73c8b4f..afa2b35 100644
--- a/share/VERSION.txt
+++ b/share/VERSION.txt
@@ -1 +1 @@
-1.7.7
\ No newline at end of file
+1.8.0
\ No newline at end of file
diff --git a/share/docker/Dockerfile b/share/docker/Dockerfile
new file mode 100644
index 0000000..3bc0b33
--- /dev/null
+++ b/share/docker/Dockerfile
@@ -0,0 +1,58 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Dockerfile for installing the necessary dependencies for building Avro.
+# See BUILD.txt.
+
+FROM java:7-jdk
+
+WORKDIR /root
+
+# Install dependencies from packages
+RUN apt-get update && apt-get install --no-install-recommends -y \
+  git subversion curl ant make maven \
+  gcc cmake asciidoc source-highlight \
+  g++ flex bison libboost-all-dev doxygen \
+  mono-devel mono-gmcs nunit \
+  nodejs nodejs-legacy npm \
+  perl \
+  php5 phpunit php5-gmp bzip2 \
+  python python-setuptools python3-setuptools \
+  ruby ruby-dev rake \
+  libsnappy1 libsnappy-dev
+
+# Install Forrest
+RUN mkdir -p /usr/local/apache-forrest
+RUN curl -O http://archive.apache.org/dist/forrest/0.8/apache-forrest-0.8.tar.gz
+RUN tar xzf *forrest* --strip-components 1 -C /usr/local/apache-forrest
+RUN echo 'forrest.home=/usr/local/apache-forrest' > build.properties
+RUN chmod -R 0777 /usr/local/apache-forrest/build /usr/local/apache-forrest/main \
+  /usr/local/apache-forrest/plugins
+ENV FORREST_HOME /usr/local/apache-forrest
+
+# Install Perl modules
+RUN curl -L http://cpanmin.us | perl - --self-upgrade # non-interactive cpan
+RUN cpanm install Module::Install Module::Install::ReadmeFromPod \
+  Module::Install::Repository \
+  Math::BigInt JSON::XS Try::Tiny Regexp::Common Encode \
+  IO::String Object::Tiny Compress::Zlib Test::More \
+  Test::Exception Test::Pod
+
+# Install Ruby modules
+RUN gem install echoe yajl-ruby multi_json snappy
+
+# Install global Node modules
+RUN npm install -g grunt-cli
diff --git a/share/rat-excludes.txt b/share/rat-excludes.txt
index c123a93..9b05e70 100644
--- a/share/rat-excludes.txt
+++ b/share/rat-excludes.txt
@@ -8,6 +8,7 @@
 **/*.js
 **/*.la
 **/*.m4
+**/*.md
 **/*.md5
 **/*.pom
 **/*.properties
diff --git a/share/schemas/org/apache/avro/ipc/trace/avroTrace.avdl b/share/schemas/org/apache/avro/ipc/trace/avroTrace.avdl
deleted file mode 100644
index 9fd5680..0000000
--- a/share/schemas/org/apache/avro/ipc/trace/avroTrace.avdl
+++ /dev/null
@@ -1,68 +0,0 @@
-/**
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-/**
- * A Span is our basic unit of tracing. It tracks the critical points
- * of a single RPC call and records other call meta-data. It also
- * allows arbitrary string annotations. Both the client and server create
- * Span objects, each of which is populated with half of the relevant event
- * data. They share a span ID, which allows us to merge them into one complete
- * span later on.
- */
- at namespace("org.apache.avro.ipc.trace")
-
-protocol AvroTrace {
-  enum SpanEvent { SERVER_RECV, SERVER_SEND, CLIENT_RECV, CLIENT_SEND }
-
-  fixed ID(8);
-
-  record TimestampedEvent {
-    long timeStamp; // Unix time, in nanoseconds
-    union { SpanEvent, string} event;
-  }
-
-  /**
-   * An individual span is the basic unit of testing.
-   * The record is used by both \"client\" and \"server\".
-   */
-  record Span {
-    ID  traceID;  // ID shared by all Spans in a given trace
-    ID spanID;    // Random ID for this Span
-    union { ID, null } parentSpanID; // Parent Span ID (null if root Span)
-    string messageName;       // Function call represented
-    long requestPayloadSize;  // Size (bytes) of the request
-    long responsePayloadSize; // Size (byts) of the response
-    union { string, null} requestorHostname; // Hostname of requestor
-//    int requestorPort;     // Port of the requestor (currently unused)
-    union { string, null } responderHostname; // Hostname of the responder
-//    int responderPort;     // Port of the responder (currently unused)
-    array<TimestampedEvent> events;  // List of critical events
-    boolean complete; // Whether includes data from both sides
-  }
-
-  /**
-   * Get all spans stored on this host.
-   */
-  array<Span> getAllSpans();
-
-  /**
-   * Get spans occuring between start and end. Each is a unix timestamp
-   * in nanosecond units (for consistency with TimestampedEvent).
-   */
-  array<Span> getSpansInRange(long start, long end);
-}
diff --git a/share/schemas/org/apache/avro/ipc/trace/avroTrace.avpr b/share/schemas/org/apache/avro/ipc/trace/avroTrace.avpr
deleted file mode 100644
index 041f3e8..0000000
--- a/share/schemas/org/apache/avro/ipc/trace/avroTrace.avpr
+++ /dev/null
@@ -1,82 +0,0 @@
-{
-  "protocol" : "AvroTrace",
-  "namespace" : "org.apache.avro.ipc.trace",
-  "types" : [ {
-    "type" : "enum",
-    "name" : "SpanEvent",
-    "symbols" : [ "SERVER_RECV", "SERVER_SEND", "CLIENT_RECV", "CLIENT_SEND" ]
-  }, {
-    "type" : "fixed",
-    "name" : "ID",
-    "size" : 8
-  }, {
-    "type" : "record",
-    "name" : "TimestampedEvent",
-    "fields" : [ {
-      "name" : "timeStamp",
-      "type" : "long"
-    }, {
-      "name" : "event",
-      "type" : [ "SpanEvent", "string" ]
-    } ]
-  }, {
-    "type" : "record",
-    "name" : "Span",
-    "fields" : [ {
-      "name" : "traceID",
-      "type" : "ID"
-    }, {
-      "name" : "spanID",
-      "type" : "ID"
-    }, {
-      "name" : "parentSpanID",
-      "type" : [ "ID", "null" ]
-    }, {
-      "name" : "messageName",
-      "type" : "string"
-    }, {
-      "name" : "requestPayloadSize",
-      "type" : "long"
-    }, {
-      "name" : "responsePayloadSize",
-      "type" : "long"
-    }, {
-      "name" : "requestorHostname",
-      "type" : [ "string", "null" ]
-    }, {
-      "name" : "responderHostname",
-      "type" : [ "string", "null" ]
-    }, {
-      "name" : "events",
-      "type" : {
-        "type" : "array",
-        "items" : "TimestampedEvent"
-      }
-    }, {
-      "name" : "complete",
-      "type" : "boolean"
-    } ]
-  } ],
-  "messages" : {
-    "getAllSpans" : {
-      "request" : [ ],
-      "response" : {
-        "type" : "array",
-        "items" : "Span"
-      }
-    },
-    "getSpansInRange" : {
-      "request" : [ {
-        "name" : "start",
-        "type" : "long"
-      }, {
-        "name" : "end",
-        "type" : "long"
-      } ],
-      "response" : {
-        "type" : "array",
-        "items" : "Span"
-      }
-    }
-  }
-}
\ No newline at end of file
diff --git a/share/test/schemas/http.avdl b/share/test/schemas/http.avdl
new file mode 100644
index 0000000..52313e7
--- /dev/null
+++ b/share/test/schemas/http.avdl
@@ -0,0 +1,66 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/** NOTE: This structure was inspired by HTTP and deliberately skewed to get the effects that needed testing */
+
+ at namespace("org.apache.avro.test.http")
+protocol Http {
+
+    enum NetworkType {
+        IPv4,
+        IPv6
+    }
+
+    record NetworkConnection {
+        NetworkType networkType;
+        string      networkAddress;
+    }
+
+    record UserAgent {
+        union { null, string } id = null;
+        string                      useragent;
+    }
+
+    enum HttpMethod {
+        GET,
+        POST
+    }
+
+    record QueryParameter {
+        string                  name;
+        union { null, string }  value; // Sometimes there is no value.
+    }
+
+    record HttpURI {
+        HttpMethod method;
+        string                path;
+        array<QueryParameter> parameters = [];
+    }
+
+    record HttpRequest {
+        UserAgent         userAgent;
+        HttpURI    URI;
+    }
+
+    record Request {
+      long              timestamp;
+      NetworkConnection connection;
+      HttpRequest       httpRequest;
+    }
+
+}
diff --git a/share/test/schemas/reserved.avsc b/share/test/schemas/reserved.avsc
new file mode 100644
index 0000000..40f4849
--- /dev/null
+++ b/share/test/schemas/reserved.avsc
@@ -0,0 +1,2 @@
+{"name": "org.apache.avro.test.Reserved", "type": "enum",
+ "symbols": ["default","class","int"]},
diff --git a/share/test/schemas/specialtypes.avdl b/share/test/schemas/specialtypes.avdl
new file mode 100644
index 0000000..623e016
--- /dev/null
+++ b/share/test/schemas/specialtypes.avdl
@@ -0,0 +1,98 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/** NOTE: This structure is intended to contain names that are likely to cause collisions with the generated code. */
+
+ at namespace("org.apache.avro.test.specialtypes")
+protocol LetsBreakIt {
+
+    enum Enum {
+        builder,
+        Builder,
+        builderBuider,
+        value,
+        this
+    }
+
+    record One {
+        Enum    this;
+    }
+
+    record Two {
+        union { null, string } this = null;
+        string                 String;
+    }
+
+    record Variables {
+        One       this;
+
+        One       Boolean;
+        One       Integer;
+        One       Long;
+        One       Float;
+        One       String;
+    }
+
+    enum Boolean {
+        Yes,
+        No
+    }
+
+    record String {
+        string value;
+    }
+
+    record builder {
+        One      this;
+        Two      builder;
+    }
+
+    record builderBuilder {
+        One      this;
+        Two      that;
+    }
+
+    record Builder {
+        One      this;
+        Two      that;
+    }
+
+    record value {
+        One      this;
+        Two      that;
+    }
+
+    record Types {
+      Boolean one;
+      builder two;
+      Builder three;
+      builderBuilder four;
+      String five;
+      value six;
+    }
+
+    record Names {
+      string Boolean;
+      string builder;
+      string Builder;
+      string builderBuilder;
+      string String;
+      string value;
+    }
+
+}

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/python-avro.git