[mapnik] 01/05: Imported Upstream version 3.0.4+ds
Sebastiaan Couwenberg
sebastic at moszumanska.debian.org
Sun Aug 30 14:19:59 UTC 2015
This is an automated email from the git hooks/post-receive script.
sebastic pushed a commit to branch master
in repository mapnik.
commit 2108614bff665e12a28d47ca68d21b7263f07396
Author: Bas Couwenberg <sebastic at xs4all.nl>
Date: Sun Aug 30 13:37:52 2015 +0200
Imported Upstream version 3.0.4+ds
---
.travis.yml | 12 +-
CHANGELOG.md | 54 +-
benchmark/test_polygon_clipping.cpp | 20 +-
include/build.py | 1 +
include/mapnik/csv/csv_grammar.hpp | 103 ++
include/mapnik/datasource_cache.hpp | 6 +
include/mapnik/marker_helpers.hpp | 1 -
include/mapnik/value.hpp | 6 +
include/mapnik/version.hpp | 2 +-
plugins/input/csv/build.py | 2 +
plugins/input/csv/csv_datasource.cpp | 838 ++++---------
plugins/input/csv/csv_datasource.hpp | 43 +-
.../csv_featureset.cpp} | 62 +-
.../csv_featureset.hpp} | 40 +-
plugins/input/csv/csv_inline_featureset.cpp | 78 ++
.../csv_inline_featureset.hpp} | 41 +-
plugins/input/csv/csv_utils.hpp | 297 ++++-
plugins/input/geojson/large_geojson_featureset.cpp | 1 -
plugins/input/geojson/large_geojson_featureset.hpp | 2 -
src/datasource_cache.cpp | 14 +-
src/image_util_jpeg.cpp | 2 +-
test/standalone/csv_test.cpp | 1229 ++++++++++----------
test/standalone/datasource_registration_test.cpp | 46 +
test/unit/svg/svg_parser_test.cpp | 4 +-
test/visual/run.cpp | 2 +
test/visual/runner.cpp | 52 +-
test/visual/runner.hpp | 11 +-
27 files changed, 1583 insertions(+), 1386 deletions(-)
diff --git a/.travis.yml b/.travis.yml
index c0e2afb..b9493a9 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -31,12 +31,12 @@ matrix:
- os: linux
compiler: gcc
env: JOBS=6
- - os: osx
- compiler: clang
- env: JOBS=8 MASON_PUBLISH=true
- - os: osx
- compiler: clang
- env: JOBS=8 COVERAGE=true
+ #- os: osx
+ # compiler: clang
+ # env: JOBS=8 MASON_PUBLISH=true
+ #- os: osx
+ # compiler: clang
+ # env: JOBS=8 COVERAGE=true
before_install:
- export COVERAGE=${COVERAGE:-false}
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 789b725..fb40544 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -12,6 +12,20 @@ Released: YYYY XX, 2015
(Packaged from xxxx)
+## 3.0.4
+
+Released: August 26, 2015
+
+(Packaged from 17bb81c)
+
+#### Summary
+
+- CSV.input: plug-in has been refactored to minimise memory usage and to improve handling of larger input.
+ (NOTE: [large_csv](https://github.com/mapnik/mapnik/tree/large_csv) branch adds experimental trunsduction parser with deferred string initialisation)
+- CSV.input: added internal spatial index (boost::geometry::index::tree) for fast `bounding box` queries (https://github.com/mapnik/mapnik/pull/3010)
+- Fixed deadlock in recursive datasource registration via @zerebubuth (https://github.com/mapnik/mapnik/pull/3038)
+- Introduced new command line argument `--limit` or `-l` to limit number of failed tests via @talaj (https://github.com/mapnik/mapnik/pull/2996)
+
## 3.0.3
Released: August 12, 2015
@@ -20,12 +34,12 @@ Released: August 12, 2015
#### Summary
-- Fixed an issue with fields over size of int32 in OGR plugin (https://github.com/mapnik/node-mapnik/issues/499)
+- Fixed an issue with fields over size of `int32` in `OGR` plugin (https://github.com/mapnik/node-mapnik/issues/499)
- Added 3 new image-filters to simulate types of colorblindness (`color-blind-protanope`,`color-blind-deuteranope`,`color-blind-tritanope`)
- Fix so that null text boxes have no bounding boxes when attempting placement ( 162f82cba5b0fb984c425586c6a4b354917abc47 )
- Patch to add legacy method for setting JPEG quality in images ( #3024 )
-- Added `filter_image` method which can modify an image in place or return a new image that is filtered.
-- Added missing typedef's in mapnik::geometry to allow experiementing with different containers
+- Added `filter_image` method which can modify an image in place or return a new image that is filtered
+- Added missing typedef's in `mapnik::geometry` to allow experimenting with different containers
## 3.0.2
@@ -127,7 +141,7 @@ The 3.0 release is a major milestone for Mapnik and includes many performance an
- Shield icons are now pixel snapped for crisp rendering
-- `MarkersSymbolizer` now supports `avoid-edges`, `offset`, `geometry-transform`, `simplify` for `line` placement and two new `placement` options called `vertex-last` and `vertex-first` to place a single marker at the end or beginning of a path. Also `clip` is now respected when rendering markers on a LineString
+- `MarkersSymbolizer` now supports `avoid-edges`, `offset`, `geometry-transform`, `simplify` for `line` placement and two new `placement` options called `vertex-last` and `vertex-first` to place a single marker at the end or beginning of a path. Also `clip` is now respected when rendering markers on a LineString
geometry.
- `TextSymbolizer` now supports `smooth`, `simplify`, `halo-opacity`, `halo-comp-op`, and `halo-transform`
@@ -208,7 +222,7 @@ geometry.
- Optimized expression evaluation of text by avoiding extra copy (1dd1275)
-- Added Map level `background-image-comp-op` to control the compositing operation used to blend the
+- Added Map level `background-image-comp-op` to control the compositing operation used to blend the
`background-image` onto the `background-color`. Has no meaning if `background-color` or `background-image`
are not set. (#1966)
@@ -396,8 +410,8 @@ Summary: The 2.2.0 release is primarily a performance and stability release. The
- Enabled default input plugin directory and fonts path to be set inherited from environment settings in
python bindings to make it easier to run tests locally (#1594). New environment settings are:
- - MAPNIK_INPUT_PLUGINS_DIRECTORY
- - MAPNIK_FONT_DIRECTORY
+ - MAPNIK_INPUT_PLUGINS_DIRECTORY
+ - MAPNIK_FONT_DIRECTORY
- Added support for controlling rendering behavior of markers on multi-geometries `marker-multi-policy` (#1555,#1573)
@@ -789,7 +803,7 @@ Released January, 19 2010
- Gdal Plugin: Added support for Gdal overviews, enabling fast loading of > 1GB rasters (#54)
- * Use the gdaladdo utility to add overviews to existing GDAL datasets
+ * Use the gdaladdo utility to add overviews to existing GDAL datasets
- PostGIS: Added an optional `geometry_table` parameter. The `geometry_table` used by Mapnik to look up
metadata in the geometry_columns and calculate extents (when the `geometry_field` and `srid` parameters
@@ -814,23 +828,23 @@ Released January, 19 2010
complex queries that may aggregate geometries to be kept fast by allowing proper placement of the bbox
query to be used by indexes. (#415)
- * Pass the bbox token inside a subquery like: !bbox!
+ * Pass the bbox token inside a subquery like: !bbox!
- * Valid Usages include:
+ * Valid Usages include:
- <Parameter name="table">
- (Select ST_Union(geom) as geom from table where ST_Intersects(geometry,!bbox!)) as map
- </Parameter>
+ <Parameter name="table">
+ (Select ST_Union(geom) as geom from table where ST_Intersects(geometry,!bbox!)) as map
+ </Parameter>
- <Parameter name="table">
- (Select * from table where geom && !bbox!) as map
- </Parameter>
+ <Parameter name="table">
+ (Select * from table where geom && !bbox!) as map
+ </Parameter>
- PostGIS Plugin: Added `scale_denominator` substitution ability in sql query string (#415/#465)
- * Pass the scale_denominator token inside a subquery like: !scale_denominator!
+ * Pass the scale_denominator token inside a subquery like: !scale_denominator!
- * e.g. (Select * from table where field_value > !scale_denominator!) as map
+ * e.g. (Select * from table where field_value > !scale_denominator!) as map
- PostGIS Plugin: Added support for quoted table names (r1454) (#393)
@@ -862,14 +876,14 @@ Released January, 19 2010
- TextSymbolizer: Large set of new attributes: `text_transform`, `line_spacing`, `character_spacing`,
`wrap_character`, `wrap_before`, `horizontal_alignment`, `justify_alignment`, and `opacity`.
- * More details at changesets: r1254 and r1341
+ * More details at changesets: r1254 and r1341
- SheildSymbolizer: Added special new attributes: `unlock_image`, `VERTEX` placement, `no_text` and many
attributes previously only supported in the TextSymbolizer: `allow_overlap`, `vertical_alignment`,
`horizontal_alignment`, `justify_alignment`, `wrap_width`, `wrap_character`, `wrap_before`, `text_transform`,
`line_spacing`, `character_spacing`, and `opacity`.
- * More details at changeset r1341
+ * More details at changeset r1341
- XML: Added support for using CDATA with libxml2 parser (r1364)
diff --git a/benchmark/test_polygon_clipping.cpp b/benchmark/test_polygon_clipping.cpp
index 25eacea..0005fc9 100644
--- a/benchmark/test_polygon_clipping.cpp
+++ b/benchmark/test_polygon_clipping.cpp
@@ -9,12 +9,14 @@
#include <mapnik/util/fs.hpp>
#include <mapnik/geometry.hpp>
#include <mapnik/vertex_adapters.hpp>
+#include <mapnik/geometry.hpp>
#include <mapnik/geometry_adapters.hpp>
#include <mapnik/geometry_envelope.hpp>
#include <mapnik/geometry_correct.hpp>
#include <mapnik/geometry_is_empty.hpp>
#include <mapnik/image_util.hpp>
#include <mapnik/color.hpp>
+// boost geometry
#include <boost/geometry.hpp>
// agg
#include "agg_conv_clip_polygon.h"
@@ -240,8 +242,15 @@ public:
mapnik::geometry::polygon<double> & poly = mapnik::util::get<mapnik::geometry::polygon<double> >(geom);
mapnik::geometry::correct(poly);
+ mapnik::geometry::linear_ring<double> bbox;
+ bbox.add_coord(extent_.minx(), extent_.miny());
+ bbox.add_coord(extent_.minx(), extent_.maxy());
+ bbox.add_coord(extent_.maxx(), extent_.maxy());
+ bbox.add_coord(extent_.maxx(), extent_.miny());
+ bbox.add_coord(extent_.minx(), extent_.miny());
+
std::deque<mapnik::geometry::polygon<double> > result;
- boost::geometry::intersection(extent_,poly,result);
+ boost::geometry::intersection(bbox, poly, result);
std::string expect = expected_+".png";
std::string actual = expected_+"_actual.png";
@@ -281,11 +290,18 @@ public:
mapnik::geometry::polygon<double> & poly = mapnik::util::get<mapnik::geometry::polygon<double> >(geom);
mapnik::geometry::correct(poly);
+ mapnik::geometry::linear_ring<double> bbox;
+ bbox.add_coord(extent_.minx(), extent_.miny());
+ bbox.add_coord(extent_.minx(), extent_.maxy());
+ bbox.add_coord(extent_.maxx(), extent_.maxy());
+ bbox.add_coord(extent_.maxx(), extent_.miny());
+ bbox.add_coord(extent_.minx(), extent_.miny());
+
bool valid = true;
for (unsigned i=0;i<iterations_;++i)
{
std::deque<mapnik::geometry::polygon<double> > result;
- boost::geometry::intersection(extent_,poly,result);
+ boost::geometry::intersection(bbox, poly, result);
unsigned count = 0;
for (auto const& _geom : result)
{
diff --git a/include/build.py b/include/build.py
index 54bb16b..9c225a7 100644
--- a/include/build.py
+++ b/include/build.py
@@ -27,6 +27,7 @@ Import('env')
base = './mapnik/'
subdirs = [
'',
+ 'csv',
'svg',
'wkt',
'cairo',
diff --git a/include/mapnik/csv/csv_grammar.hpp b/include/mapnik/csv/csv_grammar.hpp
new file mode 100644
index 0000000..195542b
--- /dev/null
+++ b/include/mapnik/csv/csv_grammar.hpp
@@ -0,0 +1,103 @@
+/*****************************************************************************
+ *
+ * This file is part of Mapnik (c++ mapping toolkit)
+ *
+ * Copyright (C) 2015 Artem Pavlenko
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ *
+ *****************************************************************************/
+
+#ifndef MAPNIK_CVS_GRAMMAR_HPP
+#define MAPNIK_CVS_GRAMMAR_HPP
+
+//#define BOOST_SPIRIT_DEBUG
+
+#include <boost/spirit/include/qi.hpp>
+#include <boost/spirit/include/phoenix.hpp>
+
+namespace mapnik {
+
+namespace qi = boost::spirit::qi;
+using csv_value = std::string;
+using csv_line = std::vector<csv_value>;
+using csv_data = std::vector<csv_line>;
+
+template <typename Iterator>
+struct csv_line_grammar : qi::grammar<Iterator, csv_line(std::string const&), qi::blank_type>
+{
+ csv_line_grammar() : csv_line_grammar::base_type(line)
+ {
+ using namespace qi;
+ qi::_a_type _a;
+ qi::_r1_type _r1;
+ qi::lit_type lit;
+ //qi::eol_type eol;
+ qi::_1_type _1;
+ qi::char_type char_;
+ qi::omit_type omit;
+ unesc_char.add
+ ("\\a", '\a')
+ ("\\b", '\b')
+ ("\\f", '\f')
+ ("\\n", '\n')
+ ("\\r", '\r')
+ ("\\t", '\t')
+ ("\\v", '\v')
+ ("\\\\",'\\')
+ ("\\\'", '\'')
+ ("\\\"", '\"')
+ ("\"\"", '\"') // double quote
+ ;
+
+ line = column(_r1) % char_(_r1)
+ ;
+ column = quoted | *(char_ - (lit(_r1) /*| eol*/))
+ ;
+ quoted = omit[char_("\"'")[_a = _1]] > text(_a) > -lit(_a)
+ ;
+ text = *(unesc_char | (char_ - char_(_r1)))
+ ;
+ BOOST_SPIRIT_DEBUG_NODES((line)(column)(quoted));
+ }
+ private:
+ qi::rule<Iterator, csv_line(std::string const&), qi::blank_type> line;
+ qi::rule<Iterator, csv_value(std::string const&)> column; // no-skip
+ qi::rule<Iterator, csv_value(char)> text;
+ qi::rule<Iterator, qi::locals<char>, csv_value()> quoted;
+ qi::symbols<char const, char const> unesc_char;
+};
+
+template <typename Iterator>
+struct csv_file_grammar : qi::grammar<Iterator, csv_data(std::string const&), qi::blank_type>
+{
+ csv_file_grammar() : csv_file_grammar::base_type(start)
+ {
+ using namespace qi;
+ qi::eol_type eol;
+ qi::_r1_type _r1;
+ start = -line(_r1) % eol
+ ;
+ BOOST_SPIRIT_DEBUG_NODES((start));
+ }
+ private:
+ qi::rule<Iterator, csv_data(std::string const&), qi::blank_type> start;
+ csv_line_grammar<Iterator> line;
+};
+
+
+}
+
+#endif // MAPNIK_CVS_GRAMMAR_HPP
diff --git a/include/mapnik/datasource_cache.hpp b/include/mapnik/datasource_cache.hpp
index bf951e7..4f90f89 100644
--- a/include/mapnik/datasource_cache.hpp
+++ b/include/mapnik/datasource_cache.hpp
@@ -33,6 +33,7 @@
#include <set>
#include <vector>
#include <memory>
+#include <mutex>
namespace mapnik {
@@ -56,6 +57,11 @@ private:
~datasource_cache();
std::map<std::string,std::shared_ptr<PluginInfo> > plugins_;
std::set<std::string> plugin_directories_;
+ // the singleton has a mutex protecting the instance pointer,
+ // but the instance also needs its own mutex to protect the
+ // plugins_ and plugin_directories_ members which are potentially
+ // modified recusrively by register_datasources(path, true);
+ std::recursive_mutex instance_mutex_;
};
extern template class MAPNIK_DECL singleton<datasource_cache, CreateStatic>;
diff --git a/include/mapnik/marker_helpers.hpp b/include/mapnik/marker_helpers.hpp
index e6fdfa2..9183d6a 100644
--- a/include/mapnik/marker_helpers.hpp
+++ b/include/mapnik/marker_helpers.hpp
@@ -232,7 +232,6 @@ void apply_markers_multi(feature_impl const& feature, attributes const& vars, Co
for (geometry::polygon<double> const& poly : multi_poly)
{
box2d<double> bbox = geometry::envelope(poly);
- geometry::polygon_vertex_adapter<double> va(poly);
double area = bbox.width() * bbox.height();
if (area > maxarea)
{
diff --git a/include/mapnik/value.hpp b/include/mapnik/value.hpp
index 8a59b83..a56d76f 100644
--- a/include/mapnik/value.hpp
+++ b/include/mapnik/value.hpp
@@ -992,6 +992,10 @@ inline bool value::is_null() const
// support for std::unordered_xxx
namespace std
{
+
+#pragma clang diagnostic push
+#pragma clang diagnostic ignored "-Wmismatched-tags"
+
template <>
struct hash<mapnik::value>
{
@@ -1001,6 +1005,8 @@ struct hash<mapnik::value>
}
};
+#pragma clang diagnostic pop
+
}
#endif // MAPNIK_VALUE_HPP
diff --git a/include/mapnik/version.hpp b/include/mapnik/version.hpp
index 9b16040..642932b 100644
--- a/include/mapnik/version.hpp
+++ b/include/mapnik/version.hpp
@@ -27,7 +27,7 @@
#define MAPNIK_MAJOR_VERSION 3
#define MAPNIK_MINOR_VERSION 0
-#define MAPNIK_PATCH_VERSION 3
+#define MAPNIK_PATCH_VERSION 4
// translates to 300003
#define MAPNIK_VERSION (MAPNIK_MAJOR_VERSION*100000) + (MAPNIK_MINOR_VERSION*100) + (MAPNIK_PATCH_VERSION)
diff --git a/plugins/input/csv/build.py b/plugins/input/csv/build.py
index d1f3716..c2beb24 100644
--- a/plugins/input/csv/build.py
+++ b/plugins/input/csv/build.py
@@ -30,6 +30,8 @@ plugin_env = plugin_base.Clone()
plugin_sources = Split(
"""
%(PLUGIN_NAME)s_datasource.cpp
+ %(PLUGIN_NAME)s_featureset.cpp
+ %(PLUGIN_NAME)s_inline_featureset.cpp
""" % locals()
)
diff --git a/plugins/input/csv/csv_datasource.cpp b/plugins/input/csv/csv_datasource.cpp
index fef1a51..a727524 100644
--- a/plugins/input/csv/csv_datasource.cpp
+++ b/plugins/input/csv/csv_datasource.cpp
@@ -20,34 +20,26 @@
*
*****************************************************************************/
-#include "csv_datasource.hpp"
#include "csv_utils.hpp"
-
+#include "csv_datasource.hpp"
+#include "csv_featureset.hpp"
+#include "csv_inline_featureset.hpp"
// boost
-#include <boost/tokenizer.hpp>
#include <boost/algorithm/string.hpp>
-
// mapnik
#include <mapnik/debug.hpp>
#include <mapnik/util/utf_conv_win.hpp>
#include <mapnik/unicode.hpp>
#include <mapnik/feature_layer_desc.hpp>
#include <mapnik/feature_factory.hpp>
-#include <mapnik/geometry.hpp>
-#include <mapnik/geometry_correct.hpp>
#include <mapnik/memory_featureset.hpp>
-#include <mapnik/wkt/wkt_factory.hpp>
-#include <mapnik/json/geometry_parser.hpp>
-#include <mapnik/util/conversions.hpp>
#include <mapnik/boolean.hpp>
#include <mapnik/util/trim.hpp>
#include <mapnik/util/geometry_to_ds_type.hpp>
#include <mapnik/value_types.hpp>
-
// stl
#include <sstream>
#include <fstream>
-#include <iostream>
#include <vector>
#include <string>
#include <algorithm>
@@ -57,47 +49,31 @@ using mapnik::parameters;
DATASOURCE_PLUGIN(csv_datasource)
+
+namespace {
+
+using cvs_value = mapnik::util::variant<std::string, mapnik::value_integer, mapnik::value_double, mapnik::value_bool>;
+
+}
+
csv_datasource::csv_datasource(parameters const& params)
- : datasource(params),
+: datasource(params),
desc_(csv_datasource::name(), *params.get<std::string>("encoding", "utf-8")),
extent_(),
filename_(),
- inline_string_(),
- file_length_(0),
row_limit_(*params.get<mapnik::value_integer>("row_limit", 0)),
- features_(),
+ inline_string_(),
escape_(*params.get<std::string>("escape", "")),
separator_(*params.get<std::string>("separator", "")),
quote_(*params.get<std::string>("quote", "")),
headers_(),
manual_headers_(mapnik::util::trim_copy(*params.get<std::string>("headers", ""))),
strict_(*params.get<mapnik::boolean_type>("strict", false)),
- filesize_max_(*params.get<double>("filesize_max", 20.0)), // MB
ctx_(std::make_shared<mapnik::context_type>()),
- extent_initialized_(false)
+ extent_initialized_(false),
+ tree_(nullptr),
+ locator_()
{
- /* TODO:
- general:
- - refactor parser into generic class
- - tests of grid_renderer output
- - ensure that the attribute desc_ matches the first feature added
- alternate large file pipeline:
- - stat file, detect > 15 MB
- - build up csv line-by-line iterator
- - creates opportunity to filter attributes by map query
- speed:
- - add properties for wkt/json/lon/lat at parse time
- - add ability to pass 'filter' keyword to drop attributes at layer init
- - create quad tree on the fly for small/med size files
- - memory map large files for reading
- - smaller features (less memory overhead)
- usability:
- - enforce column names without leading digit
- - better error messages (add filepath) if not reading from string
- - move to spirit to tokenize and add character level error feedback:
- http://boost-spirit.com/home/articles/qi-example/tracking-the-input-position-while-parsing/
- */
-
boost::optional<std::string> ext = params.get<std::string>("extent");
if (ext && !ext->empty())
{
@@ -113,7 +89,6 @@ csv_datasource::csv_datasource(parameters const& params)
{
boost::optional<std::string> file = params.get<std::string>("file");
if (!file) throw mapnik::datasource_exception("CSV Plugin: missing <file> parameter");
-
boost::optional<std::string> base = params.get<std::string>("base");
if (base)
filename_ = *base + "/" + *file;
@@ -123,7 +98,7 @@ csv_datasource::csv_datasource(parameters const& params)
if (!inline_string_.empty())
{
std::istringstream in(inline_string_);
- parse_csv(in,escape_, separator_, quote_);
+ parse_csv(in, escape_, separator_, quote_);
}
else
{
@@ -136,13 +111,12 @@ csv_datasource::csv_datasource(parameters const& params)
{
throw mapnik::datasource_exception("CSV Plugin: could not open: '" + filename_ + "'");
}
- parse_csv(in,escape_, separator_, quote_);
+ parse_csv(in, escape_, separator_, quote_);
in.close();
}
}
-
-csv_datasource::~csv_datasource() { }
+csv_datasource::~csv_datasource() {}
template <typename T>
void csv_datasource::parse_csv(T & stream,
@@ -150,98 +124,28 @@ void csv_datasource::parse_csv(T & stream,
std::string const& separator,
std::string const& quote)
{
- stream.seekg(0, std::ios::end);
- file_length_ = stream.tellg();
-
- if (filesize_max_ > 0)
- {
- double file_mb = static_cast<double>(file_length_)/1048576;
-
- // throw if this is an unreasonably large file to read into memory
- if (file_mb > filesize_max_)
- {
- std::ostringstream s;
- s << "CSV Plugin: csv file is greater than ";
- s << filesize_max_ << "MB - you should use a more efficient data format like sqlite, postgis or a shapefile to render this data (set 'filesize_max=0' to disable this restriction if you have lots of memory)";
- throw mapnik::datasource_exception(s.str());
- }
- }
-
+ auto file_length = detail::file_length(stream);
// set back to start
stream.seekg(0, std::ios::beg);
-
- // autodetect newlines
- char newline = '\n';
- bool has_newline = false;
- for (unsigned lidx = 0; lidx < file_length_ && lidx < 4000; lidx++)
- {
- char c = static_cast<char>(stream.get());
- if (c == '\r')
- {
- newline = '\r';
- has_newline = true;
- break;
- }
- if (c == '\n')
- {
- has_newline = true;
- break;
- }
- }
-
+ char newline;
+ bool has_newline;
+ std::tie(newline, has_newline) = detail::autodect_newline(stream, file_length);
// set back to start
stream.seekg(0, std::ios::beg);
-
// get first line
std::string csv_line;
- std::getline(stream,csv_line,newline);
+ std::getline(stream,csv_line,stream.widen(newline));
// if user has not passed a separator manually
// then attempt to detect by reading first line
- std::string sep = mapnik::util::trim_copy(separator);
- if (sep.empty())
- {
- // default to ','
- sep = ",";
- int num_commas = std::count(csv_line.begin(), csv_line.end(), ',');
- // detect tabs
- int num_tabs = std::count(csv_line.begin(), csv_line.end(), '\t');
- if (num_tabs > 0)
- {
- if (num_tabs > num_commas)
- {
- sep = "\t";
- MAPNIK_LOG_DEBUG(csv) << "csv_datasource: auto detected tab separator";
- }
- }
- else // pipes
- {
- int num_pipes = std::count(csv_line.begin(), csv_line.end(), '|');
- if (num_pipes > num_commas)
- {
- sep = "|";
-
- MAPNIK_LOG_DEBUG(csv) << "csv_datasource: auto detected '|' separator";
- }
- else // semicolons
- {
- int num_semicolons = std::count(csv_line.begin(), csv_line.end(), ';');
- if (num_semicolons > num_commas)
- {
- sep = ";";
-
- MAPNIK_LOG_DEBUG(csv) << "csv_datasource: auto detected ';' separator";
- }
- }
- }
- }
+ std::string sep = mapnik::util::trim_copy(separator);
+ if (sep.empty()) sep = detail::detect_separator(csv_line);
+ separator_ = sep;
// set back to start
stream.seekg(0, std::ios::beg);
- using escape_type = boost::escaped_list_separator<char>;
-
std::string esc = mapnik::util::trim_copy(escape);
if (esc.empty()) esc = "\\";
@@ -251,104 +155,41 @@ void csv_datasource::parse_csv(T & stream,
MAPNIK_LOG_DEBUG(csv) << "csv_datasource: csv grammar: sep: '" << sep
<< "' quo: '" << quo << "' esc: '" << esc << "'";
- boost::escaped_list_separator<char> grammer;
- try
- {
- // grammer = boost::escaped_list_separator<char>('\\', ',', '\"');
- grammer = boost::escaped_list_separator<char>(esc, sep, quo);
- }
- catch(std::exception const& ex)
- {
- std::string s("CSV Plugin: ");
- s += ex.what();
- throw mapnik::datasource_exception(s);
- }
-
- using Tokenizer = boost::tokenizer< escape_type >;
-
int line_number = 1;
- bool has_wkt_field = false;
- bool has_json_field = false;
- bool has_lat_field = false;
- bool has_lon_field = false;
- unsigned wkt_idx = 0;
- unsigned json_idx = 0;
- unsigned lat_idx = 0;
- unsigned lon_idx = 0;
-
if (!manual_headers_.empty())
{
- Tokenizer tok(manual_headers_, grammer);
- Tokenizer::iterator beg = tok.begin();
- unsigned idx = 0;
- for (; beg != tok.end(); ++beg)
+ std::size_t index = 0;
+ auto headers = csv_utils::parse_line(manual_headers_, sep);
+ for (auto const& header : headers)
{
- std::string val = mapnik::util::trim_copy(*beg);
- std::string lower_val = val;
- std::transform(lower_val.begin(), lower_val.end(), lower_val.begin(), ::tolower);
- if (lower_val == "wkt"
- || (lower_val.find("geom") != std::string::npos))
- {
- wkt_idx = idx;
- has_wkt_field = true;
- }
- if (lower_val == "geojson")
- {
- json_idx = idx;
- has_json_field = true;
- }
- if (lower_val == "x"
- || lower_val == "lon"
- || lower_val == "lng"
- || lower_val == "long"
- || (lower_val.find("longitude") != std::string::npos))
- {
- lon_idx = idx;
- has_lon_field = true;
- }
- if (lower_val == "y"
- || lower_val == "lat"
- || (lower_val.find("latitude") != std::string::npos))
- {
- lat_idx = idx;
- has_lat_field = true;
- }
- ++idx;
+ std::string val = mapnik::util::trim_copy(header);
+ detail::locate_geometry_column(val, index++, locator_);
headers_.push_back(val);
}
}
else // parse first line as headers
{
- while (std::getline(stream,csv_line,newline))
+ while (std::getline(stream,csv_line,stream.widen(newline)))
{
try
{
- Tokenizer tok(csv_line, grammer);
- Tokenizer::iterator beg = tok.begin();
- std::string val;
- if (beg != tok.end())
- val = mapnik::util::trim_copy(*beg);
-
+ auto headers = csv_utils::parse_line(csv_line, sep);
// skip blank lines
- if (val.empty())
- {
- // do nothing
- ++line_number;
- }
+ std::string val;
+ if (headers.size() > 0 && headers[0].empty()) ++line_number;
else
{
- int idx = -1;
- for (; beg != tok.end(); ++beg)
+ std::size_t index = 0;
+ for (auto const& header : headers)
{
- ++idx;
- val = mapnik::util::trim_copy(*beg);
+ val = mapnik::util::trim_copy(header);
if (val.empty())
{
if (strict_)
{
std::ostringstream s;
s << "CSV Plugin: expected a column header at line ";
- s << line_number << ", column " << idx;
+ s << line_number << ", column " << index;
s << " - ensure this row contains valid header fields: '";
s << csv_line << "'\n";
throw mapnik::datasource_exception(s.str());
@@ -357,49 +198,22 @@ void csv_datasource::parse_csv(T & stream,
{
// create a placeholder for the empty header
std::ostringstream s;
- s << "_" << idx;
+ s << "_" << index;
headers_.push_back(s.str());
}
}
else
{
- std::string lower_val = val;
- std::transform(lower_val.begin(), lower_val.end(), lower_val.begin(), ::tolower);
- if (lower_val == "wkt"
- || (lower_val.find("geom") != std::string::npos))
- {
- wkt_idx = idx;
- has_wkt_field = true;
- }
- if (lower_val == "geojson")
- {
- json_idx = idx;
- has_json_field = true;
- }
- if (lower_val == "x"
- || lower_val == "lon"
- || lower_val == "lng"
- || lower_val == "long"
- || (lower_val.find("longitude") != std::string::npos))
- {
- lon_idx = idx;
- has_lon_field = true;
- }
- if (lower_val == "y"
- || lower_val == "lat"
- || (lower_val.find("latitude") != std::string::npos))
- {
- lat_idx = idx;
- has_lat_field = true;
- }
+ detail::locate_geometry_column(val, index, locator_);
headers_.push_back(val);
}
+ ++index;
}
++line_number;
break;
}
}
- catch(const std::exception & ex)
+ catch (std::exception const& ex)
{
std::string s("CSV Plugin: error parsing headers: ");
s += ex.what();
@@ -408,16 +222,16 @@ void csv_datasource::parse_csv(T & stream,
}
}
- if (!has_wkt_field && !has_json_field && (!has_lon_field || !has_lat_field) )
+ if (locator_.type == detail::geometry_column_locator::UNKNOWN)
{
- throw mapnik::datasource_exception("CSV Plugin: could not detect column headers with the name of wkt, geojson, x/y, or latitude/longitude - this is required for reading geometry data");
+ throw mapnik::datasource_exception("CSV Plugin: could not detect column headers with the name of wkt, geojson, x/y, or "
+ "latitude/longitude - this is required for reading geometry data");
}
mapnik::value_integer feature_count = 0;
bool extent_started = false;
std::size_t num_headers = headers_.size();
-
std::for_each(headers_.begin(), headers_.end(),
[ & ](std::string const& header){ ctx_->push(header); });
@@ -434,15 +248,20 @@ void csv_datasource::parse_csv(T & stream,
is_first_row = true;
}
}
- while (std::getline(stream,csv_line,newline) || is_first_row)
+
+ std::vector<item_type> boxes;
+ auto pos = stream.tellg();
+ while (std::getline(stream, csv_line, stream.widen(newline)) || is_first_row)
{
- is_first_row = false;
- if ((row_limit_ > 0) && (line_number > row_limit_))
+ if ((row_limit_ > 0) && (line_number++ > row_limit_))
{
MAPNIK_LOG_DEBUG(csv) << "csv_datasource: row limit hit, exiting at feature: " << feature_count;
break;
}
-
+ auto record_offset = pos;
+ auto record_size = csv_line.length();
+ pos = stream.tellg();
+ is_first_row = false;
// skip blank lines
unsigned line_length = csv_line.length();
if (line_length <= 10)
@@ -451,7 +270,6 @@ void csv_datasource::parse_csv(T & stream,
boost::trim_if(trimmed,boost::algorithm::is_any_of("\",'\r\n "));
if (trimmed.empty())
{
- ++line_number;
MAPNIK_LOG_DEBUG(csv) << "csv_datasource: empty row encountered at line: " << line_number;
continue;
}
@@ -459,17 +277,8 @@ void csv_datasource::parse_csv(T & stream,
try
{
- // special handling for varieties of quoting that we will enounter with json
- // TODO - test with custom "quo" option
- if (has_json_field && (quo == "\"") && (std::count(csv_line.begin(), csv_line.end(), '"') >= 6))
- {
- csv_utils::fix_json_quoting(csv_line);
- }
-
- Tokenizer tok(csv_line, grammer);
- Tokenizer::iterator beg = tok.begin();
-
- unsigned num_fields = std::distance(beg,tok.end());
+ auto values = csv_utils::parse_line(csv_line, sep);
+ unsigned num_fields = values.size();
if (num_fields > num_headers)
{
std::ostringstream s;
@@ -494,378 +303,108 @@ void csv_datasource::parse_csv(T & stream,
}
}
- // NOTE: we use ++feature_count here because feature id's should start at 1;
- mapnik::feature_ptr feature(mapnik::feature_factory::create(ctx_,++feature_count));
- double x = 0;
- double y = 0;
- bool parsed_x = false;
- bool parsed_y = false;
- bool parsed_wkt = false;
- bool parsed_json = false;
- std::vector<std::string> collected;
- for (unsigned i = 0; i < num_headers; ++i)
+ auto geom = detail::extract_geometry(values, locator_);
+ if (!geom.is<mapnik::geometry::geometry_empty>())
{
- std::string fld_name(headers_.at(i));
- collected.push_back(fld_name);
- std::string value;
- if (beg == tok.end()) // there are more headers than column values for this row
+ auto box = mapnik::geometry::envelope(geom);
+ boxes.emplace_back(std::move(box), make_pair(record_offset, record_size));
+ if (!extent_initialized_)
{
- // add an empty string here to represent a missing value
- // not using null type here since nulls are not a csv thing
- feature->put(fld_name,tr.transcode(value.c_str()));
- if (feature_count == 1)
+ if (!extent_started)
{
- desc_.add_descriptor(mapnik::attribute_descriptor(fld_name,mapnik::String));
+ extent_started = true;
+ extent_ = mapnik::geometry::envelope(geom);
}
- // continue here instead of break so that all missing values are
- // encoded consistenly as empty strings
- continue;
- }
- else
- {
- value = mapnik::util::trim_copy(*beg);
- ++beg;
- }
-
- int value_length = value.length();
-
- // parse wkt
- if (has_wkt_field)
- {
- if (i == wkt_idx)
- {
- // skip empty geoms
- if (value.empty())
- {
- break;
- }
- mapnik::geometry::geometry<double> geom;
- if (mapnik::from_wkt(value, geom))
- {
- // correct orientations etc
- mapnik::geometry::correct(geom);
- // set geometry
- feature->set_geometry(std::move(geom));
- parsed_wkt = true;
- }
- else
- {
- std::ostringstream s;
- s << "CSV Plugin: expected well known text geometry: could not parse row "
- << line_number
- << ",column "
- << i << " - found: '"
- << value << "'";
- if (strict_)
- {
- throw mapnik::datasource_exception(s.str());
- }
- else
- {
- MAPNIK_LOG_ERROR(csv) << s.str();
- }
- }
- }
- }
- // TODO - support both wkt/geojson columns
- // at once to create multi-geoms?
- // parse as geojson
- else if (has_json_field)
- {
- if (i == json_idx)
+ else
{
- // skip empty geoms
- if (value.empty())
- {
- break;
- }
- mapnik::geometry::geometry<double> geom;
- if (mapnik::json::from_geojson(value, geom))
- {
- feature->set_geometry(std::move(geom));
- parsed_json = true;
- }
- else
- {
- std::ostringstream s;
- s << "CSV Plugin: expected geojson geometry: could not parse row "
- << line_number
- << ",column "
- << i << " - found: '"
- << value << "'";
- if (strict_)
- {
- throw mapnik::datasource_exception(s.str());
- }
- else
- {
- MAPNIK_LOG_ERROR(csv) << s.str();
- }
- }
+ extent_.expand_to_include(mapnik::geometry::envelope(geom));
}
}
- else
+ if (++feature_count != 1) continue;
+ auto beg = values.begin();
+ auto end = values.end();
+ for (std::size_t i = 0; i < num_headers; ++i)
{
- // longitude
- if (i == lon_idx)
- {
- // skip empty geoms
- if (value.empty())
- {
- break;
- }
-
- if (mapnik::util::string2double(value,x))
- {
- parsed_x = true;
- }
- else
- {
- std::ostringstream s;
- s << "CSV Plugin: expected a float value for longitude: could not parse row "
- << line_number
- << ", column "
- << i << " - found: '"
- << value << "'";
- if (strict_)
- {
- throw mapnik::datasource_exception(s.str());
- }
- else
- {
- MAPNIK_LOG_ERROR(csv) << s.str();
- }
- }
- }
- // latitude
- else if (i == lat_idx)
+ std::string const& header = headers_.at(i);
+ if (beg == end) // there are more headers than column values for this row
{
- // skip empty geoms
- if (value.empty())
- {
- break;
- }
-
- if (mapnik::util::string2double(value,y))
+ // add an empty string here to represent a missing value
+ // not using null type here since nulls are not a csv thing
+ if (feature_count == 1)
{
- parsed_y = true;
- }
- else
- {
- std::ostringstream s;
- s << "CSV Plugin: expected a float value for latitude: could not parse row "
- << line_number
- << ", column "
- << i << " - found: '"
- << value << "'";
- if (strict_)
- {
- throw mapnik::datasource_exception(s.str());
- }
- else
- {
- MAPNIK_LOG_ERROR(csv) << s.str();
- }
+ desc_.add_descriptor(mapnik::attribute_descriptor(header, mapnik::String));
}
+ // continue here instead of break so that all missing values are
+ // encoded consistenly as empty strings
+ continue;
}
- }
-
- // now, add attributes, skipping any WKT or JSON fields
- if ((has_wkt_field) && (i == wkt_idx)) continue;
- if ((has_json_field) && (i == json_idx)) continue;
- /* First we detect likely strings,
- then try parsing likely numbers,
- then try converting to bool,
- finally falling back to string type.
- An empty string or a string of "null" will be parsed
- as a string rather than a true null value.
- Likely strings are either empty values, very long values
- or values with leading zeros like 001 (which are not safe
- to assume are numbers)
- */
-
- bool matched = false;
- bool has_dot = value.find(".") != std::string::npos;
- if (value.empty() ||
- (value_length > 20) ||
- (value_length > 1 && !has_dot && value[0] == '0'))
- {
- matched = true;
- feature->put(fld_name,std::move(tr.transcode(value.c_str())));
- if (feature_count == 1)
+ std::string value = mapnik::util::trim_copy(*beg++);
+ int value_length = value.length();
+ if (locator_.index == i && (locator_.type == detail::geometry_column_locator::WKT
+ || locator_.type == detail::geometry_column_locator::GEOJSON)) continue;
+
+ // First we detect likely strings,
+ // then try parsing likely numbers,
+ // then try converting to bool,
+ // finally falling back to string type.
+
+ // An empty string or a string of "null" will be parsed
+ // as a string rather than a true null value.
+ // Likely strings are either empty values, very long values
+ // or values with leading zeros like 001 (which are not safe
+ // to assume are numbers)
+
+ bool matched = false;
+ bool has_dot = value.find(".") != std::string::npos;
+ if (value.empty() || (value_length > 20) || (value_length > 1 && !has_dot && value[0] == '0'))
{
- desc_.add_descriptor(mapnik::attribute_descriptor(fld_name,mapnik::String));
+ matched = true;
+ desc_.add_descriptor(mapnik::attribute_descriptor(header, mapnik::String));
}
- }
- else if (csv_utils::is_likely_number(value))
- {
- bool has_e = value.find("e") != std::string::npos;
- if (has_dot || has_e)
+ else if (csv_utils::is_likely_number(value))
{
- double float_val = 0.0;
- if (mapnik::util::string2double(value,float_val))
+ bool has_e = value.find("e") != std::string::npos;
+ if (has_dot || has_e)
{
- matched = true;
- feature->put(fld_name,float_val);
- if (feature_count == 1)
+ double float_val = 0.0;
+ if (mapnik::util::string2double(value,float_val))
{
- desc_.add_descriptor(
- mapnik::attribute_descriptor(
- fld_name,mapnik::Double));
+ matched = true;
+ desc_.add_descriptor(mapnik::attribute_descriptor(header,mapnik::Double));
}
}
- }
- else
- {
- mapnik::value_integer int_val = 0;
- if (mapnik::util::string2int(value,int_val))
+ else
{
- matched = true;
- feature->put(fld_name,int_val);
- if (feature_count == 1)
+ mapnik::value_integer int_val = 0;
+ if (mapnik::util::string2int(value,int_val))
{
- desc_.add_descriptor(
- mapnik::attribute_descriptor(
- fld_name,mapnik::Integer));
+ matched = true;
+ desc_.add_descriptor(mapnik::attribute_descriptor(header,mapnik::Integer));
}
}
}
- }
- if (!matched)
- {
- // NOTE: we don't use mapnik::util::string2bool
- // here because we don't want to treat 'on' and 'off'
- // as booleans, only 'true' and 'false'
- bool bool_val = false;
- std::string lower_val = value;
- std::transform(lower_val.begin(), lower_val.end(), lower_val.begin(), ::tolower);
- if (lower_val == "true")
+ if (!matched)
{
- matched = true;
- bool_val = true;
- }
- else if (lower_val == "false")
- {
- matched = true;
- bool_val = false;
- }
- if (matched)
- {
- feature->put(fld_name,bool_val);
- if (feature_count == 1)
+ // NOTE: we don't use mapnik::util::string2bool
+ // here because we don't want to treat 'on' and 'off'
+ // as booleans, only 'true' and 'false'
+ if (csv_utils::ignore_case_equal(value, "true") || csv_utils::ignore_case_equal(value, "false"))
{
- desc_.add_descriptor(
- mapnik::attribute_descriptor(
- fld_name,mapnik::Boolean));
+ desc_.add_descriptor(mapnik::attribute_descriptor(header, mapnik::Boolean));
}
- }
- else
- {
- // fallback to normal string
- feature->put(fld_name,std::move(tr.transcode(value.c_str())));
- if (feature_count == 1)
+ else // fallback to normal string
{
- desc_.add_descriptor(
- mapnik::attribute_descriptor(
- fld_name,mapnik::String));
+ desc_.add_descriptor(mapnik::attribute_descriptor(header, mapnik::String));
}
}
}
}
-
- bool null_geom = true;
- if (has_wkt_field || has_json_field)
- {
- if (parsed_wkt || parsed_json)
- {
- if (!extent_initialized_)
- {
- if (!extent_started)
- {
- extent_started = true;
- extent_ = feature->envelope();
- }
- else
- {
- extent_.expand_to_include(feature->envelope());
- }
- }
- features_.push_back(feature);
- null_geom = false;
- }
- else
- {
- std::ostringstream s;
- s << "CSV Plugin: could not read WKT or GeoJSON geometry "
- << "for line " << line_number << " - found " << headers_.size()
- << " with values like: " << csv_line << "\n";
- if (strict_)
- {
- throw mapnik::datasource_exception(s.str());
- }
- else
- {
- MAPNIK_LOG_ERROR(csv) << s.str();
- continue;
- }
- }
- }
- else if (has_lat_field || has_lon_field)
- {
- if (parsed_x && parsed_y)
- {
- mapnik::geometry::point<double> pt(x,y);
- feature->set_geometry(std::move(pt));
- features_.push_back(feature);
- null_geom = false;
- if (!extent_initialized_)
- {
- if (!extent_started)
- {
- extent_started = true;
- extent_ = feature->envelope();
- }
- else
- {
- extent_.expand_to_include(feature->envelope());
- }
- }
- }
- else if (parsed_x || parsed_y)
- {
- std::ostringstream s;
- s << "CSV Plugin: does your csv have valid headers?\n";
- if (!parsed_x)
- {
- s << "Could not detect or parse any rows named 'x' or 'longitude' "
- << "for line " << line_number << " but found " << headers_.size()
- << " with values like: " << csv_line << "\n"
- << "for: " << boost::algorithm::join(collected, ",") << "\n";
- }
- if (!parsed_y)
- {
- s << "Could not detect or parse any rows named 'y' or 'latitude' "
- << "for line " << line_number << " but found " << headers_.size()
- << " with values like: " << csv_line << "\n"
- << "for: " << boost::algorithm::join(collected, ",") << "\n";
- }
- if (strict_)
- {
- throw mapnik::datasource_exception(s.str());
- }
- else
- {
- MAPNIK_LOG_ERROR(csv) << s.str();
- continue;
- }
- }
- }
-
- if (null_geom)
+ else
{
std::ostringstream s;
- s << "CSV Plugin: could not detect and parse valid lat/lon fields or wkt/json geometry for line "
- << line_number;
+ s << "CSV Plugin: expected geometry column: could not parse row "
+ << line_number << " "
+ << values[locator_.index] << "'";
if (strict_)
{
throw mapnik::datasource_exception(s.str());
@@ -873,27 +412,18 @@ void csv_datasource::parse_csv(T & stream,
else
{
MAPNIK_LOG_ERROR(csv) << s.str();
- // with no geometry we will never
- // add this feature so drop the count
- feature_count--;
- continue;
}
}
-
- ++line_number;
}
- catch(mapnik::datasource_exception const& ex )
+ catch (mapnik::datasource_exception const& ex )
{
- if (strict_)
- {
- throw mapnik::datasource_exception(ex.what());
- }
+ if (strict_) throw ex;
else
{
MAPNIK_LOG_ERROR(csv) << ex.what();
}
}
- catch(std::exception const& ex)
+ catch (std::exception const& ex)
{
std::ostringstream s;
s << "CSV Plugin: unexpected error parsing line: " << line_number
@@ -909,10 +439,8 @@ void csv_datasource::parse_csv(T & stream,
}
}
}
- if (feature_count < 1)
- {
- MAPNIK_LOG_ERROR(csv) << "CSV Plugin: could not parse any lines of data";
- }
+ // bulk insert initialise r-tree
+ tree_ = std::make_unique<spatial_index_type>(boxes);
}
const char * csv_datasource::name()
@@ -939,19 +467,58 @@ boost::optional<mapnik::datasource_geometry_t> csv_datasource::get_geometry_type
{
boost::optional<mapnik::datasource_geometry_t> result;
int multi_type = 0;
- unsigned num_features = features_.size();
- for (unsigned i = 0; i < num_features && i < 5; ++i)
+ auto itr = tree_->qbegin(boost::geometry::index::intersects(extent_));
+ auto end = tree_->qend();
+ mapnik::context_ptr ctx = std::make_shared<mapnik::context_type>();
+ for (std::size_t count = 0; itr !=end && count < 5; ++itr, ++count)
{
- result = mapnik::util::to_ds_type(features_[i]->get_geometry());
- if (result)
+ csv_datasource::item_type const& item = *itr;
+ std::size_t file_offset = item.second.first;
+ std::size_t size = item.second.second;
+
+ std::string str;
+ if (inline_string_.empty())
{
- int type = static_cast<int>(*result);
- if (multi_type > 0 && multi_type != type)
+#if defined (_WINDOWS)
+ std::ifstream in(mapnik::utf8_to_utf16(filename_),std::ios_base::in | std::ios_base::binary);
+#else
+ std::ifstream in(filename_.c_str(),std::ios_base::in | std::ios_base::binary);
+#endif
+ if (!in.is_open())
{
- result.reset(mapnik::datasource_geometry_t::Collection);
- return result;
+ throw mapnik::datasource_exception("CSV Plugin: could not open: '" + filename_ + "'");
}
- multi_type = type;
+ in.seekg(file_offset);
+ std::vector<char> record;
+ record.resize(size);
+ in.read(record.data(), size);
+ str = std::string(record.begin(), record.end());
+ }
+ else
+ {
+ str = inline_string_.substr(file_offset, size);
+ }
+
+ try
+ {
+ auto values = csv_utils::parse_line(str, separator_);
+ auto geom = detail::extract_geometry(values, locator_);
+ result = mapnik::util::to_ds_type(geom);
+ if (result)
+ {
+ int type = static_cast<int>(*result);
+ if (multi_type > 0 && multi_type != type)
+ {
+ result.reset(mapnik::datasource_geometry_t::Collection);
+ return result;
+ }
+ multi_type = type;
+ }
+ }
+ catch (std::exception const& ex)
+ {
+ if (strict_) throw ex;
+ else MAPNIK_LOG_ERROR(csv) << ex.what();
}
}
return result;
@@ -959,32 +526,61 @@ boost::optional<mapnik::datasource_geometry_t> csv_datasource::get_geometry_type
mapnik::featureset_ptr csv_datasource::features(mapnik::query const& q) const
{
- const std::set<std::string>& attribute_names = q.property_names();
- std::set<std::string>::const_iterator pos = attribute_names.begin();
- while (pos != attribute_names.end())
+
+ for (auto const& name : q.property_names())
{
bool found_name = false;
- for (std::size_t i = 0; i < headers_.size(); ++i)
+ for (auto const& header : headers_)
{
- if (headers_[i] == *pos)
+ if (header == name)
{
found_name = true;
break;
}
}
- if (! found_name)
+ if (!found_name)
{
std::ostringstream s;
- s << "CSV Plugin: no attribute '" << *pos << "'. Valid attributes are: "
+ s << "CSV Plugin: no attribute '" << name << "'. Valid attributes are: "
<< boost::algorithm::join(headers_, ",") << ".";
throw mapnik::datasource_exception(s.str());
}
- ++pos;
}
- return std::make_shared<mapnik::memory_featureset>(q.get_bbox(),features_);
+
+ mapnik::box2d<double> const& box = q.get_bbox();
+ if (extent_.intersects(box))
+ {
+ csv_featureset::array_type index_array;
+ if (tree_)
+ {
+ tree_->query(boost::geometry::index::intersects(box),std::back_inserter(index_array));
+ std::sort(index_array.begin(),index_array.end(),
+ [] (item_type const& item0, item_type const& item1)
+ {
+ return item0.second.first < item1.second.first;
+ });
+ if (inline_string_.empty())
+ {
+ return std::make_shared<csv_featureset>(filename_, locator_, separator_, headers_, ctx_, std::move(index_array));
+ }
+ else
+ {
+ return std::make_shared<csv_inline_featureset>(inline_string_, locator_, separator_, headers_, ctx_, std::move(index_array));
+ }
+ }
+ }
+ return mapnik::featureset_ptr();
}
mapnik::featureset_ptr csv_datasource::features_at_point(mapnik::coord2d const& pt, double tol) const
{
- throw mapnik::datasource_exception("CSV Plugin: features_at_point is not supported yet");
+ mapnik::box2d<double> query_bbox(pt, pt);
+ query_bbox.pad(tol);
+ mapnik::query q(query_bbox);
+ std::vector<mapnik::attribute_descriptor> const& desc = desc_.get_descriptors();
+ for (auto const& item : desc)
+ {
+ q.add_property_name(item.get_name());
+ }
+ return features(q);
}
diff --git a/plugins/input/csv/csv_datasource.hpp b/plugins/input/csv/csv_datasource.hpp
index 7881af8..0c8864b 100644
--- a/plugins/input/csv/csv_datasource.hpp
+++ b/plugins/input/csv/csv_datasource.hpp
@@ -35,15 +35,51 @@
// boost
#include <boost/optional.hpp>
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wunused-parameter"
+#pragma GCC diagnostic ignored "-Wunused-variable"
+#pragma GCC diagnostic ignored "-Wunused-local-typedef"
+#pragma GCC diagnostic ignored "-Wshadow"
+#pragma GCC diagnostic ignored "-Wsign-conversion"
+#pragma GCC diagnostic ignored "-Wconversion"
+#include <boost/version.hpp>
+#include <boost/geometry/index/rtree.hpp>
+#pragma GCC diagnostic pop
// stl
#include <vector>
#include <deque>
#include <string>
+template <std::size_t Max, std::size_t Min>
+struct csv_linear : boost::geometry::index::linear<Max,Min> {};
+
+namespace boost { namespace geometry { namespace index { namespace detail { namespace rtree {
+
+template <std::size_t Max, std::size_t Min>
+struct options_type<csv_linear<Max,Min> >
+{
+ using type = options<csv_linear<Max, Min>,
+ insert_default_tag,
+ choose_by_content_diff_tag,
+ split_default_tag,
+ linear_tag,
+#if BOOST_VERSION >= 105700
+ node_variant_static_tag>;
+#else
+ node_s_mem_static_tag>;
+
+#endif
+};
+}}}}}
+
class csv_datasource : public mapnik::datasource
{
public:
+ using box_type = mapnik::box2d<double>;
+ using item_type = std::pair<box_type, std::pair<std::size_t, std::size_t>>;
+ using spatial_index_type = boost::geometry::index::rtree<item_type,csv_linear<16,4>>;
+
csv_datasource(mapnik::parameters const& params);
virtual ~csv_datasource ();
mapnik::datasource::datasource_t type() const;
@@ -63,19 +99,18 @@ private:
mapnik::layer_descriptor desc_;
mapnik::box2d<double> extent_;
std::string filename_;
- std::string inline_string_;
- unsigned file_length_;
mapnik::value_integer row_limit_;
- std::deque<mapnik::feature_ptr> features_;
+ std::string inline_string_;
std::string escape_;
std::string separator_;
std::string quote_;
std::vector<std::string> headers_;
std::string manual_headers_;
bool strict_;
- double filesize_max_;
mapnik::context_ptr ctx_;
bool extent_initialized_;
+ std::unique_ptr<spatial_index_type> tree_;
+ detail::geometry_column_locator locator_;
};
#endif // MAPNIK_CSV_DATASOURCE_HPP
diff --git a/plugins/input/geojson/large_geojson_featureset.cpp b/plugins/input/csv/csv_featureset.cpp
similarity index 58%
copy from plugins/input/geojson/large_geojson_featureset.cpp
copy to plugins/input/csv/csv_featureset.cpp
index 1df7dce..4a9e74a 100644
--- a/plugins/input/geojson/large_geojson_featureset.cpp
+++ b/plugins/input/csv/csv_featureset.cpp
@@ -21,62 +21,66 @@
*****************************************************************************/
// mapnik
+#include "csv_featureset.hpp"
+#include <mapnik/debug.hpp>
#include <mapnik/feature.hpp>
#include <mapnik/feature_factory.hpp>
-#include <mapnik/json/geometry_grammar.hpp>
-#include <mapnik/json/feature_grammar.hpp>
#include <mapnik/util/utf_conv_win.hpp>
// stl
#include <string>
#include <vector>
#include <deque>
-#include "large_geojson_featureset.hpp"
-
-large_geojson_featureset::large_geojson_featureset(std::string const& filename,
- array_type && index_array)
-:
+csv_featureset::csv_featureset(std::string const& filename, detail::geometry_column_locator const& locator, std::string const& separator,
+ std::vector<std::string> const& headers, mapnik::context_ptr const& ctx, array_type && index_array)
+ :
#ifdef _WINDOWS
file_(_wfopen(mapnik::utf8_to_utf16(filename).c_str(), L"rb"), std::fclose),
#else
file_(std::fopen(filename.c_str(),"rb"), std::fclose),
#endif
+ separator_(separator),
+ headers_(headers),
index_array_(std::move(index_array)),
index_itr_(index_array_.begin()),
index_end_(index_array_.end()),
- ctx_(std::make_shared<mapnik::context_type>())
+ ctx_(ctx),
+ locator_(locator),
+ tr_("utf8")
{
if (!file_) throw std::runtime_error("Can't open " + filename);
}
-large_geojson_featureset::~large_geojson_featureset() {}
+csv_featureset::~csv_featureset() {}
-mapnik::feature_ptr large_geojson_featureset::next()
+mapnik::feature_ptr csv_featureset::parse_feature(char const* beg, char const* end)
+{
+ auto values = csv_utils::parse_line(beg, end, separator_, headers_.size());
+ auto geom = detail::extract_geometry(values, locator_);
+ if (!geom.is<mapnik::geometry::geometry_empty>())
+ {
+ mapnik::feature_ptr feature(mapnik::feature_factory::create(ctx_, ++feature_id_));
+ feature->set_geometry(std::move(geom));
+ detail::process_properties(*feature, headers_, values, locator_, tr_);
+ return feature;
+ }
+ return mapnik::feature_ptr();
+}
+
+mapnik::feature_ptr csv_featureset::next()
{
if (index_itr_ != index_end_)
{
- geojson_datasource::item_type const& item = *index_itr_++;
+ csv_datasource::item_type const& item = *index_itr_++;
std::size_t file_offset = item.second.first;
std::size_t size = item.second.second;
std::fseek(file_.get(), file_offset, SEEK_SET);
- std::vector<char> json;
- json.resize(size);
- std::fread(json.data(), size, 1, file_.get());
-
- using chr_iterator_type = char const*;
- chr_iterator_type start = json.data();
- chr_iterator_type end = start + json.size();
-
- static const mapnik::transcoder tr("utf8");
- static const mapnik::json::feature_grammar<chr_iterator_type,mapnik::feature_impl> grammar(tr);
- using namespace boost::spirit;
- standard::space_type space;
- mapnik::feature_ptr feature(mapnik::feature_factory::create(ctx_,1));
- if (!qi::phrase_parse(start, end, (grammar)(boost::phoenix::ref(*feature)), space))
- {
- throw std::runtime_error("Failed to parse geojson feature");
- }
- return feature;
+ std::vector<char> record;
+ record.resize(size);
+ std::fread(record.data(), size, 1, file_.get());
+ auto const* start = record.data();
+ auto const* end = start + record.size();
+ return parse_feature(start, end);
}
return mapnik::feature_ptr();
}
diff --git a/plugins/input/geojson/large_geojson_featureset.hpp b/plugins/input/csv/csv_featureset.hpp
similarity index 59%
copy from plugins/input/geojson/large_geojson_featureset.hpp
copy to plugins/input/csv/csv_featureset.hpp
index a67eec5..1fc2103 100644
--- a/plugins/input/geojson/large_geojson_featureset.hpp
+++ b/plugins/input/csv/csv_featureset.hpp
@@ -20,35 +20,43 @@
*
*****************************************************************************/
-#ifndef LARGE_GEOJSON_FEATURESET_HPP
-#define LARGE_GEOJSON_FEATURESET_HPP
+#ifndef CSV_FEATURESET_HPP
+#define CSV_FEATURESET_HPP
#include <mapnik/feature.hpp>
-#include "geojson_datasource.hpp"
-
-#include <vector>
+#include <mapnik/unicode.hpp>
+#include "csv_utils.hpp"
+#include "csv_datasource.hpp"
#include <deque>
-#include <fstream>
#include <cstdio>
-class large_geojson_featureset : public mapnik::Featureset
+class csv_featureset : public mapnik::Featureset
{
-public:
- using array_type = std::deque<geojson_datasource::item_type>;
using file_ptr = std::unique_ptr<std::FILE, int (*)(std::FILE *)>;
-
- large_geojson_featureset(std::string const& filename,
- array_type && index_array);
- virtual ~large_geojson_featureset();
+ using locator_type = detail::geometry_column_locator;
+public:
+ using array_type = std::deque<csv_datasource::item_type>;
+ csv_featureset(std::string const& filename,
+ locator_type const& locator,
+ std::string const& separator,
+ std::vector<std::string> const& headers,
+ mapnik::context_ptr const& ctx,
+ array_type && index_array);
+ ~csv_featureset();
mapnik::feature_ptr next();
-
private:
+ mapnik::feature_ptr parse_feature(char const* beg, char const* end);
file_ptr file_;
-
+ std::string const& separator_;
+ std::vector<std::string> const& headers_;
const array_type index_array_;
array_type::const_iterator index_itr_;
array_type::const_iterator index_end_;
mapnik::context_ptr ctx_;
+ mapnik::value_integer feature_id_ = 0;
+ detail::geometry_column_locator const& locator_;
+ mapnik::transcoder tr_;
};
-#endif // LARGE_GEOJSON_FEATURESET_HPP
+
+#endif // CSV_FEATURESET_HPP
diff --git a/plugins/input/csv/csv_inline_featureset.cpp b/plugins/input/csv/csv_inline_featureset.cpp
new file mode 100644
index 0000000..29b2203
--- /dev/null
+++ b/plugins/input/csv/csv_inline_featureset.cpp
@@ -0,0 +1,78 @@
+/*****************************************************************************
+ *
+ * This file is part of Mapnik (c++ mapping toolkit)
+ *
+ * Copyright (C) 2015 Artem Pavlenko
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ *
+ *****************************************************************************/
+
+// mapnik
+#include "csv_inline_featureset.hpp"
+#include <mapnik/debug.hpp>
+#include <mapnik/feature.hpp>
+#include <mapnik/feature_factory.hpp>
+#include <mapnik/util/utf_conv_win.hpp>
+#include <mapnik/util/trim.hpp>
+// stl
+#include <string>
+#include <vector>
+#include <deque>
+
+csv_inline_featureset::csv_inline_featureset(std::string const& inline_string,
+ detail::geometry_column_locator const& locator,
+ std::string const& separator,
+ std::vector<std::string> const& headers,
+ mapnik::context_ptr const& ctx,
+ array_type && index_array)
+ : inline_string_(inline_string),
+ separator_(separator),
+ headers_(headers),
+ index_array_(std::move(index_array)),
+ index_itr_(index_array_.begin()),
+ index_end_(index_array_.end()),
+ ctx_(ctx),
+ locator_(locator),
+ tr_("utf8") {}
+
+csv_inline_featureset::~csv_inline_featureset() {}
+
+mapnik::feature_ptr csv_inline_featureset::parse_feature(std::string const& str)
+{
+ auto values = csv_utils::parse_line(str, separator_);
+ auto geom = detail::extract_geometry(values, locator_);
+ if (!geom.is<mapnik::geometry::geometry_empty>())
+ {
+ mapnik::feature_ptr feature(mapnik::feature_factory::create(ctx_, ++feature_id_));
+ feature->set_geometry(std::move(geom));
+ detail::process_properties(*feature, headers_, values, locator_, tr_);
+ return feature;
+ }
+ return mapnik::feature_ptr();
+}
+
+mapnik::feature_ptr csv_inline_featureset::next()
+{
+ if (index_itr_ != index_end_)
+ {
+ csv_datasource::item_type const& item = *index_itr_++;
+ std::size_t file_offset = item.second.first;
+ std::size_t size = item.second.second;
+ std::string str = inline_string_.substr(file_offset, size);
+ return parse_feature(str);
+ }
+ return mapnik::feature_ptr();
+}
diff --git a/plugins/input/geojson/large_geojson_featureset.hpp b/plugins/input/csv/csv_inline_featureset.hpp
similarity index 55%
copy from plugins/input/geojson/large_geojson_featureset.hpp
copy to plugins/input/csv/csv_inline_featureset.hpp
index a67eec5..9e06be8 100644
--- a/plugins/input/geojson/large_geojson_featureset.hpp
+++ b/plugins/input/csv/csv_inline_featureset.hpp
@@ -20,35 +20,42 @@
*
*****************************************************************************/
-#ifndef LARGE_GEOJSON_FEATURESET_HPP
-#define LARGE_GEOJSON_FEATURESET_HPP
+#ifndef CSV_INLINE_FEATURESET_HPP
+#define CSV_INLINE_FEATURESET_HPP
#include <mapnik/feature.hpp>
-#include "geojson_datasource.hpp"
-
-#include <vector>
+#include <mapnik/unicode.hpp>
+#include "csv_utils.hpp"
+#include "csv_datasource.hpp"
#include <deque>
-#include <fstream>
#include <cstdio>
-class large_geojson_featureset : public mapnik::Featureset
+class csv_inline_featureset : public mapnik::Featureset
{
+ using locator_type = detail::geometry_column_locator;
public:
- using array_type = std::deque<geojson_datasource::item_type>;
- using file_ptr = std::unique_ptr<std::FILE, int (*)(std::FILE *)>;
-
- large_geojson_featureset(std::string const& filename,
- array_type && index_array);
- virtual ~large_geojson_featureset();
+ using array_type = std::deque<csv_datasource::item_type>;
+ csv_inline_featureset(std::string const& inline_string,
+ locator_type const& locator,
+ std::string const& separator,
+ std::vector<std::string> const& headers,
+ mapnik::context_ptr const& ctx,
+ array_type && index_array);
+ ~csv_inline_featureset();
mapnik::feature_ptr next();
-
private:
- file_ptr file_;
-
+ mapnik::feature_ptr parse_feature(std::string const& str);
+ std::string const& inline_string_;
+ std::string const& separator_;
+ std::vector<std::string> headers_;
const array_type index_array_;
array_type::const_iterator index_itr_;
array_type::const_iterator index_end_;
mapnik::context_ptr ctx_;
+ mapnik::value_integer feature_id_ = 0;
+ detail::geometry_column_locator const& locator_;
+ mapnik::transcoder tr_;
};
-#endif // LARGE_GEOJSON_FEATURESET_HPP
+
+#endif // CSV_INLINE_FEATURESET_HPP
diff --git a/plugins/input/csv/csv_utils.hpp b/plugins/input/csv/csv_utils.hpp
index c55065e..b2981ce 100644
--- a/plugins/input/csv/csv_utils.hpp
+++ b/plugins/input/csv/csv_utils.hpp
@@ -23,6 +23,16 @@
#ifndef MAPNIK_CSV_UTILS_DATASOURCE_HPP
#define MAPNIK_CSV_UTILS_DATASOURCE_HPP
+// mapnik
+#include <mapnik/debug.hpp>
+#include <mapnik/geometry.hpp>
+#include <mapnik/geometry_correct.hpp>
+#include <mapnik/wkt/wkt_factory.hpp>
+#include <mapnik/json/geometry_parser.hpp>
+#include <mapnik/util/conversions.hpp>
+#include <mapnik/csv/csv_grammar.hpp>
+#include <mapnik/util/trim.hpp>
+// boost
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wunused-parameter"
#pragma GCC diagnostic ignored "-Wunused-local-typedef"
@@ -32,66 +42,275 @@
#include <string>
#include <cstdio>
+#include <algorithm>
namespace csv_utils
{
- static inline bool is_likely_number(std::string const& value)
+
+static const mapnik::csv_line_grammar<char const*> line_g;
+
+template <typename Iterator>
+static mapnik::csv_line parse_line(Iterator start, Iterator end, std::string const& separator, std::size_t num_columns)
+{
+ mapnik::csv_line values;
+ if (num_columns > 0) values.reserve(num_columns);
+ boost::spirit::standard::blank_type blank;
+ if (!boost::spirit::qi::phrase_parse(start, end, (line_g)(boost::phoenix::cref(separator)), blank, values))
{
- return( strspn( value.c_str(), "e-.+0123456789" ) == value.size() );
+ throw std::runtime_error("Failed to parse CSV line:\n" + std::string(start, end));
}
+ return values;
+}
+
+static inline mapnik::csv_line parse_line(std::string const& line_str, std::string const& separator)
+{
+ auto start = line_str.c_str();
+ auto end = start + line_str.length();
+ return parse_line(start, end, separator, 0);
+}
- static inline void fix_json_quoting(std::string & csv_line)
+static inline bool is_likely_number(std::string const& value)
+{
+ return( strspn( value.c_str(), "e-.+0123456789" ) == value.size() );
+}
+
+struct ignore_case_equal_pred
+{
+ bool operator () (unsigned char a, unsigned char b) const
+ {
+ return std::tolower(a) == std::tolower(b);
+ }
+};
+
+inline bool ignore_case_equal(std::string const& s0, std::string const& s1)
+{
+ return std::equal(s0.begin(), s0.end(),
+ s1.begin(), ignore_case_equal_pred());
+}
+
+}
+
+
+namespace detail {
+
+template <typename T>
+std::size_t file_length(T & stream)
+{
+ stream.seekg(0, std::ios::end);
+ return stream.tellg();
+}
+
+static inline std::string detect_separator(std::string const& str)
+{
+ std::string separator = ","; // default
+ int num_commas = std::count(str.begin(), str.end(), ',');
+ // detect tabs
+ int num_tabs = std::count(str.begin(), str.end(), '\t');
+ if (num_tabs > 0)
{
- std::string wrapping_char;
- std::string::size_type j_idx = std::string::npos;
- std::string::size_type post_idx = std::string::npos;
- std::string::size_type j_idx_double = csv_line.find("\"{");
- std::string::size_type j_idx_single = csv_line.find("'{");
- if (j_idx_double != std::string::npos)
+ if (num_tabs > num_commas)
{
- wrapping_char = "\"";
- j_idx = j_idx_double;
- post_idx = csv_line.find("}\"");
+ separator = "\t";
+ MAPNIK_LOG_DEBUG(csv) << "csv_datasource: auto detected tab separator";
+ }
+ }
+ else // pipes
+ {
+ int num_pipes = std::count(str.begin(), str.end(), '|');
+ if (num_pipes > num_commas)
+ {
+ separator = "|";
+ MAPNIK_LOG_DEBUG(csv) << "csv_datasource: auto detected '|' separator";
+ }
+ else // semicolons
+ {
+ int num_semicolons = std::count(str.begin(), str.end(), ';');
+ if (num_semicolons > num_commas)
+ {
+ separator = ";";
+ MAPNIK_LOG_DEBUG(csv) << "csv_datasource: auto detected ';' separator";
+ }
+ }
+ }
+ return separator;
+}
+template <typename T>
+std::tuple<char,bool> autodect_newline(T & stream, std::size_t file_length)
+{
+ // autodetect newlines
+ char newline = '\n';
+ bool has_newline = false;
+ for (std::size_t lidx = 0; lidx < file_length && lidx < 4000; ++lidx)
+ {
+ char c = static_cast<char>(stream.get());
+ if (c == '\r')
+ {
+ newline = '\r';
+ has_newline = true;
+ break;
}
- else if (j_idx_single != std::string::npos)
+ if (c == '\n')
{
- wrapping_char = "'";
- j_idx = j_idx_single;
- post_idx = csv_line.find("}'");
+ has_newline = true;
+ break;
}
- // we are positive it is valid json
- if (!wrapping_char.empty())
+ }
+ return std::make_tuple(newline,has_newline);
+}
+
+
+struct geometry_column_locator
+{
+ geometry_column_locator()
+ : type(UNKNOWN), index(-1), index2(-1) {}
+
+ enum { UNKNOWN = 0, WKT, GEOJSON, LON_LAT } type;
+ std::size_t index;
+ std::size_t index2;
+};
+
+static inline void locate_geometry_column(std::string const& header, std::size_t index, geometry_column_locator & locator)
+{
+ std::string lower_val(header);
+ std::transform(lower_val.begin(), lower_val.end(), lower_val.begin(), ::tolower);
+ if (lower_val == "wkt" || (lower_val.find("geom") != std::string::npos))
+ {
+ locator.type = geometry_column_locator::WKT;
+ locator.index = index;
+ }
+ else if (lower_val == "geojson")
+ {
+ locator.type = geometry_column_locator::GEOJSON;
+ locator.index = index;
+ }
+ else if (lower_val == "x" || lower_val == "lon"
+ || lower_val == "lng" || lower_val == "long"
+ || (lower_val.find("longitude") != std::string::npos))
+ {
+ locator.index = index;
+ locator.type = geometry_column_locator::LON_LAT;
+ }
+
+ else if (lower_val == "y"
+ || lower_val == "lat"
+ || (lower_val.find("latitude") != std::string::npos))
+ {
+ locator.index2 = index;
+ locator.type = geometry_column_locator::LON_LAT;
+ }
+}
+
+static mapnik::geometry::geometry<double> extract_geometry(std::vector<std::string> const& row, geometry_column_locator const& locator)
+{
+ mapnik::geometry::geometry<double> geom;
+ if (locator.type == geometry_column_locator::WKT)
+ {
+ if (mapnik::from_wkt(row[locator.index], geom))
{
- // grab the json chunk
- std::string json_chunk = csv_line.substr(j_idx,post_idx+wrapping_char.size());
- bool does_not_have_escaped_double_quotes = (json_chunk.find("\\\"") == std::string::npos);
- // ignore properly escaped quotes like \" which need no special handling
- if (does_not_have_escaped_double_quotes)
+ // correct orientations ..
+ mapnik::geometry::correct(geom);
+ }
+ else
+ {
+ throw std::runtime_error("Failed to parse WKT:" + row[locator.index]);
+ }
+ }
+ else if (locator.type == geometry_column_locator::GEOJSON)
+ {
+
+ if (!mapnik::json::from_geojson(row[locator.index], geom))
+ {
+ throw std::runtime_error("Failed to parse GeoJSON:" + row[locator.index]);
+ }
+ }
+ else if (locator.type == geometry_column_locator::LON_LAT)
+ {
+ double x, y;
+ if (!mapnik::util::string2double(row[locator.index],x))
+ {
+ throw std::runtime_error("Failed to parse Longitude(Easting):" + row[locator.index]);
+ }
+ if (!mapnik::util::string2double(row[locator.index2],y))
+ {
+ throw std::runtime_error("Failed to parse Latitude(Northing):" + row[locator.index2]);
+ }
+ geom = mapnik::geometry::point<double>(x,y);
+ }
+ return geom;
+}
+
+template <typename Feature, typename Headers, typename Values, typename Locator, typename Transcoder>
+void process_properties(Feature & feature, Headers const& headers, Values const& values, Locator const& locator, Transcoder const& tr)
+{
+ auto val_beg = values.begin();
+ auto val_end = values.end();
+ auto num_headers = headers.size();
+ for (std::size_t i = 0; i < num_headers; ++i)
+ {
+ std::string const& fld_name = headers.at(i);
+ if (val_beg == val_end)
+ {
+ feature.put(fld_name,tr.transcode(""));
+ continue;
+ }
+ std::string value = mapnik::util::trim_copy(*val_beg++);
+ int value_length = value.length();
+
+ if (locator.index == i && (locator.type == detail::geometry_column_locator::WKT
+ || locator.type == detail::geometry_column_locator::GEOJSON) ) continue;
+
+
+ bool matched = false;
+ bool has_dot = value.find(".") != std::string::npos;
+ if (value.empty() ||
+ (value_length > 20) ||
+ (value_length > 1 && !has_dot && value[0] == '0'))
+ {
+ matched = true;
+ feature.put(fld_name,std::move(tr.transcode(value.c_str())));
+ }
+ else if (csv_utils::is_likely_number(value))
+ {
+ bool has_e = value.find("e") != std::string::npos;
+ if (has_dot || has_e)
{
- std::string pre_json = csv_line.substr(0,j_idx);
- std::string post_json = csv_line.substr(post_idx+wrapping_char.size());
- // handle "" in a string wrapped in "
- // http://tools.ietf.org/html/rfc4180#section-2 item 7.
- // e.g. "{""type"":""Point"",""coordinates"":[30.0,10.0]}"
- if (json_chunk.find("\"\"") != std::string::npos)
+ double float_val = 0.0;
+ if (mapnik::util::string2double(value,float_val))
{
- boost::algorithm::replace_all(json_chunk,"\"\"","\\\"");
- csv_line = pre_json + json_chunk + post_json;
+ matched = true;
+ feature.put(fld_name,float_val);
}
- // handle " in a string wrapped in '
- // e.g. '{"type":"Point","coordinates":[30.0,10.0]}'
- else
+ }
+ else
+ {
+ mapnik::value_integer int_val = 0;
+ if (mapnik::util::string2int(value,int_val))
{
- // escape " because we cannot exchange for single quotes
- // https://github.com/mapnik/mapnik/issues/1408
- boost::algorithm::replace_all(json_chunk,"\"","\\\"");
- boost::algorithm::replace_all(json_chunk,"'","\"");
- csv_line = pre_json + json_chunk + post_json;
+ matched = true;
+ feature.put(fld_name,int_val);
}
}
}
+ if (!matched)
+ {
+ if (csv_utils::ignore_case_equal(value, "true"))
+ {
+ feature.put(fld_name, true);
+ }
+ else if (csv_utils::ignore_case_equal(value, "false"))
+ {
+ feature.put(fld_name, false);
+ }
+ else // fallback to string
+ {
+ feature.put(fld_name,std::move(tr.transcode(value.c_str())));
+ }
+ }
}
}
+
+}// ns detail
+
#endif // MAPNIK_CSV_UTILS_DATASOURCE_HPP
diff --git a/plugins/input/geojson/large_geojson_featureset.cpp b/plugins/input/geojson/large_geojson_featureset.cpp
index 1df7dce..6f61d53 100644
--- a/plugins/input/geojson/large_geojson_featureset.cpp
+++ b/plugins/input/geojson/large_geojson_featureset.cpp
@@ -29,7 +29,6 @@
// stl
#include <string>
#include <vector>
-#include <deque>
#include "large_geojson_featureset.hpp"
diff --git a/plugins/input/geojson/large_geojson_featureset.hpp b/plugins/input/geojson/large_geojson_featureset.hpp
index a67eec5..8321ff3 100644
--- a/plugins/input/geojson/large_geojson_featureset.hpp
+++ b/plugins/input/geojson/large_geojson_featureset.hpp
@@ -26,9 +26,7 @@
#include <mapnik/feature.hpp>
#include "geojson_datasource.hpp"
-#include <vector>
#include <deque>
-#include <fstream>
#include <cstdio>
class large_geojson_featureset : public mapnik::Featureset
diff --git a/src/datasource_cache.cpp b/src/datasource_cache.cpp
index eb2cfd0..f412446 100644
--- a/src/datasource_cache.cpp
+++ b/src/datasource_cache.cpp
@@ -88,7 +88,7 @@ datasource_ptr datasource_cache::create(parameters const& params)
// add scope to ensure lock is released asap
{
#ifdef MAPNIK_THREADSAFE
- std::lock_guard<std::mutex> lock(mutex_);
+ std::lock_guard<std::recursive_mutex> lock(instance_mutex_);
#endif
itr=plugins_.find(*type);
if (itr == plugins_.end())
@@ -132,6 +132,9 @@ datasource_ptr datasource_cache::create(parameters const& params)
std::string datasource_cache::plugin_directories()
{
+#ifdef MAPNIK_THREADSAFE
+ std::lock_guard<std::recursive_mutex> lock(instance_mutex_);
+#endif
return boost::algorithm::join(plugin_directories_,", ");
}
@@ -143,6 +146,10 @@ std::vector<std::string> datasource_cache::plugin_names()
names = get_static_datasource_names();
#endif
+#ifdef MAPNIK_THREADSAFE
+ std::lock_guard<std::recursive_mutex> lock(instance_mutex_);
+#endif
+
std::map<std::string,std::shared_ptr<PluginInfo> >::const_iterator itr;
for (itr = plugins_.begin(); itr != plugins_.end(); ++itr)
{
@@ -155,7 +162,7 @@ std::vector<std::string> datasource_cache::plugin_names()
bool datasource_cache::register_datasources(std::string const& dir, bool recurse)
{
#ifdef MAPNIK_THREADSAFE
- std::lock_guard<std::mutex> lock(mutex_);
+ std::lock_guard<std::recursive_mutex> lock(instance_mutex_);
#endif
if (!mapnik::util::exists(dir))
{
@@ -202,6 +209,9 @@ bool datasource_cache::register_datasources(std::string const& dir, bool recurse
bool datasource_cache::register_datasource(std::string const& filename)
{
+#ifdef MAPNIK_THREADSAFE
+ std::lock_guard<std::recursive_mutex> lock(instance_mutex_);
+#endif
try
{
if (!mapnik::util::exists(filename))
diff --git a/src/image_util_jpeg.cpp b/src/image_util_jpeg.cpp
index 8966eb2..13860fd 100644
--- a/src/image_util_jpeg.cpp
+++ b/src/image_util_jpeg.cpp
@@ -43,7 +43,7 @@ jpeg_saver::jpeg_saver(std::ostream & stream, std::string const& t)
namespace detail {
-MAPNIK_DECL int parse_jpeg_quality(std::string const& params)
+int parse_jpeg_quality(std::string const& params)
{
int quality = 85;
if (params != "jpeg")
diff --git a/test/standalone/csv_test.cpp b/test/standalone/csv_test.cpp
index 2023f67..c156929 100644
--- a/test/standalone/csv_test.cpp
+++ b/test/standalone/csv_test.cpp
@@ -21,139 +21,145 @@
namespace bfs = boost::filesystem;
namespace {
-void add_csv_files(bfs::path dir, std::vector<bfs::path> &csv_files) {
- for (auto const &entry : boost::make_iterator_range(
- bfs::directory_iterator(dir), bfs::directory_iterator())) {
- auto path = entry.path();
- if (path.extension().native() == ".csv") {
- csv_files.emplace_back(path);
+void add_csv_files(bfs::path dir, std::vector<bfs::path> &csv_files)
+{
+ for (auto const &entry : boost::make_iterator_range(
+ bfs::directory_iterator(dir), bfs::directory_iterator()))
+ {
+ auto path = entry.path();
+ if (path.extension().native() == ".csv")
+ {
+ csv_files.emplace_back(path);
+ }
}
- }
}
-mapnik::datasource_ptr get_csv_ds(std::string const &file_name, bool strict = true) {
- mapnik::parameters params;
- params["type"] = std::string("csv");
- params["file"] = file_name;
- params["strict"] = mapnik::value_bool(strict);
- auto ds = mapnik::datasource_cache::instance().create(params);
- // require a non-null pointer returned
- REQUIRE(bool(ds));
- return ds;
+mapnik::datasource_ptr get_csv_ds(std::string const &file_name, bool strict = true)
+{
+ mapnik::parameters params;
+ params["type"] = std::string("csv");
+ params["file"] = file_name;
+ params["strict"] = mapnik::value_bool(strict);
+ auto ds = mapnik::datasource_cache::instance().create(params);
+ // require a non-null pointer returned
+ REQUIRE(ds != nullptr);
+ return ds;
}
void require_field_names(std::vector<mapnik::attribute_descriptor> const &fields,
- std::initializer_list<std::string> const &names) {
- REQUIRE(fields.size() == names.size());
- auto itr_a = fields.begin();
- auto const end_a = fields.end();
- auto itr_b = names.begin();
- for (; itr_a != end_a; ++itr_a, ++itr_b) {
- CHECK(itr_a->get_name() == *itr_b);
- }
+ std::initializer_list<std::string> const &names)
+{
+ REQUIRE(fields.size() == names.size());
+ auto itr_a = fields.begin();
+ auto const end_a = fields.end();
+ auto itr_b = names.begin();
+ for (; itr_a != end_a; ++itr_a, ++itr_b)
+ {
+ CHECK(itr_a->get_name() == *itr_b);
+ }
}
void require_field_types(std::vector<mapnik::attribute_descriptor> const &fields,
std::initializer_list<mapnik::eAttributeType> const &types) {
- REQUIRE(fields.size() == types.size());
- auto itr_a = fields.begin();
- auto const end_a = fields.end();
- auto itr_b = types.begin();
- for (; itr_a != end_a; ++itr_a, ++itr_b) {
- CHECK(itr_a->get_type() == *itr_b);
- }
+ REQUIRE(fields.size() == types.size());
+ auto itr_a = fields.begin();
+ auto const end_a = fields.end();
+ auto itr_b = types.begin();
+ for (; itr_a != end_a; ++itr_a, ++itr_b) {
+ CHECK(itr_a->get_type() == *itr_b);
+ }
}
mapnik::featureset_ptr all_features(mapnik::datasource_ptr ds) {
- auto fields = ds->get_descriptor().get_descriptors();
- mapnik::query query(ds->envelope());
- for (auto const &field : fields) {
- query.add_property_name(field.get_name());
- }
- return ds->features(query);
+ auto fields = ds->get_descriptor().get_descriptors();
+ mapnik::query query(ds->envelope());
+ for (auto const &field : fields) {
+ query.add_property_name(field.get_name());
+ }
+ return ds->features(query);
}
std::size_t count_features(mapnik::featureset_ptr features) {
- std::size_t count = 0;
- while (features->next()) {
- ++count;
- }
- return count;
+ std::size_t count = 0;
+ while (features->next()) {
+ ++count;
+ }
+ return count;
}
using attr = std::tuple<std::string, mapnik::value>;
void require_attributes(mapnik::feature_ptr feature,
std::initializer_list<attr> const &attrs) {
- REQUIRE(bool(feature));
- for (auto const &kv : attrs) {
- REQUIRE(feature->has_key(std::get<0>(kv)));
- CHECK(feature->get(std::get<0>(kv)) == std::get<1>(kv));
- }
+ REQUIRE(bool(feature));
+ for (auto const &kv : attrs) {
+ REQUIRE(feature->has_key(std::get<0>(kv)));
+ CHECK(feature->get(std::get<0>(kv)) == std::get<1>(kv));
+ }
}
namespace detail {
struct feature_count {
- template <typename T>
- std::size_t operator()(T const &geom) const {
- return mapnik::util::apply_visitor(*this, geom);
- }
-
- std::size_t operator()(mapnik::geometry::geometry_empty const &) const {
- return 0;
- }
-
- template <typename T>
- std::size_t operator()(mapnik::geometry::point<T> const &) const {
- return 1;
- }
-
- template <typename T>
- std::size_t operator()(mapnik::geometry::line_string<T> const &) const {
- return 1;
- }
-
- template <typename T>
- std::size_t operator()(mapnik::geometry::polygon<T> const &) const {
- return 1;
- }
-
- template <typename T>
- std::size_t operator()(mapnik::geometry::multi_point<T> const &mp) const {
- return mp.size();
- }
-
- template <typename T>
- std::size_t operator()(mapnik::geometry::multi_line_string<T> const &mls) const {
- return mls.size();
- }
-
- template <typename T>
- std::size_t operator()(mapnik::geometry::multi_polygon<T> const &mp) const {
- return mp.size();
- }
-
- template <typename T>
- std::size_t operator()(mapnik::geometry::geometry_collection<T> const &col) const {
- std::size_t sum = 0;
- for (auto const &geom : col) {
- sum += operator()(geom);
+ template <typename T>
+ std::size_t operator()(T const &geom) const {
+ return mapnik::util::apply_visitor(*this, geom);
+ }
+
+ std::size_t operator()(mapnik::geometry::geometry_empty const &) const {
+ return 0;
+ }
+
+ template <typename T>
+ std::size_t operator()(mapnik::geometry::point<T> const &) const {
+ return 1;
+ }
+
+ template <typename T>
+ std::size_t operator()(mapnik::geometry::line_string<T> const &) const {
+ return 1;
+ }
+
+ template <typename T>
+ std::size_t operator()(mapnik::geometry::polygon<T> const &) const {
+ return 1;
+ }
+
+ template <typename T>
+ std::size_t operator()(mapnik::geometry::multi_point<T> const &mp) const {
+ return mp.size();
+ }
+
+ template <typename T>
+ std::size_t operator()(mapnik::geometry::multi_line_string<T> const &mls) const {
+ return mls.size();
+ }
+
+ template <typename T>
+ std::size_t operator()(mapnik::geometry::multi_polygon<T> const &mp) const {
+ return mp.size();
+ }
+
+ template <typename T>
+ std::size_t operator()(mapnik::geometry::geometry_collection<T> const &col) const {
+ std::size_t sum = 0;
+ for (auto const &geom : col) {
+ sum += operator()(geom);
+ }
+ return sum;
}
- return sum;
- }
};
} // namespace detail
template <typename T>
std::size_t feature_count(mapnik::geometry::geometry<T> const &g) {
- return detail::feature_count()(g);
+ return detail::feature_count()(g);
}
void require_geometry(mapnik::feature_ptr feature,
std::size_t num_parts,
mapnik::geometry::geometry_types type) {
- REQUIRE(bool(feature));
- CHECK(mapnik::geometry::geometry_type(feature->get_geometry()) == type);
- CHECK(feature_count(feature->get_geometry()) == num_parts);
+ REQUIRE(bool(feature));
+ CHECK(mapnik::geometry::geometry_type(feature->get_geometry()) == type);
+ CHECK(feature_count(feature->get_geometry()) == num_parts);
}
} // anonymous namespace
@@ -163,520 +169,519 @@ const bool registered = mapnik::datasource_cache::instance().register_datasource
TEST_CASE("csv") {
- if (mapnik::util::exists(csv_plugin))
- {
-
- REQUIRE(registered);
-
- // make the tests silent since we intentially test error conditions that are noisy
- auto const severity = mapnik::logger::instance().get_severity();
- mapnik::logger::instance().set_severity(mapnik::logger::none);
-
- // check the CSV datasource is loaded
- const std::vector<std::string> plugin_names =
- mapnik::datasource_cache::instance().plugin_names();
- const bool have_csv_plugin =
- std::find(plugin_names.begin(), plugin_names.end(), "csv") != plugin_names.end();
-
- SECTION("broken files") {
- if (have_csv_plugin) {
- std::vector<bfs::path> broken;
- add_csv_files("test/data/csv/fails", broken);
- add_csv_files("test/data/csv/warns", broken);
- broken.emplace_back("test/data/csv/fails/does_not_exist.csv");
-
- for (auto const &path : broken) {
- REQUIRE_THROWS(get_csv_ds(path.native()));
- }
- }
- } // END SECTION
-
- SECTION("good files") {
- if (have_csv_plugin) {
- std::vector<bfs::path> good;
- add_csv_files("test/data/csv", good);
- add_csv_files("test/data/csv/warns", good);
-
- for (auto const &path : good) {
- auto ds = get_csv_ds(path.native(), false);
- // require a non-null pointer returned
+ if (mapnik::util::exists(csv_plugin))
+ {
+ REQUIRE(registered);
+ // make the tests silent since we intentially test error conditions that are noisy
+ auto const severity = mapnik::logger::instance().get_severity();
+ mapnik::logger::instance().set_severity(mapnik::logger::none);
+
+ // check the CSV datasource is loaded
+ const std::vector<std::string> plugin_names =
+ mapnik::datasource_cache::instance().plugin_names();
+ const bool have_csv_plugin =
+ std::find(plugin_names.begin(), plugin_names.end(), "csv") != plugin_names.end();
+
+ SECTION("broken files") {
+ if (have_csv_plugin) {
+ std::vector<bfs::path> broken;
+ add_csv_files("test/data/csv/fails", broken);
+ add_csv_files("test/data/csv/warns", broken);
+ broken.emplace_back("test/data/csv/fails/does_not_exist.csv");
+
+ for (auto const &path : broken)
+ {
+ REQUIRE_THROWS(get_csv_ds(path.native()));
+ }
+ }
+ } // END SECTION
+
+ SECTION("good files") {
+ if (have_csv_plugin) {
+ std::vector<bfs::path> good;
+ add_csv_files("test/data/csv", good);
+ add_csv_files("test/data/csv/warns", good);
+
+ for (auto const& path : good)
+ {
+ auto ds = get_csv_ds(path.native(), false);
+ // require a non-null pointer returned
+ REQUIRE(bool(ds));
+ }
+ }
+ } // END SECTION
+
+ SECTION("lon/lat detection")
+ {
+ for (auto const& lon_name : {std::string("lon"), std::string("lng")})
+ {
+ auto ds = get_csv_ds((boost::format("test/data/csv/%1%_lat.csv") % lon_name).str());
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {lon_name, "lat"});
+ require_field_types(fields, {mapnik::Integer, mapnik::Integer});
+
+ CHECK(ds->get_geometry_type() == mapnik::datasource_geometry_t::Point);
+
+ mapnik::query query(ds->envelope());
+ for (auto const &field : fields)
+ {
+ query.add_property_name(field.get_name());
+ }
+ auto features = ds->features(query);
+ auto feature = features->next();
+
+ require_attributes(feature, {
+ attr { lon_name, mapnik::value_integer(0) },
+ attr { "lat", mapnik::value_integer(0) }
+ });
+ }
+ } // END SECTION
+
+ SECTION("type detection") {
+ auto ds = get_csv_ds("test/data/csv/nypd.csv");
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"Precinct", "Phone", "Address", "City", "geo_longitude", "geo_latitude", "geo_accuracy"});
+ require_field_types(fields, {mapnik::String, mapnik::String, mapnik::String, mapnik::String, mapnik::Double, mapnik::Double, mapnik::String});
+
+ CHECK(ds->get_geometry_type() == mapnik::datasource_geometry_t::Point);
+ CHECK(count_features(all_features(ds)) == 2);
+
+ auto feature = all_features(ds)->next();
+ require_attributes(feature, {
+ attr { "City", mapnik::value_unicode_string("New York, NY") }
+ , attr { "geo_accuracy", mapnik::value_unicode_string("house") }
+ , attr { "Phone", mapnik::value_unicode_string("(212) 334-0711") }
+ , attr { "Address", mapnik::value_unicode_string("19 Elizabeth Street") }
+ , attr { "Precinct", mapnik::value_unicode_string("5th Precinct") }
+ , attr { "geo_longitude", mapnik::value_integer(-70) }
+ , attr { "geo_latitude", mapnik::value_integer(40) }
+ });
+ } // END SECTION
+
+ SECTION("skipping blank rows") {
+ auto ds = get_csv_ds("test/data/csv/blank_rows.csv");
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"x", "y", "name"});
+ require_field_types(fields, {mapnik::Integer, mapnik::Integer, mapnik::String});
+
+ CHECK(ds->get_geometry_type() == mapnik::datasource_geometry_t::Point);
+ CHECK(count_features(all_features(ds)) == 2);
+ } // END SECTION
+
+ SECTION("empty rows") {
+ auto ds = get_csv_ds("test/data/csv/empty_rows.csv");
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"x", "y", "text", "date", "integer", "boolean", "float", "time", "datetime", "empty_column"});
+ require_field_types(fields, {mapnik::Integer, mapnik::Integer, mapnik::String, mapnik::String, mapnik::Integer, mapnik::Boolean, mapnik::Double, mapnik::String, mapnik::String, mapnik::String});
+
+ CHECK(ds->get_geometry_type() == mapnik::datasource_geometry_t::Point);
+ CHECK(count_features(all_features(ds)) == 4);
+
+ auto featureset = all_features(ds);
+ auto feature = featureset->next();
+ require_attributes(feature, {
+ attr { "x", mapnik::value_integer(0) }
+ , attr { "empty_column", mapnik::value_unicode_string("") }
+ , attr { "text", mapnik::value_unicode_string("a b") }
+ , attr { "float", mapnik::value_double(1.0) }
+ , attr { "datetime", mapnik::value_unicode_string("1971-01-01T04:14:00") }
+ , attr { "y", mapnik::value_integer(0) }
+ , attr { "boolean", mapnik::value_bool(true) }
+ , attr { "time", mapnik::value_unicode_string("04:14:00") }
+ , attr { "date", mapnik::value_unicode_string("1971-01-01") }
+ , attr { "integer", mapnik::value_integer(40) }
+ });
+
+ while (bool(feature = featureset->next())) {
+ CHECK(feature->size() == 10);
+ CHECK(feature->get("empty_column") == mapnik::value_unicode_string(""));
+ }
+ } // END SECTION
+
+ SECTION("slashes") {
+ auto ds = get_csv_ds("test/data/csv/has_attributes_with_slashes.csv");
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"x", "y", "name"});
+ // NOTE: y column is integer, even though a double value is used below in the test?
+ require_field_types(fields, {mapnik::Integer, mapnik::Integer, mapnik::String});
+
+ auto featureset = all_features(ds);
+ require_attributes(featureset->next(), {
+ attr{"x", 0}
+ , attr{"y", 0}
+ , attr{"name", mapnik::value_unicode_string("a/a") } });
+ require_attributes(featureset->next(), {
+ attr{"x", 1}
+ , attr{"y", 4}
+ , attr{"name", mapnik::value_unicode_string("b/b") } });
+ require_attributes(featureset->next(), {
+ attr{"x", 10}
+ , attr{"y", 2.5}
+ , attr{"name", mapnik::value_unicode_string("c/c") } });
+ } // END SECTION
+
+ SECTION("wkt field") {
+ using mapnik::geometry::geometry_types;
+
+ auto ds = get_csv_ds("test/data/csv/wkt.csv");
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"type"});
+ require_field_types(fields, {mapnik::String});
+
+ auto featureset = all_features(ds);
+ require_geometry(featureset->next(), 1, geometry_types::Point);
+ require_geometry(featureset->next(), 1, geometry_types::LineString);
+ require_geometry(featureset->next(), 1, geometry_types::Polygon);
+ require_geometry(featureset->next(), 1, geometry_types::Polygon);
+ require_geometry(featureset->next(), 4, geometry_types::MultiPoint);
+ require_geometry(featureset->next(), 2, geometry_types::MultiLineString);
+ require_geometry(featureset->next(), 2, geometry_types::MultiPolygon);
+ require_geometry(featureset->next(), 2, geometry_types::MultiPolygon);
+ } // END SECTION
+
+ SECTION("handling of missing header") {
+ // TODO: does this mean 'missing_header.csv' should be in the warnings
+ // subdirectory, since it doesn't work in strict mode?
+ auto ds = get_csv_ds("test/data/csv/missing_header.csv", false);
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"one", "two", "x", "y", "_4", "aftermissing"});
+ auto feature = all_features(ds)->next();
+ REQUIRE(feature);
+ REQUIRE(feature->has_key("_4"));
+ CHECK(feature->get("_4") == mapnik::value_unicode_string("missing"));
+ } // END SECTION
+
+ SECTION("handling of headers that are numbers") {
+ auto ds = get_csv_ds("test/data/csv/numbers_for_headers.csv");
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"x", "y", "1990", "1991", "1992"});
+ auto feature = all_features(ds)->next();
+ require_attributes(feature, {
+ attr{"x", 0}
+ , attr{"y", 0}
+ , attr{"1990", 1}
+ , attr{"1991", 2}
+ , attr{"1992", 3}
+ });
+ auto expression = mapnik::parse_expression("[1991]=2");
+ REQUIRE(bool(expression));
+ auto value = mapnik::util::apply_visitor(
+ mapnik::evaluate<mapnik::feature_impl, mapnik::value_type, mapnik::attributes>(
+ *feature, mapnik::attributes()), *expression);
+ CHECK(value == true);
+ } // END SECTION
+
+ SECTION("quoted numbers") {
+ using ustring = mapnik::value_unicode_string;
+
+ auto ds = get_csv_ds("test/data/csv/quoted_numbers.csv");
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"x", "y", "label"});
+ auto featureset = all_features(ds);
+
+ require_attributes(featureset->next(), {
+ attr{"x", 0}, attr{"y", 0}, attr{"label", ustring("0,0") } });
+ require_attributes(featureset->next(), {
+ attr{"x", 5}, attr{"y", 5}, attr{"label", ustring("5,5") } });
+ require_attributes(featureset->next(), {
+ attr{"x", 0}, attr{"y", 5}, attr{"label", ustring("0,5") } });
+ require_attributes(featureset->next(), {
+ attr{"x", 5}, attr{"y", 0}, attr{"label", ustring("5,0") } });
+ require_attributes(featureset->next(), {
+ attr{"x", 2.5}, attr{"y", 2.5}, attr{"label", ustring("2.5,2.5") } });
+
+ } // END SECTION
+
+ SECTION("reading newlines") {
+ for (auto const &platform : {std::string("windows"), std::string("mac")}) {
+ std::string file_name = (boost::format("test/data/csv/%1%_newlines.csv") % platform).str();
+ auto ds = get_csv_ds(file_name);
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"x", "y", "z"});
+ require_attributes(all_features(ds)->next(), {
+ attr{"x", 1}, attr{"y", 10}, attr{"z", 9999.9999} });
+ }
+ } // END SECTION
+
+ SECTION("mixed newlines") {
+ using ustring = mapnik::value_unicode_string;
+
+ for (auto const &file : {
+ std::string("test/data/csv/mac_newlines_with_unix_inline.csv")
+ , std::string("test/data/csv/mac_newlines_with_unix_inline_escaped.csv")
+ , std::string("test/data/csv/windows_newlines_with_unix_inline.csv")
+ , std::string("test/data/csv/windows_newlines_with_unix_inline_escaped.csv")
+ }) {
+ auto ds = get_csv_ds(file);
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"x", "y", "line"});
+ require_attributes(all_features(ds)->next(), {
+ attr{"x", 0}, attr{"y", 0}
+ , attr{"line", ustring("many\n lines\n of text\n with unix newlines")} });
+ }
+ } // END SECTION
+
+ SECTION("tabs") {
+ auto ds = get_csv_ds("test/data/csv/tabs_in_csv.csv");
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"x", "y", "z"});
+ require_attributes(all_features(ds)->next(), {
+ attr{"x", -122}, attr{"y", 48}, attr{"z", 0} });
+ } // END SECTION
+
+ SECTION("separators") {
+ using ustring = mapnik::value_unicode_string;
+
+ for (auto const &file : {
+ std::string("test/data/csv/pipe_delimiters.csv")
+ , std::string("test/data/csv/semicolon_delimiters.csv")
+ }) {
+ auto ds = get_csv_ds(file);
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"x", "y", "z"});
+ require_attributes(all_features(ds)->next(), {
+ attr{"x", 0}, attr{"y", 0}, attr{"z", ustring("hello")} });
+ }
+ } // END SECTION
+
+ SECTION("null and bool keywords are empty strings") {
+ using ustring = mapnik::value_unicode_string;
+
+ auto ds = get_csv_ds("test/data/csv/nulls_and_booleans_as_strings.csv");
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"x", "y", "null", "boolean"});
+ require_field_types(fields, {mapnik::Integer, mapnik::Integer, mapnik::String, mapnik::Boolean});
+
+ auto featureset = all_features(ds);
+ require_attributes(featureset->next(), {
+ attr{"x", 0}, attr{"y", 0}, attr{"null", ustring("null")}, attr{"boolean", true}});
+ require_attributes(featureset->next(), {
+ attr{"x", 0}, attr{"y", 0}, attr{"null", ustring("")}, attr{"boolean", false}});
+ } // END SECTION
+
+ SECTION("nonexistent query fields throw") {
+ auto ds = get_csv_ds("test/data/csv/lon_lat.csv");
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"lon", "lat"});
+ require_field_types(fields, {mapnik::Integer, mapnik::Integer});
+
+ mapnik::query query(ds->envelope());
+ for (auto const &field : fields) {
+ query.add_property_name(field.get_name());
+ }
+ // also add an invalid one, triggering throw
+ query.add_property_name("bogus");
+
+ REQUIRE_THROWS(ds->features(query));
+ } // END SECTION
+
+ SECTION("leading zeros mean strings") {
+ using ustring = mapnik::value_unicode_string;
+
+ auto ds = get_csv_ds("test/data/csv/leading_zeros.csv");
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"x", "y", "fips"});
+ require_field_types(fields, {mapnik::Integer, mapnik::Integer, mapnik::String});
+
+ auto featureset = all_features(ds);
+ require_attributes(featureset->next(), {
+ attr{"x", 0}, attr{"y", 0}, attr{"fips", ustring("001")}});
+ require_attributes(featureset->next(), {
+ attr{"x", 0}, attr{"y", 0}, attr{"fips", ustring("003")}});
+ require_attributes(featureset->next(), {
+ attr{"x", 0}, attr{"y", 0}, attr{"fips", ustring("005")}});
+ } // END SECTION
+
+ SECTION("advanced geometry detection") {
+ using row = std::pair<std::string, mapnik::datasource_geometry_t>;
+
+ for (row r : {
+ row{"point", mapnik::datasource_geometry_t::Point}
+ , row{"poly", mapnik::datasource_geometry_t::Polygon}
+ , row{"multi_poly", mapnik::datasource_geometry_t::Polygon}
+ , row{"line", mapnik::datasource_geometry_t::LineString}
+ }) {
+ std::string file_name = (boost::format("test/data/csv/%1%_wkt.csv") % r.first).str();
+ auto ds = get_csv_ds(file_name);
+ CHECK(ds->get_geometry_type() == r.second);
+ }
+ } // END SECTION
+
+ SECTION("creation of CSV from in-memory strings") {
+ using ustring = mapnik::value_unicode_string;
+
+ for (auto const &name : {std::string("Winthrop, WA"), std::string(u8"Qu\u00e9bec")}) {
+ std::string csv_string =
+ (boost::format(
+ "wkt,Name\n"
+ "\"POINT (120.15 48.47)\",\"%1%\"\n"
+ ) % name).str();
+
+ mapnik::parameters params;
+ params["type"] = std::string("csv");
+ params["inline"] = csv_string;
+ auto ds = mapnik::datasource_cache::instance().create(params);
+ REQUIRE(bool(ds));
+
+ auto feature = all_features(ds)->next();
+ REQUIRE(bool(feature));
+ REQUIRE(feature->has_key("Name"));
+ CHECK(feature->get("Name") == ustring(name.c_str()));
+ }
+ } // END SECTION
+
+ SECTION("geojson quoting") {
+ using mapnik::geometry::geometry_types;
+
+ for (auto const &file : {
+ std::string("test/data/csv/geojson_double_quote_escape.csv")
+ , std::string("test/data/csv/geojson_single_quote.csv")
+ , std::string("test/data/csv/geojson_2x_double_quote_filebakery_style.csv")
+ }) {
+ auto ds = get_csv_ds(file);
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"type"});
+ require_field_types(fields, {mapnik::String});
+
+ auto featureset = all_features(ds);
+ require_geometry(featureset->next(), 1, geometry_types::Point);
+ require_geometry(featureset->next(), 1, geometry_types::LineString);
+ require_geometry(featureset->next(), 1, geometry_types::Polygon);
+ require_geometry(featureset->next(), 1, geometry_types::Polygon);
+ require_geometry(featureset->next(), 4, geometry_types::MultiPoint);
+ require_geometry(featureset->next(), 2, geometry_types::MultiLineString);
+ require_geometry(featureset->next(), 2, geometry_types::MultiPolygon);
+ require_geometry(featureset->next(), 2, geometry_types::MultiPolygon);
+ }
+ } // END SECTION
+
+ SECTION("blank undelimited rows are still parsed") {
+ using ustring = mapnik::value_unicode_string;
+
+ // TODO: does this mean this CSV file should be in the warnings
+ // subdirectory, since it doesn't work in strict mode?
+ auto ds = get_csv_ds("test/data/csv/more_headers_than_column_values.csv", false);
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"x", "y", "one", "two", "three"});
+ require_field_types(fields, {mapnik::Integer, mapnik::Integer, mapnik::String, mapnik::String, mapnik::String});
+
+ require_attributes(all_features(ds)->next(), {
+ attr{"x", 0}, attr{"y", 0}, attr{"one", ustring("")}, attr{"two", ustring("")}, attr{"three", ustring("")} });
+ } // END SECTION
+
+ SECTION("fewer headers than rows throws") {
+ REQUIRE_THROWS(get_csv_ds("test/data/csv/more_column_values_than_headers.csv"));
+ } // END SECTION
+
+ SECTION("feature ID only incremented for valid rows") {
+ auto ds = get_csv_ds("test/data/csv/warns/feature_id_counting.csv", false);
+ auto fs = all_features(ds);
+
+ // first
+ auto feature = fs->next();
+ REQUIRE(bool(feature));
+ CHECK(feature->id() == 1);
+
+ // second, should have skipped bogus one
+ feature = fs->next();
+ REQUIRE(bool(feature));
+ CHECK(feature->id() == 2);
+
+ feature = fs->next();
+ CHECK(!feature);
+ } // END SECTION
+
+ SECTION("dynamically defining headers") {
+ using ustring = mapnik::value_unicode_string;
+ using row = std::pair<std::string, std::size_t>;
+
+ for (auto const &r : {
+ row{"test/data/csv/fails/needs_headers_two_lines.csv", 2},
+ row{"test/data/csv/fails/needs_headers_one_line.csv", 1},
+ row{"test/data/csv/fails/needs_headers_one_line_no_newline.csv", 1}})
+ {
+ mapnik::parameters params;
+ params["type"] = std::string("csv");
+ params["file"] = r.first;
+ params["headers"] = "x,y,name";
+ auto ds = mapnik::datasource_cache::instance().create(params);
+ REQUIRE(bool(ds));
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"x", "y", "name"});
+ require_field_types(fields, {mapnik::Integer, mapnik::Integer, mapnik::String});
+ require_attributes(all_features(ds)->next(), {
+ attr{"x", 0}, attr{"y", 0}, attr{"name", ustring("data_name")} });
+ REQUIRE(count_features(all_features(ds)) == r.second);
+ }
+ } // END SECTION
+
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wlong-long"
+ SECTION("64bit int fields work") {
+ auto ds = get_csv_ds("test/data/csv/64bit_int.csv");
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"x", "y", "bigint"});
+ require_field_types(fields, {mapnik::Integer, mapnik::Integer, mapnik::Integer});
+
+ auto fs = all_features(ds);
+ auto feature = fs->next();
+ require_attributes(feature, {
+ attr{"x", 0}, attr{"y", 0}, attr{"bigint", 2147483648} });
+
+ feature = fs->next();
+ require_attributes(feature, {
+ attr{"x", 0}, attr{"y", 0}, attr{"bigint", 9223372036854775807ll} });
+ require_attributes(feature, {
+ attr{"x", 0}, attr{"y", 0}, attr{"bigint", 0x7FFFFFFFFFFFFFFFll} });
+ } // END SECTION
+#pragma GCC diagnostic pop
+
+ SECTION("various number types") {
+ auto ds = get_csv_ds("test/data/csv/number_types.csv");
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {"x", "y", "floats"});
+ require_field_types(fields, {mapnik::Integer, mapnik::Integer, mapnik::Double});
+ auto fs = all_features(ds);
+ for (double d : { .0, +.0, 1e-06, -1e-06, 0.000001, 1.234e+16, 1.234e+16 }) {
+ auto feature = fs->next();
+ REQUIRE(bool(feature));
+ CHECK(feature->get("floats").get<mapnik::value_double>() == Approx(d));
+ }
+ } // END SECTION
+
+ SECTION("manually supplied extent") {
+ std::string csv_string("wkt,Name\n");
+ mapnik::parameters params;
+ params["type"] = std::string("csv");
+ params["inline"] = csv_string;
+ params["extent"] = "-180,-90,180,90";
+ auto ds = mapnik::datasource_cache::instance().create(params);
REQUIRE(bool(ds));
- }
- }
- } // END SECTION
-
- SECTION("lon/lat detection") {
- for (auto const &lon_name : {std::string("lon"), std::string("lng")}) {
- auto ds = get_csv_ds((boost::format("test/data/csv/%1%_lat.csv") % lon_name).str());
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {lon_name, "lat"});
- require_field_types(fields, {mapnik::Integer, mapnik::Integer});
-
- CHECK(ds->get_geometry_type() == mapnik::datasource_geometry_t::Point);
-
- mapnik::query query(ds->envelope());
- for (auto const &field : fields) {
- query.add_property_name(field.get_name());
- }
- auto features = ds->features(query);
- auto feature = features->next();
-
- require_attributes(feature, {
- attr { lon_name, mapnik::value_integer(0) },
- attr { "lat", mapnik::value_integer(0) }
- });
- }
- } // END SECTION
-
- SECTION("type detection") {
- auto ds = get_csv_ds("test/data/csv/nypd.csv");
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"Precinct", "Phone", "Address", "City", "geo_longitude", "geo_latitude", "geo_accuracy"});
- require_field_types(fields, {mapnik::String, mapnik::String, mapnik::String, mapnik::String, mapnik::Double, mapnik::Double, mapnik::String});
-
- CHECK(ds->get_geometry_type() == mapnik::datasource_geometry_t::Point);
- CHECK(count_features(all_features(ds)) == 2);
-
- auto feature = all_features(ds)->next();
- require_attributes(feature, {
- attr { "City", mapnik::value_unicode_string("New York, NY") }
- , attr { "geo_accuracy", mapnik::value_unicode_string("house") }
- , attr { "Phone", mapnik::value_unicode_string("(212) 334-0711") }
- , attr { "Address", mapnik::value_unicode_string("19 Elizabeth Street") }
- , attr { "Precinct", mapnik::value_unicode_string("5th Precinct") }
- , attr { "geo_longitude", mapnik::value_integer(-70) }
- , attr { "geo_latitude", mapnik::value_integer(40) }
- });
- } // END SECTION
-
- SECTION("skipping blank rows") {
- auto ds = get_csv_ds("test/data/csv/blank_rows.csv");
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"x", "y", "name"});
- require_field_types(fields, {mapnik::Integer, mapnik::Integer, mapnik::String});
-
- CHECK(ds->get_geometry_type() == mapnik::datasource_geometry_t::Point);
- CHECK(count_features(all_features(ds)) == 2);
- } // END SECTION
-
- SECTION("empty rows") {
- auto ds = get_csv_ds("test/data/csv/empty_rows.csv");
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"x", "y", "text", "date", "integer", "boolean", "float", "time", "datetime", "empty_column"});
- require_field_types(fields, {mapnik::Integer, mapnik::Integer, mapnik::String, mapnik::String, mapnik::Integer, mapnik::Boolean, mapnik::Double, mapnik::String, mapnik::String, mapnik::String});
-
- CHECK(ds->get_geometry_type() == mapnik::datasource_geometry_t::Point);
- CHECK(count_features(all_features(ds)) == 4);
-
- auto featureset = all_features(ds);
- auto feature = featureset->next();
- require_attributes(feature, {
- attr { "x", mapnik::value_integer(0) }
- , attr { "empty_column", mapnik::value_unicode_string("") }
- , attr { "text", mapnik::value_unicode_string("a b") }
- , attr { "float", mapnik::value_double(1.0) }
- , attr { "datetime", mapnik::value_unicode_string("1971-01-01T04:14:00") }
- , attr { "y", mapnik::value_integer(0) }
- , attr { "boolean", mapnik::value_bool(true) }
- , attr { "time", mapnik::value_unicode_string("04:14:00") }
- , attr { "date", mapnik::value_unicode_string("1971-01-01") }
- , attr { "integer", mapnik::value_integer(40) }
- });
-
- while (bool(feature = featureset->next())) {
- CHECK(feature->size() == 10);
- CHECK(feature->get("empty_column") == mapnik::value_unicode_string(""));
- }
- } // END SECTION
-
- SECTION("slashes") {
- auto ds = get_csv_ds("test/data/csv/has_attributes_with_slashes.csv");
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"x", "y", "name"});
- // NOTE: y column is integer, even though a double value is used below in the test?
- require_field_types(fields, {mapnik::Integer, mapnik::Integer, mapnik::String});
-
- auto featureset = all_features(ds);
- require_attributes(featureset->next(), {
- attr{"x", 0}
- , attr{"y", 0}
- , attr{"name", mapnik::value_unicode_string("a/a") } });
- require_attributes(featureset->next(), {
- attr{"x", 1}
- , attr{"y", 4}
- , attr{"name", mapnik::value_unicode_string("b/b") } });
- require_attributes(featureset->next(), {
- attr{"x", 10}
- , attr{"y", 2.5}
- , attr{"name", mapnik::value_unicode_string("c/c") } });
- } // END SECTION
-
- SECTION("wkt field") {
- using mapnik::geometry::geometry_types;
-
- auto ds = get_csv_ds("test/data/csv/wkt.csv");
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"type"});
- require_field_types(fields, {mapnik::String});
-
- auto featureset = all_features(ds);
- require_geometry(featureset->next(), 1, geometry_types::Point);
- require_geometry(featureset->next(), 1, geometry_types::LineString);
- require_geometry(featureset->next(), 1, geometry_types::Polygon);
- require_geometry(featureset->next(), 1, geometry_types::Polygon);
- require_geometry(featureset->next(), 4, geometry_types::MultiPoint);
- require_geometry(featureset->next(), 2, geometry_types::MultiLineString);
- require_geometry(featureset->next(), 2, geometry_types::MultiPolygon);
- require_geometry(featureset->next(), 2, geometry_types::MultiPolygon);
- } // END SECTION
-
- SECTION("handling of missing header") {
- // TODO: does this mean 'missing_header.csv' should be in the warnings
- // subdirectory, since it doesn't work in strict mode?
- auto ds = get_csv_ds("test/data/csv/missing_header.csv", false);
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"one", "two", "x", "y", "_4", "aftermissing"});
- auto feature = all_features(ds)->next();
- REQUIRE(feature);
- REQUIRE(feature->has_key("_4"));
- CHECK(feature->get("_4") == mapnik::value_unicode_string("missing"));
- } // END SECTION
-
- SECTION("handling of headers that are numbers") {
- auto ds = get_csv_ds("test/data/csv/numbers_for_headers.csv");
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"x", "y", "1990", "1991", "1992"});
- auto feature = all_features(ds)->next();
- require_attributes(feature, {
- attr{"x", 0}
- , attr{"y", 0}
- , attr{"1990", 1}
- , attr{"1991", 2}
- , attr{"1992", 3}
- });
- auto expression = mapnik::parse_expression("[1991]=2");
- REQUIRE(bool(expression));
- auto value = mapnik::util::apply_visitor(
- mapnik::evaluate<mapnik::feature_impl, mapnik::value_type, mapnik::attributes>(
- *feature, mapnik::attributes()), *expression);
- CHECK(value == true);
- } // END SECTION
-
- SECTION("quoted numbers") {
- using ustring = mapnik::value_unicode_string;
-
- auto ds = get_csv_ds("test/data/csv/quoted_numbers.csv");
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"x", "y", "label"});
- auto featureset = all_features(ds);
-
- require_attributes(featureset->next(), {
- attr{"x", 0}, attr{"y", 0}, attr{"label", ustring("0,0") } });
- require_attributes(featureset->next(), {
- attr{"x", 5}, attr{"y", 5}, attr{"label", ustring("5,5") } });
- require_attributes(featureset->next(), {
- attr{"x", 0}, attr{"y", 5}, attr{"label", ustring("0,5") } });
- require_attributes(featureset->next(), {
- attr{"x", 5}, attr{"y", 0}, attr{"label", ustring("5,0") } });
- require_attributes(featureset->next(), {
- attr{"x", 2.5}, attr{"y", 2.5}, attr{"label", ustring("2.5,2.5") } });
-
- } // END SECTION
-
- SECTION("reading newlines") {
- for (auto const &platform : {std::string("windows"), std::string("mac")}) {
- std::string file_name = (boost::format("test/data/csv/%1%_newlines.csv") % platform).str();
- auto ds = get_csv_ds(file_name);
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"x", "y", "z"});
- require_attributes(all_features(ds)->next(), {
- attr{"x", 1}, attr{"y", 10}, attr{"z", 9999.9999} });
- }
- } // END SECTION
-
- SECTION("mixed newlines") {
- using ustring = mapnik::value_unicode_string;
-
- for (auto const &file : {
- std::string("test/data/csv/mac_newlines_with_unix_inline.csv")
- , std::string("test/data/csv/mac_newlines_with_unix_inline_escaped.csv")
- , std::string("test/data/csv/windows_newlines_with_unix_inline.csv")
- , std::string("test/data/csv/windows_newlines_with_unix_inline_escaped.csv")
- }) {
- auto ds = get_csv_ds(file);
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"x", "y", "line"});
- require_attributes(all_features(ds)->next(), {
- attr{"x", 0}, attr{"y", 0}
- , attr{"line", ustring("many\n lines\n of text\n with unix newlines")} });
- }
- } // END SECTION
-
- SECTION("tabs") {
- auto ds = get_csv_ds("test/data/csv/tabs_in_csv.csv");
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"x", "y", "z"});
- require_attributes(all_features(ds)->next(), {
- attr{"x", -122}, attr{"y", 48}, attr{"z", 0} });
- } // END SECTION
-
- SECTION("separators") {
- using ustring = mapnik::value_unicode_string;
-
- for (auto const &file : {
- std::string("test/data/csv/pipe_delimiters.csv")
- , std::string("test/data/csv/semicolon_delimiters.csv")
- }) {
- auto ds = get_csv_ds(file);
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"x", "y", "z"});
- require_attributes(all_features(ds)->next(), {
- attr{"x", 0}, attr{"y", 0}, attr{"z", ustring("hello")} });
- }
- } // END SECTION
-
- SECTION("null and bool keywords are empty strings") {
- using ustring = mapnik::value_unicode_string;
-
- auto ds = get_csv_ds("test/data/csv/nulls_and_booleans_as_strings.csv");
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"x", "y", "null", "boolean"});
- require_field_types(fields, {mapnik::Integer, mapnik::Integer, mapnik::String, mapnik::Boolean});
-
- auto featureset = all_features(ds);
- require_attributes(featureset->next(), {
- attr{"x", 0}, attr{"y", 0}, attr{"null", ustring("null")}, attr{"boolean", true}});
- require_attributes(featureset->next(), {
- attr{"x", 0}, attr{"y", 0}, attr{"null", ustring("")}, attr{"boolean", false}});
- } // END SECTION
-
- SECTION("nonexistent query fields throw") {
- auto ds = get_csv_ds("test/data/csv/lon_lat.csv");
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"lon", "lat"});
- require_field_types(fields, {mapnik::Integer, mapnik::Integer});
-
- mapnik::query query(ds->envelope());
- for (auto const &field : fields) {
- query.add_property_name(field.get_name());
- }
- // also add an invalid one, triggering throw
- query.add_property_name("bogus");
-
- REQUIRE_THROWS(ds->features(query));
- } // END SECTION
-
- SECTION("leading zeros mean strings") {
- using ustring = mapnik::value_unicode_string;
-
- auto ds = get_csv_ds("test/data/csv/leading_zeros.csv");
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"x", "y", "fips"});
- require_field_types(fields, {mapnik::Integer, mapnik::Integer, mapnik::String});
-
- auto featureset = all_features(ds);
- require_attributes(featureset->next(), {
- attr{"x", 0}, attr{"y", 0}, attr{"fips", ustring("001")}});
- require_attributes(featureset->next(), {
- attr{"x", 0}, attr{"y", 0}, attr{"fips", ustring("003")}});
- require_attributes(featureset->next(), {
- attr{"x", 0}, attr{"y", 0}, attr{"fips", ustring("005")}});
- } // END SECTION
-
- SECTION("advanced geometry detection") {
- using row = std::pair<std::string, mapnik::datasource_geometry_t>;
-
- for (row r : {
- row{"point", mapnik::datasource_geometry_t::Point}
- , row{"poly", mapnik::datasource_geometry_t::Polygon}
- , row{"multi_poly", mapnik::datasource_geometry_t::Polygon}
- , row{"line", mapnik::datasource_geometry_t::LineString}
- }) {
- std::string file_name = (boost::format("test/data/csv/%1%_wkt.csv") % r.first).str();
- auto ds = get_csv_ds(file_name);
- CHECK(ds->get_geometry_type() == r.second);
- }
- } // END SECTION
-
- SECTION("creation of CSV from in-memory strings") {
- using ustring = mapnik::value_unicode_string;
-
- for (auto const &name : {std::string("Winthrop, WA"), std::string(u8"Qu\u00e9bec")}) {
- std::string csv_string =
- (boost::format(
- "wkt,Name\n"
- "\"POINT (120.15 48.47)\",\"%1%\"\n"
- ) % name).str();
-
- mapnik::parameters params;
- params["type"] = std::string("csv");
- params["inline"] = csv_string;
- auto ds = mapnik::datasource_cache::instance().create(params);
- REQUIRE(bool(ds));
-
- auto feature = all_features(ds)->next();
- REQUIRE(bool(feature));
- REQUIRE(feature->has_key("Name"));
- CHECK(feature->get("Name") == ustring(name.c_str()));
- }
- } // END SECTION
-
- SECTION("geojson quoting") {
- using mapnik::geometry::geometry_types;
-
- for (auto const &file : {
- std::string("test/data/csv/geojson_double_quote_escape.csv")
- , std::string("test/data/csv/geojson_single_quote.csv")
- , std::string("test/data/csv/geojson_2x_double_quote_filebakery_style.csv")
- }) {
- auto ds = get_csv_ds(file);
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"type"});
- require_field_types(fields, {mapnik::String});
-
- auto featureset = all_features(ds);
- require_geometry(featureset->next(), 1, geometry_types::Point);
- require_geometry(featureset->next(), 1, geometry_types::LineString);
- require_geometry(featureset->next(), 1, geometry_types::Polygon);
- require_geometry(featureset->next(), 1, geometry_types::Polygon);
- require_geometry(featureset->next(), 4, geometry_types::MultiPoint);
- require_geometry(featureset->next(), 2, geometry_types::MultiLineString);
- require_geometry(featureset->next(), 2, geometry_types::MultiPolygon);
- require_geometry(featureset->next(), 2, geometry_types::MultiPolygon);
- }
- } // END SECTION
-
- SECTION("blank undelimited rows are still parsed") {
- using ustring = mapnik::value_unicode_string;
-
- // TODO: does this mean this CSV file should be in the warnings
- // subdirectory, since it doesn't work in strict mode?
- auto ds = get_csv_ds("test/data/csv/more_headers_than_column_values.csv", false);
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"x", "y", "one", "two", "three"});
- require_field_types(fields, {mapnik::Integer, mapnik::Integer, mapnik::String, mapnik::String, mapnik::String});
-
- require_attributes(all_features(ds)->next(), {
- attr{"x", 0}, attr{"y", 0}, attr{"one", ustring("")}, attr{"two", ustring("")}, attr{"three", ustring("")} });
- } // END SECTION
-
- SECTION("fewer headers than rows throws") {
- REQUIRE_THROWS(get_csv_ds("test/data/csv/more_column_values_than_headers.csv"));
- } // END SECTION
-
- SECTION("feature ID only incremented for valid rows") {
- auto ds = get_csv_ds("test/data/csv/warns/feature_id_counting.csv", false);
- auto fs = all_features(ds);
-
- // first
- auto feature = fs->next();
- REQUIRE(bool(feature));
- CHECK(feature->id() == 1);
-
- // second, should have skipped bogus one
- feature = fs->next();
- REQUIRE(bool(feature));
- CHECK(feature->id() == 2);
-
- feature = fs->next();
- CHECK(!feature);
- } // END SECTION
-
- SECTION("dynamically defining headers") {
- using ustring = mapnik::value_unicode_string;
- using row = std::pair<std::string, std::size_t>;
-
- for (auto const &r : {
- row{"test/data/csv/fails/needs_headers_two_lines.csv", 2}
- , row{"test/data/csv/fails/needs_headers_one_line.csv", 1}
- , row{"test/data/csv/fails/needs_headers_one_line_no_newline.csv", 1}
- }) {
- mapnik::parameters params;
- params["type"] = std::string("csv");
- params["file"] = r.first;
- params["headers"] = "x,y,name";
- auto ds = mapnik::datasource_cache::instance().create(params);
- REQUIRE(bool(ds));
-
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"x", "y", "name"});
- require_field_types(fields, {mapnik::Integer, mapnik::Integer, mapnik::String});
- require_attributes(all_features(ds)->next(), {
- attr{"x", 0}, attr{"y", 0}, attr{"name", ustring("data_name")} });
- REQUIRE(count_features(all_features(ds)) == r.second);
- }
- } // END SECTION
-
- #pragma GCC diagnostic push
- #pragma GCC diagnostic ignored "-Wlong-long"
- SECTION("64bit int fields work") {
- auto ds = get_csv_ds("test/data/csv/64bit_int.csv");
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"x", "y", "bigint"});
- require_field_types(fields, {mapnik::Integer, mapnik::Integer, mapnik::Integer});
-
- auto fs = all_features(ds);
- auto feature = fs->next();
- require_attributes(feature, {
- attr{"x", 0}, attr{"y", 0}, attr{"bigint", 2147483648} });
-
- feature = fs->next();
- require_attributes(feature, {
- attr{"x", 0}, attr{"y", 0}, attr{"bigint", 9223372036854775807ll} });
- require_attributes(feature, {
- attr{"x", 0}, attr{"y", 0}, attr{"bigint", 0x7FFFFFFFFFFFFFFFll} });
- } // END SECTION
- #pragma GCC diagnostic pop
-
- SECTION("various number types") {
- auto ds = get_csv_ds("test/data/csv/number_types.csv");
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {"x", "y", "floats"});
- require_field_types(fields, {mapnik::Integer, mapnik::Integer, mapnik::Double});
-
- auto fs = all_features(ds);
- for (double d : { .0, +.0, 1e-06, -1e-06, 0.000001, 1.234e+16, 1.234e+16 }) {
- auto feature = fs->next();
- REQUIRE(bool(feature));
- CHECK(feature->get("floats").get<mapnik::value_double>() == Approx(d));
- }
- } // END SECTION
-
- SECTION("manually supplied extent") {
- std::string csv_string("wkt,Name\n");
- mapnik::parameters params;
- params["type"] = std::string("csv");
- params["inline"] = csv_string;
- params["extent"] = "-180,-90,180,90";
- auto ds = mapnik::datasource_cache::instance().create(params);
- REQUIRE(bool(ds));
-
- auto box = ds->envelope();
- CHECK(box.minx() == -180);
- CHECK(box.miny() == -90);
- CHECK(box.maxx() == 180);
- CHECK(box.maxy() == 90);
- } // END SECTION
-
- SECTION("inline geojson") {
- std::string csv_string = "geojson\n'{\"coordinates\":[-92.22568,38.59553],\"type\":\"Point\"}'";
- mapnik::parameters params;
- params["type"] = std::string("csv");
- params["inline"] = csv_string;
- auto ds = mapnik::datasource_cache::instance().create(params);
- REQUIRE(bool(ds));
-
- auto fields = ds->get_descriptor().get_descriptors();
- require_field_names(fields, {});
-
- // TODO: this originally had the following comment:
- // - re-enable after https://github.com/mapnik/mapnik/issues/2319 is fixed
- // but that seems to have been merged and tested separately?
- auto fs = all_features(ds);
- auto feat = fs->next();
- CHECK(feature_count(feat->get_geometry()) == 1);
- } // END SECTION
-
- mapnik::logger::instance().set_severity(severity);
- }
+ auto box = ds->envelope();
+ CHECK(box.minx() == -180);
+ CHECK(box.miny() == -90);
+ CHECK(box.maxx() == 180);
+ CHECK(box.maxy() == 90);
+ } // END SECTION
+
+ SECTION("inline geojson") {
+ std::string csv_string = "geojson\n'{\"coordinates\":[-92.22568,38.59553],\"type\":\"Point\"}'";
+ mapnik::parameters params;
+ params["type"] = std::string("csv");
+ params["inline"] = csv_string;
+ auto ds = mapnik::datasource_cache::instance().create(params);
+ REQUIRE(bool(ds));
+
+ auto fields = ds->get_descriptor().get_descriptors();
+ require_field_names(fields, {});
+
+ // TODO: this originally had the following comment:
+ // - re-enable after https://github.com/mapnik/mapnik/issues/2319 is fixed
+ // but that seems to have been merged and tested separately?
+ auto fs = all_features(ds);
+ auto feat = fs->next();
+ CHECK(feature_count(feat->get_geometry()) == 1);
+ } // END SECTION
+ mapnik::logger::instance().set_severity(severity);
+ }
} // END TEST CASE
diff --git a/test/standalone/datasource_registration_test.cpp b/test/standalone/datasource_registration_test.cpp
new file mode 100644
index 0000000..2edbcd7
--- /dev/null
+++ b/test/standalone/datasource_registration_test.cpp
@@ -0,0 +1,46 @@
+#define CATCH_CONFIG_MAIN
+#include "catch.hpp"
+
+#include <mapnik/datasource_cache.hpp>
+#include <mapnik/debug.hpp>
+
+#include <iostream>
+#include <vector>
+#include <algorithm>
+
+TEST_CASE("datasource_cache") {
+
+SECTION("registration") {
+ try
+ {
+ mapnik::logger logger;
+ mapnik::logger::severity_type original_severity = logger.get_severity();
+ bool success = false;
+ auto &cache = mapnik::datasource_cache::instance();
+
+ // registering a directory without any plugins should return false
+ success = cache.register_datasources("test/data/vrt");
+ CHECK(success == false);
+
+ // registering a directory for the first time should return true
+ success = cache.register_datasources("plugins/input");
+ REQUIRE(success == true);
+
+ // registering the same directory again should now return false
+ success = cache.register_datasources("plugins/input");
+ CHECK(success == false);
+
+ // registering the same directory again, but recursively should
+ // still return false - even though there are subdirectories, they
+ // do not contain any more plugins.
+ success = cache.register_datasources("plugins/input", true);
+ CHECK(success == false);
+ }
+ catch (std::exception const & ex)
+ {
+ std::clog << ex.what() << "\n";
+ REQUIRE(false);
+ }
+
+}
+}
diff --git a/test/unit/svg/svg_parser_test.cpp b/test/unit/svg/svg_parser_test.cpp
index f33962d..276e3c5 100644
--- a/test/unit/svg/svg_parser_test.cpp
+++ b/test/unit/svg/svg_parser_test.cpp
@@ -163,11 +163,10 @@ TEST_CASE("SVG parser") {
svg_path_adapter svg_path(stl_storage);
svg_converter_type svg(svg_path, path.attributes());
svg_parser p(svg);
-
if (!p.parse_from_string(svg_str))
{
auto const& errors = p.error_messages();
- REQUIRE(errors.size() == 14);
+ REQUIRE(errors.size() == 13);
REQUIRE(errors[0] == "parse_rect: Invalid width");
REQUIRE(errors[1] == "Failed to parse double: \"FAIL\"");
REQUIRE(errors[2] == "parse_rect: Invalid height");
@@ -181,7 +180,6 @@ TEST_CASE("SVG parser") {
REQUIRE(errors[10] == "Failed to parse <polyline> 'points'");
REQUIRE(errors[11] == "parse_ellipse: Invalid rx");
REQUIRE(errors[12] == "parse_ellipse: Invalid ry");
- REQUIRE(errors[13] == "parse_rect: Invalid height");
}
}
diff --git a/test/visual/run.cpp b/test/visual/run.cpp
index 62ca73b..833f97b 100644
--- a/test/visual/run.cpp
+++ b/test/visual/run.cpp
@@ -56,6 +56,7 @@ int main(int argc, char** argv)
("duration,d", "output rendering duration")
("iterations,i", po::value<std::size_t>()->default_value(1), "number of iterations for benchmarking")
("jobs,j", po::value<std::size_t>()->default_value(1), "number of parallel threads")
+ ("limit,l", po::value<std::size_t>()->default_value(0), "limit number of failures")
("styles-dir", po::value<std::string>()->default_value("test/data-visual/styles"), "directory with styles")
("images-dir", po::value<std::string>()->default_value("test/data-visual/images"), "directory with reference images")
("output-dir", po::value<std::string>()->default_value("/tmp/mapnik-visual-images"), "directory for output files")
@@ -111,6 +112,7 @@ int main(int argc, char** argv)
vm["images-dir"].as<std::string>(),
vm.count("overwrite"),
vm["iterations"].as<std::size_t>(),
+ vm["limit"].as<std::size_t>(),
vm["jobs"].as<std::size_t>());
bool show_duration = vm.count("duration");
report_type report(vm.count("verbose") ? report_type((console_report(show_duration))) : report_type((console_short_report(show_duration))));
diff --git a/test/visual/runner.cpp b/test/visual/runner.cpp
index 45dc366..a987d05 100644
--- a/test/visual/runner.cpp
+++ b/test/visual/runner.cpp
@@ -23,6 +23,7 @@
// stl
#include <algorithm>
#include <future>
+#include <atomic>
#include <mapnik/load_map.hpp>
@@ -40,14 +41,18 @@ public:
double scale_factor,
result_list & results,
report_type & report,
- std::size_t iterations)
+ std::size_t iterations,
+ bool is_fail_limit,
+ std::atomic<std::size_t> & fail_count)
: name_(name),
map_(map),
tiles_(tiles),
scale_factor_(scale_factor),
results_(results),
report_(report),
- iterations_(iterations)
+ iterations_(iterations),
+ is_fail_limit_(is_fail_limit),
+ fail_count_(fail_count)
{
}
@@ -82,6 +87,10 @@ private:
r.duration = end - start;
mapnik::util::apply_visitor(report_visitor(r), report_);
results_.push_back(std::move(r));
+ if (is_fail_limit_ && r.state == STATE_FAIL)
+ {
+ ++fail_count_;
+ }
}
}
}
@@ -112,6 +121,8 @@ private:
result_list & results_;
report_type & report_;
std::size_t iterations_;
+ bool is_fail_limit_;
+ std::atomic<std::size_t> & fail_count_;
};
runner::runner(runner::path_type const & styles_dir,
@@ -119,12 +130,14 @@ runner::runner(runner::path_type const & styles_dir,
runner::path_type const & reference_dir,
bool overwrite,
std::size_t iterations,
+ std::size_t fail_limit,
std::size_t jobs)
: styles_dir_(styles_dir),
output_dir_(output_dir),
reference_dir_(reference_dir),
jobs_(jobs),
iterations_(iterations),
+ fail_limit_(fail_limit),
renderers_{ renderer<agg_renderer>(output_dir_, reference_dir_, overwrite)
#if defined(HAVE_CAIRO)
,renderer<cairo_renderer>(output_dir_, reference_dir_, overwrite)
@@ -182,6 +195,7 @@ result_list runner::test_parallel(std::vector<runner::path_type> const & files,
std::launch launch(jobs == 1 ? std::launch::deferred : std::launch::async);
std::vector<std::future<result_list>> futures(jobs);
+ std::atomic<std::size_t> fail_count(0);
for (std::size_t i = 0; i < jobs; i++)
{
@@ -194,7 +208,7 @@ result_list runner::test_parallel(std::vector<runner::path_type> const & files,
end = files.end();
}
- futures[i] = std::async(launch, &runner::test_range, this, begin, end, std::ref(report));
+ futures[i] = std::async(launch, &runner::test_range, this, begin, end, std::ref(report), std::ref(fail_count));
}
for (auto & f : futures)
@@ -206,7 +220,10 @@ result_list runner::test_parallel(std::vector<runner::path_type> const & files,
return results;
}
-result_list runner::test_range(files_iterator begin, files_iterator end, std::reference_wrapper<report_type> report) const
+result_list runner::test_range(files_iterator begin,
+ files_iterator end,
+ std::reference_wrapper<report_type> report,
+ std::reference_wrapper<std::atomic<std::size_t>> fail_count) const
{
config defaults;
result_list results;
@@ -218,7 +235,7 @@ result_list runner::test_range(files_iterator begin, files_iterator end, std::re
{
try
{
- result_list r = test_one(file, defaults, report);
+ result_list r = test_one(file, defaults, report, fail_count.get());
std::move(r.begin(), r.end(), std::back_inserter(results));
}
catch (std::exception const& ex)
@@ -227,16 +244,25 @@ result_list runner::test_range(files_iterator begin, files_iterator end, std::re
r.state = STATE_ERROR;
r.name = file.string();
r.error_message = ex.what();
+ r.duration = std::chrono::high_resolution_clock::duration::zero();
results.emplace_back(r);
mapnik::util::apply_visitor(report_visitor(r), report.get());
+ ++fail_count.get();
}
}
+ if (fail_limit_ && fail_count.get() >= fail_limit_)
+ {
+ break;
+ }
}
return results;
}
-result_list runner::test_one(runner::path_type const& style_path, config cfg, report_type & report) const
+result_list runner::test_one(runner::path_type const& style_path,
+ config cfg,
+ report_type & report,
+ std::atomic<std::size_t> & fail_count) const
{
mapnik::Map map(cfg.sizes.front().width, cfg.sizes.front().height);
result_list results;
@@ -317,7 +343,19 @@ result_list runner::test_one(runner::path_type const& style_path, config cfg, re
{
map.zoom_all();
}
- mapnik::util::apply_visitor(renderer_visitor(name, map, tiles_count, scale_factor, results, report, iterations_), ren);
+ mapnik::util::apply_visitor(renderer_visitor(name,
+ map,
+ tiles_count,
+ scale_factor,
+ results,
+ report,
+ iterations_,
+ fail_limit_,
+ fail_count), ren);
+ if (fail_limit_ && fail_count >= fail_limit_)
+ {
+ return results;
+ }
}
}
}
diff --git a/test/visual/runner.hpp b/test/visual/runner.hpp
index a4c91ba..65b19bb 100644
--- a/test/visual/runner.hpp
+++ b/test/visual/runner.hpp
@@ -55,6 +55,7 @@ public:
path_type const & reference_dir,
bool overwrite,
std::size_t iterations,
+ std::size_t fail_limit,
std::size_t jobs);
result_list test_all(report_type & report) const;
@@ -62,8 +63,13 @@ public:
private:
result_list test_parallel(std::vector<path_type> const & files, report_type & report, std::size_t jobs) const;
- result_list test_range(files_iterator begin, files_iterator end, std::reference_wrapper<report_type> report) const;
- result_list test_one(path_type const & style_path, config cfg, report_type & report) const;
+ result_list test_range(files_iterator begin,
+ files_iterator end,
+ std::reference_wrapper<report_type> report,
+ std::reference_wrapper<std::atomic<std::size_t>> fail_limit) const;
+ result_list test_one(path_type const & style_path,
+ config cfg, report_type & report,
+ std::atomic<std::size_t> & fail_limit) const;
void parse_map_sizes(std::string const & str, std::vector<map_size> & sizes) const;
const map_sizes_grammar<std::string::const_iterator> map_sizes_parser_;
@@ -72,6 +78,7 @@ private:
const path_type reference_dir_;
const std::size_t jobs_;
const std::size_t iterations_;
+ const std::size_t fail_limit_;
const renderer_type renderers_[boost::mpl::size<renderer_type::types>::value];
};
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/pkg-grass/mapnik.git
More information about the Pkg-grass-devel
mailing list