TMCnet News

MapR Advances Support for Flexible and High Performance Analytics on JSON and S3 Data with Apache Drill
[January 30, 2019]

MapR Advances Support for Flexible and High Performance Analytics on JSON and S3 Data with Apache Drill


MapR® Technologies, Inc., provider of the industry's next generation data platform for AI and Analytics, today announced the support of Apache Drill 1.15. The new release offers new enhancements to conduct powerful queries on highly complex nested data structures; including files, MapR JSON database tables and cloud data sources specifically for S3.

"The latest Drill release is aimed at further improving intuitive access to different data types across on-premises and cloud data sources as well as enhancing performance and usability," said Neeraja Rentachintala, vice president of product management, MapR. "We evolved Drill by closely listening to our customers, and it is exciting to see our customers achieve true self service data exploration without compromising on analytic flexibility and performance."

Drill 1.15 expands on ANSI SQL compliance and query performance improvements both for Parquet and MapR-DB JSON tables. With the new release it is easier to deploy Drill in multi-tenant environments co-existing with other analytic frameworks such as Hive and Spark, while achieving predictable SLAs, to successfully conduct interactive analytics at any scale.

With today's release, Drill 1.15 introduces the following features:

  • S3 Plug-In Support. Customers can now access data in S3 through Drill and join them with other supported data sources like Parquet, Hive and JSON all through a single query. Drill also supports writing to S3 buckets by creating tables. By bringing read and write capability to S3 buckets, MapR continues to integrate with cloud applications and add to the existing object tiering offering.
  • Expanded Spill to Disk Capability. Spill to disk for memory intensive queries has been expanded to include all SQL operations that rely on memory like GROUP BY, JOIN, ORDER BY, DISTINCT. Memory controls can now be put in place so that large memory intensive queries that pass a definedthreshold spill to disk.
  • Spin Multiple Drill Clusters and Set Resource Controls. Customers now have the ability to spin up multiple Drill clusters within a single MapR cluster to support multi-tenancy and the ability to segregate workloads by user personas, as set CPU resource limits through cgroups. In addition, users now have the ability to spin up multiple Drill clusters to cater to different user personas on a shared MapR cluster which allows isolated Drill compute workloads with guaranteed minimum resources.
  • Leverage MapR Document Database Secondary Indexes for Complex Types. Secondary indexes can now be created on complex nested types like MAPs and ARRAY's. An entire array, array elements, entire MAP's or MAP elements regardless of whether they are primitive or complex can now be leveraged by Drill.
  • Deeper Integration with Parquet. Drill 1.15 provides deeper integration with Parquet. Filters on strings can now be pushed down to the underlying parquet API so scanning parquet files only returns the rows that match query predicates. In addition, push down can occur across a broader range of queries such as JOIN'ed tables with predicates only on one table as well as predicates on complex nested types.



SUPPORTING RESOURCES

Read more in the MapR blog: MapR Announces Drill 1.15 with S3 Cloud-Storage Plug-In


Tweet this: .@MapR Packs Enterprise Grade Features and Low Latency Capabilities to Apache Drill #apachedrill #apache #parquet #json https://mapr.com/company/press-releases/mapr-advances-analytics-support-on-json-and-s3-data-with-apache-drill

About MapR Technologies

MapR Technologies, provider of the industry's next generation data platform for AI and Analytics, enables enterprises to inject analytics into their business processes to increase revenue, reduce costs, and mitigate risks. MapR addresses the data complexities of high-scale and mission critical distributed processing from the cloud to the edge, IoT analytics, and container persistence. Global 2000 enterprises trust the MapR Data Platform to help them solve their most complex AI and analytics challenges. Amazon, Cisco, Google, Microsoft (News - Alert), SAP and other leading businesses are all part of the MapR ecosystem. For more information, visit mapr.com.

MapR is a registered trademark of MapR Technologies, Inc. in the United States and other countries. Other names and brands may be the property of others.


[ Back To TMCnet.com's Homepage ]