Data Driver

Blog archive

MapR Bolsters Hadoop Distro with Apache Drill for SQL-Based Big Data Analytics

After stewarding the open source project from incubation to its new 1.0 release, MapR Technologies Inc. added Apache Drill for SQL-based Big Data analytics to its Apache Hadoop distribution.

The company -- one of the "big three" Hadoop vendors along with Hortonworks Inc. and Coudera Inc. -- this week announced the general availability of the open source Apache Drill 1.0 and its inclusion in the MapR Hadoop distribution.

Drill is a low-latency query engine based on ANSI SQL standards that facilitates self-service, interactive analytics at Big Data scales, including up to petabyte scale (1 PB is equal to 1 million GB). One of its key features is that it doesn't depend on traditional database schemas that describe how data is categorized. Discovering such schemas on the fly makes for quicker analytics, the company said.

MapR engineers including Jacques Nadeau and Steven Phillips have taken the lead on the open source project, which was incubated at the Apache Sofwtare Foundation (ASF) in September 2012 with the goal of wedding the familiar workings of relational databases with the huge new scalability demanded by the Big Data era and the agility of Hadoop systems and their heavy use of NoSQL databases.

"The project has been on the fast track in the last nine months since the developer preview in August 2014, delivering seven significant iterative releases, each adding exciting new features and most importantly, improving on the stability, scale and performance required for broader enterprise deployments," MapR exec Neeraja Rentachintala said in a blog post Tuesday.

The Apache Drill Project Timeline
[Click on image for larger view.] The Apache Drill Project Timeline (source: MapR Technologies Inc.)

In addition to SQL queries, the tool can work with varying types of data, including files, NoSQL databases and more complex types of data such as JSON and Parquet.

"Drill enables interactivity with data from both legacy transactional systems and new data sources, such as Internet of Things (IOT) sensors, Web click-streams and other semi-structured data, along with support for popular business intelligence (BI) and data visualization tools," MapR said in a news release. "Drill provides reliability and performance at Hadoop scale with integrated granular security and governance capabilities required for multi-tenant data lakes or enterprise data hubs."

Upcoming features planned for future editions of Drill include more functionality centered on JSON, SQL, complex data functions and new file formats, Rentachintala said.

Posted by David Ramel on 05/22/2015


comments powered by Disqus

Featured

  • Mads Kristensen Eyes MCP Server for Visual Studio Copilot

    "What MCP server would be helpful to use with Copilot in Visual Studio? I want to write one."

  • Two Different Takes on Cursor/Copilot Vibe Coding Supremacy

    Cursor and GitHub Copilot go head-to-head in a pair of firsthand reviews. One coder returns to Copilot after it adds support for top LLMs. A coding writer falls for Cursor’s conversational style and beginner-friendly flow.

  • Linear Regression with Two-Way Interactions Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of linear regression with two-way interactions between predictor variables. Compared to standard linear regression, which predicts a single numeric value based only on a linear combination of predictor values, linear regression with interactions can handle more complex data while retaining a high level of model interpretability.

  • Vibe Writing

    Why outline when you can prompt? Vibe writing is the new vibe coding, and yes, it’s exactly what it sounds like.

  • Next-gen SQL Projects with Microsoft.Build.Sql

    SQL development is evolving fast, and Microsoft's Drew Skwiers-Koballa will explain it all in a featured session at the VS Live! @ Microsoft HQ developer conference being held at the company's Redmond campus in August.

Subscribe on YouTube

Upcoming Training Events

0 AM
Visual Studio Live! San Diego
September 8-12, 2025
Live! 360 Orlando
November 16-21, 2025
Cloud & Containers Live! Orlando
November 16-21, 2025
Data Platform Live! Orlando
November 16-21, 2025
Visual Studio Live! Orlando
November 16-21, 2025