Data Driver

Blog archive

SQL Encroaches on Big Data Turf

Remember when SQL developers felt threatened by Big Data? Relational database management systems were old-school relics that couldn't cope with the vast amounts of unstructured, disparate data. NoSQL was the future. You needed to get onboard with Hadoop and MapReduce, running on Linux.

Well, not anymore.

Maybe not ever, really. There is just too big of an installed base of SQL developers and systems for the two camps, Big Data and SQL, to have remained apart. Even four or five years ago the convergence was underway with Hive, a data warehouse system for Hadoop that uses "a SQL-like language called HiveQL."

That convergence seems to be rapidly accelerating. Microsoft has been helping out, of course, with PolyBase in its SQL Server 2012 Parallel Data Warehouse to enable SQL queries of Big Data and initiatives such as HDInsight and the Hortonworks Data Platform to get Big Data into the Windows ecosystem.

But Redmond has plenty of company. Just this week I had the opportunity to interview Web coding pioneer Lloyd Tabb about the subject when his new company, Looker Data Sciences Inc., announced a query-based business intelligence (BI) platform called Looker. "SQL and relational querying is the best way to ask questions of large related data sets," Tabb told me.

He should know what he's talking about. He was a database and languages architect at Borland in the earlier days of RDBMS and went on to build LiveWire, the first application server for the World Wide Web. He was later a principal engineer at Netscape where he was architect of Netscape Navigator Gold (later named Composer), the first WYSIWYG HTML editor, and the engineering lead for Netscape Communicator. He helped found Mozilla.org and later became a pioneer in crowdsourcing, just to name a few of his accomplishments.

Looker, according to the company, "uses a new modeling language, LookML, which enhances SQL for analytics so end-users can perform powerful analytics without needing to know how a query is written."

I asked Tabb about the use of SQL instead of NoSQL, Hadoop or other Big Data technologies associated with BI analytics, and he gave me a little history lesson.

"Back in the day conventional wisdom was that if you were going to create an application for a PC you had to write it in Assembly language," Tabb said. "Higher-level languages generated code that was too big and too slow. Later, conventional wisdom was that you couldn't build a 'real-applicaiton' in an agile language--it was too big and too slow.

"Hadoop was designed because at the time there were no SQL engines that could deal with data sets that large. Developers regressed to hand coding queries in MapReduce. Both SQL and C are still in use today because they are the best abstractions for the kinds of problems they solve."

Looking around, I see lots of other evidence pointing to the Borg-like assimilation of Big Data by SQL. A few weeks ago GigaOM explored the subject with an article titled "SQL is what's next for Hadoop: Here's who's doing it," and just yesterday a PluralSight course on the topic was announced, described as "An investigation into the convergence of relational SQL database technologies from several vendors and Big Data technologies like Apache Hadoop."

And there are plenty more similar things going on out there. So rest easy, SQL data developers, your future is still bright.

What do you think about the convergence of Big Data and SQL? Share your thoughts by commenting here or by e-mail.

Posted by David Ramel on 03/08/2013


comments powered by Disqus

Featured

  • Mads Kristensen Eyes MCP Server for Visual Studio Copilot

    "What MCP server would be helpful to use with Copilot in Visual Studio? I want to write one."

  • Two Different Takes on Cursor/Copilot Vibe Coding Supremacy

    Cursor and GitHub Copilot go head-to-head in a pair of firsthand reviews. One coder returns to Copilot after it adds support for top LLMs. A coding writer falls for Cursor’s conversational style and beginner-friendly flow.

  • Linear Regression with Two-Way Interactions Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of linear regression with two-way interactions between predictor variables. Compared to standard linear regression, which predicts a single numeric value based only on a linear combination of predictor values, linear regression with interactions can handle more complex data while retaining a high level of model interpretability.

  • Vibe Writing

    Why outline when you can prompt? Vibe writing is the new vibe coding, and yes, it’s exactly what it sounds like.

  • Next-gen SQL Projects with Microsoft.Build.Sql

    SQL development is evolving fast, and Microsoft's Drew Skwiers-Koballa will explain it all in a featured session at the VS Live! @ Microsoft HQ developer conference being held at the company's Redmond campus in August.

Subscribe on YouTube

Upcoming Training Events

0 AM
Visual Studio Live! San Diego
September 8-12, 2025
Live! 360 Orlando
November 16-21, 2025
Cloud & Containers Live! Orlando
November 16-21, 2025
Data Platform Live! Orlando
November 16-21, 2025
Visual Studio Live! Orlando
November 16-21, 2025