Data Driver

Blog archive

SQL Encroaches on Big Data Turf

Remember when SQL developers felt threatened by Big Data? Relational database management systems were old-school relics that couldn't cope with the vast amounts of unstructured, disparate data. NoSQL was the future. You needed to get onboard with Hadoop and MapReduce, running on Linux.

Well, not anymore.

Maybe not ever, really. There is just too big of an installed base of SQL developers and systems for the two camps, Big Data and SQL, to have remained apart. Even four or five years ago the convergence was underway with Hive, a data warehouse system for Hadoop that uses "a SQL-like language called HiveQL."

That convergence seems to be rapidly accelerating. Microsoft has been helping out, of course, with PolyBase in its SQL Server 2012 Parallel Data Warehouse to enable SQL queries of Big Data and initiatives such as HDInsight and the Hortonworks Data Platform to get Big Data into the Windows ecosystem.

But Redmond has plenty of company. Just this week I had the opportunity to interview Web coding pioneer Lloyd Tabb about the subject when his new company, Looker Data Sciences Inc., announced a query-based business intelligence (BI) platform called Looker. "SQL and relational querying is the best way to ask questions of large related data sets," Tabb told me.

He should know what he's talking about. He was a database and languages architect at Borland in the earlier days of RDBMS and went on to build LiveWire, the first application server for the World Wide Web. He was later a principal engineer at Netscape where he was architect of Netscape Navigator Gold (later named Composer), the first WYSIWYG HTML editor, and the engineering lead for Netscape Communicator. He helped found Mozilla.org and later became a pioneer in crowdsourcing, just to name a few of his accomplishments.

Looker, according to the company, "uses a new modeling language, LookML, which enhances SQL for analytics so end-users can perform powerful analytics without needing to know how a query is written."

I asked Tabb about the use of SQL instead of NoSQL, Hadoop or other Big Data technologies associated with BI analytics, and he gave me a little history lesson.

"Back in the day conventional wisdom was that if you were going to create an application for a PC you had to write it in Assembly language," Tabb said. "Higher-level languages generated code that was too big and too slow. Later, conventional wisdom was that you couldn't build a 'real-applicaiton' in an agile language--it was too big and too slow.

"Hadoop was designed because at the time there were no SQL engines that could deal with data sets that large. Developers regressed to hand coding queries in MapReduce. Both SQL and C are still in use today because they are the best abstractions for the kinds of problems they solve."

Looking around, I see lots of other evidence pointing to the Borg-like assimilation of Big Data by SQL. A few weeks ago GigaOM explored the subject with an article titled "SQL is what's next for Hadoop: Here's who's doing it," and just yesterday a PluralSight course on the topic was announced, described as "An investigation into the convergence of relational SQL database technologies from several vendors and Big Data technologies like Apache Hadoop."

And there are plenty more similar things going on out there. So rest easy, SQL data developers, your future is still bright.

What do you think about the convergence of Big Data and SQL? Share your thoughts by commenting here or by e-mail.

Posted by David Ramel on 03/08/2013


comments powered by Disqus

Featured

  • Compare New GitHub Copilot Free Plan for Visual Studio/VS Code to Paid Plans

    The free plan restricts the number of completions, chat requests and access to AI models, being suitable for occasional users and small projects.

  • Diving Deep into .NET MAUI

    Ever since someone figured out that fiddling bits results in source code, developers have sought one codebase for all types of apps on all platforms, with Microsoft's latest attempt to further that effort being .NET MAUI.

  • Copilot AI Boosts Abound in New VS Code v1.96

    Microsoft improved on its new "Copilot Edit" functionality in the latest release of Visual Studio Code, v1.96, its open-source based code editor that has become the most popular in the world according to many surveys.

  • AdaBoost Regression Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the AdaBoost.R2 algorithm for regression problems (where the goal is to predict a single numeric value). The implementation follows the original source research paper closely, so you can use it as a guide for customization for specific scenarios.

  • Versioning and Documenting ASP.NET Core Services

    Building an API with ASP.NET Core is only half the job. If your API is going to live more than one release cycle, you're going to need to version it. If you have other people building clients for it, you're going to need to document it.

Subscribe on YouTube