UPDATED: Microsoft Exec Explains SDS About-Face
Microsoft in recent weeks began dropping hints that it would be announcing a revamped iteration of its SQL Data Services -- its cloud-based database service that's been available for testing for four months -- after the testers insisted they wanted SDS to have native relational capabilities.
In a surprise move, Microsoft said yesterday that it would expose its Tabular Data Stream (TDS) over-the-wire protocol for accessing SQL Server via its forthcoming Azure Services Platform. The move reverses the existing plan to offer SDS via the REST Web services interface. I spoke today with Niraj Nagrani, a senior product manager for SDS at Microsoft, about the changes.
Is it fair to say this is a major revamp from your initial plan?
The plan was always to deliver a relational database. A major part of this acceleration came from the feedback, but we always planned to deliver a relational database.
Did you, in effect, give up on the Entity Attribute Value [EAV] tables?
In the course of our acceleration, we heard a lot of feedback that people wanted the experience of a traditional SQL Server database with its T-SQL compatibility. To deliver that aspect of it we were kind of working around it. We always wanted to deliver the SQL Server experience that we took the traditional Entity Model and we were trying to imitate what SQL Server does, but we felt that based on the feedback we heard, customers preferred more the traditional T-SQL-based support so we decided to go in this direction.
Were you surprised at the reaction?
We were very happy with the reaction. Initially we were thinking going with the traditional entity model, we were calling it SQL Server. But it really was not similar to a SQL Server-type experience. So the question was, should we toy with the brand and not call it SQL Server or should we keep SQL Server and then deliver a traditional, more familiar experience to our existing customers? But we didn't have enough data points. Until we actually went to the market and got some data points, we didn't really have any justification to do it. Now we have enough proof points. We were not surprised, but we were happy to see that customers confirmed our hypothesis that they do want to have a traditional SQL-like experience.
How much did the fact that the Azure Tables and SDS were seen as indistinguishable data storage services?
With the current acceleration to relational databases, definitely the T-SQL-based compatibility and working with the traditional TDS proxy protocol, SQL Server becomes more like a traditional RDBM database. It's very similar to a SimpleDB-type storage, which is a simple, structured storage with no relational capabilities. So there was a big differentiation between somebody needing an RDBMS database in the cloud versus a shared distributed database that's a highly scalable database built in with HA [high availability], self-healing and data protection, as opposed to structured storage with stored metadata and files.
Are you basically not going to be offering SDS with the EAV tables any more?
We are looking into our future roadmap to make sure that Astoria [ADO.NET Data Services] can be leveraged on top of SDS and Entity Data Model continues to exist, and we will continue to provide for that through Astoria. We will continue to work with the Astoria framework and figure out how SDS can support that.
TDS is not meant to be an Internet-friendly protocol. Is that going to affect performance?
We actually did a lot of benchmarking and testing. We think it's appropriate for what we are doing and the direction we are taking it. We feel comfortable, as we get more early-adopter customers and we look at the type of workloads they are building, they will keep modifying and tweaking our protocols so it's more workload-friendly.
Are you looking at other protocols, as well?
Now we are going to take the TDS and see how we can scale our services and start working with early-adopter customers. SDS will support breadth protocols including the existing TDS over TCP/IP and also options to support TDS over other transports such as HTTP for high-latency scenarios without making modifications to TDS.
So you're not concerned about the speed issues related to TDS?
If you look at any other product in a hosted environment, there is always going to be a latency issue coming from the typical service but also just going over the wire. There are always going to be workloads that are OK with the latency and will adopt to the cloud initially, and as we go in the future, the whole cloud infrastructure will enhance and will propose more high-performance workloads. As adoption grows and as we need efficiencies over the Web, I am sure the latency will become a non-issue for quite a bit of workloads.
What about the scalability questions of relational databases versus the EAV tables used in SDS?
SDS was built on SQL Server as a back end. The engineering team did a lot of re-engineering of the existing SQL Server architecture to have it work in a scale-out infrastructure manner. One of the biggest value benefits of SQL Data Services will be that it's a scale-out architecture and infrastructure, which means that workloads can scale out based on the usage, so not only the low-end workflows that don't need to have a scale-out architecture but also the high-end workloads that currently may have a limitation on the existing Web environment, in terms of how they scale out the infrastructure.
Will SDS support data partitioning?
In SDS V1, data partitioning will need to be handled in the application. Developers who need to scale out across multiple databases will need to share the data out themselves. In the future, based on customer feedback we will provide automatic provisioning.
In [senior program manager] David [Robinson's] blog posting yesterday, he wrote the majority of database applications will work. What types of applications are not going to be suited for this environment that developers should beg off?
There are certain workloads that are natural to clouds. In terms of Web workloads, we see them going to the cloud. We see a lot of data protection and storage-type workloads going to the cloud, like CRM applications, content management, product lifecycle management, supply chain and collaboration across enterprise. Where we continue to work toward is where we can have data warehouses and data-marts in the cloud. We are seeing a lot of excitement around BI workloads in the cloud. Or reporting-type applications living in the cloud. There is probably a natural tendency for these early-adopter workloads to go to the cloud right away and there is going to be a tendency of some other workloads like data warehouse and real OLTP workloads to go to the cloud in time.
What will be the surcharge for SDS over Azure Table Services?
We are still working on the pricing. I think sometime in the middle of the year, we will have some more information on the actual business model.
Do you think it will be competitive with Amazon's EC2 60-cent standard or more the $1.20-per-hour enterprise standard that Amazon is offering?
We are still working on that. We certainly don't have a range or a price point at this point.
Will the new SDS run on SQL Server 2008?
It is currently using 2005, but we have a roadmap to move to 2008.
Upon release?
That's the plan.
Will SDS use Enterprise Edition?
It will use Enterprise Edition. Just to be clear, when we say Enterprise Edition, we don't just take the box and put it in the cloud. You're really not going to take the code bit by bit and line by line and put it in a box and run it on SDS because it is not a hosted environment -- it's a shared database infrastructure. The code base is taken from the enterprise; we have an enhanced architecture to run on datacenter machines. We can leverage the cost benefit of running it on cheap hardware but deliver an enterprise-class, mission-critical database.
Will it be TDE [Transparent Data Encryption] Enabled?
We are looking at different security features of how we can enable it. The thing is, there is a list of features that are available on-premises and quite frankly there's going to be some features that we leverage from inside-out and there are going to be a lot of features coming from outside-in based on the customer feedback.
How will users of TDI [Trusted Database Interpretation] and column-level encryption protect their private keys from unauthorized access?
We are looking into the type of workload and requirements for row-level security and column-level security and based on the requirements, we will actually enable those features.
How will data partitioning be handled?
We built an intelligent software infrastructure across the nodes that actually knows the size of each node and partitions data across the nodes.
Will all SQL Server transaction scopes be supported?
That's the plan.
What should developers be on the lookout for next week regarding SDS?
People will see the code and the bits running. There will be a demo of our SDS relational data model and you will see it working and will have a good level of the discussion about the architecture under the hood and the types of applications that can be built in real time. That will give a sense of how easy it is to actually use some of the T-SQL-based language into applications or running existing T-SQL applications in the cloud.
Posted by Jeffrey Schwartz on 03/11/2009