Pages

Tuesday, December 21, 2010

Getting started with Windows Azure

For the last couple of weeks, we've been studying converting our solution to a Windows Azure Application. So this post is gonna explain the basics of Windows Azure. Note that I did not put any payment information here. I don't even have the list of prices and I don't know what is charged and what isn't for sure.

Let's understand it.

How it works

We basically have two types of projects within a Windows Azure Application Solution: We have Web Roles, which can be Web Applications, and Worker Roles, which can be class libraries, console applications. Note that a Web Role can only be a Web Application, not a Web Site. See post "Converting a Web Site to a Web Application" for more details.

The Web Role will call the Worker Roles to do some heavy work, like background working. Something asynchronous that doesn't necessarily send a response to the Web Role, although you could have this situation. How will it call them? We have a communication pipe, a Queue. As surprising as it may be, it works exactly as a Queue object: first in, first out. The Web Role goes and puts a message in the Queue. Within a time span, the Worker Roles check the Queue for any new message, and verify whether it's assigned to them or not. If it is, they do their work, and delete the message. Simple.

To add Roles, just right click the folder Roles and select the desired option. You can also make an existing project (if compatible) a Role.
When you create a Role, you'll get a file named WebRole.cs or WorkerRole.cs, depending on which type of role you've created. These inherit from class RoleEntryPoint. These files contain methods such as OnStart(), OnStop() and Run(). You'll understand it later.

This is an image of what the Solution Explorer looks like with a Web Role and a Worker Role:


Notice that we have two configuration files: ServiceConfiguration.cscfg and ServiceDefinition.csdef. Those basically contain information about the roles and their settings.

Storage?

We have SQL Azure, Tables, and Blobs.

SQL Azure seems to go smoothly with basic instructions. It only requires that each table has a Clustered Index. Also, you can put your already existing database there, scripting it. I haven't tested it yet, but it looks like SQL Server 2008 R2 supports connecting to SQL Azure as if it was an ordinary SQL Server instance, meaning that it shows the server on Object Explorer, and lets you visually navigate through its objects. You can then script your existing database and run the scripts through Management Studio. For existing applications, connecting to it should be as simple as changing the Connection String.

Blobs. "Blobs are chunks of data". Well, they are written as byte arrays, so you could use it to store .txt, .xml files, anything. You can create a container, which would be a folder, and put blobs inside of it. For example, each customer would have their own container, and inside, their Blobs.

Tables work like an Excel Sheet, or an Access Database. Their use are to store data that is not relational. An example would be storing each customer's code and connection string.

Clouding it

First, let's configure our Roles to use development storage. I've done this with Windows Azure SDK 1.3, so I'm assuming you have that installed.

Open settings page of a role. Right click it, click properties, then go to Settings Page. You'll see the following:


Add another setting and mark the settings type as a Connection String. Then, click on the [...] button, and check Use the Windows Azure storage emulator. You can name it whatever you want to, just keep that name in mind. Let's use something generic, like "DataConnectionString". Do that for each of your roles.

After you've done that, your project now knows where's his storage, so the next step is to write to the Queue.

We're gonna write a string message that tells our Worker Role to create a Blob .txt file, and write something to it. Firstly, there's a piece of code that we need to execute before we can connect to our development storage account:

using Microsoft.WindowsAzure;
using Microsoft.WindowsAzure.ServiceRuntime;


CloudStorageAccount.SetConfigurationSettingPublisher((configName, configSetter) =>
{
      // Provide the configSetter with the initial value
      configSetter(RoleEnvironment.GetConfigurationSettingValue(configName));

      RoleEnvironment.Changed += (s, arg) =>
      {
            if (arg.Changes.OfType().Any((change) => (change.ConfigurationSettingName == configName)))
            {
                  // The corresponding configuration setting has changed, propagate the value
                  if (!configSetter(RoleEnvironment.GetConfigurationSettingValue(configName)))
                  {
                        // In this case, the change to the storage account credentials in the
                        // service configuration is significant enough that the role needs to be
                        // recycled in order to use the latest settings. (for example, the 
                        // endpoint has changed)
                        RoleEnvironment.RequestRecycle();
                   }
             }
      };
});

I couldn't find an actual answer to "Why do I need to run this?", so if you do have an answer, leave a comment with it, and I'll add it here. I'll even leave the code's original comments. If you try to get the storage account without running the code above, you'll get an exception saying that you must call the method SetConfigurationSettingPublisher before accessing it. Also, I tried putting this on the OnStart() method on WebRole.cs, and it did not work. So just put it on Application_Start() on Global.asax. As for WorkerRoles, you should put it on the OnStart() method on WorkerRole.cs.

To add our Queue Message, I've put this on a button click event:

using Microsoft.WindowsAzure;
using Microsoft.WindowsAzure.StorageClient;


// Gets our storage account
CloudStorageAccount storageAccount = CloudStorageAccount.FromConfigurationSetting("DataConnectionString");

// Gets a reference to our Queue
CloudQueue queue = storageAccount.CreateCloudQueueClient().GetQueueReference("blobqueue");
queue.CreateIfNotExist();
queue.AddMessage(new CloudQueueMessage("TXT;This is some text to be written"));

Notice that we call the CreateIfNotExist() method. This way we can always be sure we won't get a NullReferenceException. At least not when dealing with this Queue. And the Queue name MUST be entirely lowercase! You'll get a StorageClientException with message "One of the request inputs is out of range." if you have any uppercase character in its name.
Also notice that the message is pure string, so we're gonna have to handle message parts our own way.

So when the code above executes we have our first message on the Queue. Simple, right?
Now we're gonna get that message on our WorkerRole.


On the WorkerRole.cs file I've mentioned before, check the code that was generated inside the Run() method:

public override void Run()
{
      // This is a sample worker implementation. Replace with your logic.
      Trace.WriteLine("ClassLibraryRole entry point called", "Information");

      while (true)
      {
            Thread.Sleep(10000);
            Trace.WriteLine("Working", "Information");
      }
}

Yes, it's literally that: a while(true) loop. We trap the execution there, and then we can do whatever we want, including getting messages from the Queue. We could implement some sort of "timer against counter" logic to relief the processor a bit, like, each time it checks for a message and notices there isn't any, it sets the sleep time a little bit higher. Of course we'd have to handle that so it doesn't become a huge amount of time, set a maximum on that sleep.

Now let's handle getting messages from the Queue:

while (true)
{
      CloudStorageAccount storageAccount = CloudStorageAccount.FromConfigurationSetting("DataConnectionString");

      CloudQueue queue = storageAccount.CreateCloudQueueClient().GetQueueReference("blobqueue");

      CloudQueueMessage msg = queue.GetMessage();

      if (msg != null)
      {
            string actionType = msg.AsString.Split(';')[0];
            string content = msg.AsString.Split(';')[1];

            switch (actionType)
            {
                  case "TXT":
                        // Write the Blob
                        break;
            }

            // After all work is done, delete the message
            queue.DeleteMessage(msg);
      }
      else
      {
            Thread.Sleep(10000);
      }
}

Like I said, we could implement a better solution for that Sleep() over there, but, in some cases, it is fully functional just as it is: checking the queue within a certain time span.

Now we're gonna get that message, and write its content to a Blob:

while (true)
{
      CloudStorageAccount storageAccount = CloudStorageAccount.FromConfigurationSetting("DataConnectionString");

      CloudQueue queue = storageAccount.CreateCloudQueueClient().GetQueueReference("blobqueue");

      CloudQueueMessage msg = queue.GetMessage();

      if (msg != null)
      {
            string actionType = msg.AsString.Split(';')[0];
            string content = msg.AsString.Split(';')[1];

            switch (actionType)
            {
                  case "TXT":
                        CloudBlobContainer blobContainter = storageAccount.CreateCloudBlobClient().GetContainerReference("mycontainer");
                        blobContainter.CreateIfNotExist();
                        CloudBlob cloudBlob = blobContainter.GetBlobReference(string.Concat(DateTime.Now.ToString("dd-MM-yyyy_hh-mm-ss"), ".txt"));
                        cloudBlob.UploadText(content);
                        break;
            }

            // After all work is done, delete the message
            queue.DeleteMessage(msg);
      }
      else
      {
            Thread.Sleep(10000);
      }
}

That should create a new Blob container (a folder) and inside it, a .txt file (Blob). The same goes for the container name: it must not have any uppercase characters, or you will get a StorageClientException, with message "One of the request inputs is out of range.".
Simple again, huh.

Let's write something to the Table. In this example, we're gonna write customer records to the table: Customer ID, Customer's Name, and Customer's Connection String, assuming that each customer will have its own Database.

In a brief explanation of tables, the three important columns are: PartitionKey, RowKey, and TimeStamp.

Partition Key behaves like a string, so we could use it to filter our Records. For example, if we have two sources of customers, and we wanna know where they came from, we could have two different values for PartitionKey: "CustSource1" and "CustSource2". So when selecting those customers, we could filter them by PartitionKey.

RowKey is our primary key. It cannot be repeated, and it doesn't have auto-incrementation systems as identity fields. Usually, they get GUIDs.

TimeStamp is just a last modified time stamp.

To interact with Tables, we have to define a Model and a Context.
Our model will describe the record, and which columns it'll have. The context will interact directly with the table, inserting, deleting and selecting records.

It's important NOT to link these tables with SQL Server tables. They don't mantain a pattern. You can insert a record with 3 columns in a certain table, and then a record with completely different 6 columns in the same table, and it will take it normally.

To define our Model, we have to inherit from TableServiceEntity. We can then define our own columns and constructors and whatever we want to.

class CustModel : TableServiceEntity
{
    private const string PARTITIONKEY = "Cust";
    public const string TABLENAME = "CustModel";

    public CustModel(string custID, string name, string connStr)
        : base(PARTITIONKEY, custID)
    {
        Name = name;
        ConnStr = connStr;
    }

    public CustModel(string custID)
        : base(PARTITIONKEY, custID)
    {
    }

    public string Name { get; set; } 
    public string ConnStr { get; set; }
}


Once we have defined our Model, we'll work on our Context. The Context iherits from TableServiceContext. It also provides us some important key methods to work with. Our defined interaction methods should also be here, like the code below:

class CustDataContext : TableServiceContext
{
    private const string CUSTMODEL = "CustModel";

    public CustDataContext(string baseAddress, StorageCredentials credentials)
        : base (baseAddress, credentials)
    {
        CloudTableClient.CreateTablesFromModel(typeof(CustDataContext), baseAddress, credentials);
    }

    public IQueryable CustModel
    {
        get
        {
            return this.CreateQuery(CUSTMODEL);
        }
    }

    public void DeleteCustomer(CustModel cust)
    {
        this.DeleteObject(cust);
        this.SaveChanges();
    }

    public void AddCustomer(CustModel cust)
    {
        this.AddObject(CUSTMODEL, cust);
        this.SaveChanges();
    }

    public CustModel GetCustomerByID(string custID)
    {
        var results = from cust in this.CustModel
                      where cust.RowKey == custID && cust.PartitionKey == "Cust"
                      select cust;

        return results.FirstOrDefault();
    }

    public CustModel GetCustomerByName(string name)
    {
        var results = from cust in this.CustModel
                      where cust.Name == name && cust.PartitionKey == "Cust"
                      select cust;

        return results.FirstOrDefault();
    }
}

Notice the IQueryable property that returns us the table corresponding to the table name specified as a parameter. Whenever you want to access the table, that should be the property. You can also define more than one property, but always a table per property. The only other key method is the one called inside the constructor.
Oh and you have to add a reference to System.Data.Services.Client or you won't be able to call method CreateQuery().

Now how do we work with this? With the code below!

CloudStorageAccount storageAccount = CloudStorageAccount.FromConfigurationSetting("DataConnectionString");
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
tableClient.CreateTableIfNotExist(CustModel.TABLENAME); 
  
CustDataContext custDataContext = new CustDataContext(storageAccount.TableEndpoint.ToString(), storageAccount.Credentials);
CustModel cust = new CustModel("1", "Felipe", "MyConnStr");
custDataContext.AddCustomer(cust);

It should create a table with the name defined in the constant CustModel.TABLENAME, respecting the columns defined in the Model. You do not have to create the columns manually, it'll do so automatically.



So this post explained the basics of Windows Azure, and how simple it is to work with it. If you are going to do so, I suggest you download some applications from codeplex, such as:



SQL Azure Migration Wizard helps you migrate your already existing database to SQL Azure.
Azure Storage Explorer lets you visually navigate through your storage objects, such as Queue Messages, Blobs, Containers, Tables, etc.

No comments:

Post a Comment